TECH

What happens when AI agents run an entire company? Carnegie Mellon tested the idea.

By Aniket Chakraborty

May 26, 2025

Arrow
Arrow

Their fictional firm, TheAgentCompany, had no humans—just bots from top AI labs.

2

Arrow

Agents were assigned roles: software engineers, analysts, managers, even HR.

3

Arrow

But most agents flopped—Anthropic’s Claude 3.5 Sonnet only completed 24% of tasks.

4

Arrow

Other models, like Google’s Gemini 2 Flash, performed even worse—barely 11% success.

5

Arrow

Simple jobs like naming a Word file or closing a popup left bots confused.

6

Arrow

Some bots created fake coworkers instead of finding real ones in internal chats.

7

Arrow

CMU researchers found that agents broke down as task complexity increased.

8

Arrow

Yet companies like Johnson & Johnson and Moody’s are already experimenting with AI agents.

9

The verdict? AI can help humans—but it can’t replace them. Not yet.

10