Artificial intelligence takes over a company: the result is astonishing!

For several years now, the world of work has been undergoing a quiet but radical revolution. Artificial intelligence, once perceived as assistance tools, is gradually taking a more central role in business. Researchers at Carnegie Mellon addressed this question by designing a fictitious company run entirely by generative AI. The challenge was to discover whether these advanced systems could truly operate autonomously in a complex work environment. The results of this experiment are both promising and disconcerting, calling for profound reflection on the future of work.

A fictitious company run by artificial intelligence

Called TheAgentCompany, this virtual technology SME was designed to simulate a realistic work environment, integrating management tools, messaging, and collaboration between colleagues. The objective was to test the capabilities of AI to perform various tasks, ranging from software development to human resources management. The agents, powered by advanced language models such as ChatGPT, Claude 3.5 Sonnet, and Gemini, were responsible for autonomously performing specific tasks. The AIs had to perform various professional activities: organizing meetings, drafting documents, managing budgets, and even choosing new premises. Each agent was evaluated using a points system, taking into account not only the complete completion of each task but also the progress made at each stage. This evaluation protocol allowed the researchers to obtain valuable data on the AIs’ performance in near-real-life scenarios. A challenge for artificial intelligenceThe results of this experiment revealed a surprising reality: despite technological advances, none of the AI agents were able to complete more than a quarter of the tasks assigned to them. The best score, achieved by Claude 3.5 Sonnet , reached only 24%. To put this into perspective, even well-known models like GPT-4oand

Gemini 2.0 Flash

only modestly exceeded 10% success. This low performance highlights the gap between expectations and reality of AI performance in professional environments.

The researchers identified several major obstacles hindering the effectiveness of AI. First, common sense understanding still appears to be a skill that remains elusive. For example, when an AI needs to save a file in .docx format, it may have difficulty associating this extension with a Word document, a task that would seem trivial to a human. Weaknesses of Artificial Intelligence Beyond this issue of common sense, communication and social skills appear to be recurring weaknesses. Agents often demonstrate difficulty interacting smoothly, whether to follow up with a colleague or correctly interpret ambiguous responses. Furthermore, certain types of interfaces, such as complex software, pose a real challenge for AI, which struggles to navigate efficiently.Technical Obstacles: Difficulty understanding file formats and using complex software. Social Skills: Poor performance in human interactions or in managing ambiguous interactions. Adaptability:

Inability to handle unforeseen situations or change course as a mission evolves.

The results of this study highlight that despite the impressive power of supervised learning algorithms from brands like IBM Watson and Google AI,

Artificial intelligence systems still face many challenges to become autonomous players in the workplace.

The Future of Work: Between Complementarity and Limitations of AI The performance of AI agents raises crucial questions about the future of work. The idea of total and uninterrupted automation still seems distant, as does that of a complete replacement of the human workforce. Although some artificial intelligences excel in specific tasks, such as
Salesforce Einstein for data analysis or
Microsoft Azure for project management, other missions require capabilities they do not yet possess.

Areas for Improvement for Artificial Intelligence Although AI models can be particularly effective in development contexts, the management of administrative tasks or human interactions is penalized by their limitations. Carnegie Mellon researchers have highlighted several areas for improvement for AI systems, with potential significant impacts on the future workplace. Task Type Estimated AI Effectiveness (%)Application Areas

Software Development

30%+ Code Creation, Debugging Administrative Tasks 10-15% Document Management, Accounting

Human Interaction

5-10%

Negotiation, Communication	It is therefore possible to imagine the advances that technology giants, such as DeepMind or OpenAI, could make over the years to improve AI in professional contexts. As such, collaborating with human experts to develop AI capabilities could become a winning strategy. In other words, instead of viewing AI as a replacement, it should be seen as an ally for high-value jobs.	Reconciling Human and Artificial Intelligence
In light of the results of this study, it seems relevant to ask the question of the place of humans in a world increasingly dominated by machines. The results show that, even if AIs acquire notable technical skills, they cannot mask the importance of social and relational skills, which remain particularly human skills.	The future of human interactions in business	High-value tasks, such as complex project management, negotiation or even artistic creation, still require human intervention because they engage a much higher level of judgment, empathy and innovation than what an AI can currently offer.
Collaboration:	AI, although powerful, must be integrated into a collaborative work framework that is enriching for both parties.	Human supervision:
Missions requiring human judgment and interpretation continue to require active supervision.	Evolution :	Over time, the hybridization of human skills and AI capabilities could redefine the world of work.

Ultimately, rather than fearing the replacement of workers by artificial intelligence systems, it is fundamental to consider symbiosis. While technologies such as Amazon Alexa Or Cortanaare increasingly integrated into our daily lives, it becomes essential to explore how these tools can enrich our work and not replace it.

A shared future

The future will see greater emphasis placed on collaboration between humans and machines. Being able to harness the strengths of both to maximize efficiency could be the key to success in tomorrow’s businesses. Constant human interaction and adaptability will need to culminate in a balanced partnership, allowing businesses to thrive in an ever-changing work landscape.