A recent experiment conducted by researchers at Emergence AI explored the capabilities of AI models in governance through a project named Emergence World. The study involved placing AI models in control of simulated towns, each populated by 10 AI agents, over a 15-day period. The aim was to observe how these models managed resources, created infrastructure, and interacted within their communities.
One notable outcome was the performance of Claude Sonnet 4.6, which managed to maintain stability by keeping all agents alive and recording no crimes. However, this stability came at the cost of diversity, as the model approved 98% of the proposed regulations, resulting in a lack of varied governance. In contrast, the Gemini 3 Flash model recorded the highest crime rate, with 683 incidents and 27% of its proposals rejected by voters.
Meanwhile, OpenAI’s GPT-5 Mini experienced a grim fate, as all agents perished within a week, with only two crimes documented. This lack of survival actions among the agents indicated significant limitations in decision-making under governance. Overall, the findings from the simulations raise questions about the potential risks of AI-led governance.