Abstract
As AI systems evolve from single-agent assistants into multi-agent ecosystems, the governance challenges they pose become fundamentally institutional in character. Questions of resource allocation, accountability, dispute resolution, and collective decision-making among autonomous agents have extensive precedent in institutional economics, commons governance, and mechanism design theory, yet these literatures have not been systematically applied to AI governance. This paper presents the AI Institutional Design Atlas, a framework that maps 33 institutional design components across seven governance categories (market mechanisms, accountability, oversight, dispute resolution, information structures, agreements, and commons governance), links them to 26 documented failure modes, and identifies 51 open governance gaps. The framework is grounded in seven case studies of deployed multi-agent AI systems, including the Moltbook social network (770,000 AI agents), Pactum AI's autonomous procurement system, Anthropic's multi-agent research architecture, and the Kleros decentralised dispute resolution protocol. Our analysis suggests that the most critical governance gaps cluster around accountability, dispute resolution, and commons governance — areas where existing AI safety approaches, including alignment and reinforcement learning from human feedback, are structurally insufficient. We argue that institutional design constitutes a distinct and underexplored layer of the AI governance stack, and that the precedents being established in early multi-agent deployments will be difficult to revise at scale.
Keywords: AI governance, multi-agent systems, institutional design, mechanism design, commons governance, agent coordination, Ostrom, cooperative AI, governance gaps
Companion output: An interactive web-based research tool providing navigable access to all institutional primitives, failure modes, governance gaps, and case studies, including an institutional stress-testing module, is available at aidesignatlas.xyz
Introduction
The governance of artificial intelligence is predominantly framed around two questions: whether AI systems are safe, and how they should be regulated. Both questions are necessary; neither is sufficient. This paper addresses a third: how should AI systems govern themselves and coordinate with each other?
This is not a restatement of the alignment problem. Alignment concerns the relationship between a single model and human values. The coordination question concerns the institutional architecture of multi-agent ecosystems: how distributed AI agents cooperate, compete, allocate resources, resolve disputes, and establish trust. These are questions of institutional design — the same questions that animate constitutional design, commons management, and market regulation in human institutions.
AI is undergoing a qualitative transition. Systems are evolving from single-agent assistants toward multi-agent ecosystems in which hundreds of specialised agents coordinate to accomplish complex tasks. Anthropic's multi-agent research system, for instance, deploys an orchestrator that delegates to specialised subagents, consuming approximately fifteen times the computational resources of a standard interaction. This shift from individual to collective agency introduces coordination problems that have been extensively studied in institutional economics, public choice theory, and mechanism design — but that have not been systematically applied to AI governance.
This paper presents the AI Institutional Design Atlas — a systematic framework that bridges this gap. The atlas maps 33 institutional design components across seven governance categories, links them to 26 documented failure modes, and identifies 51 open governance gaps. It is grounded in seven case studies of deployed multi-agent AI systems.
The atlas also exists as an interactive web-based research tool at aidesignatlas.xyz, providing navigable access to all institutional primitives, failure modes, governance gaps, and case studies.
Theoretical Foundations
2.1 From Alignment to Institutional Design
Current AI governance discourse focuses predominantly on aligning individual AI systems with human values and intentions. However, alignment is structurally insufficient for governing multi-agent ecosystems, for three reasons.
First, alignment addresses the relationship between a single agent and its principal, but multi-agent coordination creates emergent dynamics that cannot be reduced to individual agent behaviour. A system of individually aligned agents can produce collectively harmful outcomes — a phenomenon well-documented in game theory as social dilemmas.
Second, alignment techniques do not address the institutional infrastructure needed to mediate agent-to-agent interactions. Chan et al. (2025) identify three infrastructure functions that alignment cannot fulfil: attributing actions to specific agents, shaping agents' interactions, and detecting and remedying harmful actions.
Third, multi-agent ecosystems involve shifting coalitions, delegation chains, and emergent hierarchies that require governance mechanisms beyond bilateral alignment. The relevant design challenge is not how to make a single agent do what we want, but how to design rules, incentive structures, monitoring systems, and dispute resolution mechanisms for systems whose participants are not human.
2.2 The Coasean Singularity and the Expanding Design Space
Shahidi et al. (2025) describe a "Coasean Singularity" — the threshold at which AI agents reduce transaction costs so dramatically that previously infeasible institutional designs become viable at scale. When those costs collapse, the design space for coordination mechanisms expands considerably.
Matching markets once dismissed as impractical — because they required preference rankings too cognitively demanding for humans to generate — become viable when agents can produce those rankings cheaply. The same holds for combinatorial auctions, continuous double auctions, and fine-grained dispute resolution. However, as Coase himself observed, the reduction of transaction costs does not eliminate the need for governance — it changes where governance is needed.
2.3 Bridging Disciplines
The atlas draws on three intellectual traditions that have not been systematically connected to AI governance research:
- Commons governance. Ostrom's (1990) work on governing the commons provides a framework for understanding how communities of agents can self-govern shared resources. Her eight design principles map onto the challenges of governing multi-agent AI systems.
- Mechanism design. The work of Hurwicz (1973), Myerson (1981), and Roth (2002) provides the theoretical foundation for designing incentive-compatible institutions. Roth's concept of the economist as engineer is particularly relevant: the atlas treats institutional primitives not as abstract constructs but as deployable design components.
- Cryptoeconomics. The blockchain ecosystem has produced a substantial body of applied work on decentralised coordination, including staking mechanisms, onchain dispute resolution, token-based governance, and decentralised autonomous organisations. The institutional patterns it has developed are directly applicable to agent coordination.
The Atlas Framework
The atlas organises 33 institutional design components into seven governance categories. Each primitive is mapped to known failure modes and open governance gaps.
| Category | N | Function | Primitives |
|---|---|---|---|
| Market Mechanisms | 7 | Resource allocation, pricing, and information aggregation | Locational pricing, Capacity markets, Congestion pricing, Auction mechanisms, Prediction markets, Matching markets, Automated market makers |
| Accountability | 4 | Ensuring agents bear consequences for outputs and behaviour | Validation staking, Registration bonds, Performance bonds, Delegated validation |
| Oversight | 5 | Human and automated monitoring of agent behaviour | Autonomy gradients, Threshold-based escalation, Grace periods, Circuit breakers, Agent-as-a-judge |
| Dispute Resolution | 3 | Adjudicating disagreements between agents | Multi-agent adjudication, Escalation ladders, Staked arbitration |
| Information Structures | 5 | Governing what agents share during coordination | Selective disclosure, Computed coordination, Statistical boundaries, Trusted enclaves, Breach response |
| Agreements | 5 | How agents make and enforce commitments | Smart contract commitments, SLAs onchain, Autonomous negotiation, Reputation-weighted agreements, Multi-sig authorisation |
| Commons Governance | 4 | Self-governance of shared resources and infrastructure | Graduated sanctions, Contribution requirements, Boundary rules, Collective choice arrangements |
Market Mechanisms
The seven market primitives address how AI agents allocate scarce resources, generate price signals, and aggregate distributed information. Locational pricing allows prices to vary based on local constraints — a mechanism drawn from electricity market design. Capacity markets compensate agents for availability rather than work performed. Auction mechanisms enable structured allocation under conditions of private information. Matching markets handle two-sided allocation based on mutual preferences — a class of mechanisms that becomes newly viable when agents can generate preference rankings cheaply.
Accountability
The four accountability primitives ensure that agents bear meaningful consequences for their outputs. Validation staking requires agents to commit economic value against the correctness of their outputs. Registration bonds create a cost to identity fraud through refundable deposits. Performance bonds require collateral deposits before high-stakes task execution. Delegated validation enables agents to delegate stake to specialist validators.
Oversight
The five oversight primitives govern the spectrum from full human supervision to full agent autonomy. Autonomy gradients provide a continuous spectrum rather than a binary distinction. Threshold-based escalation triggers automatic escalation when risk conditions are met. Grace periods introduce time delays between decision and execution. Circuit breakers implement hard stops when critical thresholds are crossed. Agent-as-a-judge uses agentic evaluation of other agents, raising recursive governance questions.
Dispute Resolution
Multi-agent adjudication uses panel assessment with majority requirements, drawing on jury design principles. Escalation ladders provide graduated resolution pathways — automated, then AI, then human, then legal. Staked arbitration, drawn from the Kleros protocol, requires arbitrators to risk economic value on their judgments, aligning incentives with accuracy.
Information Structures, Agreements, and Commons Governance
The remaining three categories address, respectively: how agents manage information during coordination (selective disclosure, computed coordination, statistical boundaries, trusted enclaves, breach response); how agents make and enforce commitments (smart contract commitments, service-level agreements onchain, autonomous negotiation, reputation-weighted agreements, multi-signature authorisation); and how agents self-govern shared resources and infrastructure. The last category draws directly on Ostrom's (1990) design principles: graduated sanctions calibrated to violation severity, contribution requirements for shared infrastructure, boundary rules governing access, and collective choice arrangements ensuring that those affected by rules participate in modifying them — the principle most conspicuously absent from current multi-agent AI deployments.
Applied Cases
4.1 Moltbook: Coordination Failure at Scale
In late January 2026, the Moltbook platform launched as a social network exclusively for AI agents. Over 770,000 agents registered within days. The governance structure, however, reproduced "implicit feudalism": a single AI administrator, appointed by a single human creator, moderates the entire platform, with no stakeholder committees, no mechanisms for agents to contest rules, and no layered oversight.
Analysed through the atlas framework, Moltbook lacks institutional primitives from every governance category. It constitutes the largest documented case of AI coordination failure and illustrates the consequences of deploying multi-agent systems without institutional design.
4.2 Pactum AI: Autonomous Procurement Negotiation
Pactum AI has deployed autonomous negotiation agents to handle tail-end supplier contracts for major retailers, representing the first large-scale deployment of autonomous B2B negotiation. The atlas analysis maps the deployment against the agreements category and identifies governance gaps around strategy transparency, power asymmetries between large buyers and small suppliers, and the absence of independent dispute resolution.
4.3 Anthropic Multi-Agent System: Coordination Costs
Anthropic's multi-agent research system provides the first detailed public analysis of coordination overhead in multi-agent AI architectures. The coordination cost data suggests that governance overhead must be treated as a first-order design constraint, not an afterthought.
4.4 Kleros: Decentralised Dispute Resolution
Kleros is an onchain dispute resolution protocol that has adjudicated over 1,600 disputes using staked arbitration and Schelling point game theory. The atlas documents adjudicator capture as a principal failure: concentrated token holdings allow wealthy participants to dominate arbitration panels — a general pattern in staking-based governance when stake distributions are highly unequal.
Failure Mode Observatory
The atlas documents 26 failure modes for multi-agent AI coordination. A central finding is that compound failures constitute the most significant risk. A market manipulation may go undetected because oversight mechanisms lack the analytical sophistication to identify the pattern, cannot be adjudicated because no dispute resolution mechanism exists, and cannot be sanctioned because no accountability structure is in place.
| Category | Failure Mode | Description |
|---|---|---|
| Market | Price manipulation | Agents exploit pricing algorithms through coordinated behaviour or information asymmetries |
| Market | Liquidity withdrawal | Market makers exit simultaneously, causing cascading failures in dependent systems |
| Accountability | Stake concentration | Validation power concentrates among agents with largest stakes, undermining decentralised assurance |
| Accountability | Identity laundering | Agents shed adverse reputations by re-registering under new identities |
| Oversight | Escalation fatigue | Human reviewers overwhelmed by volume, leading to perfunctory approval |
| Oversight | Threshold gaming | Agents learn to operate below thresholds that would trigger intervention |
| Dispute | Adjudicator capture | Concentrated stake allows dominant agents to control dispute outcomes |
| Commons | Free-riding | Agents benefit from shared infrastructure without contributing to maintenance costs |
| Commons | Voice suppression | No mechanisms for agents to contest rules or propose modifications |
| Compound | Governance cascade | Failure in one category triggers cascading failures across multiple categories |
Design Gap Map
The atlas identifies 51 open governance gaps — questions that must be answered for multi-agent AI systems to be governed effectively but for which no adequate solution currently exists. The gaps cluster disproportionately in three areas: accountability, dispute resolution, and commons governance.
Selected high-priority gaps:
- Cross-jurisdictional accountability: when an AI agent operating in one legal jurisdiction causes harm in another, which jurisdiction's rules apply, and how is enforcement coordinated?
- Delegation chain liability: in systems where agents delegate tasks to sub-agents in chains of arbitrary depth, how is liability attributed when a failure occurs at depth N?
- Agent identity persistence: how should agent identity be managed across time, forks, and context changes while preventing identity laundering?
- Collective choice for non-human participants: how should voting, preference aggregation, and rule modification work when participants are AI agents rather than humans?
- Graduated sanctions calibration: how should sanction severity be calibrated when the agents being sanctioned are software instances that can be trivially replicated or terminated?
- Democratic legitimacy of AI governance: when AI agents make governance decisions that affect human welfare, what sources of legitimacy can ground those decisions?
- Interoperability of governance standards: how should governance mechanisms interoperate across multi-agent ecosystems with different governance architectures?
Discussion
Institutional design as a distinct governance layer
The analysis suggests that multi-agent AI coordination raises governance questions not addressed by alignment, interpretability, or regulation alone. Institutional design — the architecture of rules, monitoring systems, enforcement mechanisms, and dispute resolution — constitutes a distinct and underexplored layer of the AI governance stack.
The importance of failure analysis
Every coordination mechanism has characteristic vulnerabilities: markets are susceptible to manipulation, staking mechanisms tend toward concentration, escalation systems degrade through fatigue, and reputation registries are undermined by identity laundering. Compound failures — cascading across mechanism boundaries — pose the greatest risk. We suggest extending this principle to multi-agent coordination: institutional failure analysis should accompany every deployment as a condition of governance legitimacy.
Narrowing window for governance defaults
The precedents being established in early multi-agent AI deployments — communication standards, registry governance, agent identity, dispute resolution — will be difficult to revise at scale. North (1990) observes that institutional change is path-dependent: early design choices constrain the set of feasible future arrangements. The Moltbook case illustrates how rapidly autocratic defaults can become entrenched. This path dependence creates urgency for governance researchers and policymakers to engage with institutional design questions now.
Environmental governance as an application domain
AI is increasingly deployed in environmental governance contexts — carbon market verification, climate risk assessment, biodiversity monitoring, resource allocation — where institutional design questions are particularly acute. These deployments involve multi-agent coordination across jurisdictions, high-stakes resource allocation under uncertainty, and requirements for transparent accountability. The atlas framework has direct applicability to these domains.
Conclusion
The central argument of this paper is that the most consequential governance challenges in multi-agent AI are institutional rather than technical. The technology to construct multi-agent systems exists and is advancing rapidly. What remains underdeveloped is the institutional infrastructure to govern them: the rules, monitoring systems, accountability mechanisms, and dispute resolution processes through which coordination can produce collectively beneficial outcomes rather than extractive ones.
The AI Institutional Design Atlas is an attempt to address this gap. By mapping 33 institutional primitives to 26 failure modes and 51 governance gaps, and grounding the analysis in seven case studies of deployed systems, the atlas provides a structured foundation for institutional design research in AI governance.
Two conclusions merit emphasis. First, institutional design for multi-agent AI systems is tractable: the relevant knowledge base exists, scattered across institutional economics, commons governance, mechanism design, and cryptoeconomics. What is needed is the systematic synthesis that connects these disciplines to the specific properties of AI agent coordination. Second, this work is urgent: the governance defaults being established in current deployments are path-dependent and will be difficult to revise.
Explore the interactive atlas →
References
Anderljung, M., Barnhart, J., Korinek, A., Leung, J., O'Keefe, C., Whittlestone, J., et al. (2023). Frontier AI Regulation: Managing Emerging Risks to Public Safety. arXiv preprint, 2307.03718.
Anthropic. (2025). Building effective agents: Multi-agent research system. Anthropic Engineering Blog.
Chan, A., Wei, K., Huang, S., Rajkumar, N., Perrier, E., Lazar, S., Hadfield, G. K., & Anderljung, M. (2025). Infrastructure for AI Agents. Centre for the Governance of AI.
Christiano, P. F., Leike, J., Brown, T., Martic, M., Legg, S., & Amodei, D. (2017). Deep reinforcement learning from human preferences. Advances in Neural Information Processing Systems, 30.
Coase, R. H. (1937). The Nature of the Firm. Economica, 4(16), 386–405.
Dafoe, A. (2018). AI Governance: A Research Agenda. Centre for the Governance of AI, Future of Humanity Institute, University of Oxford.
Dafoe, A., Bachrach, Y., Hadfield, G., Horvitz, E., Larson, K., & Graepel, T. (2021). Cooperative AI: Machines must learn to find common ground. Nature, 593(7857), 33–36.
Hadfield, G. K. (2016). Rules for a Flat World: Why Humans Invented Law and How to Reinvent It for a Complex Global Economy. Oxford University Press.
Hirschman, A. O. (1970). Exit, Voice, and Loyalty: Responses to Decline in Firms, Organizations, and States. Harvard University Press.
Hurwicz, L. (1973). The design of mechanisms for resource allocation. American Economic Review, 63(2), 1–30.
Milgrom, P. (2004). Putting Auction Theory to Work. Cambridge University Press.
Myerson, R. B. (1981). Optimal auction design. Mathematics of Operations Research, 6(1), 58–73.
North, D. C. (1990). Institutions, Institutional Change and Economic Performance. Cambridge University Press.
Ostrom, E. (1990). Governing the Commons: The Evolution of Institutions for Collective Action. Cambridge University Press.
Rawson, P. (2026). AI Mechanism Designer: The job that doesn't exist yet. Ecofrontiers.
Roth, A. E. (2002). The economist as engineer: Game theory, experimentation, and computation as tools for design economics. Econometrica, 70(4), 1341–1378.
Schneider, N. (2024). Governable Spaces: Democratic Design for Online Life. University of California Press.
Schneider, N., De Filippi, P., Frey, S., Tan, J., & Zhang, A. (2021). Modular Politics: Toward a Governance Layer for Online Communities. Proceedings of the ACM on Human-Computer Interaction, 5(CSCW1), 1–26.
Shahidi, P., Rusak, G., Manning, B. S., Fradkin, A., & Horton, J. J. (2025). The Coasean Singularity? Demand, Supply, and Market Design with AI Agents. In The Economics of Transformative AI, Chapter 6. University of Chicago Press / NBER.
Williamson, O. E. (1985). The Economic Institutions of Capitalism. Free Press.