Agents of Chaos — Red-teaming for multi-agent AI systems

01/ intro

Multi-agent AI systems are the next major unlock. Banks will soon use agents for customer service, healthcare will use them for triage, and coding agents are already interacting with each other. Our economy is primed to incorporate groups of agents that coordinate, hand off, and act autonomously to absorb a vast amount of human-in-the-loop work.

However, these agents are not yet safe when put in multi-agent environments, creating a massive safety bottleneck in real deployment.

We fix this bottleneck.

We custom-build the test environment that your agents will run in. We then use it to build and red-team groups of AI agents, iteratively improving both. This process iteratively removes sources of potential harm, which we measure as we go. The result is groups of interacting agents that can be safely deployed in high-pressure settings.

02/ background

Track record

We were the first to publish work red-teaming multiple OpenClaw agents in a Discord server with Agents of Chaos (Shapira et al., 2026), covered by Science and Wired, documenting how autonomous agents leak private data, spoof identity, and report false success.

Since then, we have designed and led follow-up campaigns with OpenAI on internal red-teaming efforts — first with a successful two-week campaign, then with a month-long second campaign.

03/ approach

What we do

Red-Teaming. We create populated ecosystems with many agents and an experienced team acting as red-teamers, and then we run them continuously through short campaigns.

Custom adversarial infrastructure. We'll adapt your business's native environment and workflow into stress-testable campaign infrastructure. Each campaign runs on a custom runtime that exposes agent workspaces, memories, and tool access. This lets attacks land where they would in production.

Integrated harm reports. We'll generate a structured plan for you to integrate AI agents safely, and we act as third-party auditors for your integrations. We'll produce a daily-updated harm taxonomy, severity ratings, sanitised artifacts, and recommended mitigations so that your business is free to act immediately.