Agents of Chaos

Agents of Chaos

Red-teaming on multi-agent systems

Abstract visualization of a populated multi-agent environment: a constellation of grey dots with a few in red, connected by faint lines, suggesting emergent failures propagating through the population.
01/ intro

Multi-agent AI systems are the next major unlock. Banks will soon use agents for customer service, healthcare will use them for triage, and coding agents are already interacting with each other. Our economy is primed to incorporate groups of agents that coordinate, hand off, and act autonomously to absorb a vast amount of human-in-the-loop work. 

However, these agents are not yet safe  when put in multi-agent environments, creating a massive safety bottleneck in real deployment. 

We fix this bottleneck.

Build Environment custom infrastructure and agent harnesses
Red-Teaming Stress-test the agents under heat
Findings surfaced, reported on, and fed back into the build
Iterative feedback loop
outputs: safety reports · testing infrastructure · agent runtimes

We custom-build the test environment that your agents will run in. We then use it to build and red-team groups of AI agents, iteratively improving both. This process iteratively removes sources of potential harm, which we measure as we go. The result is groups of interacting agents that can be safely deployed in high-pressure settings.

02/ background

Track record

We were the first to publish work red-teaming multiple OpenClaw agents in a Discord server with Agents of Chaos (Shapira et al., 2026), covered by Science and Wired, documenting how autonomous agents leak private data, spoof identity, and report false success.

Since then, we have designed and led follow-up campaigns with OpenAI on internal red-teaming efforts — first with a successful two-week campaign, then with a month-long second campaign.

03/ approach

What we do

Red-Teaming. We create populated ecosystems with many agents and an experienced team acting as red-teamers, and then we run them continuously through short campaigns.

Custom adversarial infrastructure. We'll adapt your business's native environment and workflow into stress-testable campaign infrastructure. Each campaign runs on a custom runtime that exposes agent workspaces, memories, and tool access. This lets attacks land where they would in production.

Integrated harm reports. We'll generate a structured plan for you to integrate AI agents safely, and we act as third-party auditors for your integrations. We'll produce a daily-updated harm taxonomy, severity ratings, sanitised artifacts, and recommended mitigations so that your business is free to act immediately.