Your AI Just Got a Team

Anthropic shipped a feature in Claude Code that lets multiple AI agents work on the same project at the same time — coordinating with each other, dividing labor, and reporting back as a unit. It's called Agent Teams. Here's what it actually does, how to use it, how it stacks up against every tool worth knowing about, and where OpenClaw fits into the picture.

What Agent Teams Actually Does

Until now, Claude Code operated as a single AI session. You gave it one task, it worked through it sequentially, and you waited. If the project was complex — say, building an API while also writing frontend components and tests — you were bottlenecked by one agent doing everything in order.

Agent Teams changes that. When you enable the feature, one Claude Code session acts as the team lead. It reads your prompt, plans the work, and spins up specialized teammates — each running in its own context window with its own focus area. One handles your backend. Another builds components. A third writes tests. A fourth reviews the code the others are producing.

The critical difference from just opening multiple terminal windows: these agents talk to each other. They share a task list. They send direct messages. They flag dependencies and conflicts without routing everything through you. The lead synthesizes their work and surfaces results.

It shipped six weeks ago — February 5, alongside the Opus 4.6 model release. It's still labeled experimental, but it's officially documented and actively being used at companies like Uber, Salesforce, and Accenture. And Anthropic hasn't stopped building on it since.

How to Enable It

Agent Teams is disabled by default. You turn it on one of two ways:

# Option 1: Environment variable
export CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1

# Option 2: Add to your settings.json "CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS": true

Once enabled, you describe your task and tell Claude to create an agent team. You can specify the team structure explicitly — "create a backend agent, a frontend agent, and a code reviewer" — or let Claude propose one based on your prompt. Either way, it won't spin up agents without your approval.

You need Claude Code v2.1.32 or later. Check with claude --version and update with npm update -g @anthropic-ai/claude-code if needed.

Practical Note: Agent Teams uses significantly more tokens than a single session. Each teammate has its own context window and runs independently. Early developer reports put real-world costs around $20 over two days for demo projects. For straightforward tasks, a single session or subagents are still faster and cheaper. Reserve Agent Teams for work where parallel exploration genuinely adds value — multi-layer features, debugging with competing hypotheses, or cross-stack coordination.

One thing that helps on cost: last Thursday (March 13), Anthropic dropped the long-context premium entirely. The 1M token context window now bills at standard rates — $5/$25 per million tokens for Opus 4.6, $3/$15 for Sonnet 4.6. Previously, anything over 200K tokens carried a 2x input / 1.5x output surcharge. That makes running multiple agents with large context windows meaningfully cheaper than it was even two weeks ago.

Where It's Fastest

Not every task benefits from multiple agents. The strongest use cases based on what's shipping and what developers are reporting:

New features that span layers. Frontend, backend, database, and tests — each owned by a different teammate working simultaneously instead of one agent switching context between all four.

Debugging with competing theories. Instead of testing one hypothesis at a time, spin up three teammates each investigating a different possible root cause. The one that finds it reports back. The others stop.

Research and review. Multiple teammates investigate different aspects of a problem, then share and challenge each other's findings before presenting conclusions.

Code review at scale. Last week (March 9), Anthropic followed up Agent Teams with a dedicated Code Review feature — a multi-agent system that automatically dispatches teams of agents to analyze pull requests in parallel, cross-verify findings, rank issues by severity, and leave detailed comments. It's available for Teams and Enterprise plans.

200%

Increase in code output per engineer over the past year (Anthropic)

16% → 54%

PRs receiving substantive review comments (before → after Code Review)

$15–$25

Average cost per pull request review on Code Review

Agent Teams vs. Subagents

Claude Code already had subagents before this — quick, focused workers that run inside your main session and report results back. Agent Teams is fundamentally different. The distinction matters because choosing wrong wastes tokens and time.

	Subagents	Agent Teams
Context	Run inside main session's context window	Each teammate gets its own independent context window
Communication	Report results back to main agent only	Message each other directly, share a task list
Best For	Quick, focused tasks — research lookups, file generation, single-scope work	Complex parallel work where agents need to coordinate, share findings, challenge each other
Cost	Lower — contained in one session	Higher — multiple independent sessions running simultaneously
Coordination	Main agent acts as intermediary	Peer-to-peer with lead orchestrating
Limits	3–4 recommended max	Scales larger, but adds coordination overhead

The short version: use subagents when workers just need to report back, use Agent Teams when they need to talk to each other.

How It Compares

Anthropic isn't alone in the multi-agent race. The field is crowded and moving fast. Here's how it actually breaks down as of mid-March 2026 — not just the headlines, but the architectural differences that determine which tool fits which job.

The Benchmark Reality

Six weeks ago (February 5), Anthropic launched Opus 4.6 and briefly held the Terminal-Bench 2.0 record at 65.4%. Twenty-seven minutes later, OpenAI released GPT-5.3-Codex and posted 77.3% on the same benchmark — a 12-point gap. On SWE-bench Verified (broader software engineering), Opus 4.6 scores 80.8% versus Codex at roughly 80%. On SWE-bench Pro (four languages, contamination-resistant), Opus leads at 59.0% to Codex's 56.8%. The takeaway: Codex wins on terminal execution speed. Opus wins on reasoning depth and long-context coherence. They're optimized for different workloads, and the benchmarks reflect that.

But the model is only half the story. Augment Code's agent ran Opus 4.5 on SWE-bench and solved 17 more problems than Claude Code itself — same model, different scaffolding. The agent's architecture matters as much as the model underneath.

Tier 1: The Coding Agents

Cursor shipped Cloud Agents three weeks ago (February 24) and changed the competitive calculus. Each agent runs in an isolated cloud VM with a full dev environment, browser access, and the ability to self-test. Developers can spin up 10 to 20 agents working simultaneously on separate features. Each produces a merge-ready PR with video recordings, screenshots, and logs as artifacts. Cursor reports 35% of their own internal merged PRs now come from autonomous agents (DevOps.com, Feb 24). The tradeoff: this is parallelized throughput on independent tasks — agents don't coordinate with each other the way Agent Teams does. Context management also suffers on very large codebases where manual scoping is required.

OpenAI Codex CLI is open-source (62,000+ GitHub stars), Rust-native, and optimized for raw speed and token efficiency. It hit 1M+ downloads in its first week as a desktop app. Each task runs in its own sandbox preloaded with your repository. Codex uses fewer tokens than Claude Code on identical tasks and runs significantly faster. It hasn't shipped anything equivalent to Agent Teams' peer-to-peer coordination — the focus is individual agent speed, not multi-agent orchestration.

GitHub Copilot + VS Code launched multi-agent support in January, letting you run Claude and Codex agents alongside Copilot in a single editor. Parallel subagents, custom agent definitions, and MCP Apps for interactive UI directly in chat. This is the most integrated editor experience — but it's coordinating between different vendors' agents rather than managing a unified team.

Tier 2: The Wider Field

Windsurf ($15/month) is a VS Code fork built around Cascade, an agent that plans and executes multi-step changes across your project. It supports JetBrains IDEs — which neither Cursor nor Claude Code does natively — and its persistent memory system learns your coding patterns across sessions. Ownership changed hands in 2025 when Cognition (makers of Devin) acquired it, and governance questions linger, but the product is solid for structured agent workflows at a lower price point than Cursor or Claude Code.

Aider (free, open-source, 20K+ GitHub stars) is the choice for senior engineers who live in the terminal and treat git as gospel. It's model-agnostic, works with Claude, GPT, Gemini, and local models, and produces clean commits with structured diffs. No agent orchestration — just a reliable, git-native pair programmer for developers who value correctness over automation.

Augment Code ($50/dev/month) was purpose-built for massive, multi-repository enterprise codebases. It's the first AI coding assistant with ISO/IEC 42001 certification for AI management systems. If your engineering org has sprawling repos and strict compliance requirements, Augment is the structural choice — though developer sentiment has cooled from its initial hype.

Cline, Continue.dev, and Kilo Code represent the open-source, model-agnostic tier. All are free, all work with multiple AI providers, and all appeal to developers who want control over what model they use and where their code goes. Continue.dev is particularly strong for air-gapped deployments. Kilo Code bundles features from both Cline and RooCode with zero API markup.

AWS Kiro takes a different approach entirely: spec-first, code-second. Before writing a line of code, it generates requirements, system design, and a dependency-ordered task list. This catches design mistakes at the requirements stage rather than the implementation stage — ideal for greenfield features and compliance-heavy environments.

Tier 3: Beyond Coding

Microsoft Copilot Cowork launched as a direct competitor to Anthropic's Cowork — an AI agent that reads, analyzes, and manipulates files across your desktop. Built partly on Anthropic technology, it selects the best model per task. This is multi-agent at the operating system level, not the coding level, and it triggered a massive stock selloff — Thomson Reuters dropped 16%, LegalZoom sank 20% — because investors recognized these tools are starting to displace specialized enterprise software.

Community Tools

The open-source community got to multi-agent before any vendor shipped it natively. Projects like claude-flow, ccswarm, and oh-my-claudecode pioneered agent orchestration for Claude Code. The community found a hidden "TeammateTool" system with 13 operations buried in Claude Code's binary weeks before Anthropic officially launched Agent Teams. That demand signal — developers reverse-engineering capabilities they need — is worth more than any benchmark.

The honest take

No single tool dominates across all dimensions. Claude Code Agent Teams leads for coordinated, multi-agent depth — agents that talk to each other, share findings, and challenge assumptions. Cursor Cloud Agents leads for parallelized throughput on independent tasks. Codex CLI leads on terminal speed and token efficiency. Aider leads for git-native correctness. Augment leads for enterprise-scale compliance. The most effective setups combine tools — using one for deep coding, another for rapid prototyping, another for automation.

For Agent Teams specifically: it's still experimental. Edge cases around session resumption and nested teams exist. For most individual developers, subagents or single-session Claude Code is still the right call for everyday work. Agent Teams earns its cost on larger, multi-layer projects where coordination genuinely saves time — and last week's pricing change makes that cost easier to justify.

Where OpenClaw Fits

Any complete picture of AI agents in 2026 has to include OpenClaw, but not as a competitor to Claude Code. They operate in fundamentally different layers.

OpenClaw (formerly Clawdbot/Moltbot) is an open-source, self-hosted AI agent created by Peter Steinberger. It runs 24/7 on your hardware or a VPS, monitoring your messaging apps — Telegram, WhatsApp, Slack, iMessage, Discord, Signal — executing scheduled tasks, scraping web pages, and calling APIs without waiting for a prompt. It's now at 247,000+ GitHub stars, making it the fastest-growing open-source project in AI history. Steinberger joined OpenAI in February 2026, and the project moved to an independent open-source foundation.

Claude Code is reactive. You start a session when you need to build something. OpenClaw is proactive. It uses cron jobs and a heartbeat system to monitor, alert, and act while you're doing something else entirely.

The January Meltdown

Earlier this year, thousands of OpenClaw users received suspension notices from Anthropic. The community had been routing OpenClaw through standard Claude.ai subscriptions ($20/month) instead of paying for dedicated API access. Anthropic shut it down explicitly. The incident drew a hard line: OpenClaw can use Claude the model as its brain, but it is not Claude Code the product. Different rules, different economics, different trust boundaries.

Security: The Real Conversation

OpenClaw's power comes with real risk. It requires root-level access to your system — email credentials, API keys, calendar tokens, filesystem permissions. Security researchers found 135,000+ exposed instances on the open internet (O'Reilly Radar, Mar 2026). The ClawHavoc attack in early 2026 planted hundreds of malicious skills on ClawHub (the community plugin marketplace), distributing credential-stealing malware. Cisco's AI security research team confirmed a third-party skill was performing data exfiltration without user awareness (Sangfor Research, Mar 2026). Microsoft published security guidance stating OpenClaw should be treated as untrusted code execution and should not run on standard workstations (MindStudio, citing Microsoft Security Blog).

This isn't a reason to ignore OpenClaw. It's a reason to understand what you're deploying. The tool is powerful. The security posture requires deliberate hardening — dedicated hardware, filesystem restrictions, audited skills, localhost binding. That's a fundamentally different deployment model than installing Claude Code with npm install.

	OpenClaw	Claude Code
Primary Use	Personal AI automation across messaging and workflows	AI coding assistant for development
Operational Model	Always-on, event-driven (heartbeat + cron)	On-demand, reactive (start a session when needed)
Platform	Multi-platform messaging (Telegram, Discord, Slack, Signal, iMessage, WhatsApp, Teams)	Terminal only, with IDE extensions via MCP
Memory	Persistent 2-layer (ephemeral + durable Markdown)	Session-based with CLAUDE.md for project context
Proactive	Yes — monitors, alerts, and acts without prompting	No — purely reactive
Extensibility	10,000+ community skills on ClawHub (vet carefully)	MCP servers and slash commands
Pricing	Free (open source) + API costs for your chosen model	$20–$200/month subscription
Security Model	Self-hosted, full control — but full responsibility	Local execution with optional sandboxing

The choice isn't "OpenClaw or Claude Code." It's "what problem are you solving?" OpenClaw won't debug your backend. Claude Code won't sort your Slack messages at 6am. Developers are already combining them — using Claude Code for deep coding work and OpenClaw as the automation layer that deploys, monitors, and manages everything around it.

The Bigger Pattern

Multi-agent coordination in coding tools is a preview of where all AI software is going. The industry is moving from single-model interactions to modular, multi-agent architectures — the same evolution software went through moving from monolithic apps to microservices.

Anthropic's Model Context Protocol (MCP), now under the Linux Foundation's Agentic AI Foundation, is becoming the standard connector layer. OpenAI and Microsoft have adopted it. Google is building managed MCP servers. The plumbing is being standardized, which means agents from different vendors will eventually interoperate.

According to Gartner (as reported by RTInsights and cited across multiple industry analyses), multi-agent system inquiries surged 1,445% between Q1 2024 and Q2 2025. They project 40% of enterprise applications will include task-specific AI agents by end of 2026, up from less than 5% in 2025.

What's happening in Claude Code right now — specialized agents splitting work, coordinating in real time, reviewing each other's output — is the template for how AI handles everything from financial analysis to project management to compliance review. The tools will be different. The architecture will be the same.

What Shipped This Week

Agent Teams was the headline feature six weeks ago, but Anthropic hasn't slowed down. The past seven days alone delivered a stack of updates that affect how Agent Teams and Claude Code operate day-to-day:

March 9: Multi-Agent Code Review launched — dedicated teams of agents analyze pull requests in parallel, cross-verify findings, and leave severity-ranked comments. Available on Teams and Enterprise.

March 11–12: Claude in Excel and PowerPoint upgraded — both add-ins now share full conversation context with each other, support skills, and connect to Amazon Bedrock, Vertex AI, or Microsoft Foundry via LLM gateway. Inline visualizations (charts, diagrams) also went live in chat responses.

March 13: The 1M token context window moved to standard pricing. No more long-context surcharge. A 900K-token request now costs the same per token as a 9K-token one.

March 14–15: Push-to-talk voice mode (/voice) began rolling out. The /loop command enables recurring scheduled tasks (cron-style). Effort levels simplified to low/medium/high with a new /effort command. MCP elicitation allows servers to request structured input mid-task. Voice STT expanded to 20 languages. Multi-agent tasks now use fewer tokens via more concise subagent reports.

That's five major feature drops in seven days. Whether or not you use Agent Teams, updating to the latest Claude Code (v2.1.76) gets you meaningful improvements across the board.

Quick Start

If you want to try Agent Teams today, here's the minimum path:

# 1. Update Claude Code
npm update -g @anthropic-ai/claude-code
# 2. Enable Agent Teams
export CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1
# 3. Start a session
claude

# 4. Describe your task and tell it to create a team # Example: "Build a REST API with authentication, # a React dashboard, and integration tests. # Create an agent team with specialists for each."

Claude will propose a team structure. Approve it, and the agents spin up. You can interact with individual teammates directly or let the lead coordinate. When you're done, the lead synthesizes the output.

For the best experience, you'll want tmux (Linux/macOS) or iTerm2 (macOS) installed — Agent Teams supports a split-pane display mode where you can watch each teammate's progress in real time. Without a terminal multiplexer, you'll still get results but you'll only see the lead's view.

Start with a project where parallel work makes obvious sense — like a feature that touches backend, frontend, and tests simultaneously. Don't use it for sequential tasks where one step depends on the previous one finishing. That's still subagent territory.

Sources

Anthropic, "Introducing Claude Opus 4.6" — Feb 5, 2026 · Anthropic, Claude Code Agent Teams Documentation · Anthropic, Claude Code Subagents Documentation · Anthropic, Claude API Pricing — 1M Context at Standard Rates — Mar 13, 2026 · OpenAI, "Introducing GPT-5.3-Codex" — Feb 5, 2026 · TechCrunch, "Anthropic releases Opus 4.6 with new 'agent teams'" — Feb 5, 2026 · VentureBeat, "OpenAI's GPT-5.3-Codex drops as Anthropic upgrades Claude" — Feb 5, 2026 · The New Stack, "Anthropic launches a multi-agent code review tool" — Mar 9, 2026 · DevOps.com, "Cursor Cloud Agents Get Their Own Computers" — Feb 24, 2026 · VS Code Blog, "Your Home for Multi-Agent Development" — Feb 5, 2026 · Wikipedia, OpenClaw · DigitalOcean, "What is OpenClaw?" · O'Reilly, "What OpenClaw Reveals About the Next Phase of AI Agents" — Mar 2026 · MorphLLM, "Best AI for Coding 2026: Every Model Ranked" — Mar 2026 · Faros AI, "Best AI Coding Agents for 2026" — Jan 2026 · Sangfor Research, "OpenClaw: Exploring AI Agent Security Vulnerabilities" — Mar 2026 · MindStudio, "What Is OpenClaw? The Open-Source AI Agent That Actually Does Things" — Feb 2026 · Anthropic, Claude Code Changelog — Mar 2026