Claude Managed Agents: Deploy AI Agents Without Managing Infrastructure

May 03, 2026 Abhay 11 min read

Claude Managed Agents: Deploy AI Agents Without Managing Infrastructure

Building an AI agent that runs autonomously — browses the web, executes code, reads and writes files, persists memory across sessions — requires infrastructure. You need a sandbox, a process that can run for hours without your web server timing out, and a way to resume from where you left off after a network hiccup.

Claude Managed Agents, launched in public beta in April 2026, offloads all of that to Anthropic. You send a task to a REST endpoint, and Anthropic runs the agent loop in a secure sandboxed container. The agent can run for hours, survives disconnections, and maintains memory across sessions.

This article covers when to use it, how to call it, and what it costs.

Managed Agents vs. the Agent SDK: What to Choose

Before diving into the API, here is the decision framework:

	Managed Agents	Agent SDK
Runtime	Anthropic-hosted	You host
Agent loop	Anthropic manages	You implement
Infrastructure	Fully managed	You manage
Sandbox	Built-in, secure	You set up
Long-running sessions	Natively supported	You handle timeouts
Memory	Built-in (public beta)	You implement
Customisation	Less control	Full control
Time to first agent	Minutes	Days to weeks
Runtime pricing	$0.08/session-hour	Your infra cost

Use Managed Agents when:

You want to ship an agent quickly without infrastructure work
Tasks run for hours and you need reliable resumption
You do not need custom tool integrations or unusual control flow
Your team does not have capacity to maintain agent infrastructure

Use the Agent SDK when:

You need custom tool integrations or non-standard architecture
Compliance requires your infrastructure (not Anthropic’s)
You have specific performance requirements the managed service cannot meet
You are building a product where the agent runtime is a core differentiator

How It Works

sequenceDiagram
    participant App as Your Application
    participant API as Managed Agents API
    participant Container as Secure Sandbox

    App->>API: POST /agents/sessions {task, tools, model}
    API->>Container: Provision sandbox container
    Container->>Container: Agent loop begins
    Note over Container: Agent uses web search, code execution,\nfile management autonomously
    App->>API: GET /agents/sessions/{id}/stream (SSE)
    Container-->>App: Stream: thinking steps, tool calls, progress
    Container->>Container: Task completes
    Container-->>App: Stream: final result + session summary

Every agent run gets its own isolated container. The container has access to built-in tools (web search, code execution, file management) and runs with the model you specify. The session persists even if your connection drops — you can reconnect and resume streaming from where you left off.

Creating Your First Agent Session

All Managed Agents requests require the beta header: anthropic-beta: managed-agents-2026-04-01

Python

import anthropic
import json

client = anthropic.Anthropic()

# Create an agent session
session = client.beta.managed_agents.sessions.create(
    model="claude-sonnet-4-6",
    max_tokens=8192,
    tools=[
        {"type": "web_search"},
        {"type": "code_execution"}
    ],
    messages=[
        {
            "role": "user",
            "content": """
            Research the top 5 Kubernetes security vulnerabilities reported in 2026,
            write a concise summary for each, and produce a markdown report
            with a risk rating and recommended mitigations.
            Save the final report as 'k8s-security-2026.md'.
            """
        }
    ],
    betas=["managed-agents-2026-04-01"]
)

print(f"Session started: {session.id}")
print(f"Status: {session.status}")

TypeScript

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

async function runResearchAgent(task: string): Promise<string> {
  const session = await client.beta.managedAgents.sessions.create({
    model: "claude-sonnet-4-6",
    max_tokens: 8192,
    tools: [
      { type: "web_search" },
      { type: "code_execution" },
      { type: "file_management" }
    ],
    messages: [{ role: "user", content: task }],
    betas: ["managed-agents-2026-04-01"]
  });

  return session.id;
}

Streaming Progress with Server-Sent Events

Agent tasks can take minutes or hours. Use Server-Sent Events to stream progress as it happens:

import anthropic

client = anthropic.Anthropic()

# Create a session
session = client.beta.managed_agents.sessions.create(
    model="claude-opus-4-7",
    max_tokens=16384,
    tools=[
        {"type": "web_search"},
        {"type": "code_execution"}
    ],
    messages=[
        {
            "role": "user",
            "content": "Analyse our GitHub Actions CI logs from the last 7 days and identify the top 3 most common failure patterns."
        }
    ],
    betas=["managed-agents-2026-04-01"]
)

# Stream the agent's progress
with client.beta.managed_agents.sessions.stream(session.id) as stream:
    for event in stream:
        if event.type == "content_block_delta":
            # Agent is writing a response
            print(event.delta.text, end="", flush=True)
        elif event.type == "tool_use":
            # Agent is using a tool
            print(f"\n[Tool: {event.name}] {json.dumps(event.input)[:100]}...")
        elif event.type == "message_stop":
            print("\n\nAgent completed.")
            break

Reconnecting to a Running Session

If your connection drops — or you want to check on a long-running session later — reconnect by session ID:

import anthropic

client = anthropic.Anthropic()

session_id = "session_01ABC..."  # saved from when you created the session

# Check current status
session = client.beta.managed_agents.sessions.retrieve(
    session_id,
    betas=["managed-agents-2026-04-01"]
)

print(f"Status: {session.status}")  # running, completed, failed

# If still running, reconnect to the stream
if session.status == "running":
    with client.beta.managed_agents.sessions.stream(session_id) as stream:
        for event in stream:
            print(event)

Memory Across Sessions

As of April 2026, Managed Agents support persistent memory. Claude remembers facts from previous sessions and uses them automatically in new ones.

import anthropic

client = anthropic.Anthropic()

# First session — Claude learns your environment
session1 = client.beta.managed_agents.sessions.create(
    model="claude-sonnet-4-6",
    max_tokens=4096,
    tools=[{"type": "web_search"}, {"type": "code_execution"}],
    memory={"enabled": True, "scope": "user"},  # persist memory for this user
    messages=[
        {
            "role": "user",
            "content": "Our production cluster runs Kubernetes 1.29 on AWS EKS in eu-west-1. "
                       "Our staging cluster is in us-east-1. Remember this for future sessions."
        }
    ],
    betas=["managed-agents-2026-04-01"]
)

# Later — a separate session, Claude remembers
session2 = client.beta.managed_agents.sessions.create(
    model="claude-sonnet-4-6",
    max_tokens=4096,
    tools=[{"type": "web_search"}],
    memory={"enabled": True, "scope": "user"},
    messages=[
        {
            "role": "user",
            "content": "Is there a known CVE affecting our Kubernetes version?"
        }
        # Claude knows from memory that you run K8s 1.29
    ],
    betas=["managed-agents-2026-04-01"]
)

Scheduling Recurring Agent Runs

Managed Agents supports routines — cron-scheduled agent tasks that run automatically without your application triggering them:

import anthropic

client = anthropic.Anthropic()

# Run a security scan every Monday at 08:00 UTC
routine = client.beta.managed_agents.routines.create(
    name="weekly-security-scan",
    schedule="0 8 * * 1",  # cron expression
    model="claude-sonnet-4-6",
    max_tokens=8192,
    tools=[
        {"type": "web_search"},
        {"type": "code_execution"}
    ],
    messages=[
        {
            "role": "user",
            "content": """
            Scan our GitHub repository for:
            1. Dependencies with known CVEs published this week
            2. Any new Dependabot alerts
            3. Failed security checks in recent CI runs
            Produce a markdown summary and post it to our Slack webhook.
            """
        }
    ],
    webhook_url="https://your-app.com/webhooks/agent-results",
    betas=["managed-agents-2026-04-01"]
)

print(f"Routine created: {routine.id}")
print(f"Next run: {routine.next_run_at}")

The webhook receives the completed session result when the task finishes.

DevOps Use Cases

Automated Incident Triage

import anthropic
import os

client = anthropic.Anthropic()

def triage_incident(alert_body: str) -> str:
    """
    Given a PagerDuty alert body, spin up an agent to investigate
    and return a preliminary diagnosis.
    """
    session = client.beta.managed_agents.sessions.create(
        model="claude-opus-4-7",
        max_tokens=16384,
        tools=[
            {"type": "web_search"},
            {"type": "code_execution"}
        ],
        system="You are an on-call SRE. Investigate the incident methodically. "
               "Check logs, metrics, and recent deployments. "
               "Produce a diagnosis and recommended immediate actions.",
        messages=[
            {"role": "user", "content": f"Incoming incident alert:\n\n{alert_body}"}
        ],
        betas=["managed-agents-2026-04-01"]
    )

    # Collect the full result
    result_text = ""
    with client.beta.managed_agents.sessions.stream(session.id) as stream:
        for event in stream:
            if event.type == "content_block_delta":
                result_text += event.delta.text

    return result_text


# Called by your PagerDuty webhook handler
incident_alert = """
CRITICAL: High error rate on payment-service
Time: 2026-05-03 14:32 UTC
Error rate: 12.4% (normal: 0.1%)
Affected endpoints: /api/v2/payments/create, /api/v2/payments/confirm
Recent deploy: payment-service v2.3.1 at 14:15 UTC
"""

diagnosis = triage_incident(incident_alert)
print(diagnosis)

Weekly Dependency Audit

import anthropic

client = anthropic.Anthropic()

# Run once; schedule as a routine for weekly automation
session = client.beta.managed_agents.sessions.create(
    model="claude-sonnet-4-6",
    max_tokens=8192,
    tools=[
        {"type": "web_search"},
        {"type": "code_execution"},
        {"type": "file_management"}
    ],
    messages=[
        {
            "role": "user",
            "content": """
            Audit our Node.js dependencies:
            1. Read package.json from the repo root
            2. For each direct dependency, check the npm registry for:
               - Current latest version vs what we pin
               - Known security advisories (npm audit equivalent)
            3. Produce a prioritised list: security issues first, then major version gaps
            4. Write the report to dependency-audit.md
            """
        }
    ],
    betas=["managed-agents-2026-04-01"]
)

print(f"Audit running: {session.id}")

Infrastructure Documentation Generator

import anthropic

client = anthropic.Anthropic()

session = client.beta.managed_agents.sessions.create(
    model="claude-opus-4-7",
    max_tokens=16384,
    tools=[
        {"type": "code_execution"},
        {"type": "file_management"}
    ],
    messages=[
        {
            "role": "user",
            "content": """
            Generate architecture documentation for our infrastructure:
            1. Read all Terraform files in infra/
            2. Read all Kubernetes manifests in k8s/
            3. Read all GitHub Actions workflows in .github/workflows/
            4. Produce a comprehensive architecture document covering:
               - AWS resources and their purposes
               - Service dependencies and data flows
               - CI/CD pipeline stages
               - Network boundaries and security groups
            5. Save as docs/architecture.md with Mermaid diagrams
            """
        }
    ],
    betas=["managed-agents-2026-04-01"]
)

Pricing

Managed Agents uses standard Claude API token pricing (same rates as direct API calls) plus a runtime charge for the sandboxed container:

Component	Price
Input/output tokens	Same as Claude API
Container runtime	$0.08 per session-hour
Web search	$10 per 1,000 searches
Code execution (with search)	Free

For a task that runs for 20 minutes and uses Sonnet 4.6 with 100k tokens total:

Tokens: ~$0.45 (100k input at $3/MTok + some output)
Container: ~$0.027 (20 min at $0.08/hr)
Total: roughly $0.48

Compare that to the engineering time to build and maintain the equivalent infrastructure.

New in May 2026: Dreaming, Outcomes, and Multiagent Orchestration

On May 6, 2026, Anthropic shipped three significant additions to Managed Agents. All three are available now.

Harvey reported a 6x jump in task completion rates after enabling multiagent orchestration. — Anthropic, May 6, 2026

Dreaming (Research Preview)

Dreaming is a scheduled background process that reviews your agent’s past sessions and memory stores, extracts patterns, and curates memories so your agents improve over time.

Standard memory lets agents recall specific facts from earlier sessions. Dreaming goes further: it surfaces patterns that a single session cannot see — recurring mistakes, workflows the agent converges on across many runs, preferences shared across a team. The result is an agent that gets measurably better at your specific workflows the more you use it.

You control how much autonomy dreaming has via two modes:

Auto mode (default) — dreaming updates memory automatically based on session patterns
Review mode — dreaming surfaces proposed memory changes for human approval before they land. This is the right choice for regulated industries or teams that want oversight of what the agent “learns”.

memory={
    "enabled": True,
    "scope": "user",
    "dreaming": {
        "enabled": True,
        "mode": "review"  # "auto" (default) or "review" (human approves changes)
    }
}

To enable dreaming on a session, add the dreaming configuration:

session = client.beta.managed_agents.sessions.create(
    model="claude-sonnet-4-6",
    max_tokens=8192,
    tools=[{"type": "web_search"}, {"type": "code_execution"}],
    memory={"enabled": True, "scope": "user", "dreaming": {"enabled": True}},
    messages=[{"role": "user", "content": your_task}],
    betas=["managed-agents-2026-04-01"]
)

Outcomes (Public Beta)

Outcomes let you define what success looks like, and the agent works toward your criteria rather than just completing a task description.

You write a rubric describing success. A separate grader evaluates the agent’s output against your criteria in its own context window — isolated from the agent’s reasoning so it cannot be influenced by how the agent justified its own choices.

session = client.beta.managed_agents.sessions.create(
    model="claude-opus-4-7",
    max_tokens=16384,
    tools=[{"type": "web_search"}, {"type": "code_execution"}],
    outcomes={
        "rubric": """
        The dependency audit report is successful if:
        1. All direct dependencies are listed with their current and pinned versions
        2. Every CVE found has a severity rating (critical/high/medium/low)
        3. The top 3 most urgent issues are clearly identified with fix instructions
        4. The report is actionable by a developer with no prior context
        """,
        "grader_model": "claude-sonnet-4-6"
    },
    messages=[
        {"role": "user", "content": "Audit our Node.js dependencies for security issues."}
    ],
    betas=["managed-agents-2026-04-01"]
)

The grader result is returned alongside the session output, so you can programmatically gate on quality — retry if the rubric score is below threshold, or escalate to a human review queue.

Multiagent Orchestration (Public Beta)

Multiagent orchestration lets a lead agent break a complex job into pieces and delegate each to a specialist agent with its own model, prompt, and tools. Specialists work in parallel on a shared filesystem and contribute to the lead agent’s overall context.

Unlike spawning subagents from Claude Code CLI (which share a session), Managed Agents orchestration runs each specialist in its own fully isolated container. Specialists can check back with the lead agent mid-workflow because events are persistent and every agent remembers what it has done.

session = client.beta.managed_agents.sessions.create(
    model="claude-opus-4-7",
    max_tokens=32768,
    tools=[
        {"type": "web_search"},
        {"type": "code_execution"},
        {"type": "file_management"}
    ],
    orchestration={
        "enabled": True,
        "specialists": [
            {
                "name": "security-reviewer",
                "model": "claude-opus-4-7",
                "prompt": "You are a security specialist. Review code for vulnerabilities only.",
                "tools": [{"type": "code_execution"}]
            },
            {
                "name": "dependency-checker",
                "model": "claude-sonnet-4-6",
                "prompt": "You are a dependency specialist. Check npm audit and CVE databases.",
                "tools": [{"type": "web_search"}]
            },
            {
                "name": "report-writer",
                "model": "claude-sonnet-4-6",
                "prompt": "You write clear technical reports from findings provided to you.",
                "tools": [{"type": "file_management"}]
            }
        ]
    },
    messages=[
        {
            "role": "user",
            "content": """
            Run a comprehensive security audit:
            1. Review source code for vulnerabilities
            2. Check all dependencies for known CVEs
            3. Compile findings into a prioritised report
            Do these in parallel, then combine results.
            """
        }
    ],
    betas=["managed-agents-2026-04-01"]
)

The lead agent plans the work, delegates to specialists, and integrates results — you get parallelism without managing orchestration logic yourself.

Limits and Constraints (Beta)

Managed Agents is in public beta as of May 2026. Current constraints:

Maximum session duration: 4 hours per session
Built-in tools: web search, code execution, file management (custom MCP tools: roadmap)
Memory scope: user-level or session-level (org-level: roadmap)
Streaming: SSE only (WebSockets: roadmap)
Concurrency: rate limits apply per workspace

These will expand as the beta matures. The beta header (managed-agents-2026-04-01) is required on all requests — when the feature becomes generally available, the header will no longer be required.

The ant CLI

Alongside Managed Agents, Anthropic launched the ant CLI — a command-line client for interacting with the Claude API, including managed agents, without writing code:

# Install
npm install -g @anthropic/ant

# Run a one-off agent task
ant run --model claude-sonnet-4-6 --tools web_search,code_execution \
  "Research Kubernetes 1.31 release notes and summarise breaking changes"

# List running sessions
ant sessions list

# Stream output from a running session
ant sessions stream session_01ABC...

# Create a routine
ant routines create --schedule "0 9 * * 1" --name "weekly-audit" \
  "Audit npm dependencies for security issues and write a report"

The ant CLI is useful for one-off agent tasks and for testing prompts before wiring them into an application.

Managed Agents is the fastest path from “I want an autonomous agent” to “I have a running autonomous agent.” The infrastructure decisions — sandboxing, session persistence, memory — are handled. You focus on the task.

For teams already running Claude Code or using the Claude API, adding Managed Agents for long-running autonomous tasks is a natural extension of the same toolchain.

Abhay Pratap Singh

DevOps Engineer passionate about automation, cloud infrastructure, and self-hosted tools. I write about Kubernetes, Terraform, DNS, and everything in between.

GitHub LinkedIn RSS