

Best agent cloud platforms in 2026
- Agent cloud refers to the full infrastructure stack AI agents run on: isolated sandboxes for code execution, persistent compute for stateful workloads, background workers, storage, and optionally GPU inference.
- Most purpose-built sandbox tools cover isolated code execution only. Production agents typically need more than that.
- Key evaluation dimensions: isolation model, ephemeral vs persistent environments, GPU availability, BYOC (Bring Your Own Cloud) support, and pricing model.
- Northflank covers the full stack: microVM-based sandboxes (Kata Containers, Firecracker) and gVisor, both ephemeral and persistent environments with no forced time limits, on-demand GPUs, and self-serve BYOC across AWS, GCP, Azure, Oracle, Civo, CoreWeave, on-premises, and bare-metal. In production since 2021.
In infrastructure terms, "agent cloud" refers to the compute and orchestration layer that AI agents execute on. This article covers that meaning: what agents need to run in the cloud, and which platforms cover that scope today.
You are building something that runs agents, executes code those agents write, maintains state across sessions, and likely needs to do all of that inside your own cloud or at a cost that scales with your workload. The question is which platforms can handle that scope, and where each one draws the line.
An agent cloud is the infrastructure layer that AI agents run on. It covers the environments where agents execute code, the compute that keeps long-running agents alive, the storage that preserves state across sessions, and the orchestration that ties all of it together.
The term covers a wide range. Some platforms in this category provide only sandboxed code execution. Others provide the full stack: agents, background workers, APIs, databases, and GPU inference under one control plane. Understanding where a platform sits on that spectrum matters before you commit to it.
For a foundational definition of what sandboxes are within this stack, see what is an AI sandbox.
Before comparing platforms, it helps to be clear about what the infrastructure layer needs to provide. Most production agents need more than isolated code execution.
- Sandboxes and isolated code execution: Agents write and execute code that may be LLM-generated, user-submitted, or untrusted. It needs to run in an environment isolated from your host system and other tenants. The isolation model matters: container-level isolation shares the host kernel, while microVMs (Firecracker, Kata Containers) and kernel-sandboxing tools like gVisor give each workload stronger isolation than standard containers. See best code execution sandbox for AI agents, how to sandbox AI agents.
- Ephemeral vs. persistent environments: Stateless sandboxes work for short-lived tasks. Agents with memory, session history, or accumulated state need environments that persist between runs. Some platforms impose hard session limits that break long-horizon workflows. See ephemeral execution environments for AI agents, persistent sandboxes.
- Background workers and async jobs: Agents spawn async tasks and scheduled jobs. A sandbox handles isolated execution of a single workload; a full runtime handles the lifecycle of workers and background processes alongside that. See code execution environment for autonomous agents, top AI agent runtime tools.
- GPU compute: Inference and compute-heavy tool use require GPUs. On-demand availability without quota requests or reserved capacity is a meaningful practical distinction between platforms.
- BYOC and deployment model: For enterprise deployments with data residency requirements or teams that need execution inside their own VPC, self-serve BYOC is a hard requirement. Related: self-hosted AI sandboxes, top BYOC AI sandboxes.
These are the dimensions that tend to be decisive when choosing infrastructure for agent workloads.
| Criteria | Why it matters |
|---|---|
| Isolation model | MicroVM vs container-level security for untrusted code |
| Ephemeral and persistent | Whether the platform supports both stateless and stateful workloads |
| Session limits | Maximum sandbox duration; relevant for long-horizon agent tasks |
| GPU availability | Required for inference and training workloads |
| BYOC support | Running execution inside your own VPC for compliance or data residency |
| Pricing model | Per-second billing, PaaS vs BYOC cost structure |
| SDK and API access | Integration surface for agent frameworks |
The table below covers isolation model, environment support, GPU and BYOC availability, compute pricing, and billing model across all platforms in this comparison. Pricing as of April 2026. Verify current rates on each platform's pricing page before making cost decisions.
| Platform | Isolation model | Ephemeral | Persistent | GPU | BYOC (Bring Your Own Cloud) | CPU pricing | Memory pricing | Billing |
|---|---|---|---|---|---|---|---|---|
| Northflank | MicroVM (Kata, Firecracker) + gVisor | Yes | Yes | Yes, L4: $0.80/hr, A100 40GB: $1.42/hr, A100 80GB: $1.76/hr, H100: $2.74/hr, H200: $3.14/hr, and more | Yes, self-serve | $0.01667/vCPU-hr | $0.00833/GB-hr | Per second |
| E2B | MicroVM (Firecracker) | Yes | Yes | Do not provide GPU compute | Limited (enterprise only, AWS & GCP only), requires contacting sales | $0.0504/vCPU-hr | $0.0162/GiB-hr | Per second |
| Modal | gVisor | Yes | No | Yes, L4: $0.80/hr, A100 40GB: $2.10/hr, A100 80GB: $2.50/hr, H100: $3.95/hr, H200: $4.54/hr | No | $0.1419/core-hr (2 vCPU) | $0.0242/GiB-hr | Per second |
| Fly.io Sprites | MicroVM (Firecracker) | Yes | Yes | Do not provide GPU compute | No | $0.07/CPU-hr | $0.04375/GB-hr | Per second, no idle |
| Runloop | MicroVM + container (two-layer) | Yes | Yes | Do not provide GPU compute | Enterprise, requires contacting sales | $0.108/CPU-hr | $0.0252/GB-hr | Per second |
| Vercel Sandbox | MicroVM (Firecracker) | Yes | Beta | Do not provide GPU compute | No | $0.128/vCPU-hr | $0.0212/GB-hr | Active CPU only |
| Cloudflare Sandbox | Container | Yes | No | Do not provide GPU compute | No | $0.072/vCPU-hr | $0.009/GiB-hr | Active CPU |
The platforms below range from full-stack production runtimes to purpose-built sandbox tools. Each has a distinct scope and trade-off profile worth understanding before you commit.
Northflank is a production infrastructure platform that covers the complete stack an AI product needs: agents, APIs, background workers, databases, cron jobs, and isolated sandbox execution in one control plane. CPU and GPU workloads are both supported.
Sandboxes on Northflank use microVM-based isolation with Kata Containers and Firecracker, and gVisor, applied per workload depending on security and performance requirements. Environment creation takes 1-2 seconds end-to-end, accounting for the full orchestration cycle.
- MicroVM isolation (Kata Containers, Firecracker) and gVisor applied per workload type
- Both ephemeral and persistent environments with no forced time limits
- On-demand GPUs (L4, A100 40GB/80GB, H100, H200, and more) without quota requests or reservation
- Self-serve BYOC (Bring Your Own Cloud) across AWS, GCP, Azure, Oracle, CoreWeave, on-premises, and bare-metal
- API, CLI, and SSH access
- In production since 2021 across startups, public companies, and government deployments
The table below shows total monthly cost across providers at 200 sandboxes, using equivalent compute specs.
Based on 200 sandboxes, plan nf-compute-100-4, infra node m7i.2xlarge. Pricing as of April 2026.
| Model | Provider | Cloud cost | Vendor cost | Total |
|---|---|---|---|---|
| PaaS | Northflank | - | $7,200.00 | $7,200.00 |
| PaaS | E2B | - | $16,819.20 | $16,819.20 |
| PaaS | Modal | - | $24,491.50 | $24,491.50 |
| PaaS | Fly Sprites | - | $35,770.00 | $35,770.00 |
| PaaS | Runloop | - | $30,484.80 | $30,484.80 |
| PaaS | Vercel Sandbox | - | $31,068.80 | $31,068.80 |
| BYOC (0.2 request modifier) | Northflank | $1,500.00 | $560.00 | $2,060.00 |
| BYOC | E2B | $1,500.00 | $10,000.00 | $11,500.00 |
The BYOC row for Northflank uses a request modifier of 0.2. Each sandbox requests 20% of its plan's resources as a guaranteed minimum and can burst to the full plan limit when capacity is available on the node. This allows more sandboxes to run on the same hardware, reducing both cloud provider costs and the Northflank management fee. The modifier is configurable.
To get started, see the sandboxes on Northflank and deploy sandboxes on Northflank documentation, or follow the guide to deploy sandboxes in your cloud for BYOC deployments. To integrate via code, see create sandbox with SDK.
Teams can get started directly (self-serve) or book a session with an engineer for specific infrastructure or compliance requirements.
E2B is a purpose-built sandbox tool for AI agents and LLM applications. It uses Firecracker microVMs and provides Python and TypeScript SDKs.
- Hobby tier: free, $100 usage credit, sessions up to 1 hour, up to 20 concurrent sandboxes
- Pro tier: $150/month plus usage, sessions up to 24 hours, up to 100 concurrent sandboxes
- No GPU compute
- BYOC is available but limited to enterprise customers on AWS and GCP; requires contacting sales
E2B covers the sandbox layer. If your agents need persistent workers, databases, background jobs, or GPU inference alongside code execution, you will need to run those on separate infrastructure.
Modal is a serverless Python-first platform that runs sandboxes in isolated gVisor environments.
- Supports scaling to 50,000 or more concurrent sessions.
- Sandbox pricing uses a separate, higher compute tier than standard Modal workloads
- GPU support across L4, A10, A100, H100, H200, and B200; GPU rates on the sandbox tier match standard Modal GPU pricing
- No BYOC; managed infrastructure only
For more comparison, see E2B vs Modal
Sprites is a sandbox product from Fly.io built on Firecracker VMs.
- Each Sprite has a persistent filesystem (ext4) with checkpoint and restore support
- Up to 8 vCPUs and 16GB RAM per Sprite
- Per-second billing with no idle charge
- No GPU support
- No BYOC
Related: E2B vs Sprites.
Runloop focuses on sandbox environments (called Devboxes) for AI agent workflows, with evaluation tooling alongside.
- Basic plan: free with $50 in trial credits
- Pro plan: $250/month plus usage
- VPC deployment available on enterprise plans
Vercel Sandbox runs sandboxes in Firecracker microVMs on Vercel's managed infrastructure.
- Node.js and Python runtimes available
- Maximum session duration: 5 hours on Pro and Enterprise, 45 minutes on Hobby
- Persistent sandboxes with auto-save and resume available in beta
- No GPU support
- No BYOC; managed infrastructure only, single region
Cloudflare Sandbox is a beta product built on Cloudflare's Containers infrastructure, available on the Workers Paid plan.
- Provides a Linux environment with support for commands, file management, background processes, and exposing services from Workers applications
- Container-based isolation
- No GPU support
- No BYOC
Most platforms in this list handle one layer: sandboxed code execution. Northflank covers the full runtime: sandboxes, persistent services, background workers, databases, and GPU workloads under one control plane, with self-serve BYOC for teams that need execution inside their own VPC.
For production AI products, the operational surface is wider than isolated code execution. Agents need persistent memory, spawn async tasks, call APIs, and sometimes need GPU access for inference. Running each of those on separate platforms adds coordination overhead and more failure surfaces.
If you are evaluating Northflank for agent infrastructure, see the sandboxes on Northflank and deploy sandboxes in your cloud documentation, or follow the hands-on guide to spinning up a secure sandbox and microVM.
You can get started directly (self-serve) or book a call with the team.
At minimum, agents need isolated execution environments so untrusted code does not affect host systems or other tenants. Production agents typically also need persistent compute to maintain state across sessions, background workers for async tasks, storage for memory and outputs, and sometimes GPU access for inference. See code execution environment for autonomous agents for a more detailed breakdown.
A sandbox provides an isolated environment for executing code. An agent cloud platform covers a broader scope: sandboxes for execution, plus persistent compute, storage, workers, databases, and orchestration. Most purpose-built sandbox tools in this list focus on the execution layer only.
Of the platforms compared here, Northflank supports self-serve BYOC across AWS, GCP, Azure, Oracle, Civo, CoreWeave, on-premises, and bare-metal. E2B and Runloop both offer BYOC but with limitations and require contacting sales. Modal, Fly.io Sprites, Vercel Sandbox, and Cloudflare Sandbox are managed-only. See self-hosted AI sandboxes and top BYOC AI sandboxes.
Ephemeral sandboxes spin up for a task and terminate when it completes, leaving no state behind. Persistent sandboxes retain filesystem state, memory, and installed packages across runs. Agents with session history or accumulated context need persistent environments. See ephemeral execution environments for AI agents and persistent sandboxes.

