

Best platforms for long-running sandbox environments in 2026
Most sandbox platforms are built around short-lived execution. They work well when each run is isolated and stateless, but fall apart the moment your agent needs to maintain a working environment across sessions, build up state over time, or run a task that outlasts an arbitrary platform timeout. These are the platforms worth evaluating when persistence is the requirement.
- Northflank – Full-stack AI infrastructure platform with managed cloud and BYOC deployment into AWS, GCP, Azure, or bare-metal. Production-grade microVM sandboxes with Kata Containers, Firecracker, and gVisor isolation, unlimited sessions, databases, GPUs, CI/CD, and observability all in one place.
- E2B – Up to 24 hours on Pro with session persistence and snapshot support. Best for agents that need structured execution windows.
- CodeSandbox – Snapshot and fork environments with VM restore in under two seconds. State persists across sessions without rebuilding.
- Modal – Unlimited session duration with gVisor isolation and snapshot primitives for saving and restoring sandbox state.
- Fly.io Sprites – Persistent Linux VMs with 100GB NVMe storage that survive between sessions and idles automatically when not in use.
Most early sandbox decisions are made under prototype conditions, where every run is short, stateless, and independent. That works fine for quick code execution or one-shot agent tasks. It stops working the moment your agent needs to hold state.
A coding agent refactoring a large codebase across multiple interactions cannot start from scratch each time. An AI assistant maintaining memory of a user's project needs an environment that survives beyond a single session. A data pipeline agent processing files for hours cannot hit a platform timeout mid-run. These are not edge cases. They are the default shape of production agent workflows.
Short session limits force workarounds: checkpointing state to external storage, rebuilding environments on every run, and re-downloading datasets. Each one adds complexity and latency. Choosing a platform built for persistence from the start avoids the problem entirely.
Session length is only part of the equation. The platforms below differ significantly in how they handle state, what survives between runs, and how much infrastructure sits around the sandbox itself. Here is how they compare.
Northflank is a full-stack AI cloud platform with native support for long-running and persistent sandbox environments, accessible via UI, API, CLI, and GitOps. You define your sandbox environment once, specifying isolation model, storage, attached databases, secrets, and lifecycle rules, then provision it however fits your workflow: from a CLI command, an API call in a CI step, a Git trigger, or directly from an agent pipeline.

What sets Northflank apart for long-running use cases is the combination of no forced session limits and full-stack scope. Most sandbox platforms provision containers only. Northflank provisions databases, persistent volumes, S3-compatible object storage, background jobs, and encrypted secrets alongside your sandboxes, all from a single template on every trigger.
Key features:
- No session limits: Sandboxes run for seconds or weeks with no platform-imposed cutoff. Ephemeral and persistent environments are supported in the same control plane.
- Persistent storage: Attach volumes from 4GB to 64TB with multi-read-write support. Mount S3-compatible object storage for artifacts. Deploy managed databases alongside sandboxes for agent memory and execution history.
- Isolation options: Kata Containers with Cloud Hypervisor, Firecracker, and gVisor, applied per workload. Northflank's engineering team actively contributes to Kata Containers, QEMU, and Cloud Hypervisor upstream.
- API-first provisioning: Trigger, list, pause, resume, and delete sandbox environments programmatically from any CI system, script, or orchestration layer.
- Managed or BYOC: Deploy on Northflank's managed infrastructure or run sandboxes inside your own cloud. BYOC supports AWS, GCP, Azure, Oracle, CoreWeave, Civo, on-premises, and bare-metal, self-serve with no enterprise sales process required.
- GitOps-compatible: Sandbox environment templates can be version-controlled and synced bidirectionally with a Git repository.
- SOC 2 Type 2 certified: Relevant for teams with compliance requirements or regulated infrastructure.
cto.new migrated their entire sandbox infrastructure to Northflank in two days after EC2 metal instances made scaling costs unpredictable, going from unworkable provisioning to thousands of daily deployments with linear, per-second billing.
Best for: Production agents that maintain state across days or weeks. Platform engineering teams building agent infrastructure. Enterprise teams that need BYOC, compliance controls, and persistent storage at scale.
Pricing: $0.01667/vCPU-hour, $0.00833/GB-hour, H100 GPU at $2.74/hour all-inclusive. BYOC deployments bill against your own cloud account.
Get started on Northflank (self-serve, no demo required). Or book a demo with an engineer if you want to walk through your architecture first.
E2B supports session persistence and snapshots, letting you pause a sandbox and resume it later from the same state. On the Pro plan, sessions run for up to 24 hours, which covers the majority of agent workflows that do not need multi-day continuity. The Python and TypeScript SDKs handle the full lifecycle, including creation, execution, filesystem access, and teardown, and integrate cleanly with LangChain, OpenAI, and Anthropic tooling.
The 24-hour cap is the real constraint here. Workflows that span multiple days require either an upgrade to enterprise or engineering around the limit with external state management. BYOC is available but limited to AWS enterprise customers only.
Best for: Agents with structured execution windows under 24 hours who want clean SDK integration and reliable persistence within a session.
Pricing: Free tier with $100 one-time credit. Pro at $150/month with 24-hour sessions and configurable CPU and RAM.
CodeSandbox persists environment state across sessions and supports snapshot and fork workflows that are genuinely useful for long-running agent work. You can save a sandbox at any point, branch from it, and restore in under two seconds, which makes it practical for agents that need to experiment across multiple paths from the same base state.
Backed by Together AI, it accepts Dev Container images and standard environment formats. There is no BYOC option, and the platform skews toward web-focused use cases, but for teams building iterative agent workflows where resuming from a known state is more important than raw session length, it holds up well.
Best for: Iterative agent workflows, parallel runs from shared state, and web-focused coding tools where snapshot and restore matter more than unlimited session length.
Pricing: Community plan is free. Production at $0.0446/vCPU-hour plus $0.0149/GB-RAM-hour.
Modal supports unlimited session duration with no platform-imposed cap. Sandboxes use gVisor isolation and sit inside a broader ML infrastructure stack that scales to 20,000 concurrent containers with sub-second cold starts. The platform also provides snapshot primitives for saving and restoring sandbox state, which is useful for long-running workflows that need checkpoints.
The tradeoff is the SDK model. Environments are defined through Modal's Python library rather than arbitrary container images, which limits flexibility for teams not already working Python-first. There is no BYOC option.
Best for: Python-heavy agents running long ML inference, training, or data processing jobs that need persistence without a session cap.
Pricing: Usage-based per second. CPU from around $0.047/vCPU-hour. GPU billed separately from CPU and RAM.
Sprites are persistent Linux VMs with 100GB NVMe storage that survive indefinitely between sessions. They checkpoint and restore in around 300ms and idle automatically when not in use, so you pay nothing when the environment is sitting dormant. That billing model is well-suited for long-running agent environments that have unpredictable usage patterns, where always-on compute would be wasteful but cold-start rebuild time is unacceptable.
Fly.io CEO Kurt Mackey put it plainly: ephemeral sandboxes are obsolete for agents that need a real working environment. Sprites are built around that idea. The tradeoff is that sandbox creation takes one to twelve seconds, there is no BYOC, and the platform is still early-stage relative to the others here.
Best for: Agents that need a persistent warm environment between irregular sessions, and teams already on Fly.io who want session persistence without always-on costs.
Pricing: $0.07/CPU-hour and $0.04375/GB-hour of memory, no charge when idle.
The right choice depends on how long your sessions actually need to run and what surrounds the sandbox.
For sessions measured in days or weeks, Northflank is the only option with no cap. Northflank adds microVM isolation, persistent volumes, databases, and BYOC on top of that.
For sessions under 24 hours, E2B covers most production use cases cleanly. CodeSandbox and Fly.io Sprites work well when persistence between irregular sessions matters more than raw duration. Modal fits if your long-running workloads are Python and ML first.
| Platform | Session limit | Persistence model | BYOC | Isolation |
|---|---|---|---|---|
| Northflank | Unlimited | Volumes, databases, S3 | Yes (AWS, GCP, Azure, bare-metal) | Kata Containers, Firecracker, gVisor |
| E2B | 24 hours | Session snapshots | AWS only, enterprise | Firecracker |
| CodeSandbox | None | Snapshots, fork and restore | No | microVM |
| Modal | None | Snapshot primitives | No | gVisor |
| Fly.io Sprites | None | 100GB NVMe, survives idle | No | Firecracker |
Session limits are usually a cost control mechanism. Running live containers indefinitely is expensive, and platforms with managed infrastructure pass that constraint to users. Platforms like Northflank and Fly.io Sprites solve this with idle-based billing or per-second pricing rather than hard cutoffs.
Session persistence means the container stays alive and active. State persistence means the filesystem and environment survive even when the container shuts down or idles. Fly.io Sprites persist the state even when the environment is not running. E2B and Northflank support both, depending on how you configure your environment.
On most sandbox-only platforms, no. Northflank is the exception. You can deploy Postgres, Redis, MySQL, or MongoDB in the same control plane as your sandbox and connect them directly. For other platforms, you would need an external database service.
Northflank supports unlimited session duration. Northflank adds stronger isolation, persistent volumes, and BYOC.
No. Sprites automatically idle when not in use, and billing stops. The 100GB NVMe filesystem persists through idle periods, so the environment is exactly as the agent left it when it wakes up.
On E2B, state within a session persists across executions while the session is active. When the session times out, all in-memory state and ephemeral filesystem data are lost. Snapshots let you save and restore specific states before a timeout, but this requires intentional checkpointing in your agent workflow.
Ephemeral sandboxes made sense when agents were simple, and tasks were short. Production agents in 2026 hold state, build up environments over time, and run tasks that span hours or days. The platform you pick needs to match that reality.
Northflank is the strongest option for teams that need unlimited sessions, real persistence with volumes and databases, and the flexibility to run inside their own infrastructure. The other platforms here each cover a slice of the problem well. Northflank is the one that covers it end to end.
You can get started for free on Northflank or talk to the team to see how persistent sandbox infrastructure fits your stack.


