Header image for blog post: What are persistent sandboxes? (and why AI agents need them)

Published 6th July 2026

What are persistent sandboxes? (and why AI agents need them)

Persistent sandboxes are isolated execution environments that retain their state across sessions, giving AI agents and developers a continuous workspace that picks up exactly where it left off.

If you've ever built an AI agent that needs to pick up where it left off, with the same files and same installed packages, you've already felt the problem that persistent sandboxes solve.

Most sandbox environments are ephemeral by design. They spin up, run some code, and disappear. That works for a lot of use cases. But the moment your agent needs to resume a task, maintain a working directory across sessions, or keep a long-running service alive between calls, ephemeral execution starts fighting against you.

This article breaks down what persistent sandboxes are, why they've become important for AI agent infrastructure, and what you should be evaluating when you're choosing a platform that supports them.

TL;DR: Key takeaways on persistent sandboxes

A persistent sandbox is an isolated execution environment that retains its filesystem state across executions, giving agents a persistent working directory and installed environment to return to.
Ephemeral sandboxes are destroyed after each run; persistent ones survive.
Persistent sandboxes are most relevant for AI agents that need to resume tasks, accumulate state, or run long-horizon workflows.
The tradeoff: persistent sandboxes require more infrastructure thinking around storage, security, and lifecycle management.

Northflank supports both persistent and ephemeral sandbox environments. It offers MicroVM-based isolation (Kata Containers, Firecracker) and gVisor depending on workload, 97ms median time-to-interactive per ComputeSDK benchmarks (July 2026), on-demand GPUs, bring your own cloud (BYOC) deployment across your own cloud accounts, on-premises, and bare metal infrastructure, API/CLI/SSH access, and SOC 2 Type 2 compliance. It has been in production since 2021 across startups, public companies, and government deployments.

What is a persistent sandbox?

A persistent sandbox is an isolated execution environment that keeps its state between sessions. When you close the connection and come back later, or when your agent makes a new call, everything is still there: the files you wrote and the packages you installed.

The word "sandbox" here is doing its usual job: this is an environment with enforced isolation, meaning code running inside it can't affect the host system or other tenants. The word "persistent" describes what happens to that environment over time. It isn't discarded when the session ends.

Compare this to an ephemeral sandbox, which is created fresh for each execution and discarded when the run ends. Ephemeral environments are great for untrusted one-shot code execution. Persistent environments are what you reach for when continuity of state is important.

If you want a deeper look at how the two compare in practice, ephemeral execution environments for AI agents covers the ephemeral side in more detail.

What's the difference between persistent and ephemeral sandboxes?

The distinction comes down to what survives when an execution ends. Here's a quick breakdown:

	Ephemeral sandbox	Persistent sandbox
State after run	Destroyed	Retained
Filesystem	Wiped	Survives across executions
Installed packages	Gone	Still there
Running processes	Terminated	Platform-dependent
Best for	Stateless, one-shot execution	Multi-step, stateful workloads
Security cleanup	Automatic	Requires lifecycle management

With an ephemeral sandbox, each run starts from a clean slate. Nothing carries over from the previous session: no files, no installed dependencies, no process state. This is useful for security-sensitive workloads where you want guaranteed cleanup, and for stateless tasks where you don't need continuity.

With a persistent sandbox, the environment survives between runs. Your agent can write a file during session one, disconnect, and find that file intact during session two. This is much closer to how a developer's local machine works, which is part of why it maps well to agent workflows.

In practice, the choice between persistent and ephemeral isn't always binary. Well-designed platforms let you choose per-workload: spin up short-lived execution pools for stateless tasks, and maintain long-running stateful services for the workflows that need them.

Why do AI agents need persistent sandboxes?

The shift toward persistent sandboxes is largely being driven by how AI agents are being built and deployed today.

Early sandbox use cases were straightforward: run user-submitted code safely, return the output, tear it down. The sandbox was a one-shot execution container. But AI agents, especially those built to complete multi-step tasks autonomously, don't work like that.

For instance, take an agent that's been asked to build a feature in a codebase. It needs to clone a repo, install dependencies, run tests, iterate on a fix, and re-run tests. If each tool call spins up a fresh sandbox, the agent has to reinstall everything from scratch every time. The iteration loop gets expensive and slow.

Let's say you also have an agent running a background data pipeline, processing files as they arrive, maintaining a working directory, and accumulating output. That's not a stateless task. It needs an environment that behaves like a running service, not a function invocation.

Persistent sandboxes are also key for:

Multi-step coding agents that need to build up a working environment incrementally
Long-horizon research agents that read, write, and revise documents over extended periods
Agent-powered development tools where the environment needs to feel continuous to the user
Stateful tool execution where the agent uses shell, filesystem, and process state as part of its reasoning loop

Running AI agents in production?

Northflank is a full workload runtime built for exactly this. You can run agents, APIs, workers, databases, and background jobs on a single platform, with persistent and ephemeral sandbox environments as first-class options.

Sandbox environments overview - see how persistent and ephemeral environments work on Northflank
Get started - self-serve setup
Pricing - CPU, memory, and GPU pricing
How to deploy ClawdBot on Northflank - a practical walkthrough of running an agent with sandbox environments on Northflank
Talk to an engineer - if you have specific infrastructure or compliance requirements and want to talk through how Northflank fits your setup

What should you look for in a persistent sandbox platform?

If you're evaluating platforms for persistent sandbox support, here are the things worth scrutinising:

How does persistence work?

Some platforms snapshot filesystem state and restore it on the next call. Others keep the environment running as a long-lived process. Others let you pause and resume. The underlying mechanism affects performance, cost, and what kinds of state persist (filesystem vs process vs memory).
What's the isolation model?

Persistence introduces a new surface area for security concerns, because long-lived environments accumulate state over time. You want MicroVM-level isolation, something like Firecracker or Kata Containers, or gVisor for user-space kernel sandboxing, rather than just container-level isolation, especially if you're running untrusted code or serving multiple tenants.
How fast does a new environment come up?

Even if you're using persistent environments for most workloads, you'll still need to spin up new ones. Pay attention to cold start time for the full environment creation path, not just component-level benchmarks.
Can you run the full workload on one platform?

Agents aren't just code execution. They need storage, APIs, background workers, and databases. If your sandbox platform only handles code execution, you end up stitching together multiple services. Look for platforms like Northflank that support full workload runtimes.
What does deployment look like in an enterprise context?

If you're building for an organisation with data residency or compliance requirements, confirm the platform supports deployment inside your own cloud or VPC. SOC 2 Type 2 compliance is also worth verifying. See Northflank's security page for its compliance posture.
Do you get GPU access?

For inference-heavy agent workloads, GPU availability and the ability to provision it on-demand without quota requests is worth checking.

How does Northflank handle persistent sandboxes?

Northflank is a full workload runtime that supports both persistent and ephemeral sandbox environments as first-class primitives.

Here's what that looks like in practice:

Persistent environments: Stateful services backed by persistent volumes, where filesystem state survives between executions
Ephemeral environments: Short-lived execution pools suited to stateless or one-shot workloads
Per-workload flexibility: Environment type is configured per workload; both can run on the same platform alongside the rest of your agent infrastructure, including APIs, workers, databases, and background jobs
Isolation: MicroVM-based using Kata Containers, Firecracker, and gVisor, depending on workload characteristics, for secure execution of untrusted code
Spin-up time: 97ms median time-to-interactive sequentially, 167ms under concurrent burst load (P99 216ms)
GPU access: On-demand GPUs, self-service provisioning, no quota requests
BYOC: Bring your own cloud (BYOC) deployment across your own cloud accounts, on-premises, and bare metal infrastructure, fully self-serve
Access: API, CLI, and SSH access
Compliance: SOC 2 Type 2 certified. See the security page for full details
Pricing: CPU at $0.01667/vCPU/hour, memory at $0.00833/GB/hour. GPU pricing on the pricing page
Production track record: In use since 2021 across startups, public companies, and government deployments

Northflank at 100,000 concurrent sandboxes. In the ComputeSDK 2026 Scale Invitational, Northflank reached 100,000 concurrent live sandboxes in 24 seconds from a cold start with zero failures, the fastest time of any participant. P99 allocate latency was 566ms and P99 readiness was 733ms, compared to 2.69s and 1.42s for E2B and 1.51s and 3.67s for Modal. At scale, tail latency determines the slowest sandbox your system waits on before the full fleet is live. See the full results.

Screenshot 2026-06-23 at 09.50.13.png

Get started with Northflank or talk to an engineer if you want to discuss your company's specific infrastructure requirements.

When should you use persistent sandboxes vs ephemeral?

Neither approach is universally better. The right choice depends on what your workload needs.

Use case	Recommended environment
Agent resuming a task across executions	Persistent
Accumulating filesystem state (cloning repos, installing packages)	Persistent
Long-lived services or background processes	Persistent
Environment that behaves like a continuous workspace	Persistent
Stateless, one-shot code execution	Ephemeral
Guaranteed environment cleanup after each run	Ephemeral
Many parallel short-lived tasks with no continuity needed	Ephemeral
Short-burst workloads where cost efficiency is a priority	Ephemeral

For most production AI agent architectures, you'll want both available. The workloads that benefit from persistence and the workloads that benefit from ephemerality often coexist in the same system.

FAQ: persistent sandboxes

What is a persistent sandbox?

A persistent sandbox is an isolated execution environment that retains its filesystem state, including files and installed packages, between separate sessions or invocations. Unlike ephemeral sandboxes that are destroyed after each run, a persistent sandbox retains its filesystem state across executions, even with no active connection.

What's the difference between a persistent sandbox and a container?

Containers can be either persistent or ephemeral, depending on how they're managed. The term "persistent sandbox" specifically refers to a sandboxed (isolated) environment designed to maintain state over time, usually with additional security primitives like MicroVM isolation on top of the container layer.

Do AI agents need persistent sandboxes?

It depends on the agent's task. Agents doing multi-step work, such as writing code across multiple executions, maintaining a working directory, or running background processes, benefit from persistent sandboxes. Agents doing stateless one-shot tasks can work fine with ephemeral environments.

Are persistent sandboxes less secure than ephemeral ones?

Not inherently, but they require more careful security design. Because state accumulates over time, persistent environments need robust isolation (MicroVM-level, not just container-level) and clear lifecycle management policies to limit exposure. Ephemeral environments get a degree of automatic cleanup that persistent ones don't.

How do persistent sandboxes handle state between sessions?

It varies by platform. Some keep the environment process running continuously. Others snapshot and restore filesystem state on reconnection. The mechanism affects what kinds of state persist (filesystem, in-memory, running processes) and the latency of resuming a session.

Can I use persistent and ephemeral sandboxes on the same platform?

Yes. Platforms like Northflank support both as first-class primitives, so you can choose the right model per workload without switching providers.

Ephemeral sandbox environments: The direct counterpart to this article. Covers how ephemeral sandboxes work and when they're the right choice.
Ephemeral execution environments for AI agents: Goes deeper on ephemeral execution for agent workloads, including the security model behind short-lived environments.
What is an AI sandbox?: A foundational explainer on sandbox environments in the context of AI, covering isolation models and use cases.
Best code execution sandbox for AI agents: A platform comparison for agent builders evaluating code execution sandbox options.
How to sandbox AI agents: A practical guide to sandboxing agent workloads, including isolation strategies and platform setup.
Self-hosted AI sandboxes: Covers the self-hosted and BYOC route for teams with compliance or data residency requirements.
Top AI sandbox platforms for code execution: An overview of sandbox platforms for code execution, covering isolation, performance, and scalability.
Remote code execution sandbox: How remote code execution sandboxes work and the security considerations involved.
Best sandboxes for coding agents: Focused on coding agent use cases and what they need from a sandbox environment.
Code execution environment for autonomous agents: What autonomous agents need from a code execution environment, including state and continuity requirements.