← Back to Blog
Header image for blog post: GPU sandboxes: isolation models and platform support in 2026
Deborah Emeni
Published 4th May 2026

GPU sandboxes: isolation models and platform support in 2026

TL;DR: what you need to know about GPU sandboxes

  • A GPU sandbox is an isolated execution environment where a workload gets access to a GPU with a hardware or syscall-level boundary separating it from the host and other tenants. Most sandbox platforms do not support GPU workloads at all.
  • GPU isolation requires hardware-level PCIe device passthrough, IOMMU configuration, and a VMM that supports passing devices through into a VM. Most sandbox platforms use Firecracker, which does not support GPU passthrough, making them CPU-only by design.
  • Northflank is one of the few platforms that supports sandboxing both CPU and GPU workloads. Where nested virtualization is available, Northflank uses microVM-based isolation (KVM / Kata Containers) to run GPU workloads, with the GPU passed through into the sandboxed runtime. This is the same execution model as CPU workloads, with strong isolation guarantees.
  • Where nested virtualization is unavailable, Northflank falls back to gVisor. GPU workloads still run inside the sandboxed environment, but the isolation boundary is at the syscall level rather than hardware virtualization. The deployment model is the same in both cases.

GPU sandboxes are isolated execution environments that give untrusted workloads access to a GPU while enforcing hardware or syscall-level boundaries between the workload, the host system, and other tenants.

Most sandbox platforms today are CPU-only, and the reason comes down to how GPU hardware virtualization works at the PCIe level.

This article explains the technical constraints that make GPU sandboxing harder than CPU sandboxing, how the nested virtualization requirement shapes which isolation model applies, and which platforms support GPU workloads in sandboxed environments today.

What is a GPU sandbox?

A GPU sandbox is an isolated execution environment that provides a workload with access to a GPU while preventing it from affecting the host system or other tenants.

A bare-metal GPU instance gives a single tenant unrestricted access with no isolation boundary. A GPU-enabled container shares the host kernel and NVIDIA driver namespace with everything else on that host.

A GPU sandbox adds an isolation layer using hardware virtualization or syscall interception, which is what makes multi-tenant GPU execution safe.

Why is GPU sandboxing harder than CPU sandboxing?

CPU sandboxing and GPU sandboxing solve different problems at different layers of the stack. Understanding the distinction explains why most sandbox platforms support only CPU workloads.

CPU isolation is a kernel problem

CPU workload isolation requires memory isolation, syscall boundaries, and process separation. Both microVMs, which give each workload its own kernel via hardware virtualization, and gVisor, which intercepts system calls in user space, solve this problem well. The problem is well-understood, and multiple viable solutions exist.

GPU isolation is a hardware problem

GPUs are PCIe devices. To sandbox a GPU workload you need: an IOMMU (Intel VT-d or AMD-Vi) to enforce DMA isolation, preventing one tenant's GPU from reading or writing another tenant's memory over the PCIe bus; VFIO binding to detach the GPU from the host driver and assign it to a specific VM; IOMMU group separation so devices sharing a group are assigned together; and a VMM that supports PCIe device passthrough. On multi-GPU systems with NVLink fabrics, nv-fabricmanager adds further host-level isolation requirements.

Why Firecracker does not support GPU passthrough

Firecracker excludes GPU passthrough as a design decision to minimize its attack surface. It implements only six emulated devices: virtio-net, virtio-block, virtio-balloon, virtio-vsock, serial console, and a minimal keyboard controller.

Because E2B, Fly.io Sprites, and Vercel Sandbox all use Firecracker for isolation, none of them can sandbox GPU workloads. VMMs that support PCIe device passthrough are required for GPU access inside a sandboxed VM. Northflank uses Kata Containers for microVM-based workloads.

See what is AWS Firecracker, Firecracker vs gVisor, and the guide to Cloud Hypervisor.

What is the nested virtualization constraint for GPU sandboxes?

Nested virtualization determines which isolation path is available for GPU sandboxes. MicroVM-based isolation requires KVM to be present, and most cloud VMs do not expose this since the host hypervisor does not pass the KVM interface through to VMs running on top of it.

Without nested virtualization, hardware-level GPU passthrough through microVMs is not available, and platforms need to use an alternative isolation approach.

Northflank handles this constraint differently depending on the underlying infrastructure.

Path 1: microVM with GPU passthrough (nested virtualization available)

When nested virtualization is available, Northflank uses microVM-based isolation to run GPU workloads. The GPU is passed through into the sandboxed runtime, and the workload runs inside a hardware-isolated VM with its own kernel. This is the same execution model used for CPU workloads, with strong isolation guarantees. See Kata Containers vs Firecracker vs gVisor and microVM vs gVisor.

Path 2: gVisor (nested virtualization unavailable)

When nested virtualization is unavailable, Northflank falls back to gVisor. GPU workloads run inside the sandboxed environment with access to the GPU, but the isolation boundary sits at the syscall level rather than at hardware virtualization. The deployment model is identical across both paths: the same APIs, the same workload definitions, the same platform. What changes is the isolation mechanism applied underneath.

How Northflank runs GPU sandboxes

Northflank supports both CPU and GPU workloads in isolated sandbox environments, with the isolation model adapting to the underlying infrastructure as described above.

Sandboxes start in approximately 1 to 2 seconds and support both ephemeral and persistent environments. GPU workloads run on on-demand NVIDIA GPUs including L4, A100 (40GB and 80GB), H100, and H200, with self-service provisioning and no quota requests required.

The same platform also runs APIs, background workers, databases, and GPU inference alongside sandboxes, so teams are not managing separate tooling for each workload type.

BYOC (Bring Your Own Cloud) deployment is available self-serve across AWS, GCP, Azure, Civo, Oracle Cloud, CoreWeave, and on-premises. Teams with data residency requirements or compliance mandates can deploy GPU sandboxes inside their own VPC using the same isolation model as the managed cloud. Northflank has been running production workloads across startups, public companies, and government deployments since 2021.

Get started with GPU sandboxes on Northflank

Get started (self-serve), or book a session with an engineer if you have specific infrastructure or compliance requirements.

Which sandbox platforms support GPU workloads?

The table below covers GPU support within the sandboxed execution environment specifically.

PlatformGPU supportIsolation modelBring Your Own Cloud (BYOC)Persistent environments
NorthflankYesKata Containers (microVM) / gVisorYes, self-serve across multiple cloudsYes (ephemeral and persistent)
ModalYesgVisorNo, managed onlyPartial (snapshotting available)
E2BNo (CPU only)Firecracker microVMsEnterprise only (AWS, GCP)Yes (pause/resume, indefinite retention)
Fly.io SpritesNo (CPU only)Firecracker microVMsNo, managed onlyYes (persistent NVMe filesystem)
Vercel SandboxNo (CPU only)Firecracker microVMsNo, managed onlyBeta
BlaxelNo (CPU only)Firecracker microVMsCustom (contact sales)Yes (standby with state preserved)

For deeper platform comparisons, see E2B vs Modal vs Fly.io Sprites, E2B vs Modal, and top BYOC AI sandboxes.

When do you need a GPU sandbox?

Not every GPU workload requires a sandboxed environment. A single-tenant inference service or a dedicated training job on reserved hardware does not need the isolation layer.

GPU sandboxes are relevant when the execution environment is shared, and workloads are untrusted or user-submitted:

  • Platforms where multiple tenants submit GPU-accelerated code, and each workload needs isolation from the others
  • AI agents calling local inference or embedding generation rather than an external API
  • Platforms running user-submitted training jobs or reinforcement learning reward evaluations at scale
  • Code execution products that allow users to attach GPUs to notebooks or execution environments

Pricing comparison for sandbox platforms

Pricing at scale differs significantly across platforms. The table below shows the total cost for 200 concurrent sandboxes across PaaS and BYOC deployment models, based on an nf-compute-100-4 plan on an m7i.2xlarge infrastructure node. Pricing as of May 2026. Verify current rates on each platform's pricing page before making cost decisions.

ModelProviderCloud costSandbox vendor costTotal
PaaSNorthflank$7,200.00$7,200.00
PaaSE2B$16,819.20$16,819.20
PaaSModal$24,491.50$24,491.50
PaaSFly Sprites$35,770.00$35,770.00
PaaSVercel Sandbox$31,068.80$31,068.80
BYOC (0.2 overcommit)Northflank$1,500.00$560.00$2,060.00
BYOCE2B$1,500.00$10,000.00$11,500.00

The BYOC row for Northflank reflects a request modifier of 0.2. On BYOC plans, Northflank applies an overcommit so more sandboxes run on the same hardware. For example, with a modifier of 0.2, each sandbox requests 20% of its plan resources as a guaranteed minimum but can burst to the full limit when capacity is available, fitting 40 sandboxes per node instead of 8. This comparison covers CPU sandbox workloads. GPU workload costs depend on GPU type and usage pattern. See the Northflank pricing page for current GPU rates.

Frequently asked questions about GPU sandbox isolation

Why do most sandbox providers not support GPU workloads?

Most sandbox providers use Firecracker microVMs, which do not support GPU passthrough. GPU passthrough requires VFIO binding, IOMMU configuration, and a VMM that supports PCIe device passthrough. Firecracker's minimal virtio device set does not include these capabilities.

What is the difference between gVisor and microVM isolation for GPU workloads?

MicroVM isolation passes the GPU into a hardware-isolated VM; the workload runs its own kernel with a hardware boundary separating it from the host. gVisor intercepts the NVIDIA device calls in user space and proxies them to the host driver. Both provide GPU access inside a sandboxed environment; the isolation boundary differs.

Does nested virtualization affect which isolation model applies?

Yes. MicroVM-based GPU passthrough requires KVM, which requires bare metal or a host with nested virtualization enabled. Where nested virtualization is unavailable, gVisor is the fallback. The deployment interface is identical in both cases.

Can GPU sandboxes run inside my own cloud account?

Northflank supports self-serve BYOC deployment across AWS, GCP, Azure, Civo, Oracle Cloud, CoreWeave, and on-premises. GPU sandboxes can run inside your own VPC using the same isolation model as the managed cloud.

What GPU types are available for sandboxed workloads on Northflank?

Northflank provides on-demand access to NVIDIA L4, A100 (40GB and 80GB), H100, H200, and additional GPU types with self-service provisioning and no quota requests. See the Northflank GPU documentation for the current list.

The following articles cover topics referenced in this piece in more depth.

Share this article with your network
X