v1

Cloud Providers /

Run GPU workloads in your cluster

You can run GPU workloads on Northflank in your own cloud account.

To get started you'll need a cloud account integrated with your team, on a provider that supports the GPUs you want to use. You can check the GPUs available on Northflank here .

Deploy a cluster and a GPU node pool

To run GPU workloads you must deploy a node pool on a cluster with GPU-enabled nodes. You can deploy GPU node pools on an existing cluster, or create a new one. The GPU nodes you want to use must be available in the region that your cluster is deployed in.

Click here to deploy a new cluster.

You should create one node pool of non-GPU nodes with at least 4 vCPU and 8GB memory to provision Northflank system components.

Next, create a node pool and select the type of GPU node you want to deploy. Most GPU node types are listed as accelerator optimised and you can type this category into the node type dropdown to filter nodes, or search by node type name.

Timeslicing

The number of GPUs available to each node varies by node type. Without timeslicing, one GPU process can use a single GPU at a time.

You can enable timeslicing to allow multiple workloads to schedule on an available GPU on a node, which has benefits and drawbacks detailed below. You can specify the number of slices to allow per GPU.

Scheduling rules

In addition to standard scheduling rules you can choose whether to allow non-GPU workloads to provision to GPU node pools. By default, GPU node pools are restricted to GPU workloads; you can make use of any available capacity on nodes by disabling this restriction in advanced settings. However, if your GPU nodes are filled to capacity with non-GPU workloads, GPU workloads will be unable to schedule.

Creating a node pool with GPU nodes on a cloud provider in the Northflank application

Allow multiple workloads to use a GPU with timeslicing

Without timeslicing each GPU workload will be allocated to one GPU, and the node will only be able to schedule as many workloads as there are GPUs on the node.

You can enable timeslicing to enable multiple GPU workloads to schedule per GPU, and set the number of slices to allow on each GPU. The number of slices defines how many workloads can share GPU execution time per GPU on the node.

GPUs per nodeNode countTimeslicing enabled (number of slices)GPU workloads that can be scheduled
11No (1)1
11Yes (10)10
43No (1)12
43Yes (10)120

Each workload is guaranteed the same amount of GPU execution time, but there is no guarantee on GPU memory allocation per workload. Each workload scheduled will try to use as much GPU resources as possible, which depends on the number of GPU workloads that are deployed to the node and the resources each workload requests. Workloads schedule processes on the GPU and each workload may schedule multiple processes on a GPU. This means the number of processes running on a GPU and the number of workloads scheduled per GPU are not a one-to-one ratio.

important

Timesliced workloads are not fault or memory isolated. For example, if one workload on a time-sliced GPU crashes due to a memory error, all workloads sharing the GPU will crash.

Configure workloads to deploy on GPU nodes

To allow workloads to be scheduled to node pools with GPUs, you must create them in a project that is deployed on your cluster with GPU node pools.

You can then enable GPU deployment in advanced resource options, in the resources section. When you create a new service or job, or update existing ones.

After enabling GPU deployment, you must select the type of GPU to deploy the workload to. You must also choose whether to allow the workload to use a node pool with timeslicing enabled.

If no nodes of the chosen type are available, or nodes that match the workload's timeslicing preference, your workload will not schedule until capacity becomes available on a matching node. You can gain more capacity either by scaling up existing node pools, creating new ones, or scaling down other workloads using the relevant node type.

GPU deployment options will only appear in a service or job if the project you are working in is deployed on a cluster with GPU node pools.

Enabling a resource to use GPU nodes on a cloud provider in the Northflank application

You can use labels and tags to further refine which node pools your workloads can be deployed to.

© 2024 Northflank Ltd. All rights reserved.