Was this page helpful?

How to use the NVIDIA GPU operator on Kapsule and Kosmos with GPU Instances

Reviewed on 03 December 2024 • Published on 18 July 2023

Kubernetes Kapsule and Kosmos support NVIDIA’s official Kubernetes operator for all GPU pools. This operator is compatible with RENDER-S, GPU-3070-S, H100 PCIe, L40s and L4 offers.

The GPU operator is set up for all GPU pools created in Kubernetes Kapsule and Kosmos, providing automated installation of all required software on GPU worker nodes, such as the device plugin, container toolkit, GPU drivers etc. For more information, refer to the GPU operator overview.

Before you startLink to this anchor

To complete the actions presented below, you must have:

A Scaleway account logged into the console
Owner status or IAM permissions allowing you to perform actions in the intended Organization
Created a Kubernetes Kapsule or Kosmos cluster

How to get the GPU operator for a new pool?Link to this anchor

Scaleway uses Helm to automate the deployment of the GPU operator in your GPU node pools. It is installed by default on every GPU pools.

Click Kubernetes in the Containers section of the side menu. The Kubernetes creation page displays.
From the drop-down menu, select the geographical region you want to manage.
Select the cluster you want to add a pool to.
Click the Pools tab.
Click the + Add pool button. The pool creation wizard displays.
If you are using a Kosmos cluster, you can optionally choose a pool type. Select a Scaleway Kubernetes Kapsule pool.
Choose the zone in which your pool will be deployed.
Click the GPU tab and select the GPU Instance you want to add.
Configure the pool options for your pool.
Click Add pool to deploy the pool. The GPU operator displays in the Easy Deploy tab of your pool and your kube-system namespace.

How to activate the GPU operator on existing node poolsLink to this anchor

Replace the existing nodes of your pool to deploy the GPU operator on your existing pools.

Important

The GPU Operator installs the drivers shortly after node creation.

Note that if your workload immediately schedules on it, it will miss essential components. Preferably, add a Kubernetes selector on your workload.

spec:
  nodeSelector:
    nvidia.com/gpu.present: true

or specific hardware requirements

spec:
  containers:
    - name: gpu-workload
      image: "rg.fr-par.scw.cloud/my-namespace/gpu-image:v1.0"
      resources:
        limits:
          nvidia.com/gpu: 1

How to edit the configuration of the GPU operatorLink to this anchor

The GPU operator on your Scaleway node pools is fully configurable through the Easy Deploy feature, directly from the Scaleway console, or by using helm.

Click Kubernetes in the Containers section of the side menu. The Kubernetes creation page displays.
From the drop-down menu, select the geographical region you want to manage.
Select the cluster you want to configure.
Click the Easy Deploy tab.
Click «See more Icon» > Edit next to the GPU operator deployment. A pop-up displays.
Edit the YAML configuration of the deployment to match your desired configuration.
Tip
Refer to the official NVIDIA documentation for a list of available Helm configuration options.
Click Update and deploy to update and deploy the configuration of the GPU operator.