How to deploy a model on Scaleway Managed Inference

Reviewed on 23 September 2024 • Published on 06 March 2024

Before you start

To complete the actions presented below, you must have:

A Scaleway account logged into the console
Owner status or IAM permissions allowing you to perform actions in the intended Organization

Click the AI & Data section of the Scaleway console, and select Managed Inference from the side menu to access the Managed Inference dashboard.
Click Deploy a model to launch the model deployment wizard.
Provide the necessary information:
- Select the desired model and quantization to use for your deployment from the available options
  Note
  Some models may require acceptance of an end-user license agreement. If prompted, review the terms and conditions and accept the license accordingly.
- Choose the geographical region for the deployment.
- Specify the GPU Instance type to be used with your deployment.
Enter a name for the deployment, and optional tags.
Configure the network settings for the deployment:
- Enable Private Network for secure communication and restricted availability within Private Networks. Choose an existing Private Network from the drop-down list, or create a new one.
- Enable Public Network to access resources via the public internet. Token protection is enabled by default.
Important
- Enabling both private and public networks will result in two distinct endpoints (public and private) for your deployment.
- Deployments must have at least one endpoint, either public or private.
Click Deploy model to launch the deployment process. Once the model is ready, it will be listed among your deployments.