Fine tune models like LLaMA 2
Optimize Transformers Models and LLMs through efficient processes, and accelerate the training of larger models with the cutting-edge Tensor Cores 4th generation technology and the latest 8-bit data format.
Accelerate your model training and inference with the most high-end AI chip of the market!
Optimize Transformers Models and LLMs through efficient processes, and accelerate the training of larger models with the cutting-edge Tensor Cores 4th generation technology and the latest 8-bit data format.
Accelerate your model serving workloads thanks to Transformer Engine 30x faster for AI inference and new data formats.
With 2nd generation of Secure MIG (multi-instance GPU), partition the GPU into isolated, right-size instances to maximize utilization for the smallest to biggest multi-GPU jobs.
GPU
NVIDIA H100 PCIe Tensor Core
GPU Memory
80GB HBM2e
Processor
24 vCPUs AMD Epyc Zen 4
Processor frequency
2.7 Ghz
Memory
240 GB of RAM
Memory type
DDR5
Bandwidth
10 Gbps
Storage
Block Storage for the boot and 3TB of Scratch Storage NVMe
Option | Value | Price |
---|---|---|
Zone | Paris 2 | |
Instance | 1x | 0€ |
Volume | 10GB | 0€ |
Flexible IPv4 | Yes | 0.004€ |
Total | |
---|---|
Daily | 0€ |
Weekly | 0€ |
Monthly | 0€ |
DC5 is one of Europe's greenest data centers, powered entirely by renewable wind and hydro energy (GO-certified) and cooled with ultra-efficient free and adiabatic cooling. With a PUE of 1.16 (vs. the 1.55 industry average), it slashes energy use by 30-50% compared to traditional data centers.
WAW2 runs on 100% wind power (GO-certified) and uses a combination of direct free cooling, free chilling, immersion systems, and air conditioning to optimize system cooling. With a PUE of 1.32—better than the industry average—it minimizes energy consumption for maximum efficiency.
"Execution difference is 40% in favor of using the H100 PCIe GPUs"
Sovereign AI specialists Golem.ai took a deep technical dive into the topic and shared their findings on our blog. “After running a hundred tests in total between Replicate.com and the H100 de Nvidia hosted by Scaleway, we conclude that the execution difference is 40% in favor of using the H100s,” says Golem.ai’s Kevin Baude.
Understands, interprets, and generates human language in a way that is both meaningful and contextually relevant.
Thanks to models and algorithms specialized in:
Converts spoken language into written text, facilitating the translation of verbal communication into machine-readable data.
Thanks to models and algorithms specialized in:
Generates new content, such as images, text, audio, code. It autonomously produces novel and coherent outputs, expanding the realm of AI-generated content beyond replication or prediction.
With models and algorithms specialized in:
Enables machines to interpret and understand visual information from the world, much like human vision.
Thanks to models and algorithms specialized in:
Predicts and suggests items of interest to users based on their preferences and behaviors, enhancing personalized recommendations and decision-making in various applications.
Using for example Deep Learning Recommendation Model v2 (DLRMv2) that employs DCNv2 cross-layer and a multi-hot dataset synthesized from the Criteo dataset.
Instance Name | Number of GPU | TFLOPs FP16 Tensor Cores | VRAM | Prices until June, 30 | Prices from July, 1 |
---|---|---|---|---|---|
H100-1-80GB | 1 H100 PCIe Tensor Core | Up to 1,513 teraFLOPS | 80GB | €2.52/hour | €2.73/hour |
H100-2-80G | 2 H100 PCIe Tensor Core | Up to 3,026 teraFLOPS | 2 x 80GB | €5.04/hour | €5.46/hour |
Benefit from a ready-to-use Ubuntu image to launch your favorite deep learning containers (pre-installed NVIDIA driver and Docker environment).
Easily launch your favorite JupyterLab or Notebook thanks to the pre-installed Docker environment
Access multiple container registries: your own build containers, Scaleway AI containers, NVIDIA NGC registry and any other registry
Access hundreds of AI softwares optimized by Nvidia to maximise the efficiency of your GPUs and boost your productivity. Among hundreds of softwares developed by NVIDIA and tested by leaders of their industry, harness the efficiency of
3TB of Scratch Storage are included in the instance price, but any Block Storage provisioned by you, is at your expense.
For redundancy and thus security reasons we strongly recommend that you provision extra Block Storage volume, as Scratch Storage is ephemeral storage that disappears when you switch off the machine. Scratch Storage purpose is to speed up the transfer of your data sets to the gpu.
How to use Scratch storage then? Follow the guide
These are 2 formats of the same instance embedding NVIDIA H100 PCIe Tensor Core.
NVIDIA announced the H100 to enable companies to slash costs for deploying AI, "delivering the same AI performance with 3.5x more energy efficiency and 3x lower total cost of ownership, while using 5x fewer server nodes over the previous generation."
What inside the product can confirm this announcement?
In addition, at Scaleway we decided to localize our H100 PCIe instances in the adiabatic Data Center DC5. With a PUE (Power User Effectiveness) of 1.15 (average is usually 1.6) this datacenter saves between 30% and 50% electricity compared with a conventional data centre.
Stay tuned for our benchmarks on the topic!
NVIDIA Multi-Instance GPU (MIG) is a technology introduced by NVIDIA to enhance the utilization and flexibility of their data center GPUs, specifically designed for virtualization and multi-tenant environments. It allows a single physical GPU to be partitioned into up to seven smaller Instances, each of which operates as an independent MIG partition with its own dedicated resources, such as memory, compute cores, and video outputs.
Read the dedicated documentation to use MIG technology on your GPU instance
There are many criteria to take into account to choose the right GPU instance:
For more guidance read the dedicated documentation on that topic