NavigationContentFooter
Suggest an edit
Was this page helpful?

Understanding NVIDIA NVLink

Reviewed on 13 March 2025Published on 13 March 2025

NVLink is NVIDIA’s high-bandwidth, low-latency GPU-to-GPU interconnect with built-in resiliency features, available on Scaleway’s H100-SGX Instances. It was designed to significantly improve the performance and efficiency when connecting GPUs, CPUs, and other components within the same node. It provides much higher bandwidth (up to 900 GB/s total GPU-to-GPU bandwidth in an 8-GPU configuration) and lower latency compared to traditional PCIe Gen 4 (up to 32 GB/s per link). This allows more data to be transferred between GPUs in less time while also reducing latency.

The high bandwidth and low latency make NVLink ideal for applications that require real-time data synchronization and processing, such as AI and HPC use-case scenarios. NVLink provides up to 900 GB/s total bandwidth for multi-GPU I/O and shared memory accesses, which is 7x the bandwidth of PCIe Gen 5. NVLink allows direct GPU-to-GPU interconnection, improving data transfer efficiency and reducing the need for CPU intervention, which can introduce bottlenecks.

NVLink supports the connection of multiple GPUs, enabling the creation of powerful multi-GPU systems capable of handling more complex and demanding workloads. Unified Memory Access allows GPUs to access each other’s memory directly without CPU mediation, which is particularly beneficial for large-scale AI and HPC workloads.

NVLink and PCI Express (PCIe) are both used for GPU communication, but NVLink is specifically designed to address the bandwidth and latency bottlenecks of PCIe in multi-GPU setups.

FeatureNVLink 4.0 (H100-SGX)PCIe 5.0
Use caseHigh-performance computing, deep learningGeneral-purpose computing, graphics
BandwidthUp to 900 GB/s (aggregate, multi-GPU)128 GB/s (x16 bidirectional)
LatencyLower than PCIe (sub-microsecond)Higher compared to NVLink
CommunicationDirect GPU-to-GPUThrough CPU or PCIe switch
Memory sharingUnified memory space across GPUsRequires CPU intervention (higher overhead)
ScalabilityMulti-GPU direct connection via NVSwitchLimited by PCIe lanes
EfficiencyOptimized for GPU workloadsMore general-purpose

In summary, NVLink, available on H100-SGX Instances, is superior for multi-GPU AI and HPC workloads due to its higher bandwidth, lower latency, and memory-sharing capabilities, while PCIe remains essential for broader system connectivity and general computing.

Was this page helpful?
API DocsScaleway consoleDedibox consoleScaleway LearningScaleway.comPricingBlogCareers
© 2023-2025 – Scaleway