ScalewaySkip to loginSkip to main contentSkip to footer section

ai-PULSE 2024: Pre-registrations are open. Apply now! Nov 7, Paris

100% hosted in Europe

Best for AI-powered applications requiring low latency, full data privacy, and 100% compliance with EU AI Act.

Interoperable technologies

No need to reinvent the wheel, change your code or read docs forever. Your current OpenAI library or Langchain SDK are just fine!

Extremely cost efficient

Priced per million tokens, our APIs will start as low as 0,022 EUR – up to 10X cheaper than our beloved American hyperscalers.

Everything you need to create apps with Generative AI

Exceptional developer experience meets best-in-class AI

Try all models for free

Scaleway offers a free playground allowing you to quickly experiment with different AI models. Once satisfied with the responses, simply export the payload and replicate at scale!

Go to the Generative APIs Playground

Open weight FTW

Scaleway supports the distribution of cutting-edge open-weight models, whose performance in reasoning and features now rivals that of proprietary models like GPTx or Claude.

Find supported models

Low latency

End-users in Europe will love a response time below 200ms to get the first tokens streamed, ideal for interactive dialog and agentic workflows even at high context lengths.

Send your first API request

Structured outputs

Our built-in JSON mode or JSON schema can distill and transform the diverse unstructured outputs of LLMs into actionable, reliable, machine-readable structured data.

How to use structured outputs

Native function calling

Generative AI models served at Scaleway can connect to external tools. Integrate LLMs with custom functions or APIs, and easily build applications able to interface with external systems.

Production-grade

Bridge the gap between prototype and production with a platform engineered for scale. Scaleway's inference stack runs on highly secure, reliable infrastructure in Paris. Pas mal non? C'est français.

Read our security measures

Towards a sovereign AI where your data remains yours, and only in Europe.

Designed as drop-in replacement for the OpenAI APIs

# Import modules
from openai import OpenAI
import os

# Initialize the OpenAI client using Scaleway
client = OpenAI(
    api_key=os.environ.get("SCW_API_KEY"),
    base_url='https://api.scaleway.ai/v1' 
)

# Create a chat completion request
completion = client.chat.completions.create(
    messages=[
        {
            'role': 'user',
            'content': 'Sing me a song about Xavier Niel'
        }
    ],
    model='mistral-nemo-instruct-2407'
)

Get started with tutorials

Frequently asked questions

What is Scaleway Generative APIs?

Generative APIs is Scaleway's fully managed service that makes frontier AI models from leading research labs available via a simple API call.

How can I get access to Scaleway Generative APIs?

Access to this service is restricted while in beta. You can request access to the product by filling out a form on the Scaleway’s betas page.

What is the pricing of Scaleway Generative APIs?

This service is totally free while in beta. Once in general availability stage, Generative APIs will be with a "pay-as-you-go" pricing, or "pay per tokens" since your consumption will be charged per 1M tokens in/out.

Where are Scaleway's inference servers located?

We currently host all models in a secure datacenter located in France, Paris only. This may change in the future.

Can I use the OpenAI libraries and APIs?

Scaleway lets you seamlessly transition applications already utilizing OpenAI. You can use any of the OpenAI official libraries, for example the OpenAI Python client library or Azure OpenAI sdk, to interact with your Scaleway Generative APIs. Find here the APIs and parameters supported.

What is the difference with Scaleway Managed Inference?
  • Scaleway Generative APIs is a serverless service. This is most likely the easiest way to get started: We have set up the hardware, so you only pay per token/file and don’t wait for boot-ups.

  • Scaleway Managed Inference on the other hand is meant to deploy curated models or your own models, with the quantization and instances of your choice. You will get predictable throughput, as as well as custom security: isolation in your private network, access control…

Both AI services offer text and multi-modal (image understanding) models, OpenAI compatibility and important capabilities like structured outputs.

This page has ended, but the opportunities with AI are boundless.

Wire (2).png