100% hosted in Europe
Best for AI-powered applications requiring low latency, full data privacy, and 100% compliance with EU AI Act.
Serverless endpoints to use the latest AI models via API. Towards a sovereign AI where your data remains yours, and only in Europe.
Best for AI-powered applications requiring low latency, full data privacy, and 100% compliance with EU AI Act.
No need to reinvent the wheel, change your code or read docs forever. Your current OpenAI library or Langchain SDK are just fine!
Priced per million tokens, our APIs will start as low as 0,022 EUR – up to 10X cheaper than our beloved American hyperscalers.
Provide exclusive and up-to-date information to your generative AI model using Retrieval-Augmented Generation (RAG), a technique that involves retrieving data from enterprise data sources and enriching the prompt with this data for more relevant and accurate answers.
RAG is easy with Scaleway: embeddings, vector database, Langchain - we've got you covered. Here's your step by step guide.
Scaleway Generative APIs enable frontier models to perform multi-step tasks using your organization's systems and data sources. Whether it's answering customer inquiries about a delivery status or processing bookings, these models can be granted secure access to your APIs through Serverless Functions. An autonomous agent interprets the user’s request and automatically triggers the required APIs and databases to complete the task.
Create LLM-based, multimodal assistants (copilot, chatbot etc), that understand user requests, automatically break down tasks, engage in dialogue to gather information, and boost productivity over so many tasks. Translate languages, summarize content, analyze sentiment, answer questions... you name it.
Traditional OCR models struggle with tasks that require understanding both text and visuals, but the multimodal vision-language models (VLMs) available through Scaleway Generative APIs bridge this gap. VLMs are ideal for real-world applications like scanned documents and technical diagrams, making them a powerful toolkit for mixed-content processing.
Analyze call/video recordings securely in order to identify sentiment, mood, risks, needs. Speech-to-text capabilities offered by Scaleway Generative APIs combined with powerful LLMs already help telecom giants improve quality of services while providing agents with highly valuable insights.
Scaleway offers a free playground allowing you to quickly experiment with different AI models. Once satisfied with the responses, simply export the payload and replicate at scale!
Scaleway supports the distribution of cutting-edge open-weight models, whose performance in reasoning and features now rivals that of proprietary models like GPTx or Claude.
End-users in Europe will love a response time below 200ms to get the first tokens streamed, ideal for interactive dialog and agentic workflows even at high context lengths.
Our built-in JSON mode or JSON schema can distill and transform the diverse unstructured outputs of LLMs into actionable, reliable, machine-readable structured data.
Generative AI models served at Scaleway can connect to external tools. Integrate LLMs with custom functions or APIs, and easily build applications able to interface with external systems.
Bridge the gap between prototype and production with a platform engineered for scale. Scaleway's inference stack runs on highly secure, reliable infrastructure in Paris. Pas mal non? C'est français.
Security and privacy for your data and applications
We do not collect, read, reuse, or analyze the content of your inputs, prompts, or outputs generated by the APIs. Why would we?
# Import modules
from openai import OpenAI
import os
# Initialize the OpenAI client using Scaleway
client = OpenAI(
api_key=os.environ.get("SCW_API_KEY"),
base_url='https://api.scaleway.ai/v1'
)
# Create a chat completion request
completion = client.chat.completions.create(
messages=[
{
'role': 'user',
'content': 'Sing me a song about Xavier Niel'
}
],
model='mistral-nemo-instruct-2407'
)
Generative APIs is Scaleway's fully managed service that makes frontier AI models from leading research labs available via a simple API call.
Access to this service is open to all Scaleway customers. You can begin using it via Scaleway's console playground or via API right away, see the quickstart guide here.
This service is totally free while in beta. Once in general availability stage, Generative APIs will be with a "pay-as-you-go" pricing, or "pay per tokens" since your consumption will be charged per 1M tokens in/out.
We currently host all models in a secure datacenter located in France, Paris only. This may change in the future.
Scaleway lets you seamlessly transition applications already utilizing OpenAI. You can use any of the OpenAI official libraries, for example the OpenAI Python client library or Azure OpenAI sdk, to interact with your Scaleway Generative APIs. Find here the APIs and parameters supported.
Scaleway Generative APIs is a serverless service. This is most likely the easiest way to get started: We have set up the hardware, so you only pay per token/file and don’t wait for boot-ups.
Scaleway Managed Inference on the other hand is meant to deploy curated models or your own models, with the quantization and instances of your choice. You will get predictable throughput, as as well as custom security: isolation in your private network, access control…
Both AI services offer text and multi-modal (image understanding) models, OpenAI compatibility and important capabilities like structured outputs.