ScalewaySkip to loginSkip to main contentSkip to footer section

Model-as-a-service

Serve Generative AI models and pay for a dedicated infrastructure or for millions of tokens

Generative APIs

Serve the latest AI models via API, pay by million token

ModelTypeInput tokensOutput tokens
llama-3.1-8b-instructText generation€0.20/million tokens€0.20/million tokens
llama-3.1-70b-instructText generation€0.90/million tokens€0.90/million tokens
llama-3.3-70b-instructText generation€0.90/million tokens€0.90/million tokens
mistral-nemo-instruct-2407Text generation€0.20/million tokens€0.20/million tokens
qwen2.5-coder-32b-instructCode Generation€0.90/million tokens€0.90/million tokens
pixtral-12b-2409Image analysis€0.20/million tokens€0.20/million tokens
bge-multilingual-gemma2Embedding€0.20/million tokensN/A
Legal notice

Prices before tax.
You benefit from a free tier on the first 1,000,000 tokens. You'll be charged from token number 1,000,001.

Managed Inference

Deploy your managed AI infrastructure with dedicated GPUs and optimized models. You are charged for usage of the GPU type you choose. Billing only starts once the model is deployed

ModelGPUPriceApprox. per month
llama-3.1-8b-instructL4-1-24G€0.93/hour~€679/month
llama-3.3-70b-instructH100-2-80G€6.68/hour~€4876/month
llama-3.1-70b-instructH100-1-80G€3.40/hour~€2482/month
H100-2-80G€6.68/hour~€4876/month
llama-3.1-nemotron-70b-instructH100-1-80G€3.40/hour~€2482/month
H100-2-80G€6.68/hour~€4876/month
mistral-7b-instruct-v0.3L4-1-24G€0.93/hour~€679/month
mixtral-8x7b-instruct-v0.1H100-1-80G€3.40/hour~€2482/month
H100-2-80G€6.68/hour~€4876/month
mistral-nemo-instruct-2407H100-1-80G€3.40/hour~€2482/month
pixtral-12b-2409H100-1-80G€3.40/hour~€2482/month
molmo-72b-0924H100-2-80G€6.68/hour~€4876/month
qwen2.5-coder-32b-instructH100-1-80G€3.40/hour~€2482/month
H100-2-80G€6.68/hour~€4876/month
sentence-t5-xxlL4-1-24G€0.93/hour~€679/month
bge-multilingual-gemma2L4-1-24G€0.93/hour~€679/month
Legal notice

Prices before tax