ScalewaySkip to loginSkip to main contentSkip to footer section

Deployment Pricing

Serve Generative AI models and answer prompts from European end-consumers securely

Managed Inference

Pick among on-the-shelf optimized models, and get a dedicated inference endpoint right away. You are charged for usage of the GPU type you choose. Billing only starts once the model is deployed.

ModelGPUPriceApprox. per month
Llama3.1-8b-instructL4-1-24G€0.93/hour~€679/month
Llama3.3-70b-instructH100-2-80G€6.68/hour~€4876/month
Llama3.1-70b-instructH100-1-80G€3.40/hour~€2482/month
H100-2-80G€6.68/hour~€4876/month
Llama3.1-Nemotron-70b-instructH100-1-80G€3.40/hour~€2482/month
H100-2-80G€6.68/hour~€4876/month
Mistral-7b-instruct-v0.3L4-1-24G€0.93/hour~€679/month
Mixtral-8x7b-instruct-v0.1H100-1-80G€3.40/hour~€2482/month
H100-2-80G€6.68/hour~€4876/month
Mistral-nemo-instruct-2407H100-1-80G€3.40/hour~€2482/month
Pixtral-12b-2409H100-1-80G€3.40/hour~€2482/month
Molmo-72b-2409H100-2-80G€6.68/hour~€4876/month
Qwen2.5-coder-32b-instructH100-1-80G€3.40/hour~€2482/month
H100-2-80G€6.68/hour~€4876/month
Sentence-t5-xxlL4-1-24G€0.93/hour~€679/month
BGE-Multilingual-Gemma2L4-1-24G€0.93/hour~€679/month
Legal notice

Prices before tax