Skip to login Skip to main content Skip to footer section

Home PricingDeployment

Deployment Pricing

Serve Generative AI models and answer prompts from European end-consumers securely

Managed Inference

Pick among on-the-shelf optimized models, and get a dedicated inference endpoint right away. You are charged for usage of the GPU type you choose. Billing only starts once the model is deployed.

Model	GPU	Price	Approx. per month
Llama3.1-8b-instruct	L4-1-24G	€0.93^/hour	~€679^/month
Llama3.3-70b-instruct	H100-2-80G	€6.68^/hour	~€4876^/month
Llama3.1-70b-instruct	H100-1-80G	€3.40^/hour	~€2482^/month
Llama3.1-70b-instruct	H100-2-80G	€6.68^/hour	~€4876^/month
Llama3.1-Nemotron-70b-instruct	H100-1-80G	€3.40^/hour	~€2482^/month
Llama3.1-Nemotron-70b-instruct	H100-2-80G	€6.68^/hour	~€4876^/month
Mistral-7b-instruct-v0.3	L4-1-24G	€0.93^/hour	~€679^/month
Mixtral-8x7b-instruct-v0.1	H100-1-80G	€3.40^/hour	~€2482^/month
Mixtral-8x7b-instruct-v0.1	H100-2-80G	€6.68^/hour	~€4876^/month
Mistral-nemo-instruct-2407	H100-1-80G	€3.40^/hour	~€2482^/month
Pixtral-12b-2409	H100-1-80G	€3.40^/hour	~€2482^/month
Molmo-72b-2409	H100-2-80G	€6.68^/hour	~€4876^/month
Qwen2.5-coder-32b-instruct	H100-1-80G	€3.40^/hour	~€2482^/month
Qwen2.5-coder-32b-instruct	H100-2-80G	€6.68^/hour	~€4876^/month
Sentence-t5-xxl	L4-1-24G	€0.93^/hour	~€679^/month
BGE-Multilingual-Gemma2	L4-1-24G	€0.93^/hour	~€679^/month

Legal notice

Prices before tax

Go to product page Create your account