Deployment Pricing

Serve Generative AI models and answer prompts from European end-consumers securely

Managed Inference

Pick among on-the-shelf optimized models, and get a dedicated inference endpoint right away. You are charged for usage of the GPU type you choose.

ModelQuantizationGPUPriceApprox. per month
Llama3-8b-instructBF16L4-1-24G€0.93/hour~€679/month
Llama3-70b-instructINT8H100-1-80G€3.40/hour~€2482/month
Mistral-7b-instruct-v0.3BF16L4-1-24G€0.93/hour~€679/month
Mixtral-8x7b-instruct-v0.1INT8H100-1-80G€3.40/hour~€2482/month
Mixtral-8x7b-instruct-v0.1FP16H100-2-80G€6.68/hour~€4876/month
Wizardlm-70B-V1.0FP8H100-1-80G€3.40/hour~€2482/month
Wizardlm-70B-V1.0FP16H100-2-80G€6.68/hour~€4876/month
Sentence-t5-xxlFP32L4-1-24G€0.93/hour~€679/month