Deployment Pricing
Serve Generative AI models and answer prompts from European end-consumers securely
Managed Inference
Pick among on-the-shelf optimized models, and get a dedicated inference endpoint right away. You are charged for usage of the GPU type you choose.
Model | Quantization | GPU | Price | Approx. per month |
---|---|---|---|---|
Llama3-8b-instruct | BF16 | L4-1-24G | €0.93/hour | ~€679/month |
Llama3-70b-instruct | INT8 | H100-1-80G | €3.40/hour | ~€2482/month |
Mistral-7b-instruct-v0.3 | BF16 | L4-1-24G | €0.93/hour | ~€679/month |
Mixtral-8x7b-instruct-v0.1 | INT8 | H100-1-80G | €3.40/hour | ~€2482/month |
Mixtral-8x7b-instruct-v0.1 | FP16 | H100-2-80G | €6.68/hour | ~€4876/month |
Wizardlm-70B-V1.0 | FP8 | H100-1-80G | €3.40/hour | ~€2482/month |
Wizardlm-70B-V1.0 | FP16 | H100-2-80G | €6.68/hour | ~€4876/month |
Sentence-t5-xxl | FP32 | L4-1-24G | €0.93/hour | ~€679/month |