Managed Inference
Pick among on-the-shelf optimized models, and get a dedicated inference endpoint right away. You are charged for usage of the GPU type you choose. Billing only starts once the model is deployed.
Model | GPU | Price | Approx. per month |
---|---|---|---|
Llama3.1-8b-instruct | L4-1-24G | €0.93/hour | ~€679/month |
Llama3.3-70b-instruct | H100-2-80G | €6.68/hour | ~€4876/month |
Llama3.1-70b-instruct | H100-1-80G | €3.40/hour | ~€2482/month |
H100-2-80G | €6.68/hour | ~€4876/month | |
Llama3.1-Nemotron-70b-instruct | H100-1-80G | €3.40/hour | ~€2482/month |
H100-2-80G | €6.68/hour | ~€4876/month | |
Mistral-7b-instruct-v0.3 | L4-1-24G | €0.93/hour | ~€679/month |
Mixtral-8x7b-instruct-v0.1 | H100-1-80G | €3.40/hour | ~€2482/month |
H100-2-80G | €6.68/hour | ~€4876/month | |
Mistral-nemo-instruct-2407 | H100-1-80G | €3.40/hour | ~€2482/month |
Pixtral-12b-2409 | H100-1-80G | €3.40/hour | ~€2482/month |
Molmo-72b-2409 | H100-2-80G | €6.68/hour | ~€4876/month |
Qwen2.5-coder-32b-instruct | H100-1-80G | €3.40/hour | ~€2482/month |
H100-2-80G | €6.68/hour | ~€4876/month | |
Sentence-t5-xxl | L4-1-24G | €0.93/hour | ~€679/month |
BGE-Multilingual-Gemma2 | L4-1-24G | €0.93/hour | ~€679/month |
Legal notice
Prices before tax