Scaleway MarketplaceSkip to navigationSkip to main contentSkip to footer
Become a partnerSign in
Back to Marketplace

Voxist Speech-to-text

Audio and video transcription to power your apps. Exceptional accuracy and speed at scale accessible through a simple API.

Contact Sales
PresentationHow to get startedPricingSupport

Presentation

Async Speech-to-Text

Voxist API can transcribe pre-recorded audio and/or video files in seconds, with human-level accuracy. Highly scalable to tens of thousands of files in parallel.

Realtime Speech-to-Text

Voxist API can transcribe speech in realtime for your CallBots or to enhance your customer representative.

Auto Punctuation and Casing

Automatically add casing and punctuation of proper nouns to the transcription text.

Speaker Diarization

Detect the number of speakers in your audio file, with each word in the text associated with its speaker.

Word Timings

View word-by-word timestamps across the entire transcript text.


Who is the solution for?

  • CallBot developers who need real-time speech-to-text for seamless interactions
  • CallCenters who want transcripts for after-the-fact reporting or in real-time to help their customer representative
  • Media companies or any companies who need to create subtitles for videos or for conferences/meetings

How to get started

You need to get an API key and then you can use the APIs as described in the swagger documentation.

Pricing

Contact Voxist by clicking on the 'Contact' button for a quote for specific models or onpremise deployment.

Minutes
Hours
Voxist
100 0001 6670,65 €
500 0008 3330,52 €
1 000 00016 6670,42 €
3 000 00050 0000,32 €
6 000 000100 0000,32 €

Support

Standard support is available by email, phone support requires a subscription.

Learn more about Voxist Terms and Conditions.

Categories

Supported Languages

English, French, German, Spanish, Italian, Polish