Question 1

How do I get started with the text to speech API?

Accepted Answer

Install the Sarvam SDK (pip install sarvamai for Python, npm install sarvamai for Node.js), grab an API key from dashboard.sarvam.ai, and call the text-to-speech endpoint with your text, a language code, and a speaker name. You can generate audio in under five lines of code.

Question 2

How do I use TTS in Python with the Sarvam SDK?

Accepted Answer

Import the SarvamAI client, pass your API key, then call client.text_to_speech.convert() with text, language (e.g. "hi-IN"), speaker (e.g. "meera"), and model ("bulbul:v3"). The response contains raw audio bytes you can write directly to a .wav or .mp3 file.

Question 3

Is there a free tier for the TTS API?

Accepted Answer

Yes. New accounts receive free credits on signup, enough to test the API across multiple languages and voices. After the free tier, pricing starts at Rs. 30 per 10,000 characters. Volume discounts and enterprise pricing are available for high-throughput use cases.

Question 4

How do I build a voice generator API into my app?

Accepted Answer

Use the REST API for batch audio generation (up to 500 characters per request) or the WebSocket streaming API for real-time playback (up to 2,500 characters). Both return audio in your choice of format: MP3, WAV, AAC, OPUS, FLAC, PCM, MULAW, or ALAW.

Question 5

What speech synthesis models are available?

Accepted Answer

Bulbul v3 is the current production model with 35+ voices across 11 Indian languages. It supports configurable pace (0.5x to 2x), temperature control (0.01 to 1.0), and automatic text preprocessing for numbers, dates, currencies, and mixed-language input.

Question 6

Does the TTS engine support real-time streaming?

Accepted Answer

Yes. The WebSocket streaming API delivers sub-250ms time-to-first-byte, making it suitable for voice agents, live narration, and interactive applications. Connect via WebSocket, send text chunks, and receive audio frames as they are generated.

Question 7

Which Indian languages does the TTS API support?

Accepted Answer

Bulbul v3 supports 11 languages: Hindi, Bengali, Tamil, Telugu, Gujarati, Kannada, Malayalam, Marathi, Punjabi, Odia, and Indian-accented English. Each language has multiple speaker voices with distinct tones and styles.

Question 8

How do I control voice speed and output quality?

Accepted Answer

Pass the pace parameter (0.5 for slow, 2.0 for fast) and temperature (lower values for more consistent output, higher for expressiveness). You can also choose sample rates from 8kHz to 24kHz and pick any of the eight supported audio codecs depending on your playback environment.

Text to Speech API
for India

Production-grade speech synthesis

Low latency streaming

Configurable controls

Plug-and-play integrations

11 Indian languages

35+ unique voices

Low latency streaming

Configurable controls

Plug-and-play integrations

11 Indian languages

35+ unique voices

Built for every use case

Dubbing & localization

Developer-first platform

Best text to speech engine for Indian languages

Listener preference rate (8kHz)

Works with your stack

Enterprise-ready. Data stays in India.

No training on your data

Deploy on your terms

Security and governance

Simple, transparent
pricing

Frequently asked questions

How do I get started with the text to speech API?

How do I use TTS in Python with the Sarvam SDK?

Is there a free tier for the TTS API?

How do I build a voice generator API into my app?

What speech synthesis models are available?

Does the TTS engine support real-time streaming?

Which Indian languages does the TTS API support?

How do I control voice speed and output quality?

Text to Speech API for India

Production-grade speech synthesis

Low latency streaming

Configurable controls

Plug-and-play integrations

11 Indian languages

35+ unique voices

Low latency streaming

Configurable controls

Plug-and-play integrations

11 Indian languages

35+ unique voices

Built for every use case

Dubbing & localization

Developer-first platform

Best text to speech engine for Indian languages

Listener preference rate (8kHz)

Works with your stack

Enterprise-ready. Data stays in India.

No training on your data

Deploy on your terms

Security and governance

Simple, transparent pricing

Frequently asked questions

How do I get started with the text to speech API?

How do I use TTS in Python with the Sarvam SDK?

Is there a free tier for the TTS API?

How do I build a voice generator API into my app?

What speech synthesis models are available?

Does the TTS engine support real-time streaming?

Which Indian languages does the TTS API support?

How do I control voice speed and output quality?

Text to Speech API
for India

Simple, transparent
pricing