Sarvam AI
Sarvam Motif

Voice agents that speak every Indian language

TTS built for Indian languages, not adapted from English. Handles Hinglish, pronounces Indian names correctly, and streams at sub-250ms over telephony-grade 8kHz audio.

Voice AgentsIVRCall Center AIConversational AI

Voices

View all
ShubhMale
ShreyaFemale
MananMale
IshitaFemale
45 words210/2000

The problem with voice agents in India

Code-switching

Hinglish breaks mid-sentence

A customer says 'Mera account balance check karna hai, last transaction bhi batao.' Most TTS engines choke on the Hindi-English switch. Bulbul V3 handles Hinglish, Tanglish, and Benglish natively because it was trained on how Indians actually speak.

Pronunciation

Indian names become gibberish

'Lajpat Nagar' becomes garbled. 'Dr. Lal PathLabs' is unrecognizable. 'Koramangala' turns into nonsense. Bulbul V3 is trained on Indian speech data. Names, addresses, and abbreviations sound right.

8kHz

Telephony audio quality matters

Voice agents run over phone lines at 8kHz, not studio-quality 24kHz. Most TTS sounds acceptable at high quality but falls apart at telephony bitrates. Bulbul V3 is optimized for 8kHz output, so your voice agent sounds natural on actual phone calls.

Integrate in under an hour

Install the SDK

pip install sarvamai or npm install sarvamai. OpenAI-compatible API, so if you've integrated any LLM, this is familiar.

Pick a voice for your brand

35+ voices across 11 languages. Arjun for authoritative banking, Meera for warm customer service, Amol for casual Hinglish. Preview all voices before committing.

Stream over WebSocket

Sub-250ms first-byte latency. Send text, get audio chunks in real-time. Works with LiveKit, Pipecat, Twilio, Exotel, and Ozonetel out of the box.

Go live with enterprise SLAs

SOC 2 Type II certified. ISO 27001. All data processed in India (RBI data localization compliant). 99.9% uptime SLA on enterprise plans.

from sarvamai import SarvamAI

client = SarvamAI(api_subscription_key="YOUR_KEY")

audio = client.text_to_speech.convert(
    text="Hello, main Suresh bol raha hoon ABC Finance se. Aapka loan approve ho gaya hai.",
    target_language_code="hi-IN",
    model="bulbul:v3",
    speaker="arjun",
    pitch=0,
    pace=1.1,
    enable_preprocessing=True
)

Works with your telephony stack

Drop-in integration with leading voice infrastructure providers. REST and WebSocket APIs.

LiveKit Real-time voice
Pipecat Voice pipelines
Exotel Cloud telephony
Ozonetel Contact center
Twilio Programmable voice
Any SIP/VoIP REST or WebSocket

Built for real-time voice agents

<250ms First-byte latency

Real-time streaming over WebSocket. No buffering, no awkward pauses in conversation.

8kHz Telephony-optimized

Sounds natural at phone-line quality, not just studio quality. Optimized for the bitrate your callers actually hear.

35+ Natural voices

Pick a voice that fits your brand. Adjust warmth, authority, and pacing for the exact call flow.

11 Indian languages

Hindi, Tamil, Telugu, Bengali, Malayalam, Marathi, Gujarati, Kannada, Punjabi, Odia, Assamese. One API, all languages.

SOC 2 + ISO 27001

Enterprise-grade security. Data processed in India. No training on customer data. RBI compliant.

Starting at ₹30 per 10K characters. View pricing

How Sarvam compares

Listener preference rate (8kHz)

Higher is better

Competitor win rate
Tie rate
Bulbul V3 win rate

ElevenLabs Flash V2.5

10.37
11.68
77.95

ElevenLabs V3 Alpha

28.14
28.21
43.64

Cartesia Sonic-3

29.43
30.49
40.08
0%20%40%60%80%100%

Who uses voice agents?

Banking & financial services

SBI, HDFC, ICICI handle millions of Hindi calls daily. Voice agents process balance inquiries, loan status, and EMI reminders. RBI mandates data localization. Sarvam processes everything in India.

Insurance

LIC has 300 million policyholders. Claims processing, renewal reminders, and policy servicing in Tamil, Telugu, Hindi, and more. Complex terms like 'sum assured' and 'maturity benefit' spoken naturally.

E-commerce & logistics

Order confirmations, delivery updates, COD collection calls. Indian addresses and PIN codes pronounced correctly. Handles 'cash on delivery' in Hinglish without breaking.

Telecom

Jio, Airtel, Vi serve 1 billion+ subscribers. Plan info, recharge confirmations, complaint resolution. Data pack names and validity periods communicated in the customer's language.

Why data localization matters for voice agents

The RBI mandate

India's banking regulator (RBI) requires customer data to be stored and processed within India. For voice agents handling financial conversations, this means your TTS provider must process audio generation on Indian infrastructure. Sarvam's entire stack runs in India. No data leaves the country, no exceptions.

Latency matters

Beyond compliance, latency matters. A voice agent with a TTS provider hosted in US-East adds 200-300ms of network round-trip on top of generation time. That's the difference between a natural conversation and an awkward pause. Sarvam's India-hosted infrastructure delivers sub-250ms first-byte latency to Indian callers.

The enterprise checklist

For enterprises evaluating TTS providers, the checklist is short: Does it sound natural in Hindi? Does it handle Hinglish? Is it fast enough for real-time conversation? Does it meet RBI data localization? Does it have SOC 2? Sarvam checks all five.

Need the full stack? Try Samvaad.

Samvaad is Sarvam's end-to-end conversational AI platform. It goes beyond TTS to handle the entire voice agent pipeline, purpose-built for Indian languages.

  • Speech recognition (STT) in 11 Indian languages with Hinglish support
  • Dialogue management with context tracking across multi-turn conversations
  • Intent detection and entity extraction tuned for Indian business domains
  • Full conversation handling, from caller greeting to resolution, in a single API

Your questions, answered

Start building your voice agent Powered by Bulbul V3