Sarvam AI
Sarvam Motif

Generate IVR audio in 11 Indian languages

Replace pre-recorded prompts with dynamic text to speech. Change a menu prompt by editing text. No studio, no re-recording, no deployment cycle.

IVRBanking IVRContact CenterTelephony TTSHindi IVR

Voices

View all
ShubhMale
ShreyaFemale
MananMale
IshitaFemale
32 words173/2000

The hidden cost of pre-recorded IVR

Operations

Every menu change costs weeks

New product launch. New compliance disclaimer. That means booking a studio, recording in Hindi, Tamil, Telugu, Bengali, and English, QA-testing every file, then deploying to production. With dynamic TTS, you update the text and it goes live instantly across all 11 languages.

Personalization

Dynamic TTS personalizes every call

Static IVR says 'Your account balance is...' and stops there. Dynamic TTS generates 'Suresh ji, aapke khaate mein 45,230 rupaye hain' in real-time. Name, balance, branch, transaction date, all personalized per caller.

8kHz

Built for how phones actually sound

IVR runs at 8kHz telephony quality, not studio-grade 24kHz. Bulbul V3 is optimized specifically for 8kHz output. What callers hear on an actual phone line sounds natural, because the model was trained for that bitrate.

From pre-recorded to dynamic in one sprint

Define your prompt templates

Map your existing IVR tree. Convert each pre-recorded prompt into a text template with placeholders: customer_name, balance, due_date, branch_name. Your existing IVR logic stays the same.

Detect or select caller language

Route to the correct language based on caller DTMF selection or automatic detection. One API endpoint handles all 11 Indian languages. No separate integration per language.

Generate audio in real-time

Fill templates with live customer data, send to Sarvam's API, get natural audio back in under 250ms. Latency low enough for real-time IVR playback without caller-perceptible delay.

Stream to your telephony stack

Standard WAV/PCM output at 8kHz. Works with Genesys, Avaya, Cisco UCCE, Twilio, Exotel, Ozonetel, and any SIP/VoIP system. No telephony migration required.

from sarvamai import SarvamAI

client = SarvamAI(
    api_subscription_key="YOUR_KEY"
)

# Dynamic IVR prompt with caller's data
prompt = (
    f"Namaste {customer_name} ji. "
    f"Aapke savings account mein "
    f"{balance} rupaye ka balance hai. "
    f"Pichhla transaction {last_txn_date} "
    f"ko {last_txn_amount} rupaye ka tha. "
    f"Aur jaankari ke liye ek dabayen."
)

audio = client.text_to_speech.convert(
    text=prompt,
    target_language_code="hi-IN",
    model="bulbul:v3",
    speaker="shreya",
    pace=1.0,
    enable_preprocessing=True
)

# 8kHz WAV for telephony playback
with open("ivr_prompt.wav", "wb") as f:
    f.write(audio.audios[0])

What dynamic IVR prompts sound like

Banking · Hindi

Account balance inquiry

नमस्ते सुरेश जी। आपके सेविंग्स अकाउंट में 45,230 रुपये का बैलेंस है। पिछला ट्रांज़ैक्शन 12 मई को 2,500 रुपये का था। अकाउंट स्टेटमेंट के लिए 1 दबाएं।

Insurance · English

Policy renewal reminder

Hello. Your policy number 8847291 is due for renewal on 15 June. Premium amount is Rs 12,400. Press 1 to renew, press 2 to speak with an agent.

Telecom · Tamil

Prepaid balance and validity

உங்கள் மெயின் பேலன்ஸ் 147 ரூபாய். டேட்டா பேலன்ஸ் 2.5 ஜிபி மீதமுள்ளது. வேலிடிட்டி 18 ஜூன் வரை. ரீசார்ஜ் செய்ய 1 அழுத்தவும்.

Logistics · Telugu

Delivery confirmation OTP

మీ ఆర్డర్ నంబర్ 7721 డెలివరీ అవుతోంది. డెలివరీ కన్ఫర్మ్ చేయడానికి OTP 4839. దయచేసి ఈ కోడ్ డెలివరీ పార్ట్‌నర్‌కి చెప్పండి.

Built for enterprise telephony

<250ms Generation latency

End-to-end from API call to audio response, including network round-trip on India-hosted infrastructure.

8kHz Telephony-native

Bulbul V3 is optimized for 8kHz output. IVR prompts sound natural on actual phone lines, not just in browser demos.

India-hosted SOC 2 + ISO 27001 certified

All data processed within India. No caller data stored post-processing. Audit logs for every API call.

11 Languages, one API

Hindi, Tamil, Telugu, Bengali, Kannada, Malayalam, Marathi, Gujarati, Odia, Punjabi, and English from a single endpoint.

Starting at ₹30 per 10K characters. View pricing

How Sarvam compares

Listener preference rate (8kHz)

Higher is better

Competitor win rate
Tie rate
Bulbul V3 win rate

ElevenLabs Flash V2.5

10.37
11.68
77.95

ElevenLabs V3 Alpha

28.14
28.21
43.64

Cartesia Sonic-3

29.43
30.49
40.08
0%20%40%60%80%100%

Who uses dynamic IVR?

Banking & financial services

Account balance inquiries, transaction alerts, loan status updates. Each prompt personalized with the caller's name, amounts, and dates, generated in their preferred language. No pre-recorded prompt library to maintain when products or compliance requirements change.

Insurance

Policy renewals, claim status, premium reminders. Every call reads out policy numbers, amounts, and due dates correctly in the policyholder's language. Update prompts instantly when IRDAI guidelines change.

Telecom

Balance inquiries, plan activations, complaint status. When tariff plans change, prompts update by changing text, not by re-recording across languages. Subscribers hear natural audio in their preferred language.

Government & public services

Aadhaar, DigiLocker, UMANG, and state-level portals need IVR in regional languages. Dynamic TTS serves citizens in 11 Indian languages from a single API, with all data processed within India.

Data sovereignty and compliance

Data localization

RBI's data localization circular mandates that all payment system data must be stored and processed within India. For IVR systems that handle account balances, transaction details, and OTPs, this is not optional. Routing caller data to overseas TTS servers violates the circular. Sarvam processes all data within India, on Indian infrastructure, with no cross-border data transfer.

Latency

Latency matters for a different reason too. International API calls add 150-300ms of network latency before the TTS engine even starts processing. For IVR, where callers expect immediate audio response after pressing a key, this delay is perceptible and frustrating. Sarvam's India-hosted infrastructure delivers sub-250ms end-to-end latency, including network round-trip.

Certifications

Enterprise IVR requires SOC 2 Type II certification, ISO 27001 compliance, no storage of caller data post-processing, audit logs for every API call, and SLA-backed uptime. Sarvam meets all of these. Sarvam does not store or train on any customer data passed through the API. Every request is processed and discarded.

From better IVR to full conversational AI

Dynamic IVR prompts are step one. Samvaad takes your phone system from menu navigation to real conversations. Callers speak naturally, the agent understands intent, and calls resolve without transfers. Same infrastructure, same compliance, same 11 languages.

  • Speech recognition (Saaras STT) in 11 Indian languages with Hinglish support
  • Conversational AI that handles complete calls, not just menu navigation
  • Intent detection and entity extraction tuned for Indian business domains (banking, insurance, telecom)
  • Single integration for STT, TTS, and dialogue management

Your questions, answered

Modernize your IVR with a single API Powered by Bulbul V3