Sarvam AI

Speech to Text

Convert speech to text with natural accuracy

Voice to text, audio to text. One API for all. Multilingual speech recognition that handles accents, noise, and code-switching.

Every word, captured accurately

Code-mixing. Numbers. Proper nouns. Abbreviations.

हाँमैंनेPROPER NOUNFlipkartसेNUMBER₹12,499काENTITYSamsungGalaxyCODE SWITCHफ़ोनऑर्डरकियाथा।ऑर्डरIDहैABBREVIATIONFLK-78234-XN।CODE SWITCHडिलीवरीएड्रेसहै42,PROPER NOUNKoramangala5thBlock,PROPER NOUNBangalore।मेराफ़ोननंबरहैPHONE NUMBER9840950950।CODE SWITCHप्लीज़डिलीवरीजल्दीकरदीजिए,मेरापुरानाफ़ोनबिल्कुलकामनहींकररहा।

Same audio, multiple formats

One API call. Transcription, translation, transliteration, or verbatim output.

Formatted

Transcribe

Get clean, formatted transcripts with automatic number normalization, punctuation, and sentence segmentation.

Cross-lingual

Translate

Transcribe and translate Indic audio to English in a single API call. Works across all 23 supported languages.

Romanized

Transliteration

Convert speech in any Indian language into romanized English text. Great for search indexing and chat interfaces.

Raw

Verbatim

Preserve every filler word, hesitation, and spoken number exactly as uttered. Ideal for compliance and legal transcriptions.

Powering real-world audio experiences

From contact centers to voice agents. Real use cases, already in production.

Code Mixing

Seamless code-mixing

Understands when speakers switch between Hindi, English, and regional languages mid-sentence.

Cross-language detection

Mid-sentence switching

Natural transcription

Call Center

Telephony-optimized

Handles real call center audio: 8kHz, background noise, multiple speakers.

8kHz audio support

Multi-speaker handling

Call center grade

Noisy Audio

Handle noisy audio

Background noise, cross-talk, poor connections. Our models maintain accuracy even in challenging acoustic conditions.

Noise robust

Cross-talk handling

Poor connection tolerant

Developer-first platform

Drop-in SDKs for Python and Node.js. Go from zero to first transcription in under 5 minutes.

REST & WebSocket APIs

Standard REST for batch transcription, WebSocket for real-time streaming with sub-150ms time to first token.

SDKs & libraries

Official Python and Node.js SDKs with TypeScript support. pip install sarvam-ai.

Streaming modes

Choose Accurate, Balanced, or Fast modes depending on your latency vs. accuracy needs.

Free tier included

Start building immediately. No credit card, no sales call, no minimum commitment.

from sarvamai import SarvamAI

client = SarvamAI(api_subscription_key="YOUR_SARVAM_API_KEY")

response = client.speech_to_text.transcribe(
    file_path="audio.wav",
    language="hi-IN",
    model="saaras:v3"
)

print(response.transcript)

for word in response.words:
    print(f"[{word.start:.2f}s] {word.text}")

Enterprise-ready. Responsible AI.

Built with safety, compliance, and data sovereignty at the core.

SOC 2 Type II & ISO 27001

SOC 2 Type II & ISO 27001

Enterprise-grade security certifications. Annual audits, documented controls, continuous monitoring.

Data sovereignty

Data sovereignty

All audio processed and stored in India. No cross-border transfers. Full compliance with Indian data regulations.

No training on your data

No training on your data

Your API inputs are never used for model training. Zero data retention after processing unless explicitly requested.

PII redaction

PII redaction

Automatically detect and redact sensitive information like Aadhaar numbers, phone numbers, and addresses from transcripts.

Content safety filters

Content safety filters

Automated detection and flagging of harmful, abusive, or sensitive content in transcriptions.

Audit-ready logging

Audit-ready logging

Comprehensive API usage logs, access controls, and RBAC for enterprise governance and compliance reporting.

The most affordable Transcription API

Start free. Scale as you grow. No hidden costs.

Base plan

₹1.5 per minute

Free trial included

No credit card required. Get API keys instantly.

Volume discounts available
Enterprise pricing available
Flexible pricing plans
Usage analytics
Integration with APIs
Best for startups

Frequently asked questions

Start building with India's best ASR. Go live in minutes.