Saaras V3
India's most accurate
Speech-to-Text
API
Lowest word error rates across 23 Indian languages. Real-time streaming, native code-switching, speaker diarization.
Transcript
Trusted by leading teams
Built for real workloads, not demos
Production-grade ASR with predictable latency, enterprise SLAs, and developer-first APIs.

Streaming-first architecture
Sub-150ms time to first token. Configurable Accurate, Balanced, and Fast modes for every latency requirement.

Code-switching & noise robust
Trained on 1M+ hours of real-world audio. Handles code-mixed speech, noisy telephony, and diverse accents.

Drop-in SDKs
Go live in under 10 minutes with official Python and Node.js SDKs. Pipecat & LiveKit ready.

23 Indian languages
All 22 scheduled languages plus English. Unified multilingual model with automatic language detection.

Beyond raw transcripts
Speaker diarization, word-level timestamps, output format control, and automatic language detection built in.
Streaming-first architecture
Sub-150ms time to first token. Configurable Accurate, Balanced, and Fast modes for every latency requirement.
Code-switching & noise robust
Trained on 1M+ hours of real-world audio. Handles code-mixed speech, noisy telephony, and diverse accents.
Drop-in SDKs
Go live in under 10 minutes with official Python and Node.js SDKs. Pipecat & LiveKit ready.
23 Indian languages
All 22 scheduled languages plus English. Unified multilingual model with automatic language detection.
Beyond raw transcripts
Speaker diarization, word-level timestamps, output format control, and automatic language detection built in.
Same audio, multiple formats
One API call. Transcription, translation, transliteration, or verbatim output.
Transcribe
Get clean, formatted transcripts with automatic number normalization, punctuation, and sentence segmentation.
Translate
Transcribe and translate Indic audio to English in a single API call. Works across all 23 supported languages.
Transliteration
Convert speech in any Indian language into romanized English text. Great for search indexing and chat interfaces.
Verbatim
Preserve every filler word, hesitation, and spoken number exactly as uttered. Ideal for compliance and legal transcriptions.
Powering real-world audio experiences
From contact centers to voice agents. Real use cases, already in production.
Seamless code-mixing
Understands when speakers switch between Hindi, English, and regional languages mid-sentence.
Cross-language detection
Mid-sentence switching
Natural transcription
Telephony-optimized
Handles real call center audio: 8kHz, background noise, multiple speakers.
8kHz audio support
Multi-speaker handling
Call center grade
Handle noisy audio
Background noise, cross-talk, poor connections. Our models maintain accuracy even in challenging acoustic conditions.
Noise robust
Cross-talk handling
Poor connection tolerant
Developer-first platform
Drop-in SDKs for Python and Node.js. Go from zero to first transcription in under 5 minutes.
REST & WebSocket APIs
Standard REST for batch transcription, WebSocket for real-time streaming with sub-150ms time to first token.
SDKs & libraries
Official Python and Node.js SDKs with TypeScript support. pip install sarvam-ai.
Streaming modes
Choose Accurate, Balanced, or Fast modes depending on your latency vs. accuracy needs.
Free tier included
Start building immediately. No credit card, no sales call, no minimum commitment.
from sarvamai import SarvamAI client = SarvamAI(api_subscription_key="YOUR_SARVAM_API_KEY") response = client.speech_to_text.transcribe( file_path="audio.wav", language="hi-IN", model="saaras:v3" ) print(response.transcript) for word in response.words: print(f"[{word.start:.2f}s] {word.text}")
Enterprise-ready. Responsible AI.
Built with safety, compliance, and data sovereignty at the core.
SOC 2 Type II & ISO 27001
Enterprise-grade security certifications. Annual audits, documented controls, continuous monitoring.
Data sovereignty
All audio processed and stored in India. No cross-border transfers. Full compliance with Indian data regulations.
No training on your data
Your API inputs are never used for model training. Zero data retention after processing unless explicitly requested.
PII redaction
Automatically detect and redact sensitive information like Aadhaar numbers, phone numbers, and addresses from transcripts.
Content safety filters
Automated detection and flagging of harmful, abusive, or sensitive content in transcriptions.
Audit-ready logging
Comprehensive API usage logs, access controls, and RBAC for enterprise governance and compliance reporting.
Base plan
Free trial included
No credit card required. Get API keys instantly.
Frequently asked questions
Start building with India's best ASR. Go live in minutes.
Start building with India's best ASR.
Go live in minutes.