Speechmatics logo

Speechmatics

Free tier

Low-latency speech-to-text APIs powering multilingual, multi-speaker Voice AI

Free tier available·All audiences·API available

Key strengths

Sub-second real-time speech-to-text with high accuracy55+ language support covering over half the world's populationFlexible deployment: cloud, on-premises, and on-deviceEnterprise-grade security: ISO 27001, GDPR, HIPAA, SOC 2 Type II certifiedSpecialized models for verticals like medical, legal, and contact centers
Free tier + paid plans
Cambridge, United Kingdom
Founded 2006
Self-hostable
No ratings yet

Speechmatics API — Technical Reference

Authentication

All requests require a Bearer token in the Authorization header. Obtain your API key from the Speechmatics dashboard.

Real-Time Transcription (WebSocket)

Connect to the real-time endpoint via WebSocket and stream audio chunks. The API returns low-latency JSON transcripts, typically in under 1 second.

# Example: Start a real-time session (conceptual)
wscat -c wss://eu2.rt.speechmatics.com/v2 \
  -H "Authorization: Bearer <YOUR_API_KEY>"

Batch Transcription (REST)

curl -X POST https://asr.api.speechmatics.com/v2/jobs \
  -H "Authorization: Bearer <YOUR_API_KEY>" \
  -F "data_file=@audio.wav" \
  -F 'config={"type":"transcription","transcription_config":{"language":"en"}}'

Key Parameters

ParameterDescription
languageBCP-47 language code (55+ supported)
enable_partialsStream partial results before final transcript
diarizationEnable speaker separation (speaker or channel)
operating_pointstandard or enhanced accuracy model
custom_vocabularyProvide domain-specific terms to boost accuracy

Deployment Options

  • Cloud API — Multi-region SaaS (EU, US)
  • On-Premises — Docker/Kubernetes deployment with no data egress
  • On-Device — Optimized quantized models for edge hardware (e.g., laptop CPUs)