Provider Directory

Compare supported providers by category, pricing, and expected latency. OVoice keeps them under one API surface.

Reference values are sample public rates as of February 11, 2026.

Speech-to-Text

Transcribe user speech into text in realtime.

Deepgram Nova-2

Pricing: $0.0043 / min

Latency: ~200-280ms

Best use: Realtime transcription and low-latency assistants

OpenAI Whisper

Pricing: $0.006 / min

Latency: ~300-450ms

Best use: General purpose speech recognition

AssemblyAI

Pricing: $0.00065 / sec

Latency: ~250-350ms

Best use: Cost-sensitive multi-channel workloads

Generate natural voice responses from model output.

ElevenLabs Turbo

Pricing: $0.18 / 1K chars

Latency: ~220-400ms

Best use: Natural voice quality and premium output

OpenAI TTS-1

Pricing: $0.015 / 1K chars

Latency: ~160-260ms

Best use: Fast, low-cost synthetic voice

Cartesia Sonic

Pricing: $0.045 / 1K chars

Latency: ~120-210ms

Best use: Realtime conversational response

Generate responses, reasoning, and tool calling decisions.

Gemini 2.5 Flash

Pricing: $0.075 / $0.30

Latency: ~120-250ms

Best use: Fast reasoning with low token cost

GPT-4o

Pricing: $2.50 / $10.00

Latency: ~250-420ms

Best use: General quality and tool-heavy agents

Claude Sonnet 4

Pricing: $3.00 / $15.00

Latency: ~280-430ms

Best use: Long-context reasoning and instruction following