Back to home

Provider Directory

Compare supported providers by category, pricing, and expected latency. OVoice keeps them under one API surface.

Reference values are sample public rates as of February 11, 2026.

Speech-to-Text

Transcribe user speech into text in realtime.

Deepgram Nova-2

Pricing: $0.0043 / min

Latency: ~200-280ms

Best use: Realtime transcription and low-latency assistants

OpenAI Whisper

Pricing: $0.006 / min

Latency: ~300-450ms

Best use: General purpose speech recognition

AssemblyAI

Pricing: $0.00065 / sec

Latency: ~250-350ms

Best use: Cost-sensitive multi-channel workloads

Text-to-Speech

Generate natural voice responses from model output.

ElevenLabs Turbo

Pricing: $0.18 / 1K chars

Latency: ~220-400ms

Best use: Natural voice quality and premium output

OpenAI TTS-1

Pricing: $0.015 / 1K chars

Latency: ~160-260ms

Best use: Fast, low-cost synthetic voice

Cartesia Sonic

Pricing: $0.045 / 1K chars

Latency: ~120-210ms

Best use: Realtime conversational response

Large Language Models

Generate responses, reasoning, and tool calling decisions.

Gemini 2.5 Flash

Pricing: $0.075 / $0.30

Latency: ~120-250ms

Best use: Fast reasoning with low token cost

GPT-4o

Pricing: $2.50 / $10.00

Latency: ~250-420ms

Best use: General quality and tool-heavy agents

Claude Sonnet 4

Pricing: $3.00 / $15.00

Latency: ~280-430ms

Best use: Long-context reasoning and instruction following