sourc.dev
Home LLMs Tools SaaS APIs
Claude 3.5 Sonnet input $3.00/1M ↓ -50%
GPT-4o input $2.50/1M
Gemini 1.5 Pro input $1.25/1M
Mistral Large input $2.00/1M ↓ -33%
DeepSeek V3 input $0.27/1M
synced 2026-04-05
Claude 3.5 Sonnet input $3.00/1M ↓ -50%
GPT-4o input $2.50/1M
Gemini 1.5 Pro input $1.25/1M
Mistral Large input $2.00/1M ↓ -33%
DeepSeek V3 input $0.27/1M
synced 2026-04-05
#38 of 50

Throughput

How many requests your provider can actually handle

What is throughput

Throughput is the number of tokens or requests a model API can process per unit of time. It is measured in tokens per second (TPS) for individual requests, or requests per minute (RPM) for aggregate capacity.

Different providers optimise for different throughput profiles. Groq's custom hardware delivers extremely high TPS for individual requests. OpenAI and Anthropic optimise for high concurrent RPM across many users.

Why it matters

Throughput determines whether your application can scale. A model that is cheap but slow may cost more in practice than a faster, pricier alternative — because your users are waiting and your infrastructure is idle. sourc.dev tracks speed (TPS) as a verified attribute where available.

Verified March 2026 · Source: Groq benchmark data, provider documentation

Related terms
LatencyRate limitAPI endpoint
← All terms
← Webhook Agents →