#38 of 50

Throughput

How many requests your provider can actually handle

What is throughput

Throughput is the number of tokens or requests a model API can process per unit of time. It is measured in tokens per second (TPS) for individual requests, or requests per minute (RPM) for aggregate capacity.

Different providers optimise for different throughput profiles. Groq's custom hardware delivers extremely high TPS for individual requests. OpenAI and Anthropic optimise for high concurrent RPM across many users.

Why it matters

Throughput determines whether your application can scale. A model that is cheap but slow may cost more in practice than a faster, pricier alternative — because your users are waiting and your infrastructure is idle. sourc.dev tracks speed (TPS) as a verified attribute where available.

Verified March 2026 · Source: Groq benchmark data, provider documentation

Related terms

Latency Rate limit API endpoint

← All terms

← Webhook Agents →