Throughput
How many requests your provider can actually handle
What is throughput
Throughput is the number of tokens or requests a model API can process per unit of time. It is measured in tokens per second (TPS) for individual requests, or requests per minute (RPM) for aggregate capacity.
Different providers optimise for different throughput profiles. Groq's custom hardware delivers extremely high TPS for individual requests. OpenAI and Anthropic optimise for high concurrent RPM across many users.
Why it matters
Throughput determines whether your application can scale. A model that is cheap but slow may cost more in practice than a faster, pricier alternative — because your users are waiting and your infrastructure is idle. sourc.dev tracks speed (TPS) as a verified attribute where available.
Verified March 2026 · Source: Groq benchmark data, provider documentation