#27 of 50

Batch pricing

Some API calls cost half as much if you can wait

What is batch pricing

Batch pricing is a discounted rate offered by model providers for API requests submitted in bulk with no latency guarantee. Instead of processing each request immediately, the provider queues batch requests and processes them when compute capacity is available — typically within 24 hours.

OpenAI offers 50% off for api" class="glossary-link">batch API requests. Anthropic offers a similar discount. The model output is identical — the only difference is response time.

Why it matters

If your workload does not require real-time responses — data processing, content generation at scale, overnight evaluation runs — batch pricing cuts your model cost in half. The constraint is latency, not capability.

Verified March 2026 · Source: OpenAI batch API docs

Related terms

Input price Output price Rate limit

← All terms

← Context caching Price per request →