#46 of 50

Batch API

Half price, same model, no latency guarantee

What is the batch API

The batch API is a mode offered by model providers where you submit a file of requests and receive results within 24 hours at a 50% discount. OpenAI's batch API accepts JSONL files with up to 50,000 requests per batch.

Here is the cost. 10,000 requests to GPT-4o at standard pricing: $25.00. Same 10,000 requests via batch API: $12.50. Same model, same output quality. The only difference: you wait up to 24 hours instead of getting responses in seconds.

Why it matters

Any workload that does not require real-time responses — data extraction, content generation, evaluation runs, training data preparation — should use batch pricing. It is the single largest cost reduction available without changing models or reducing quality. sourc.dev tracks batch processing support as a capability flag.

Verified March 2026 · Source: OpenAI batch API docs

Related terms

Batch pricing Input price Rate limit

← All terms

← Quantisation Async vs sync →