sourc.dev
Home LLMs Tools SaaS APIs
Claude 3.5 Sonnet input $3.00/1M ↓ -50%
GPT-4o input $2.50/1M
Gemini 1.5 Pro input $1.25/1M
Mistral Large input $2.00/1M ↓ -33%
DeepSeek V3 input $0.27/1M
synced 2026-04-05
Claude 3.5 Sonnet input $3.00/1M ↓ -50%
GPT-4o input $2.50/1M
Gemini 1.5 Pro input $1.25/1M
Mistral Large input $2.00/1M ↓ -33%
DeepSeek V3 input $0.27/1M
synced 2026-04-05
#47 of 50

Async vs sync

The architecture decision behind every API call

What is async vs sync

Synchronous (sync) API calls block until the response is complete — your code waits. Asynchronous (async) calls return immediately with a reference, and you check back later for the result. Streaming is a third pattern: the response arrives token by token as the model generates it.

Sync is simpler to implement. Async is more efficient at scale — your application can process other work while waiting for model responses. Streaming gives the fastest perceived latency because the user sees output appearing before generation is complete.

Why it matters

The choice between sync, async, and streaming determines your application's perceived speed and scalability. A chatbot needs streaming for user experience. A batch pipeline needs async for throughput. A simple script can use sync for simplicity. Most production applications use streaming for user-facing features and async for background processing.

Verified March 2026 · Source: OpenAI streaming docs, Anthropic docs

Related terms
StreamingLatencyAPI endpoint
← All terms
← Batch API Cost per query →