What is input price per million tokens?

Cost to send one million tokens to a model.

What is output price per million tokens?

Cost to receive one million tokens from a model.

What is a context window?

Maximum tokens a model can process in one call.

How much a model's behaviour has changed over time, tracked via canary prompts. A sourc.dev proprietary metric.

Rolling availability of an API endpoint.

What is EU data residency?

Whether data is processed and stored within the European Union.

What does open weights mean?

Whether model weights are publicly available for download and self-hosting.

What is API rate limit?

Maximum requests per minute on free and paid tiers.

The date sourc.dev last confirmed this data point against its primary source.

About sourc.dev

What sourc.dev is

sourc.dev is a data observatory for the AI and developer ecosystem. We track language models, AI tools, software products, and open data APIs — measuring pricing, uptime, capability, and change over time. Every data point has a source URL and a verification date. Nothing is estimated. Nothing is sponsored. The data is the product.

Why a data observatory, and why now

The AI and developer ecosystem has an information problem. There are hundreds of language models in production today, thousands of AI tools built on top of them, and an ever-expanding landscape of software products and open data APIs that integrate or compete with AI-native alternatives. The rate of change is faster than any static article, blog post, or review roundup can track. A comparison published on Monday can be outdated by Thursday. A pricing page captured in a screenshot may not reflect the pricing page live right now.

Then there is the opinion problem. Search for "best LLM for coding" and you will find dozens of results. Most are affiliate-driven or based on subjective experience. Many are from 2023 or early 2024 and reference models that have since been deprecated, re-priced, or replaced by newer versions with different capability profiles. An opinion that was reasonable eighteen months ago is now actively misleading because the underlying data has shifted. The ecosystem does not need more opinions. It needs a single place where the facts are current, verified, and timestamped.

Finally, there is the time problem. A single data point in isolation — a model's input price today — tells you almost nothing without context. Has the price gone up or down? How fast? Did competitors move at the same time? Without historical data, every snapshot is an island. sourc.dev solves this by treating history as a first-class citizen. Our attribute_history table is append-only. When a value changes, the old value is never overwritten. It is preserved with its source URL and verification date intact. The timeline is the asset.

The timing is not accidental. McKinsey's 2024 Global Survey on AI found that 65 percent of organisations are now regularly using generative AI, up from 33 percent just twelve months prior. That is the fastest adoption curve McKinsey has ever documented for an enterprise technology. When adoption doubles in a year, the demand for reliable, structured, independently verified data about the tools being adopted grows exponentially. The window for establishing the authoritative data source is open now, and that is exactly what sourc.dev is built to be.

How data is tracked

Every entity on sourc.dev is a structured record with typed attributes. A language model entry tracks input price per million tokens, output price per million tokens, context window size, whether weights are open, EU data residency status, API rate limits, and more. An AI tool entry tracks its pricing tier, the LLMs it depends on, API availability, and open-source status. A SaaS product entry tracks subscription pricing, integrations, and uptime. The schema is consistent across entity types, which means attributes can be compared across categories — you can ask "which tools offer EU data residency?" and get a structured, filterable answer.

The data pipeline runs automated daily checks against primary source URLs. For each tracked attribute, the system visits the canonical source — an official pricing page, an API documentation endpoint, a model card, a status page — and extracts the current value. When a value changes, the change is scored against structural bounds defined for that attribute type. A model's input price dropping from $15.00 to $10.00 per million tokens is within expected bounds for a competitive price cut. A model's context window jumping from 128K to 10 million tokens would exceed bounds and trigger review.

This creates a system of structural trust. Changes within bounds are auto-approved and written directly to the attribute_history table. Changes that exceed bounds are routed to a human review queue where they are manually verified before publication. The result is that routine updates flow through quickly while anomalies are caught and checked. Every publish has a human in the loop, either through pre-approved bounds or through direct review.

Verification is traceable. Every data point on sourc.dev traces to a specific URL that was live at the verified_at timestamp recorded alongside the value. If a source URL goes dead, the data point remains in the history table but is flagged as unverifiable. The history table itself is append-only — nothing is overwritten, nothing is deleted. Every change is a permanent record with its provenance intact. This means sourc.dev can answer not just "what is the price now?" but "what was the price on any given date, and where did that number come from?"

Who operates this

sourc.dev is operated by Fredrik Kallioniemi, founder of HODLR & CO Labs, based in Sweden. The site is built with Claude Code, deployed on Cloudflare Pages, backed by Supabase Postgres in eu-north-1 Stockholm. The pipeline runs on Cloudflare Workers.

This site is AI-assisted. Data collection, change detection, and description generation use Claude API. All published data is verified against primary sources before appearing. Human review is the final gate.

The attribute_history table is append-only. The data is stored in the EU. The operator is Swedish. The methodology is documented here.

What we track

input_price_per_1m: Cost to send one million tokens to a model.
output_price_per_1m: Cost to receive one million tokens from a model.
context_window: Maximum tokens a model can process in one call.
drift_index: How much a model's behaviour has changed over time, tracked via canary prompts. A sourc.dev proprietary metric.
uptime: Rolling availability of an API endpoint.
eu_data_residency: Whether data is processed and stored within the European Union.
open_weights: Whether model weights are publicly available for download and self-hosting.
api_rate_limit: Maximum requests per minute on free and paid tiers.
verified_at: The date sourc.dev last confirmed this data point against its primary source.

Attribution and data use

Data on sourc.dev is free to browse. If you cite sourc.dev data, please include: entity name, attribute, value, and the verification date shown on the page.

Example: “GPT-4o input pricing: $5.00 per million tokens (sourc.dev, verified 2024-05-13)”

The structured data and historical time series are available via API under commercial licensing. Contact us via the submit form.

What is coming

sourc.dev launched in March 2026 with language models tracked at depth. The roadmap is structured in phases, each expanding the scope of the observatory while maintaining the same data standards: verified sources, typed attributes, and append-only history.

Month 1 — 30 LLMs with generational history and full attribute coverage
Month 2 — AI tools directory: code assistants, agents, RAG infrastructure
Month 3 — SaaS and open data API directories
Month 6 — Full Source Tool: answer four questions, get data-ranked results

Submit a listing or data correction via the submit form.