How AI Infrastructure Is Measured
sourc.dev is an AI infrastructure observatory that tracks pricing, integrations, benchmarks, and ecosystem relationships across 271 published AI entities — models, tools, SaaS platforms, APIs, and companies. Every data point is verified, sourced, and timestamped. Every formula is published and versioned.
Every number on this page has a source. Every source has a date. Every formula has a version number.
That is not a standard adopted. It is the standard built into this platform — because it is the standard demanded before trusting any data with capital.
If the source cannot be shown, the number is not published.
This page explains how sourc.dev measures the AI infrastructure market. Nothing here is an opinion. Everything here is a methodology.
One thing to understand before reading further
sourc.dev does not tell you what is best or worst. It does not rank by value judgment. It records what the data shows.
If you ask a question and the data supports an answer, you get one. If the data does not support it, that is stated clearly. What is most valuable for you is for you to determine. The interpretation is always yours.
The market moves faster than the tools built to measure it
The AI infrastructure market is moving faster than any traditional analysis framework was built to handle. These are not estimates. These are verified observations from the sourc.dev archive.
| What is tracked | What the data shows | Verified |
|---|---|---|
| Price deflation since 2020 | GPT-4o is 95.8% cheaper than the June 2020 baseline | 2026-04-05 |
| Ecosystem concentration | 48.1% of tracked AI tools depend on 2 providers | 2026-04-05 |
| Provider lock-in | 9 tools have a single certified LLM dependency | 2026-04-05 |
| Price change by category | Inference APIs avg 94% cheaper than 2020 | 2026-04-05 |
| Enterprise signal presence | 28 entities with verified SSO, SCIM, and marketplace presence | 2026-04-05 |
| China market presence | 7 entities with verified Chinese auth or distribution | 2026-04-05 |
| Published entities | 271 across models, tools, SaaS, APIs, and companies | 2026-04-05 |
| Verified relations | 933 — each with source URL and signal date | 2026-04-05 |
| Pricing verified | 106 entities — full access price confirmed from source | 2026-04-05 |
| Archive depth | 2,625 timestamped observations — growing every day | 2026-04-05 |
None of these values are manually estimated. Each is computed from verified, sourced data and updated automatically. The archive grows every day. What it recorded yesterday cannot be reconstructed tomorrow.
What the data can answer
The same data answers different questions depending on who is asking. Here are examples drawn directly from the archive.
We can generate more or less any fact the data supports. If we cannot answer, we do not have the data — and we say so. What you see on this page is not all you can get.
For the investor
| Question | What the data shows |
|---|---|
| What is the certified dependency distribution across LLM providers? | OpenAI: 22 tools (42.3%) — Anthropic: 19 tools (36.5%) — top 2 combined: 48.1% |
| How has AI infrastructure pricing changed since 2020? | GPT-4o: 95.8% cheaper. Gemini 1.5 Flash: 99.8% cheaper. Verified against official pricing pages. |
| Which entities have verified enterprise signal presence? | 28 entities with SSO, SCIM, and cloud marketplace presence all confirmed |
| What percentage of the tracked AI ecosystem depends on a single provider? | 9 tools have a single certified LLM dependency — no verified alternative |
| What is the rate of new certified integrations per provider? | Velocity Index in development — active Q3 2026 with 90-day relation history |
For the CTO
| Question | What the data shows |
|---|---|
| Which coding tools have verified SSO and EU data residency? | 4 entities satisfy both criteria — verified from official documentation |
| If Anthropic changes pricing tomorrow, which tools are affected? | 19 tools have certified Anthropic dependency — 3 exclusively, with no alternative |
| Which inference APIs are OpenAI-compatible and usage-based? | 11 APIs with verified compatibility and usage-based pricing |
| Which tools have verified presence in Chinese app stores or auth? | 7 entities with WeChat, Alipay, or Huawei AppGallery presence confirmed |
| What is Cursor's integration score vs the category? | Cursor: 18 — category average: 6 — verified from relation graph |
For the developer or creator
| Question | What the data shows |
|---|---|
| What does the value-per-dollar distribution look like across models? | Gemini 1.5 Flash VDS: 1,052 — GPT-4o VDS: 35.5 — difference: 30× |
| Which tools have a verified free tier and API access? | 33 entities with both confirmed |
| Which coding tools have verified VS Code and JetBrains presence? | 4 tools with both marketplace listings confirmed |
| What is the integration score of major open source AI frameworks? | LangChain: 38 integration points — LlamaIndex: 12 — all verified |
| What are the verified pricing routes to Claude 3.5 Sonnet? | 3 verified routes — direct API, OpenRouter, Amazon Bedrock — all priced |
These are examples. The data behind sourc.dev can answer hundreds of variations of these questions — combinations of price, capability, integration, compliance, geography, and market position.
What you see on this page is not all you can get. The depth is in the data, and there is more under the hood than what the public display shows.
Contact → for institutional access, API licensing, or custom intelligence.
The knowledge graph
Most AI data platforms store lists. A list tells you that Cursor exists, that Claude 3.5 Sonnet exists, that Anthropic exists. It tells you what each costs and how each benchmarks.
A list cannot tell you that 34 tools depend on Anthropic. It cannot tell you that if Anthropic changes its pricing tomorrow, those 34 products are directly affected. It cannot tell you that 3 of those tools have no alternative — Anthropic is their only certified LLM dependency.
sourc.dev is not a list. It is a map.
Every entity tracked — models, tools, SaaS platforms, APIs, companies — is a node in a directed, weighted, and temporal knowledge graph. The connections between them are not inferred or estimated. They are verified, sourced, and dated.
Why relations are weighted differently
Not all connections are equal. A certified integration — where both parties publicly document the relationship — is fundamentally different from a community mention where one party references the other informally. They are weighted accordingly.
| Relation type | What it means | Weight |
|---|---|---|
| Certified | Both parties publicly document the integration | 3 |
| API partnership | Documented commercial agreement | 2 |
| Community | One party documents the integration | 1 |
| Available via | Passive hosting or routing relationship | 0.5 |
This weighting matters because it determines how much each connection contributes to ecosystem scoring. When sourc.dev records Anthropic's Blast Radius Score as 34, it means 34 tools have certified — not just mentioned — dependencies on Anthropic's models. That is a precise claim. The weight system is why it is precise.
Why the history of the graph matters as much as its current state
A relation that existed in 2024 but was removed in 2025 is not deleted. It is marked deprecated, with a date and a source documenting the change.
When did a tool add support for Claude 3.5 Sonnet? When did it drop GPT-3.5? Which AI integrations formed in the month after a major pricing change? These questions can only be answered if someone was recording the graph at the time those changes happened.
We were. And we will be tomorrow.
What the graph looks like today
| Graph metric | Current value |
|---|---|
| Active relations | 933 — all with source URL and signal date |
| Most connected entity | Anthropic — 34 certified inbound relations |
| Tools with multi-provider certified relations | Majority — lower concentration risk |
| Tools with single-provider certified dependency | 9 — higher concentration risk |
| Deprecated relations tracked | Yes — full history preserved |
How the system works
sourc.dev is not a manually maintained catalogue. It is a system of interconnected processes that continuously finds, verifies, enriches, monitors, and preserves data about the AI infrastructure ecosystem. Each process has individual value. Together they are self-reinforcing — and together they produce something no single process could produce alone.
Discovery — finding what exists
When a new entity enters the graph, it is automatically checked against every existing entity for potential connections. If a verified relation points to an entity not yet tracked, that entity becomes a candidate and goes through the same process. The graph expands with the ecosystem — not by manual mapping, but by following verified connections wherever they lead.
Discovery finds it. But without the rest of the system, a discovered entity is just a name. What gives it meaning is what comes next.
Linking — verifying how things connect
Every relation in the graph requires a primary source — an official page, a documented partnership, a marketplace listing, a press release. Relations are not inferred from similarity or assumed from co-occurrence. They are verified against evidence and dated when that evidence was found.
Linking is what turns a list of entities into a map. It is also what makes Blast Radius Score meaningful — because every connection counted in that score was individually verified.
Enrichment — describing what exists
Entities are continuously enriched with verified attributes — pricing, benchmark scores, authentication signals, distribution presence, compliance flags, lifecycle stage. Each attribute is sourced independently and timestamped. Nothing is assumed from the category an entity belongs to.
Enrichment is what gives metrics something to compute on. The sourc Value Index, the Price Deflation Index, the Integration Density Score — none of these exist without enrichment filling the underlying data.
Pipeline — keeping it current
A daily automated pipeline monitors pricing changes across tracked entities. When a price changes, the new value is recorded with its source and date. The old value is preserved. Nothing is overwritten.
Every pipeline run adds to the archive. Every day that passes is one more day of verified, timestamped pricing history that cannot be reconstructed retroactively. The pipeline does not just keep data current — it builds the time-series that makes trend analysis possible.
Radar — detecting what changes
The ecosystem does not only change on a daily schedule. New integrations form. Products launch and deprecate. Partnerships are announced. Prices shift without warning. Radar monitors for these changes across entity pages, official documentation, and structured data sources.
When something material changes, it is recorded — not replaced. The radar ensures that no significant change in the ecosystem passes unrecorded. Every observation that radar captures becomes part of the permanent archive.
Notifications and feeds — communicating what happened
Changes detected by radar and pipeline do not stay buried in raw data. They are surfaced as intelligence signals — plain-language statements of what changed, when it changed, and by how much.
A signal is always derived from verified data. It is never editorial. It describes what the data shows — not what it means for any particular decision. That interpretation remains with the reader.
The archive — the only source for what was
For historical values, trend analysis, and time-series data, sourc.dev's own archive is the authoritative source. Not because it was decided to be — but because no other source has it.
A price recorded on November 1 2024 with a verified source URL cannot be found anywhere else after the fact. It exists only because it was captured at the time. The same is true for every deprecated relation, every enriched attribute, every pipeline observation since the platform launched.
The archive is append-only. No row is ever deleted or modified. It is the reason every other process in this system has permanent value — because every discovery, every verified link, every enriched attribute, every pipeline run, every radar signal becomes part of a record that compounds every day.
The combination
Discovery finds it. Linking verifies it. Enrichment describes it. Pipeline keeps it current. Radar catches what changes. Notifications communicate it. The archive preserves it all — permanently, append-only, irreproducible.
Take away any one process and the system works less well. Keep all seven and the system is self-reinforcing: each process makes the others more valuable, and the archive that results is more valuable than the sum of its parts.
That is what makes sourc.dev an observatory rather than a directory.
How data is handled
Pricing — why sourc.dev tracks what it tracks
sourc.dev tracks one price per entity: the monthly price at which a professional user gains full access to the product's AI capabilities — including all models, integrations, and extensions that the knowledge graph describes. This is called price_full_access_usd.
Free tiers are not tracked. Free tiers are capability-limited, quota-limited, and feature-limited. They do not describe the product the relations graph documents. Pricing that does not unlock what the graph describes is not comparable to pricing that does.
If a product's full capabilities are genuinely free — open source, no paywall, no quota — the price is $0.00. That is the correct answer.
| Price type | Tracked | Why |
|---|---|---|
| Pro / full access | Yes | Describes the real product |
| Free tier | No | Capability-limited — not comparable |
| Usage-based | Yes — as $0.00 | Full access from first call |
| Open source | Yes — as $0.00 | No capability paywall |
| Enterprise custom | No — noted separately | Not publicly verifiable |
Freshness — why every data point has an age
Data ages. A price verified three months ago may no longer be accurate. A relation documented last year may have been deprecated. Every data point carries a verification date — and is flagged when too long has passed since last verification.
| Data category | Maximum age before flagged |
|---|---|
| Pricing — API sourced | 24 hours |
| Pricing — page verified | 7 days |
| Pricing — manually verified | 90 days |
| Benchmarks | 180 days |
| Auth and distribution flags | 90 days |
| Relation status | 90 days |
Stale data is visible as such. It is not hidden. A stale value with a visible date is more useful than no value — because it tells you something real, with an honest caveat attached.
Quality gate — why some metrics show Calibrating
A metric is not published until its underlying data meets minimum thresholds. Below the threshold, the metric shows Calibrating.
This is not a placeholder. It is not an estimate. It is an honest statement that the data is not yet sufficient to publish a reliable number. A number that cannot be defended should not be shown.
| Metric | Minimum requirement |
|---|---|
| Price Deflation Index | 2 verified price points — baseline and current |
| Value Density Score | Benchmark and price both verified |
| Integration Density Score | ≥ 3 verified relations with source URL |
| Blast Radius Score | ≥ 5 certified inbound relations |
| sourc Value Index | Integration score and price both verified |
| Velocity Index | ≥ 90 days of relation history |
Unknown is not false
If a data point has not been verified, it appears as unknown. Not as zero. Not as false. Not as absent.
Unknown means unverified. False means checked and confirmed absent.
A product with unknown SAML support is not the same as a product with verified no SAML support. That difference is never collapsed.
Built to trust with capital
This platform was not built to look rigorous. It was built because verified, sourced, timestamped intelligence on the AI infrastructure market was needed — for real investment decisions, with real capital.
The standard is simple: trustworthy enough for capital allocation — published. Not trustworthy enough — not published.
Every night, that standard is measured across seven dimensions and expressed as a single number — visible on this page.
| Quality dimension | What it measures |
|---|---|
| Accuracy | How close are values to primary source reality |
| Completeness | What proportion of entities have verified critical attributes |
| Freshness | How current is the data relative to the staleness policy |
| Consistency | No contradictions between raw facts and computed metrics |
| Validity | All values conform to defined schemas and source tiers |
| Traceability | Every published value traceable to source URL and date |
| Integrity | Append-only archive — no historical value deleted or overwritten |
How entities are ranked — and why relations come first
Every entity on sourc.dev has a tier. That tier is not editorially assigned. It is computed from data.
Ranking is by relations — not by size, revenue, funding, or reputation — because relations are the only objective measure of ecosystem weight in the AI infrastructure market. An entity that 34 other products depend on carries more ecosystem weight than one that 2 products depend on. That is a data observation, not a value judgment.
This is why Anthropic ranks as ecosystem infrastructure. Not because it was decided editorially. Because 34 published tools have certified dependencies on Anthropic's models. If Anthropic changes its API tomorrow, those 34 products are directly affected.
Tier is not permanent. It is computed from current data and changes as the ecosystem changes. An entity that gains certified integrations moves up. One that loses them moves down. The data decides — always.
| Tier | What the data shows | Determined by |
|---|---|---|
| Infrastructure | 15+ entities carry certified dependency on this one | BRS ≥ 15, IDS ≥ 20 |
| Established | Broad certified presence across the ecosystem | BRS 5–14, IDS 8–19 |
| Emerging | Growing certified presence, not yet at critical mass | BRS 2–4, IDS 3–7 |
| Niche | Limited certified footprint | BRS 0–1, IDS 0–2 |
Thresholds calibrated against current entity distribution. Reviewed quarterly as the platform grows.
What is not published — and why
Transparency includes being explicit about what is left out.
| Not published | Why |
|---|---|
| Free tier pricing | Not comparable — capability and quota limited |
| Enterprise custom pricing | Not publicly verifiable |
| Recommendations or opinions | sourc.dev records data — it does not advise |
| Unverified data as verified | Unknown is published as unknown |
| Metrics below Quality Gate | Calibrating — data not yet sufficient |
| Predictive claims | sourc.dev measures what is, not what will be |
The goal is not to tell you what to do. The goal is to give you the most accurate, sourced, and structured picture of what is — so you can decide for yourself.
The metrics
Price Deflation Index
Measures how much cheaper a model has become relative to the GPT-3 Davinci baseline of June 2020 — the first widely commercially available large language model at scale.
The 2020 baseline gives every model a common, immovable reference point. Consistency over time is more valuable than perfection at a single point.
Current example: GPT-4o PDI 4.2 — 95.8% cheaper than the 2020 baseline. Gemini 1.5 Flash PDI 0.17 — 99.8% cheaper.
Value Density Score
Measures one dimension of intelligence per dollar — specifically, how much MMLU-measured reasoning capability each dollar of input cost purchases.
This is a single-dimension metric. A model with a lower score may still be the right choice when speed, context length, or instruction-following matters more than cost per benchmark point. The limitation is documented because it should be known.
Current example: Gemini 1.5 Flash VDS 1,052 — GPT-4o VDS 35.5.
Integration Density Score
Measures how deeply embedded an entity is in the AI infrastructure ecosystem — weighted by the strength of each integration type. A certified integration contributes more than a community mention. Both are counted. Neither is hidden.
Current example: LangChain IDS 38 — LlamaIndex IDS 12.
Blast Radius Score
Measures ecosystem dependency. How many other published entities have a certified dependency on this one. If this entity changes pricing, degrades, or becomes unavailable — how many AI products are directly affected.
Current example: Anthropic Blast Radius Score 34. A pricing change reaches 34 certified tools directly.
sourc Value Index
Measures value per dollar, adjusted by entity type. How much AI capability does each dollar of monthly spend deliver — within each domain.
Version 1 is intra-domain only — valid for comparing tools against tools, and models against models. A normalized cross-domain version is in development. Partial metrics are labeled as partial.
Current example: Cursor sourc Value Index 4,650 — coding tools domain.
Velocity Index
Measures the rate of new certified integrations over a rolling 90-day window. Shows the direction of ecosystem position change over time.
In development. Active Q3 2026 when sufficient relation history is available.
Versioning — why history is never rewritten
When a formula changes, it receives a new version number. Historical values computed under the old formula are never overwritten. Both coexist in the archive with their respective version tags.
The archive is append-only by design. No row is ever deleted or modified. This is not a technical preference. It is a business rule — because the archive is the asset.
Changelog
| Version | Date | Change |
|---|---|---|
| v1.1 | 2026-03-31 | PDI formula corrected. BRS and IDS formally differentiated. SVI reclassified as v1 intra-domain metric. Tier system strengthened with computed thresholds. Pricing decision rewritten. Data quality assurance added. |
| v1.0 | 2026-03-31 | Initial publication. |
Frequently asked questions
Founder's Note
I built sourc.dev in 2025 because I had a problem I could not solve with any existing tools. I am an investor. I needed verified, sourced, timestamped intelligence on the AI infrastructure market before making capital allocation decisions. That intelligence did not exist in a form I could trust. So I built it myself.
This is self-funded, self-incubated, and built by one person. The standard I hold this data to is simple: would I trust this number with my own capital? If the answer is no, it does not get published.
What I have built I consider excellent — but not perfect. It will probably never be perfect. That is part of the mission.
I am not on social media. My time goes to building — the platform, the data infrastructure behind it, the API stack, and a mobile application. There is more work than there is time. I prefer it that way.
Founder, sourc.dev / HODLR & CO Labs / Cloud Consulting Sweden AB
Citations and contact
For citations in published work:
HODLR & CO Labs. https://sourc.dev/methodology
© 2026 HODLR & CO Labs. All rights reserved.
For questions about methodology, corrections, institutional access, or licensing:
Every formula change is logged. Every error is corrected publicly. This is the standard. Full stop.