Web index

How AI-readable
is the open web?

Lyrenth crawls and audits the open web for the signals that matter to AI agents reading it: structured data coverage, render-mode mix, heading hygiene, content density, and access friction. The numbers below are aggregates across our public index, refreshed every 10 minutes.

Indexed documents
9,566,640

Total pages currently in the canonical index

Audited pages
8,207,932

85.8% of indexed pages audited

Avg AI Readiness
5.3 / 10

Weighted mean across seven content signals

Per-signal coverage

What the audit measures

Each audited page is scored on seven content-derived signals. Below is the corpus-wide mean for each one: how the open web performs against AI-agent-friendly hygiene, aggregated across every page Lyrenth has audited.

Structured data
20% weight26%

Whether the page carries JSON-LD blocks (Article, Product, FAQPage, etc.). Structured data lets agents extract facts without parsing prose.

Heading hygiene
15% weight63%

Single H1, monotonic descent through H2 / H3. Predictable structure makes a page easier to skim and section.

Static renderability
20% weight50%

Share of pages served without a headless-Chromium escalation. Static-renderable pages cost less for crawlers and never arrive empty to first-pass scrapers.

Content density
20% weight39%

Ratio of meaningful markdown to raw HTML. High density means most of the page is content, not nav / chrome / ads.

Title & description
10% weight66%

Non-empty, sensible-length title and description that are not generic placeholders.

Open access
10% weight100%

Whether content reaches readers (and agents) without hitting a paywall or login wall. Higher means the page is open.

Content depth
5% weight75%

Whether the page has enough words to be substantive on its own. Sub-stub pages get partial credit; pages with no body fail outright.

How to read this

What these numbers mean

For AI agent builders

A high site-wide structured-data percentage means agents can extract facts cheaply. Low static-renderability or low content density means more of your input tokens are paying for headless rendering and chrome, not content. Lyrenth normalizes these for you in a single AIDocument shape regardless of how the source page is built.

Read the quickstart
For site owners

The signals above are exactly what Lyrenth measures per page on your verified domains. Lyrenth handles AI agent traffic to your site: clean, structured pages served to every agent reading you. Verify a domain to see your own AI Readiness score on the dashboard.

Verify a domain

Methodology: every audited page is scored on seven content signals, each 0.0 to 1.0. Per-page scores roll up to a domain average and a corpus-wide mean. Raw page content stays in our private index; only aggregate counts and means are public. Refreshed every 10 minutes. Full methodology in /llms-full.txt.