LYRENTH
HomePricingIntegrationsDocsIndex statsAboutContact
AI-READABLE WEB INDEX

The web,
rebuilt for AI.

Lyrenth turns messy web pages into clean, structured AIDocuments: smaller, cheaper, and faster for agents, assistants, and model companies to read than raw HTML. One index. The open web, made machine-readable.

AI-readable pages indexed - Live
828,205,406
Indexed domains
89,182,445
AI-readability score
5.4/ 10
Pages audited
826,951,683
Indexing live
02 · THE TRANSFORM

From layout noise
to structured intelligence.

An indexed page is not a scraped page. It is parsed, cleaned, normalized, and rewritten into a single canonical document a model can read in one pass.

RAW HTML · example.com/article412 KB · NOISY
<nav class="header-mega">
  cookie consent · gdpr banner
<div id="ad-slot-top-970x250">
  <script src="analytics.js">
  newsletter modal · share bar
<article>
  The actual content lives here,
  buried under boilerplate.
</article>
  related links · 38 footer nav items
  tracking pixels · web fonts · css
</body>
AIDOCUMENT · lyrenth://example.com/article9.4 KB
{
  "url": "example.com/article",
  "title": "The actual content",
  "lang": "en",
  "content": [ 14 clean blocks ],
  "entities": [ … ],
  "links": [ canonical only ],
  "tokens": 1840,
  "noise_removed": 0.91,
  "freshness": "2m ago"
}
412 KB → 9.4 KBPer-page payload
91%Boilerplate removed
1Canonical document
1 passModel-ready
LIVE CORPUS · CONTROL SURFACE
SHARDS 64DOMAINS 89.2M--:--:-- UTC
828,205,406
AI-readable pages indexed
Indexed domainsacross the open web
89.2M
Pages auditedscored for AI-readability
827.0M
AI-readability scoreaverage across audited pages
5.4
Corpus freshnessrecrawl cadence by plan
24h / 90d
04 · THE COST OF MESSY HTML

AI reads the web the hard way.

Every crawl drags in navs, ads, cookie walls, scripts, trackers, and layout markup. Models pay, in tokens, latency, and dollars, to parse junk before they reach a single useful sentence.

<nav>cookie-consentad-slotanalytics.jsshare-barnewsletter-modalfooter × 38web-fontstracking-pxinline-css
TOKENS PER RAW PAGESIGNAL · 9%
NOISE · 91%
9%
Avg. tokens to read 1 raw page~14,800
Tokens of actual signal~1,330
As a Lyrenth AIDocument~1,840
Cost & latency reductionup to 8×
05 · THE ADAPTER LAYER

A new layer between
the open web and AI.

Lyrenth sits in the middle of the stack, continuously turning the live web into a single machine-readable format every AI system can consume.

INPUT

The open web

HTML · JS · noise

Billions of messy, inconsistent, render-heavy pages.

PROCESSINGLYRENTH

Parse · clean · normalize

render → extract → structure

Render, strip boilerplate, extract entities, normalize to schema.

OUTPUT

AIDocument JSON

clean · canonical · small

One structured document per URL, versioned and fresh.

CONSUMERS

AI systems

agents · RAG · labs

Read per-URL, or license the corpus in bulk.

06 · PER-URL RETRIEVAL

One request.
One clean document.

Point Lyrenth at any URL and receive a structured AIDocument, content, entities, links, and metadata, ready to drop into a context window or a vector store.

Every read resolves against the shared index, not the origin. When a thousand agents request the same URL, the origin sees one fetch and everyone else is served from cache in milliseconds. And most of the web doesn't change between reads: for the typical page, the cached copy is the page. No other fetcher amortizes like this: a per-call scraper re-fetches the same unchanged page for every customer, every time. When freshness actually matters, force_refresh re-crawls on demand.

RESTper-URL read
1 fetchevery caller, cross-caller cache
JSONStable schema
GET/v1/read?url=example.com/article
POST/v1/aidocument
GET/v1/stats
AIDocument.json200 · clean
07 · BUILT FOR

Every system that
needs to read the web.

If it consumes web data, it runs better on clean AIDocuments than on raw HTML.

01Autonomous agentsBrowse and act with structured pages instead of burning context on markup.PER-URL · STREAM
02AI assistantsAnswer with fresh, normalized web content and clean citations.RETRIEVE · CITE
03AI search & discoveryBuild on a corpus already structured for ranking and recall.SEARCH · RANK
04RAG systemsSkip the scrape-and-clean pipeline, embed AIDocuments directly.EMBED · INDEX
05Model labsTrain and evaluate on a clean, deduplicated, web-scale corpus.CORPUS · BULK
06Enterprise & researchMonitor, extract, and analyze the live web as structured data.PIPELINE · API
Raw crawling
Scraping-as-a-service
A normal search engine
Another data broker
08 · THE CATEGORY

Lyrenth is the
AI-readable
web index.

The web, transformed into clean structured data that AI systems can read, search, and use directly.

GET STARTED

Read the web
like a machine.

Get an API key and pull your first AIDocument in under a minute. Web-scale, structured, and live.