Use case

Web ingestion for RAG without the scrape-and-clean pipeline.

Every RAG pipeline that touches the web ends up rebuilding the same stages: fetch, render, strip, normalize, dedupe. Lyrenth replaces all of them with one call against the AI-readable web index: pages arrive as clean, structured AIDocuments ready to chunk and embed.

Get API access →Free tier · 2,000 reads/mo · no credit card

Chunk-ready Markdown

Boilerplate, navigation, and consent banners are stripped in the index's extraction pipeline; heading structure survives, so your chunker works with real sections.

Stable envelope for pipelines

The v2 AIDocument schema is a published contract: source trace, cache truth, structure, and signals in the same place for every URL, forever.

Fewer tokens to embed

60-99% smaller than raw HTML on real pages (benchmarks below) means embedding and reranking budgets stretch further on every refresh cycle.

Freshness without re-scraping

Reads within your plan's freshness window serve from the index; stale pages re-crawl automatically. LangChain and LlamaIndex loaders are one pip install away.

Measured, not promised

Real pages, read through the production index, with the economics the API itself reports:

MDN: Overview of HTTP	56,692 raw	5,085 indexed	91.0% saved
Node.js fs module	259,563 raw	102,372 indexed	60.6% saved
PostgreSQL SELECT reference	29,509 raw	17,500 indexed	40.7% saved

All ten benchmarks

Point it at the sources your RAG system already ingests: the free tier's 2,000 reads a month covers a real evaluation corpus, no card required.

Get API access →First 1,000 builders: Starter free for 3 months