Use case
Web ingestion for RAG without the scrape-and-clean pipeline.
Every RAG pipeline that touches the web ends up rebuilding the same stages: fetch, render, strip, normalize, dedupe. Lyrenth replaces all of them with one call against the AI-readable web index: pages arrive as clean, structured AIDocuments ready to chunk and embed.
Get API access →Free tier · 2,000 reads/mo · no credit card
Chunk-ready Markdown
Boilerplate, navigation, and consent banners are stripped in the index's extraction pipeline; heading structure survives, so your chunker works with real sections.
Stable envelope for pipelines
The v2 AIDocument schema is a published contract: source trace, cache truth, structure, and signals in the same place for every URL, forever.
Fewer tokens to embed
60-99% smaller than raw HTML on real pages (benchmarks below) means embedding and reranking budgets stretch further on every refresh cycle.
Freshness without re-scraping
Reads within your plan's freshness window serve from the index; stale pages re-crawl automatically. LangChain and LlamaIndex loaders are one pip install away.
Measured, not promised
Real pages, read through the production index, with the economics the API itself reports:
| MDN: Overview of HTTP | 56,692 raw | 5,085 indexed | 91.0% saved |
| Node.js fs module | 259,563 raw | 102,372 indexed | 60.6% saved |
| PostgreSQL SELECT reference | 29,509 raw | 17,500 indexed | 40.7% saved |
Point it at the sources your RAG system already ingests: the free tier's 2,000 reads a month covers a real evaluation corpus, no card required.
Get API access →First 1,000 builders: Starter free for 3 months