From the team

What we learned
building Neureil

Real write-ups on feeding web data to LLMs, cutting token costs, and the iterations that led to where we are today.

Our Journey

v0.1 — alpha

Late March 2026

We started with raw fetch

The first version was just a wrapper around Node's fetch. Pass a URL, get back raw HTML. Simple. We tested it with GPT-4o on a few news articles and the token counts were brutal. One BBC article came back at over 11,000 tokens. The model spent 80% of its budget reading nav bars, cookie notices, and comment sections before it hit the actual story.

v0.2 — readability layer

Early April 2026

Added Mozilla Readability, saw the first real drop

We dropped Mozilla's Readability library in and ran the same 50 URLs. Average token count went from around 8,400 down to 1,100 per page. That was the first moment we thought this could be a real product. But it still had issues — tables came out malformed, code blocks lost their indentation, and a lot of metadata was missing.

avg 8,400 tokens → 1,100 tokens

v0.3 — structured output

Mid April 2026

Switched from plain text to structured JSON

Plain text output wasn't enough for agents that needed to understand page structure. We redesigned the output to be a structured JSON object with separate fields for title, author, published date, main content, and code blocks. This made it usable as context without any post-processing on the caller's side. We also fixed the table and code block handling.

v0.4 — caching + benchmarks

Late April 2026

Built caching, ran the proper benchmark

We added a 24-hour response cache so repeated requests to the same URL are instant. Then we did a proper benchmark across 200 URLs covering news, documentation sites, e-commerce, and developer blogs. The result was an average 92% token reduction versus raw HTML. We published those numbers on the landing page and they have held up since. P95 latency for uncached requests was 847ms.

92% token reduction · 200 URL benchmark

v0.5 — JS rendering

Early May 2026

Added JavaScript rendering for SPAs

A lot of modern sites, especially documentation and product pages, render their content client-side with React or Vue. Plain fetch returned near-empty HTML for those. We added a headless rendering path via Puppeteer that kicks in when the initial fetch returns insufficient content. The routing logic picks the fastest path automatically so you never have to think about it.

v1.0 — public launch

May 13, 2026

Launched publicly with API keys and billing

After six weeks of internal testing and a few friends using it for their agents, we launched publicly with a full API key dashboard, usage tracking, and two pricing tiers. The product you are reading about today. Still lots to build.

Latest posts

// before neureil — raw html as context
const html = await fetch(url).then(r => r.text())
// 9,200 tokens. mostly nav & ads.
 
// after — clean extraction
const data = await fetch('https://api.neureil.com/extract', {
  method: 'POST',
  body: JSON.stringify({ url })
}).then(r => r.json())
// 740 tokens. just the content.

LLM Pipelines · May 13, 2026 · 6 min read

How to Feed Web Data to an LLM Without Wasting 90% of Your Context Window

Raw HTML is the worst possible input for a language model. Here is what actually happens when you pass a webpage as context, why it breaks your agents, and how to fix it.

Read post

// token usage audit — gpt-4o, 1000 requests
const rawHtml = { avg: 8420, total: 8_420_000 }
const cleaned = { avg: 674, total: 674_000 }
 
const costPer1M = 2.50 // gpt-4o input
const savings = (8_420_000 - 674_000) / 1_000_000 * 2.50
console.log(`Saved $${savings.toFixed(2)} per 1k calls`)
// Saved $19.37 per 1k calls

Cost Optimization · May 13, 2026 · 5 min read

How We Cut Token Costs by 92% for AI Agents That Browse the Web

LLM bills scale with tokens. If your agent reads web pages, most of those tokens are garbage. We ran the numbers, and the waste is more expensive than you think.

Read post

// traditional pipeline
async function scrape(url) {
  const html = await fetch(url)
  const $ = cheerio.load(html)  // hand-roll selectors
  return $('.article-body p').text()  // breaks every deploy
}
 
// ai-ready pipeline
const result = await neureil.extract(url)
// always works, zero selectors

AI Pipelines · May 13, 2026 · 7 min read

Web Scraping for AI Pipelines Is Completely Different from Traditional Scraping

CSS selectors and XPath were built for extracting specific fields from known page layouts. AI pipelines need something else entirely. Here is what changed.

Read post

// neureil response shape
{
  "title":    "Understanding Transformers",
  "author":   "Andrej Karpathy",
  "published": "2025-11-04",
  "content":  "The attention mechanism...",
  "tokens":   812,
  "cached":   false
}

Developer Guide · May 13, 2026 · 5 min read

Why Structured Data Extraction APIs Are Replacing Custom Scrapers in AI Stacks

Developers used to write custom scrapers for every site. Now that AI is the consumer, the requirements have completely changed. Structure beats selectors.

Read post

What we learnedbuilding Neureil

What we learned
building Neureil