type: "html" · live

HTML to Markdown —
drop token count by up to 85%

Send raw HTML to LeanTokn, get back clean Markdown. Fewer tokens, same meaning. Works with GPT-4, Claude, Gemini, Llama — any LLM that reads text.

avg. 82%token reduction
< 200msp50 latency
1M tokensfree / month
↓ Try it live
live demo

Paste any HTML. See the Markdown.

Output will appear here after conversion.
No API key needed · free demo
why it matters

Tokens are money. HTML wastes most of them.

HTML is token-heavy by design

A typical web page carries navigation bars, nested divs, inline styles, data attributes, and scripts — none of which a language model uses. HTML often carries 6–10× more tokens than the same content in Markdown.

Same content, fraction of the cost

LeanTokn preserves every heading, paragraph, list, table, link, and code block. Your model reads the same information. You pay for 15% of the tokens.

One line in your pipeline

No library to install. No parsing logic to maintain. One HTTP POST before you embed or prompt. Works in every language and every runtime.

use cases

Built for developers who feed the web to LLMs

RAG pipelines

Scrape, trim, embed. Smaller chunks → better retrieval.

Web search agents

Trim search results before passing to your agent.

Documentation Q&A

Index HTML docs as Markdown for cleaner context.

Content summarisation

Trim article HTML before sending to a summariser.

LLM chatbots

Trim scraped product pages for shopping assistants.

Data extraction

Clean structured content from messy HTML before parsing.

integration

Drop-in. Any language.

curl -X POST https://api.leantokn.com/v1/trim \
  -H "Authorization: Bearer ltk_..." \
  -H "Content-Type: application/json" \
  -d '{"prompt": "<html>...</html>", "type": "html"}'
benchmarks

Real-world token savings

Measured on real pages using cl100k_base tokenisation — the same encoding GPT-4 and Claude use.

Content typeInput tokensOutput tokensReduction
News article4,20068084%
Documentation page6,1001,10082%
E-commerce product page3,80049087%
GitHub README (raw HTML)2,90052082%
Wikipedia article9,4001,60083%
faq

Common questions

What is the difference between this and html2text or pandoc?

html2text and pandoc are local libraries — they require installation, maintenance, and produce inconsistent output across versions. LeanTokn is a hosted API: one HTTP call, no dependencies, consistent output. It also returns exact token counts using the same tokenizer models use (cl100k_base), so you know precisely how much context you saved.

Does it preserve links, tables, and code blocks?

Yes. LeanTokn preserves the semantic structure of your HTML — headings, links, ordered and unordered lists, tables, blockquotes, inline code, and fenced code blocks. Navigation bars, ads, scripts, style tags, and boilerplate are stripped.

How do I use this in Python, Node.js, or Go?

Any language that can make an HTTP POST request works. See the code examples above for Python and Node.js. For Go, Ruby, PHP, or any other language, POST to /v1/trim with your API key in the Authorization header and body { "prompt": "<html>", "type": "html" }.

Is there a free tier?

Yes. All accounts include 1 million tokens saved per month on low-intensity calls at no charge. HTML→Markdown conversion is a low-intensity operation. No credit card required to start.

What is the maximum input size?

The API accepts up to 1 MB per request. Most web pages are well under this limit. For very large pages with embedded base64 images, strip the <img src="data:..."> tags before sending.

Can I process multiple pages in parallel?

Yes — there is no per-account concurrency limit. Fire as many requests as you need simultaneously. Typical RAG ingestion pipelines process thousands of pages per minute.

Is this the same as running a local Markdown converter?

Similar outcome, very different workflow. LeanTokn is infrastructure — call it from your pipeline and get back Markdown plus exact token counts. No dependencies to install, no library version drift, works identically across every language and environment.

Start saving tokens today

1 million tokens free every month. No credit card required.

▸ Get an API keyRead the docs ↗