Photo by Jakub Żerdzicki on Unsplash
In 2024, a developer named Jeremy Howard proposed a tiny text file that was supposed to fix AI search. Two years later, more than 844,000 websites have added one. There's just one problem: the AI companies it was built for mostly ignore it.
That file is llms.txt. If you run a blog and you've watched ChatGPT, Claude, and Perplexity start quoting websites in their answers, you've probably wondered whether you need one too.
Here's the short answer. llms.txt is a plain-text file that lists your most important pages for AI crawlers to read. As of May 2026, no major AI provider, not OpenAI, Anthropic, or Google, confirms that it reads yours. Adding one costs five minutes and won't hurt. It also won't get you cited.
This guide cuts through the hype. You'll learn what llms.txt is, what the spec says, who actually reads it (almost nobody), and the handful of things that genuinely move your blog into AI answers. No fluff, no "the future is here" hand-waving. Just what the server logs show.
What is llms.txt?
llms.txt is defined as a Markdown file, placed at the root of your domain (yoursite.com/llms.txt), that hands large language models a curated map of your site's most important content. Jeremy Howard, co-founder of Answer.AI, published the spec on September 3, 2024.
Think of it as a menu, not a meal. Instead of forcing an AI crawler to wade through your nav bars, cookie banners, and footer links, you hand it a clean list: here are my best pages, here's what each one covers. The format is human-readable on purpose.
It's often compared to two files you already know:
robots.txt tells crawlers what they're allowed to access.
sitemap.xml lists every URL so search engines can find them.
llms.txt curates and summarizes your best content for language models.
The pitch is simple. AI models have limited context windows. A tidy Markdown summary theoretically lowers the cost of understanding your site, which should raise your odds of being quoted. Theoretically. We'll get to why that hasn't played out yet.
What an llms.txt file actually looks like
The spec is short. A valid llms.txt file needs one H1 (your site name), an optional blockquote summary, and one or more H2 sections that list links in Markdown. Each link can carry a short note after a colon.
Here's a minimal example for a blog:
# Acme Blog
> Practical guides on indie SaaS marketing, SEO, and shipping fast.
## Core guides
- [SEO for founders](https://acme.com/blog/seo-for-founders): Our starter guide
- [Programmatic SEO 101](https://acme.com/blog/pseo-101): How to publish at scale
- [AI content that ranks](https://acme.com/blog/ai-content): What works in 2026
## Optional
- [Changelog](https://acme.com/changelog): Product updatesThe ## Optional section is special. The spec says crawlers can skip it when they need to save context. Everything uses Markdown instead of XML because the file is meant to be read by models and agents, not parsers.
That's the whole standard. Copy the block above, swap in your URLs, save it as llms.txt, and upload it to your root directory. You're done in five minutes.
llms.txt vs robots.txt vs sitemap.xml
These three files get lumped together, but they do different jobs. Only one of them is an official, universally respected standard. Here's how they stack up.
File | What it does | Who reads it | Official standard? |
|---|---|---|---|
robots.txt | Sets crawl permissions (allow / disallow) | All major search and AI crawlers | Yes, since 1994 |
sitemap.xml | Lists every URL for discovery | Google, Bing, most crawlers | Yes, backed by Google and Bing |
llms.txt | Curates and summarizes top pages for LLMs | Almost no one (see below) | No, a 2024 proposal |
The gap in that last column is the whole story. robots.txt and sitemap.xml get honored because search engines committed to them. llms.txt is a proposal that the major AI labs never agreed to support.
So when a tool promises that llms.txt is "the standard ChatGPT and Claude use," check the last column again. A file one developer published is not a standard until the companies it targets adopt it. As of May 2026, they haven't.
Do ChatGPT, Claude, and Perplexity read llms.txt?
No. As of May 2026, no major AI provider has confirmed that its crawler fetches or uses your llms.txt file. This is the single most important fact in this guide, and most articles bury it.
Google's John Mueller put it bluntly in mid-2025:
"No AI system currently uses llms.txt. It's super-obvious if you look at your server logs. The consumer LLMs/chatbots will fetch your pages, for training and grounding, but none of them fetch the llms.txt file."
That last line matters. The bots that crawl your real HTML, like GPTBot, ClaudeBot, and PerplexityBot, just don't ask for the summary file you made for them. Here's where each major player stands.
Provider | Publishes its own llms.txt? | Confirmed it reads yours? |
|---|---|---|
OpenAI (ChatGPT) | Yes, for docs | No |
Anthropic (Claude) | Yes, for docs | No |
Google (Gemini, AI Overviews) | No | No, publicly declined |
Perplexity | No | No official confirmation |
Notice the trap. OpenAI and Anthropic publish llms.txt files for their own developer docs. People see that and assume the crawlers must read everyone else's. They don't. Publishing a file and consuming one are different things.
The contrarian truth: a solution in search of a problem
Here's the part the hype skips. llms.txt may not just be ineffective today. It may be structurally pointless.
Ryan Law, Director of Content Marketing at Ahrefs, ran the analysis and didn't hedge. "There's no evidence that llms.txt improves AI retrieval, boosts traffic, or enhances model accuracy," he wrote, before calling it a solution in search of a problem.
John Mueller compared it to the keywords meta tag. That comparison stings if you know SEO history. The keywords meta tag let site owners declare their own topics. Search engines ignored it for years because anyone can lie about their own pages. llms.txt has the same flaw. It's a self-reported file the site controls, so a smart crawler trusts your actual content over your summary of it.
There's also a chicken-and-egg problem. A model only benefits from llms.txt if it fetches the file. None of them fetch it. So even a perfectly written llms.txt sits at your root, unread, while ClaudeBot and GPTBot crawl your real HTML. That's not a knock on the idea. It's just what the logs show.
The Crawl-Read-Cite framework
To see why llms.txt underdelivers, use a simple model. For an AI engine to quote your blog, three things have to happen in order. Call it the Crawl-Read-Cite framework.
Crawl — an AI bot has to fetch the page. If it never requests a URL, nothing downstream matters.
Read — the bot has to parse and understand the content: clean HTML, clear headings, structured data.
Cite — the model has to judge your page worth quoting in an answer over every competitor.
Now map llms.txt onto that. It aims to help with step two, Read, by handing over a tidy summary. But it depends on step one, Crawl, because a bot has to fetch the llms.txt file itself. Since no major bot fetches it, llms.txt fails before it can help. It's a step-two fix that never clears step one.
The lesson isn't "ignore AI search." It's "spend your effort where the funnel actually leaks." For almost every blog, that means making your real pages easy to Crawl, Read, and Cite, not maintaining a file no crawler requests.
What actually gets your blog cited by AI
If llms.txt won't do it, what will? The same moves from any solid answer engine optimization playbook, applied to pages crawlers already fetch. The data here is real, not aspirational.
Start with structure and data, because both correlate with citations:
Lead with the answer. Roughly 44% of all ChatGPT citations come from the first third of a page, per a 2025 study of citation patterns. Put the direct answer up top.
Add statistics. The Princeton GEO study (KDD 2024) found that adding citations, quotes, and statistics lifted AI visibility by 30 to 40%. Pages with five or more stats get cited noticeably more often.
Use tables and lists. Comparison pages with three or more tables earn around 26% more ChatGPT citations than prose-only pages.
Earn real authority. The same E-E-A-T trust signals that get you cited in AI Overviews and ChatGPT, like backlinks, brand mentions, and depth, carry over from classic search.
Then handle the plumbing that lets crawlers Read you: semantic HTML, descriptive headings, fast load times, and blog schema markup so machines can parse your FAQs and how-tos. None of that lives in an llms.txt file. All of it lives in your actual pages.
Where llms.txt might still make sense
There's one place llms.txt genuinely earns its keep, and it isn't blog SEO. Jeremy Howard built it for coding agents and documentation. When a developer points Cursor or Claude at your API docs, a clean llms.txt can hand the agent the exact pages it needs without burning context on navigation.
That's why the real adopters are developer-tool companies: Mintlify, Cloudflare, Tinybird, and Anthropic's own docs. For a documentation site or an API reference that agents pull from directly, llms.txt is useful today.
The trouble started when SEO tools rebranded a docs-and-agents convention as a "get cited by ChatGPT" tactic for marketing blogs. Those are two different jobs. Nobody is manually feeding your blog into Cursor. And the search crawlers behind AI answers don't fetch the file. So the original use case works. The SEO use case that got hyped doesn't. Knowing the difference saves you from chasing the wrong win.
Should you add an llms.txt anyway?
Maybe. The honest answer is that it's cheap insurance, not a growth tactic. If a major lab adopts llms.txt next year, you'll already be set. If they never do, you spent five minutes. Just don't expect traffic from it, and don't let it crowd out the work that ranks you.
Adoption is climbing fast off a tiny base, which is why it feels bigger than it is:
BuiltWith counted 844,000-plus sites with an llms.txt file by late October 2025.
SE Ranking found only 10.13% of nearly 300,000 domains had one.
A crawl of the Majestic Million grew from 15 sites in February 2025 to 105 by May, a 600% jump from almost nothing.
If you want one, keep it accurate and let it follow your real content. This is exactly the kind of plumbing an AI publishing setup should handle for you. Quillly, for instance, serves its own llms.txt at quillly.com/llms.txt and auto-generates the files that crawlers actually read, your XML sitemap and RSS feed, every time your AI ships a post with publish_blog. The goal is to never hand-maintain this stuff again.
Whatever you choose, add a noindex header to your llms.txt so it doesn't surface as a thin page in Google. Mueller recommended exactly that.
How to tell if AI is actually reading your blog
You don't have to guess whether AI crawlers visit your site. Check the evidence directly. Two sources tell you almost everything: your server logs and your referral analytics.
In your server logs (or a tool like Cloudflare's bot analytics), filter by user agent. The AI crawlers that matter announce themselves:
GPTBot and OAI-SearchBot handle OpenAI's training and search crawling.
ClaudeBot is Anthropic's crawler.
PerplexityBot is Perplexity's crawler.
Google-Extended is Google's token for AI training access.
Look at what they request. You'll almost always see them fetching your real posts and your sitemap.xml. You'll almost never see a request for /llms.txt. That single observation, on your own logs, settles the debate faster than any article can.
Then check referral traffic. In your analytics, watch for visits from chatgpt.com, perplexity.ai, and gemini.google.com. Those are humans clicking a citation inside an AI answer. That number, not your llms.txt status, is the metric that maps to revenue.
Here's a quick audit to run once a month:
Filter server logs for GPTBot, ClaudeBot, PerplexityBot, and OAI-SearchBot.
Confirm they're fetching your posts and sitemap (they should be).
Check whether anything requests
/llms.txt(it almost certainly won't).Track referral sessions from chatgpt.com, perplexity.ai, and gemini.google.com.
Note which posts earn those AI referrals, then write more like them.
If a major provider ever does start fetching llms.txt at scale, your logs will show it first. Until then, let the data lead. Spend your hour on the posts AI already crawls, not the file it skips.
Frequently asked questions about llms.txt
What is an llms.txt file?
An llms.txt file is a plain-text Markdown file at the root of your domain that lists and summarizes your most important pages for AI models. Jeremy Howard proposed it in September 2024. It's meant to give tools like ChatGPT and Claude a clean map of your site, similar to how a sitemap helps search engines find your URLs.
Does Google use llms.txt?
No. Google has publicly said it doesn't use llms.txt and has no plans to. John Mueller stated that no AI system currently fetches the file, and compared it to the old keywords meta tag that search engines learned to ignore. Google Search and AI Overviews crawl your real pages instead.
Do ChatGPT and Claude read your llms.txt?
There's no confirmation that they do. OpenAI and Anthropic publish llms.txt files for their own documentation, but neither has said its crawler reads yours during search or answers. Server-log analysis from 2025 and 2026 shows AI bots fetch your HTML pages, not your llms.txt file.
Is llms.txt worth it in 2026?
It's low-risk but low-reward. Creating one takes about five minutes and won't hurt your SEO if you noindex it. But there's no evidence it improves AI citations or traffic today. Treat it as optional insurance, not a ranking tactic, and spend your real effort on content quality and structure.
How do I create an llms.txt file?
Make a Markdown file with one H1 (your site name), an optional one-line summary in a blockquote, and H2 sections listing your best pages as Markdown links. Save it as llms.txt and upload it to your root directory so it loads at yoursite.com/llms.txt. Keep the links accurate and current.
What's the difference between llms.txt and robots.txt?
robots.txt tells crawlers what they may access, and it's an official standard every major bot respects. llms.txt tries to summarize your best content for language models, and it's a 2024 proposal no major AI lab has adopted. One controls access. The other suggests reading material that crawlers currently skip.
Will llms.txt get my blog cited by AI?
Not on its own. Because no major AI crawler fetches the file, it can't influence what gets quoted. What does drive citations is leading with direct answers, adding statistics and tables, using clean structured HTML, and earning authority. Those signals live in your actual pages, which AI bots already crawl.
The bottom line
llms.txt is the most over-hyped, under-tested file in SEO right now. Three things to remember. First, it's a 2024 proposal, not a standard, and as of May 2026 no major AI provider confirms it reads yours. Second, the evidence is clear: Ahrefs found zero retrieval benefit, and Google's John Mueller says the file goes unfetched in server logs. Third, real AI citations come from pages crawlers already read, where answer-first structure, five-plus statistics, tables, and schema lifted AI visibility 30 to 40% in the Princeton study.
Add an llms.txt if you want cheap insurance. Just noindex it and move on. Then put your energy where the funnel actually leaks: shipping well-structured posts on your own domain, fast.
Want your AI to write the post and handle the sitemap, schema, and publishing that actually get you cited? Connect Quillly to Claude, ChatGPT, or Cursor in 30 seconds.
