llms.txt

llms.txt is a proposed convention for a Markdown file at the root of a domain that provides a clean, AI-readable index of a site’s most important content. It was first proposed by Jeremy Howard in September 2024 and has since been adopted by a growing list of documentation sites, SaaS platforms, and content publishers.

Google’s position on llms.txt

Google’s May 2026 AI optimisation guide explicitly states that llms.txt files are not necessary for AI search visibility.1 The company directly addresses this as a common misconception: “You don’t need LLMS.txt files… special markup… to ‘chunk’ content… [or] to rewrite content for AI systems.”

This means publishing llms.txt provides no functional benefit for inclusion in AI-generated answers or improved AI citations. AI systems can extract and parse web pages directly without special files. If your site already earns traditional search visibility through quality content, clear structure, and authoritative authorship, you already have everything needed for AI visibility.

A common objection: Google itself publishes llms.txt and Markdown pages on developers.google.com, which appears to contradict this stance. Mueller addressed this directly in May 2026: the files are not published for search. Developer documentation benefits from clean, parseable Markdown because AI coding tools work more accurately when reading structured reference material. Mueller described the Markdown versions as “a temporary crutch, perhaps to save some tokens” rather than a search signal.2 Publishing llms.txt or Markdown pages for a developer documentation site is a functional decision, not an SEO one.

What is llms.txt for?

The problem llms.txt addresses is straightforward. Large language models reading a website at inference time face the same challenges a human would: navigation menus, footer noise, pop-ups, advertising, and HTML scaffolding all dilute the signal. Sites are built primarily for browser rendering, not for clean text extraction.

A useful frame: llms.txt addresses functionality, not discovery. Discovery is how pages get found via search. Functionality is what happens once a visitor or agent has already arrived. The two require different tools and have different success metrics. llms.txt belongs in the functionality column, which is why publishing it has no bearing on search visibility.

llms.txt provides a curated, Markdown-formatted entry point that says: here is the canonical structure of this site, here are the most important pages, here is what they contain. It is a hint to AI systems about what matters, in a format they can ingest cleanly.

How does llms.txt differ from robots.txt and sitemap.xml?

FilePurposeAudience
robots.txtWhat can be crawledAll crawlers
sitemap.xmlWhat URLs existSearch engine crawlers
llms.txtWhat matters and what it meansLLM-based retrieval and reasoning systems

The three are complementary, not competing. A site can (and ideally should) have all three.

llms.txt syntax

The format is intentionally simple Markdown. A minimal example:

# Site Name

> One-line description of what the site is about and who maintains it.

Optional longer paragraph providing context about the site, its purpose, and any guidance for how AI systems should use the content (attribution, citation preferences, etc.).

## Section heading

- [Page title](https://example.com/page/): Short description of what this page contains.
- [Another page](https://example.com/another/): Description.

## Another section

- [Page](https://example.com/page2/): Description.

Headings group related content. Bullet links describe individual pages. Descriptions are optional but recommended; they give the AI system a hint about what each linked page covers.

llms-full.txt

Some sites also publish llms-full.txt, a longer document that contains the full content of the site (or a curated subset of it) in plain Markdown. The intent is to give AI systems the option of consuming the entire site content in a single fetch rather than crawling each page individually.

This is most useful for documentation sites, where the goal is to enable an LLM to answer detailed questions about a product without partial-context errors caused by retrieving only one page at a time.

When is llms.txt useful?

Since llms.txt does not improve AI search visibility, consider publishing it only for user experience reasons, not SEO/GEO reasons:

  • Documentation sites. A clean Markdown index of API references, guides, and tutorials can improve user experience for visitors who want a single-document reference. Some AI systems may voluntarily use it, but it is not required for citations.
  • Knowledge bases and reference content. If visitors would benefit from a curated, navigable index of your content, llms.txt serves that purpose, but not for search visibility.
  • Personal and professional sites where you control presentation. Publishing a curated index reflects your editorial judgment about what matters, which may be valuable for your audience even if AI systems ignore it.
  • Browser-based AI agents. Chrome’s Lighthouse 13.3 introduced an Agentic Browsing audit that flags missing llms.txt files.3 The rationale is agent navigation: a browser-based AI agent can understand site structure without crawling every page. If your site is likely to be accessed by task-automation agents or AI-assisted browsers, this is a practical case for publishing the file, separate from search visibility.

The upside of publishing llms.txt is minimal, and the downside is negligible. If you have existing use cases where visitors would benefit from a clean Markdown index, publish it. If you are considering it primarily for AI search visibility, save the effort.

When it does not help

  • E-commerce product catalogues. For search visibility and AI-generated shopping answers, structured data (product schema) and a clean sitemap are what matter. Shopify’s native llms.txt implementation exposes MCP/UCP checkout endpoints for AI shopping agents: a transactional use case, not a retrieval one. Publishing llms.txt does not improve a product’s chances of appearing in an AI-generated shopping answer.
  • General SEO or GEO. Publishing llms.txt does not improve rankings, citations, or AI visibility. Quality content, clear structure, and authoritative authorship are what matter.

Adoption status

As of May 2026, llms.txt remains a community convention, not a formal standard. Google has explicitly confirmed that its AI systems do not require or prefer llms.txt files for search visibility. No confirmed citation benefit exists.

Chrome’s Lighthouse (version 13.3) introduced an Agentic Browsing audit category that includes an llms.txt check.3 The audit marks missing files as optional but flags server errors when fetching the file. Chrome for Developers describes llms.txt as helping AI agents understand site structure when browsing on a user’s behalf. This is distinct from Google Search’s position: Lighthouse is auditing for agent navigation readiness, not search ranking or citation inclusion.

The most significant platform-level adoption to date is Shopify. In early May 2026, Shopify quietly added llms.txt, agents.md, and sitemap_agentic_discovery.xml to every store on the platform, with no public announcement. The implementation exposes MCP/UCP commerce endpoints designed for agent-driven checkout rather than search retrieval, and is customisable via a templates/llms.txt.liquid theme file.

Crawl research from WISLR, tracking bot activity across Shopify stores over 60 days, found that major AI crawlers (BingbotLLM, GPTBotLLM, ClaudeBotLLM) are not reading any of the new agent-discovery files. All three hit legacy paths from third-party workarounds instead. The only active reader is Microsoft’s commerce platform, polling the UCP endpoint weekly as part of Copilot’s checkout integration: a transactional signal, not a search one.4

Crawling ≠ utilisation

A common argument for llms.txt is crawl logs showing AI bots hitting the file. Crawl logs confirm accessibility, not utilisation. Mark Williams-Cook illustrated the gap with cats.txt: a deliberately fictional standard that AI crawlers also fetch. A bot requesting a file and a bot acting on its contents are two different things.

Some publishers continue to publish llms.txt for documentation purposes or UX reasons, unrelated to search visibility. The convention persists as a voluntary publishing standard, similar to how some sites publish human-readable sitemaps alongside sitemap.xml, even though search engines do not require them.

The practical consensus: llms.txt is not a search-related tool. Do not publish it expecting SEO or AI search benefits. Publish it only if you want to provide curated documentation for your audience.

Frequently asked questions

Does publishing llms.txt improve AI search visibility?
No. Google’s 2026 AI optimisation guide explicitly states that llms.txt files are unnecessary for AI citations or search visibility. AI systems can parse web pages directly. If your site earns traditional search visibility, you already have everything needed for AI visibility.

Where should llms.txt live?
If you choose to publish one: the root of the domain at https://example.com/llms.txt. By convention, like robots.txt. However, placement and publication are optional since AI systems do not require or prefer it.

Does publishing llms.txt help with traditional SEO?
No. It is not a search engine signal. Googlebot does not use it for indexing or ranking.

Can I use llms.txt to opt out of AI scraping?
No. llms.txt is an opt-in surface for guiding AI systems toward your content. It cannot be used to block scraping. Opt out of AI scraping via robots.txt directives targeting specific user agents (GPTBot, ClaudeBot, Google-Extended, etc.).

Should I publish llms.txt?
Only if your audience would benefit from a curated, navigable Markdown index of your content. Do not publish it expecting SEO or GEO benefits; do publish it if it improves user experience.

Is there a tool that generates llms.txt automatically?
Several open-source generators exist for common static site generators (Astro, Next.js, Jekyll, Hugo). For larger sites, generating llms.txt from your sitemap and content collection is straightforward. The harder work is curating which pages to include and writing useful descriptions.

Footnotes

  1. Google’s AI optimization guide

  2. John Mueller on llms.txt and Markdown for agents — Bluesky, via Lily Ray

  3. llms.txt audit — Chrome for Developers / Lighthouse 2

  4. The bots crawling Shopify’s agentic stack — WISLR Research