Google teaches you how to AI-Optimise your site
Every consultant in your DMs is selling the same product this year. Add an llms.txt file to your root. Restructure every article into question-and-answer blocks. Sprinkle a new AiOptimized schema across your pages. Hire them to “GEO” your site for $4,000 a month so ChatGPT and Google’s AI Overviews start citing you. Google published their official guide on 15 May 2026 and the answer to all of it is no. None of that works. None of it is required. The closest thing to a special trick is making sure your page can be indexed and shown with a snippet, which is the same bar every blog post has had to clear for the last twenty years.
I have been watching agencies pivot their entire pricing tier into “Generative Engine Optimisation” for the better part of a year. The decks all look the same. A chart of ChatGPT user growth next to a chart of declining click-through rates, a slide about citations, then a $40k engagement to “prepare your site for the AI era.” Some of them are running this on the same clients I run consulting work for. The clients ask me what I think. I have been telling them to wait for Google to say something on the record. Google said something.
If you are paying for files, markup, or rewrites that exist only for AI, you are paying for a service Google has explicitly told you does nothing.
What every agency is currently selling you
Three flavours of grift dominate the market. The first is llms.txt. It is a proposed text file that lives at your root and lists the URLs you want large language models to crawl. The pitch is “robots.txt for AI.” It sounds reasonable, which is why it spread. Google’s stance is that it does not use it, will not use it, and you do not need it. No pipeline at Google reads that file. Adding it does not get you cited. Removing it does not get you ignored.
The second is AI-specific schema. Some vendor sold you the idea that a new flavour of JSON-LD or a custom <meta name="ai-content"> tag tells Gemini what to surface. There is no such tag. There is no special schema. The same Article, Organization, FAQPage, HowTo, and BreadcrumbList that already worked for rich results are what Google reads. Anything beyond that is invented.
The third is content chunking and AI-friendly rewriting. The pitch here is that you take every page and reshape it into 80-word answer blocks so a model can quote you cleanly. Google’s response is that pages get pulled into AI Overviews from the same index that powers normal Search results. If your page ranks for the query, you are eligible to be cited. If it does not, no amount of chunking saves you. The work goes into ranking, not into reshaping content the model is already capable of reading.
What Google actually said you must do
The guide is short on purpose. The eligibility rule is one sentence: a page must be indexed and eligible to be shown in Google Search with a snippet. That is the whole bar. If your page can show up in normal Search with a snippet under it, it can be cited in an AI Overview or in AI Mode. If it cannot, no AI surface will pull from it.
That sentence unpacks into a checklist your dev team can run today.
Your pages must be indexable. Open Google Search Console. If your important URLs are showing up as “Crawled, not indexed” or “Excluded by ‘noindex’ tag”, AI surfaces cannot use them either. Fix the indexing problem before anything else. This is the cheapest, highest-leverage edit on most sites I have audited.
Your robots.txt must allow Googlebot. AI Overviews and AI Mode are part of Search, not separate crawlers. The directive that controls them is the same one that controls everything else. If you blanket-blocked AI bots last year in a panic, double-check you did not accidentally close off your own appearance in AI Overviews along with it. The relevant user agents to leave open are Googlebot and Google-Extended, the latter being the one Google uses for Gemini training and AI Overview generation.
Your pages must produce a snippet. Pages that explicitly opt out of snippets with <meta name="robots" content="nosnippet"> or data-nosnippet attributes on key content are telling Google they cannot be summarised. Google takes the hint and excludes them from AI surfaces too.
<!-- Default: allow snippets, large image previews, AI surfacing -->
<meta name="robots" content="max-snippet:-1, max-image-preview:large">
<!-- Block AI Overviews and AI Mode from using this page -->
<meta name="robots" content="nosnippet">
<!-- Cap quoted text at 160 characters, AI Overviews respect this -->
<meta name="robots" content="max-snippet:160">
<!-- Exclude a single paragraph from any snippet, AI or otherwise -->
<p data-nosnippet>This sentence will not appear in a Google snippet.</p>
The max-snippet and data-nosnippet controls are the only AI-specific tuning Google actually supports. They let you decide how much of your page an AI surface can quote. Most sites should leave them open. If you have legal text, gated content, or anything you would rather not have summarised verbatim, that is where these tags earn their keep.
The boring diff I shipped to this site this weekend
I read Google’s guide on Friday. By Sunday night I had merged a PR to this site applying every recommendation in it. None of what I changed was AI-specific. All of it was overdue boring SEO work I had let drift because the site was “ranking fine.”
The metadata that makes a snippet possible was the largest single change. The shared layout at src/layouts/BaseLayout.astro now emits a full set of Open Graph and Twitter tags on every page: og:image, og:image:alt, og:site_name, og:locale, twitter:image, twitter:title, twitter:description. I had declared twitter:card="summary_large_image" without an actual image set, which is the kind of half-finished metadata that quietly disqualifies a page from rich previews. The layout now also swaps og:type between website and article per page, and emits article:published_time, article:author, and article:tag on posts. Every top-level page got a unique meta description instead of silently inheriting the homepage’s.
Structured data, but only the flavours Google’s guide names. A sitewide JSON-LD @graph containing a Person (linked to my LinkedIn through sameAs) and a WebSite, both with @id so child pages can reference them by ID instead of duplicating the data. Every article page now emits a BlogPosting with headline, datePublished, author and publisher references, image, keywords, and mainEntityOfPage, plus a BreadcrumbList. The writing index emits a Blog listing every post. The about page emits a ProfilePage that references the same Person. These are the schema types the guide names by hand. There is no AiOptimized, no GenerativeContent, no custom AI markup, because none of those exist.
Semantic HTML the crawler can parse without guessing. Article dates are now <time datetime="ISO"> instead of a styled span. The author byline gets rel="author". Every article cover image inside src/components/ArticleCard.astro and the homepage hero now has an alt attribute that describes the article instead of being blank. A new src/pages/rss.xml.ts generates a feed from the writing collection, and the base layout emits <link rel="alternate" type="application/rss+xml"> so a crawler finds it without me having to tell anyone. The sitemap config inside astro.config.mjs now writes lastmod, changefreq, and priority per URL, with article pages prioritised above tag and listing pages, and the homepage at 1.0.
What I deliberately did not add. No llms.txt. No AI-specific markup file. No chunked rewrite of any article. No edits to my robots.txt, which was already permissive, because the guide is explicit that there is no separate AI directive to add. Every one of these would have been billable hours from an agency. Every one of them does nothing.
The part nobody wants to repeat
Once your page is eligible, the guide says the deciding factor is content quality. Same ranking systems. Same E-E-A-T signals. The phrase Google uses is “unique, compelling, and useful” content. That is the lever. Not the file extensions, not the JSON-LD, not the chunked rewrites.
The work this actually demands is uncomfortable for a content team that has been running on AI-generated filler for two years. Original reporting beats summary. First-hand experience beats aggregation. Proprietary data beats restating a public study. Expert interpretation beats a paraphrase of someone else’s report. The page that gets cited in an AI Overview is the one the model could not have generated on its own. If your article is downstream of three other articles the model already crawled, you are not adding anything to the answer and you will not get the link.
This is also why scaled AI content farms are dying quietly. Google’s helpful content system has been deprioritising them for two years, the AI Overviews layer compounds that pressure, and the guide reiterates that “scaled content abuse” violates spam policy regardless of whether a human or a model wrote it. Volume is no longer a strategy. The page that ranks is the one the rest of the internet is downstream of.
Stop optimising for AI. Start being worth citing.
Here is what to do this week. Open your robots.txt and confirm Googlebot is allowed. Open Search Console and fix every important URL flagged as not indexed. Audit your homepage and your top ten pages for a meta robots tag and decide whether you want nosnippet on any of them, in most cases the answer is no. Delete the llms.txt file if your last agency added one, it is doing nothing and leaving it just confuses future audits. Stop paying anyone for a “GEO audit” that is not already inside a normal SEO audit. The audit you needed in 2023 is the audit you need now.
Then go write the page only you could have written. The one with the screenshot from your client’s real dashboard. The one with the conversation you had with the founder. The one with the number nobody else has. That page is the one an AI Overview pulls from. The other pages are training data.
If you want the original document, read Google’s full guide here. Fair warning, it will feel anticlimactic if you have been paying for the alternative.