What technical changes does an agency need to make on a client's website for GEO?
Six technical changes do most of the work: add valid JSON-LD schema to every page, publish an llms.txt index, fix canonical URLs, make sure content is server-rendered (or pre-rendered) rather than client-only JavaScript, allow AI crawlers in robots.txt, and write meta descriptions in the 150-160 character band. None of these are GEO-specific tricks. They are standard technical-SEO hygiene that AI engines happen to reward heavily.
1. Add valid JSON-LD schema to every page
AI engines parse schema.org structured data to understand what a page is and what entity it describes. The minimum every page needs is an Article or WebPage object plus an Organization object identifying the publisher.
A real example for a blog post on a client's site:
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "Article",
"headline": "How to choose a B2B email platform",
"author": {
"@type": "Person",
"name": "Jane Doe",
"url": "https://example.com/team/jane-doe"
},
"publisher": {
"@type": "Organization",
"name": "Example Co",
"logo": {
"@type": "ImageObject",
"url": "https://example.com/logo.png"
}
},
"datePublished": "2026-03-01",
"dateModified": "2026-04-12",
"mainEntityOfPage": "https://example.com/blog/b2b-email-platform-guide"
}
</script>Layer additional types where they fit: FAQPage on Q&A sections, HowTo on step-by-step guides, Product with Offer on pricing pages, BreadcrumbList on navigation. Validate every page with the Schema Markup Validator at validator.schema.org before shipping.
2. Publish an llms.txt at the domain root
llms.txt is an emerging convention (proposed by Jeremy Howard, adopted by Anthropic and several others) that gives AI crawlers a clean index of your site's most important content. It lives at https://example.com/llms.txt and looks like:
# Example Co
> B2B email platform for ecommerce brands.
## Docs
- [Getting started](https://example.com/docs/start): How to set up your first campaign
- [API reference](https://example.com/docs/api): REST API endpoints and auth
## Guides
- [Email deliverability](https://example.com/guides/deliverability): Improving inbox placementNot every engine reads it yet. It costs an hour to ship and is a leading-indicator signal: when engines start using it heavily, sites that already have one rank first.
3. Fix canonical URLs
Every page needs a single canonical URL declared with <link rel="canonical" href="https://example.com/canonical-path">. Trailing slashes, query parameters, and tracking pixels should all resolve to the same canonical. AI engines deduplicate citations by canonical, so a page with three URL variants splits its citation share three ways.
4. Server-render or pre-render the body content
If the page's main text only appears after JavaScript runs, most AI crawlers will not see it. ChatGPT's browse tool and Perplexity's crawler execute some JS but not reliably. The fix is server-side rendering (Next.js, Remix, Rails, Django, ASP.NET MVC) or static pre-rendering (Astro, Hugo, Eleventy) for any page meant to be cited.
Test it: curl -A "Mozilla/5.0 (compatible; PerplexityBot/1.0)" https://example.com/page and check whether the article text is in the response body. If you see an empty shell with a loading spinner, fix it.
5. Allow AI crawlers in robots.txt
Many sites quietly block AI crawlers by default. The fix is explicit allow rules for the bots you want citing you:
User-agent: GPTBot
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: Google-Extended
Allow: /
User-agent: CCBot
Allow: /Some clients have legal reasons to block specific bots (training-data concerns). Have that conversation explicitly rather than blocking by default. GPTBot is the search crawler for ChatGPT; Google-Extended controls Gemini training. Both can be allowed independently of the other.
6. Write meta descriptions in the 150-160 char band
AI engines often lift the meta description as the snippet they cite. Too short (under 120 chars) and engines fall back to extracting an arbitrary sentence; too long (over 165) and it gets truncated mid-thought. Aim for one complete sentence, 150-160 characters, that answers the page's core question and ends with a specific hook.
Anti-patterns to remove on inherited client sites
- Lazy-loaded article body. The first 500 words of a blog post should be in the initial HTML, not loaded on scroll.
- Author bylines that link to nowhere. AI engines treat anonymous content as lower-trust. Real
Personschema withsameAslinking to a LinkedIn or X profile lifts citation rates noticeably in our scans. - Pop-ups blocking content. Cookie banners and email modals that hide the article body during initial render confuse crawlers. Keep them out of the critical content area.
- Duplicate pages. A page reachable at both
/productand/products/mainsplits authority. Pick one canonical and 301 the other.