The technical audit every website needs for AI
Why an AI audit differs from an SEO audit
I notice many people think: I already had an SEO audit done, so I am good. And yes, there is overlap. But an AI audit has a different focus. You are not optimizing for a ranking algorithm that orders pages. You are optimizing for language models that directly answer questions.
That is a fundamentally different beast.
An SEO audit asks: can Google find and understand you? An AI audit asks: can a language model extract your content, understand who you are, and subsequently cite you as a reliable source?
At Kobalt, I run this audit as an entry point for new clients. In two to three hours I have a complete picture. This article is the extended version of that checklist. Go through it, do it yourself, and discover where you stand.
Work through the checklist from top to bottom. The order is intentional: start with the basics (server and access), then structure (HTML and data), then content. Problems at an early stage block everything that follows. Like in baseball: you can have the best batter, but if the pitcher cannot even find the strike zone, nothing happens.
Step 1: server and access
The first questions are simple but crucial. Is an AI crawler even allowed to visit your website? And if it does, does the server respond quickly enough?
You would be surprised how often the answer is "no."
- Check robots.txt: are GPTBot, ClaudeBot, PerplexityBot and Googlebot-Extended allowed? I see surprisingly often Disallow rules that accidentally block everything.
- Measure your TTFB with GTmetrix or WebPageTest. Target: below 400ms. Everything above? Houston, we have a problem.
- Check server logs for HTTP 429 and 503 responses for known bot user-agents.
- Verify HTTPS: no mixed content, valid certificate.
- Is there an llms.txt file at the root domain? If not: add one. Takes five minutes.
- Do you have a sitemap.xml that is up to date?
Step 2: HTML structure and semantics
HTML structure is the backbone of AI readability. A language model reads your DOM from top to bottom and tries to derive from that what is primary and secondary.
No eyes. No intuition for visual hierarchy. Just structure.
- Heading hierarchy: exactly one H1 per page? H2s as logical sections? No skipped levels (H1 to H4)?
- Semantic elements: does the page use article, nav, main, aside, header and footer? Or is it a spaghetti of div elements? (I wrote a whole article about this.)
- Alt texts on images: descriptive and relevant, or empty and generic?
- Load the page with JavaScript disabled. Does the content disappear? Then it is a problem.
- Canonical tag: does it point to the correct URL?
- Hreflang implementation for multilingual sites: correct and complete?
Step 3: structured data and Schema.org
Structured data is the direct communication layer between your website and AI models. Without structured data, an AI model misses context that you have but have never made explicit.
I always compare it to bird rings. A bird without a ring is anonymous. A bird with a ring tells you species, origin, age. Structured data is the ring around your content.
- Organization or LocalBusiness schema on the homepage: name, URL, logo, contact info.
- Article or BlogPosting schema on all blog posts and news articles.
- FAQPage schema on pages with frequently asked questions.
- Validate everything with the Google Rich Results Test and the Schema Markup Validator. Invalid markup does not count.
- BreadcrumbList schema for navigation context.
- Authors linked via Person schema with sameAs reference to LinkedIn.
Use the browser extension "Structured Data Testing Tool" to quickly see which markup is present on a page. Faster than pulling up the source code every time. I use it daily.
Step 4: content and E-E-A-T signals
The final step. Is the content structured in a way that an AI model understands who you are, why you are trustworthy, and what you want to say?
- Do all authors have a complete author page with bio, photo and external profiles?
- Does core content have source citations? AI models prefer content that substantiates its claims.
- Readability: Flesch reading ease score above 50? Not too academic, not too simple.
- Expertise signals: certifications, publications, speaking experience, client cases with concrete figures.
- Clear "About us" and contact page with consistent NAP information.
- Internal links: context-rich and logical, or bare "click here" links?
After these four steps, you know where you stand. Most websites score well on two out of four and poorly on the rest. That is normal. The point is: now you know what to fix.
Do not want to do this yourself? Our scanner at aeo-expert.nl/scan automates the technical checks. And if you need help with implementation afterwards, you know where to find us.
Frequently asked questions
How long does a full AI audit take?
For a website of 20 to 50 pages, I plan two to three hours for a thorough manual audit. Larger websites require more time or a combined approach: automated tools for the broad scan, manual inspection for nuance. Use our scanner at aeo-expert.nl/scan as a starting point.
Which errors do you encounter most often?
In order of frequency: robots.txt blocking AI crawlers (surprisingly common), missing or invalid Schema.org markup, JavaScript-generated content that cannot be crawled, inconsistent heading hierarchy and missing author information. The first two can be resolved in an hour. The rest requires more attention.
Do I need a separate AI audit alongside my regular SEO audit?
Ideally, you integrate AI readiness into your existing SEO audit process. Many checks overlap. The AI-specific additions: robots.txt configuration for AI crawlers, llms.txt, Schema.org validation and E-E-A-T signals. With a good SEO audit template, the AI layer costs an extra half hour.
An AI audit is not a one-time action. It is a baseline measurement. The real value lies in what you do with that information afterwards.
How does your website score on AI readiness?
Get your AEO score within 30 seconds and discover what you can improve.