Semantic HTML: the language humans, machines and AI understand
The div-soup problem
I open the source code of a new client. And then I see this:
- div.wrapper > div.container > div.row > div.col-12 > div.content > div.text > div.paragraph
- Not a single semantic tag in sight.
- Seven layers of div nesting for a simple paragraph.
Div-soup. And I immediately know: this is what we are going to talk about.
Look, this is not just ugly. It actively harms how AI models understand your website. A language model has no eyes. No intuition for visual hierarchy. It reads the structure of your HTML and tries to derive from that what is important and what is not. What is navigation and what is content. What is a quote and what is a caption.
If that structure is not there? It guesses. And guessing leads to incorrect interpretations, missed context and ultimately: fewer citations for you.
But wait. Is that actually a big deal? Yes. Yes, it is.
An AI model processing a page with correct semantic HTML makes better extraction decisions. It knows what the main content is, what the navigation is and what the sidebar is. That directly leads to more accurate citations.
Six elements that make the difference
HTML5 introduced semantic elements that communicate structure. Without CSS classes, without ARIA attributes. They have existed for over ten years. And yet: the majority of websites do not use them. Or use them incorrectly.
A quick tour of the most important ones:
- main: the primary content of the page. One per page. AI models use this to determine what is core content and what is noise.
- article: a self-contained piece of content that retains value independently of context. A blog post, a news article. This element tells AI: this is citable.
- section: a thematic grouping within an article or page. Helps AI models understand the structure of longer pieces.
- aside: supplementary content. Sidebar, context block, definition box. Related but not essential.
- nav: navigation. AI models recognize this and process it differently from body content.
- header and footer: page or section header and footer. Help AI models distinguish recurring elements from unique content.
That is it. Six elements. It is not rocket science. It is discipline.
Practical implementation
I can hear you thinking: nice story Reinier, but I am not going to rebuild my entire website. You do not have to.
Implementing semantic HTML is not a major refactoring. You replace a handful of div elements with their semantic equivalent. The CSS barely needs to change, because semantic elements are ordinary block elements, just like div.
- Identify the main content and wrap it in a main element. One main per page.
- Wrap self-contained content pieces (blog posts, articles) in article elements.
- Replace div.nav or div.menu with nav.
- Replace div.header with header and div.footer with footer.
- Use section for logical blocks within an article. Give each section a heading.
- Use aside for sidebar content and related links that do not belong to the main content.
At Kobalt, I carry out these changes as the first step. It takes an afternoon for an average website and the effect on AI readability is directly measurable. Sometimes I wonder why everyone does not do this. And then I remember: because nobody explains it.
Beyond the basics
Semantic HTML is the foundation. But there is more.
Lang="nl" on the html element tells AI models what language your content is written in. Crucial for multilingual websites. ARIA role attributes provide extra context for elements that have no native semantic equivalent.
It is like an ecosystem: everything is connected. Semantic HTML is the soil. Structured data is the vegetation. ARIA is the microclimate. Together they make your website readable for everything and everyone.
A client in the media sector had a website built entirely on div elements. After migrating to semantic HTML, the AEO score rose by 12 points in a single scan. No content changes, no new structured data. Just better HTML structure. Twelve points.
Frequently asked questions
Does semantic HTML also affect regular Google rankings?
Indirectly, yes. Google already understands div-soup fairly well. But semantic HTML supports rich results, improves accessibility (a ranking factor) and makes your site easier to understand for featured snippets. It is never a disadvantage.
My website is built in a page builder. Can I still use semantic HTML?
Depends on the builder. Divi, Elementor and Beaver Builder generate many div elements but offer options to set semantic wrappers. Look for the "HTML tag" setting per block. Some builders let you choose section or article. It is not perfect, but it is better than nothing.
Is there a tool to quickly see how semantic my website is?
The "Accessibility Tree" in Chrome DevTools (Inspect > Accessibility) shows how automated processors see your page. The Web Developer Toolbar browser extension visualizes the heading structure. What a screen reader understands, an AI crawler understands too.
Semantic HTML is not the last step in your AI optimization. It is the first. Without good structure, all other improvements are half measures.
How does your website score on AI readiness?
Get your AEO score within 30 seconds and discover what you can improve.