Schema markup in 2026: what AI engines read and what they ignore.

Every GEO checklist includes schema markup. Most stop there. They do not tell you which schema types AI engines actually pull from, which ones sit quietly in the background as infrastructure, or which ones you are probably implementing wrong. This is that breakdown.

Schema is not a ranking signal — it is a legibility signal

Schema markup does not move you up in Google search results. That is not what it does. What it does is make your content machine-readable — it tells the crawler what type of entity it is looking at, what that entity offers, and how to verify its identity against external sources.

A business without schema is like a document without headings. The content is there. The crawler can find it. But extracting structured facts from unstructured prose is harder, and in a competition between two pages covering the same topic, the one that is easier to parse wins the citation.

The distinction matters because it changes how you prioritise. You are not trying to trick a ranking algorithm. You are trying to tell a retrieval system: this is who we are, this is what we do, here are the third-party sources that confirm it.

The schemas that drive AI citations

Five schema types drive the majority of AI citations across ChatGPT, Perplexity, and Google AI Overviews. They are not equal.

FAQPage

FAQPage is the highest-return schema for GEO. AI engines retrieve Q&A pairs directly — not the surrounding prose, the actual question and answer text. A page with ten well-formed FAQPage questions is ten citable units. Each one can be retrieved and attributed independently. Write the question the way a user would ask it. Write the answer as a complete sentence that stands alone without context.

HowTo

Perplexity and ChatGPT both answer procedural queries by pulling HowTo steps. When someone asks "how do I set up X," the engine retrieves the step array, not the introduction. A five-step HowTo with precise step descriptions — not vague labels — wins a citation for every step it gets pulled into. The step name and text fields both matter. "Step 3: configure" is not a step description. "Step 3: open your DNS provider settings and add a TXT record with the value provided in the verification email" is.

Organization + sameAs

Without sameAs, an AI engine cannot verify your entity. It sees a name and a URL. It cannot confirm you are the same entity described on LinkedIn, listed on Google Maps, or documented on Wikidata. Entity disambiguation depends on sameAs. When the engine cannot confirm identity, it defaults to sources that can be confirmed. The sameAs array is your identity graph.

Article / BlogPosting

Article schema carries three signals that affect retrievability: datePublished (freshness), author (attribution), and publisher (entity link). An authoritative page without Article schema looks undated and unattributed to a crawler. An 18-month-old piece of content with a recent dateModified tells the engine it has been reviewed and is still current. Leave these fields out and you compete on prose quality alone.

LocalBusiness / ProfessionalService and industry subtypes

The schema.org hierarchy goes deep. Attorney, Dentist, AccountingService, ITConsultancy — each unlocks platform-specific citation surfaces. Google AI Overviews pulls LocalBusiness types into Maps-backed answers. Bing Copilot uses ProfessionalService subtypes for local intent queries. Using @type: "LocalBusiness" when you are actually an "Attorney" or "FinancialPlanningService" is leaving specificity on the table.

The schemas that are table stakes but do not drive citations

BreadcrumbList, WebSite, and WebPage are worth having. They are not worth optimising for GEO. AI engines use them for navigation context — sitelinks in search, URL structure interpretation — but they do not lift from these types when generating citations. A Perplexity answer does not quote your BreadcrumbList. A ChatGPT response does not reference your WebPage name.

Treat these three as infrastructure. Implement them correctly once and move on. The citation-driving work is in FAQPage, HowTo, Organization, and type specificity.

The most common schema mistakes we find in audits

Client after client, the same patterns appear. These are the four we find most often.

JavaScript-injected JSON-LD

The most damaging mistake: the JSON-LD block sits in a script tag that is injected by JavaScript after DOM load. AI crawlers fetch the server-rendered HTML. They do not execute JavaScript on most pages. What they see is an empty object or no schema at all. The fix is simple — server-render the JSON-LD into the document head. In Next.js, this means returning the script from a Server Component, not appending it in a useEffect.

@type: "Thing" when you mean something specific

"Thing" is the base type. Using it tells the engine nothing about what you are. Pick the most specific applicable subtype: LocalBusiness, ProfessionalService, SoftwareApplication, EducationalOrganization. The engine uses the type to route your entity to the right citation surface.

FAQPage on decorative headings

We see FAQPage schema on pages where the H2s are section titles ("Why choose us?", "Our process"), not actual questions. The schema validator passes. The engine retrieves the markup, finds that the "question" is not a genuine question with a real answer, and either skips it or rates it low for retrieval. Each FAQPage entry should be a question a real user would type and an answer that fully addresses it.

Organization without contactPoint or sameAs

An Organization block with just name, url, and logo is the minimum viable schema. It does almost nothing for entity authority. The contactPoint property adds a machine-readable contact surface. The sameAs array is what enables identity confirmation. Both are required for the schema to do real GEO work.

A correct minimal FAQPage block looks like this — served in a script tag in the document head, not injected after load: { "@context": "https://schema.org", "@type": "FAQPage", "mainEntity": [{ "@type": "Question", "name": "What is GEO?", "acceptedAnswer": { "@type": "Answer", "text": "Generative Engine Optimisation is the practice of structuring a website so AI engines can retrieve, attribute, and cite its content in generated answers." } }] }

sameAs is the most underused schema property

Entity disambiguation is the hardest problem in GEO. AI engines see thousands of entities with similar names. The way they confirm which one you are is by cross-referencing your sameAs links against known data sources. Each link in the array is a vote that the entity at this URL is the same entity described on your site.

For a professional services business, a complete sameAs array includes: LinkedIn company page, Google Maps listing, Crunchbase profile, GitHub organisation page, and any relevant directory listing — Clutch, Upwork, DesignRush depending on your industry. If any of these profiles do not exist yet, create them. The schema link only works if the destination exists and is populated.

LinkedIn company page — the strongest professional identity signal for B2B entities
Google Maps listing — required for local-intent queries and AI Overview local answers
Crunchbase profile — used for company age, funding, and category verification
GitHub organisation — credibility signal for technology companies and development studios
Wikidata entity — the gold standard for entity disambiguation; pursue this once the brand has sufficient third-party coverage
Industry directory (Clutch, Upwork, G2) — category-specific citation surfaces

Engines that cannot confirm identity via sameAs default to other sources that can be confirmed. You are not just adding links — you are deciding whether the engine uses your page or someone else's when the query matches your entity.

How to validate schema without just using the Rich Results Test

The Rich Results Test validates syntax. It tells you whether your JSON-LD parses correctly. It does not tell you whether the schema is server-rendered, whether the content actually matches the claimed type, or whether AI engines are seeing it at all.

Three steps give a more complete picture.

01 — curl and grep for server-rendered schema

Run curl on the live URL and grep the output for application/ld+json. If the script tag is not in the raw HTML response, it is JavaScript-injected and AI crawlers are not seeing it. This is the most important check and it takes 30 seconds.

02 — Google Search Console URL Inspection

The URL Inspection tool in Search Console shows you the rendered HTML as Googlebot sees it after crawling. Check the Enhancements tab — if your schema types are not listed, Googlebot is not finding them either. For batch coverage, use the Index Coverage report to find pages with schema errors at scale.

03 — Bing Webmaster Tools markup validator

Bing's validation endpoint checks schema coverage against Bing's own index, which feeds ChatGPT Copilot. A schema that passes Google's validation can still have issues in Bing's parser — particularly around JSON-LD encoding and character escaping. Run both. The two validators flag different error classes.

The schema audit we run on every site

Our GEO audit scores schema across nine dimensions. This is the full checklist.

Nine schema types validated: Organization, LocalBusiness or ProfessionalService subtype, Article or BlogPosting, FAQPage, HowTo, BreadcrumbList, WebSite, WebPage, and AggregateRating where applicable
Server-render confirmed: JSON-LD present in raw HTML response, not injected post-load
sameAs link graph verified: each linked URL resolves, the destination entity name matches, no broken or redirected links
FAQPage question count: minimum five well-formed Q&A pairs on any page targeting informational queries
HowTo presence: step count, step name quality, and tool or supply fields where warranted
Article dates and authors: datePublished, dateModified, and author with @type Person on all editorial content
AggregateRating: present where the business has verifiable review data, absent where it does not (fabricated ratings are a disqualifier)
Type specificity: @type uses the most specific applicable subtype, not a generic parent
contactPoint: present on Organization with contactType and telephone or email

Each item scores 0 to 100 in the schema dimension of the audit. A perfect schema score does not guarantee citations — retrievability, content structure, and entity authority all feed the final score. But schema below 70 reliably suppresses citation frequency even when the content is strong. We have the data across 60-plus audits to confirm the correlation.

The engine does not read your content the way a human does. It reads what you have told it you are.

If you want to see where your schema stands, the GEO portal at geo.atellius.com runs the full nine-point schema check and returns findings with fix instructions. The same scanner runs against this site on every deploy.

Schemamarkupin2026:whatAIenginesreadandwhattheyignore.