Methodology

What We Check and Why

Our infrastructure scan runs 10 automated checks against your website. Every check is labeled with an evidence tier so you know exactly what is verified by real crawler behavior versus what is speculative.

Evidence Tiers

VERIFIED

Major search engines or AI crawlers are confirmed to read and act on this signal. Backed by official documentation from Google, OpenAI, or protocol specifications.

EMERGING

Proposed standard with growing adoption but not yet confirmed to be read by major AI systems. We include these checks when adoption exceeds 10% of top sites, labeled clearly so you can decide.

How This Differs from HubSpot's AEO Grader

HubSpot AEO Grader

Asks AI models "what do you think of this brand?" Measures sentiment, recognition, and share of voice. A brand perception tool. Score out of 100.

citability.dev

Scans your actual website infrastructure. Measures whether AI crawlers can technically find and parse your content. A technical readiness tool. Every check shows evidence and explanation.

They answer "does AI know you?" We answer "can AI find you?" You can score 80/100 on HubSpot and still fail our scan if your site blocks AI crawlers. The tools are complementary.

The 10 Checks

01

robots.txt

VERIFIED

What it checks

Checks whether your robots.txt file exists and is accessible.

Who reads it

All major crawlers: Googlebot, GPTBot (OpenAI), ClaudeBot (Anthropic), CCBot (Common Crawl), Meta-ExternalAgent.

Why it matters

robots.txt is the universal protocol for communicating crawl permissions. Without it, crawlers may apply default behavior or skip your site entirely. AI-specific directives (allow/disallow GPTBot, ClaudeBot) are only possible if the file exists.

Source documentation
02

sitemap.xml

VERIFIED

What it checks

Checks whether a sitemap exists at your domain root.

Who reads it

All search engines and AI crawlers use sitemaps for content discovery.

Why it matters

Sitemaps tell crawlers which pages exist and when they were last updated. Without one, crawlers rely on link-following, which misses orphaned pages and provides no freshness signal.

Source documentation
03

Answer-First Content

VERIFIED

What it checks

Analyzes whether your homepage leads with a direct, extractable answer rather than generic marketing copy.

Who reads it

Google (featured snippets, AI Overviews), Perplexity, ChatGPT browse mode.

Why it matters

AI systems extract concise answers from pages. Content that buries the answer below navigation, hero images, or generic taglines is less likely to be selected for AI-generated responses.

Source documentation
04

Content Freshness

VERIFIED

What it checks

Checks for date signals: published dates, modified dates, or dateModified in schema.

Who reads it

Google (QDF algorithm), AI systems that prioritize recent content.

Why it matters

Stale content without date signals gets deprioritized. Google's Query Deserves Freshness (QDF) algorithm and AI training pipelines both use recency as a quality signal.

Source documentation
05

Structured Data (JSON-LD)

VERIFIED

What it checks

Checks for JSON-LD structured data blocks on your homepage.

Who reads it

Google, Bing, and AI systems parse JSON-LD to understand entities and page purpose.

Why it matters

Schema markup (Organization, Article, FAQPage, HowTo) gives machines explicit context about your content. Pages with rich schema are more likely to generate rich results and be understood correctly by AI models.

Source documentation
06

Meta Description

VERIFIED

What it checks

Checks for a meta description tag with meaningful content (>10 characters).

Who reads it

All search engines use it for snippet generation. AI systems use it as a page summary signal.

Why it matters

The meta description is often the first text an AI system reads about your page. A missing or generic description means the AI must guess your page's purpose from the body content.

Source documentation
07

Canonical URL

VERIFIED

What it checks

Checks for a rel=canonical tag pointing to the authoritative version of the page.

Who reads it

All search engines and AI training crawlers.

Why it matters

Without a canonical, duplicate versions of your content (www vs non-www, HTTP vs HTTPS, query parameters) compete with each other. AI systems may cite the wrong version or split authority across duplicates.

Source documentation
08

HTTPS

VERIFIED

What it checks

Checks whether your site is served over HTTPS.

Who reads it

Google (confirmed ranking signal since 2014), all AI crawlers that fetch content.

Why it matters

HTTPS is a baseline trust signal. AI systems that fetch live content (ChatGPT browse mode, Perplexity) require HTTPS for secure retrieval. Non-HTTPS sites may be skipped or flagged.

Source documentation
09

Heading Hierarchy

VERIFIED

What it checks

Checks for at least one H1 tag and proper heading structure.

Who reads it

Search engines use headings to understand content hierarchy. AI models use them to identify key topics.

Why it matters

A clear H1 > H2 > H3 hierarchy helps AI systems extract the main topic and subtopics from your page. Pages without an H1 lack a clear primary topic signal.

Source documentation
10

Social Sharing Readiness

VERIFIED

What it checks

Checks for Open Graph tags (og:title, og:description, og:image).

Who reads it

Social platforms (LinkedIn, Twitter/X, Facebook), AI systems that preview links.

Why it matters

OG tags control how your page appears when shared. AI systems that browse the web use these as quick metadata signals. Missing OG tags mean your content previews are auto-generated and often wrong.

Source documentation

What We Deliberately Exclude

llms.txt

Proposed in 2024, ~10% adoption as of 2026. No major AI company (Google, OpenAI, Anthropic, Meta) confirms reading it. Only 1 of the 50 most-cited domains has one. We monitor adoption but do not penalize for absence.

ai.txt / .well-known/llms.json

Not established standards. No confirmed crawler support. Including them would inflate your score without improving your actual AI visibility.

Proprietary "AI scores"

We do not generate opaque scores from black-box algorithms. Every check is a specific, verifiable test with a binary PASS/FAIL result and a cited source explaining why it matters.

Run the scan on your site

Free. No account required. Results in 10 seconds.

Start Free Scan