Which websites were included in the AI visibility benchmark?

We audited 6 websites spanning domain authority 28 to 97: Ahrefs (DA 92), Semrush (DA 91), Reddit (DA 97), Medium (DA 95), X (DA 96), and chudi.dev (DA 28). The range was chosen to test whether authority correlates with AI visibility.

Which AI platforms were tested?

Each site was tested against ChatGPT (OpenAI), Perplexity, and Claude (Anthropic). We queried each platform with questions the site should be able to answer and tracked whether the AI mentioned the brand (visibility) or cited a URL (citability).

What was the main finding of the benchmark?

Domain authority has zero correlation with AI citation rates. Ahrefs (DA 92) was 100% visible but only 5% cited. Reddit (DA 97) failed basic infrastructure checks. The three factors that predicted citations were answer-first content, dateModified schema, and original data.

Benchmarking AI Visibility: 6 Sites, 3 Platforms, 10 Checks

We audited 6 websites for AI visibility using the AI Visibility Readiness framework. The goal was straightforward: measure whether the infrastructure signals that AI crawlers need actually predict which sites get cited by AI platforms. For the conceptual model behind what citability is and the five pillars that drive it, see the citability framework.

The results challenged the assumption that authority and traffic drive AI citations.

The Benchmark Setup#

Sites audited: Ahrefs (DA 92), Semrush (DA 91), Reddit (DA 97), Medium (DA 95), X (DA 96), chudi.dev (DA 28).

Checks per site: 10 infrastructure signals tested automatically. robots.txt, sitemap.xml, answer-first content, content freshness, structured data (JSON-LD), meta description, canonical URL, HTTPS, heading hierarchy, and social sharing readiness.

AI platform testing: Each site was queried on ChatGPT, Perplexity, and Claude with questions the site should answer. Two outcomes tracked per query: visibility (AI mentions the brand) and citability (AI includes a URL as a source).

Infrastructure Results#

Site	DA	robots.txt	sitemap	Answer-First	Freshness	JSON-LD	Meta Desc	Canonical	HTTPS	Headings	OG Tags	Score
ahrefs.com	92	Pass	Pass	Pass	Pass	Pass	Pass	Pass	Pass	Pass	Pass	10/10
semrush.com	91	Pass	Pass	Pass	Pass	Pass	Pass	Pass	Pass	Pass	Pass	10/10
chudi.dev	28	Pass	Pass	Pass	Pass	Pass	Pass	Pass	Pass	Pass	Pass	10/10
reddit.com	97	Partial	Pass	Fail	Fail	Fail	Pass	Pass	Pass	Pass	Pass	7/10
medium.com	95	Pass	Partial	Fail	Partial	Fail	Pass	Pass	Pass	Pass	Pass	7/10
x.com	96	Pass	Partial	Fail	Fail	Fail	Partial	Pass	Pass	Fail	Pass	5/10

The three highest-DA sites scored lowest on infrastructure. Reddit, Medium, and X failed on the signals most relevant to AI extraction: answer-first content, structured data, and content freshness.

AI Platform Results#

Site	AI Visible	AI Cited	Visibility-Citation Gap
ahrefs.com	100%	5%	95 points
semrush.com	Partial	Partial	Moderate
chudi.dev	29%	0%	29 points
reddit.com	Untested (infra fail)	Untested	N/A
medium.com	Untested (infra fail)	Untested	N/A
x.com	Untested (infra fail)	Untested	N/A

The visibility-citation gap is the central finding. Ahrefs has 100% AI visibility, meaning every AI platform recognizes the brand when asked. But only 5% of queries produced a cited URL. The AI knows Ahrefs exists from training data. It does not need to cite the source.

Check-by-Check Analysis#

Highest Pass Rate: HTTPS, Canonical, OG Tags#

These three checks passed on every site. They are baseline web standards adopted universally. Passing them is necessary but not differentiating for AI visibility.

Lowest Pass Rate: Answer-First Content, JSON-LD, Freshness#

These three checks had the lowest pass rates and the strongest correlation with AI citability. Sites that failed these checks were either not testable for AI citability or showed 0% citation rates.

Answer-first content was the sharpest differentiator. Reddit, Medium, and X all fail because their page layouts prioritize navigation, user interactions, and dynamic content over direct answers. The pages are designed for human browsing, not AI extraction.

JSON-LD structured data was missing entirely on Reddit, Medium, and X. Ahrefs and Semrush both had comprehensive schema markup across their content pages. chudi.dev had 9 schema types including TechArticle, FAQPage, and Person with expertise signals.

Content freshness via dateModified schema was present on Ahrefs and Semrush blogs but absent on social platforms. Reddit and X content has timestamps but not in schema markup that AI crawlers parse.

What This Means for the Industry#

The benchmark reveals a split between two types of AI citation:

Training data citations - Reddit, Medium, and X get cited because massive volumes of their content exist in AI training data. AI platforms have internalized their content and can reference it without needing infrastructure signals. This only works at internet-scale volumes.

Infrastructure-driven citations - For everyone else, AI crawlers need to discover, parse, and evaluate your content through standard web infrastructure. The 10 checks we test are the minimum viable infrastructure for this discovery path.

If your site does not have internet-scale content volumes (and most do not), the infrastructure path is your only option. The good news is that the infrastructure is fixable. The checks are specific, verifiable, and the fixes are well-documented.

Run Your Own Benchmark#

Every data point in this benchmark came from the same scan you can run on your own site. Start a free scan to see which of the 10 checks your site passes and which it fails.

The full methodology, including evidence tiers and source documentation for each check, is available on our methodology page. The conceptual side of why entity authority sets the ceiling on these benchmark numbers (the slowest pillar to move; the highest-impact one when it does) is documented in Chudi's Entity Mention Graph and AI Visibility on chudi.dev.