AUTOMATIONSWITCH

Open Dataset · SARI v0.2

SARI Publisher Audit Dataset (Wave 1)

Per-site agent-readability scores for 103 publisher websites audited against the Scaletific Agent-Readability Index. Sortable in-page, downloadable as CSV, citable under CC-BY-4.0. Sourced from the same audit pipeline that produced the companion deep dive and visual findings.

103publisher sites
54median SARI score
45.8mean SARI score
6have llms.txt
8differentiated AI policy

Audit run window: 2026-05-08 to 2026-05-09 · License: CC-BY-4.0 · Last updated:

What's in the dataset

One row per publisher site, 25 columns per row. Scores are integers (Discovery, AI Bot Policy) or one-decimal floats (the three averaged categories). Boolean signals are true / false. Per-bot directives are true when the directive is present in the parsed robots.txt; the policy_class derived field summarises the per-bot pattern.

Identity

  • site · apex domain
  • name · publisher name
  • cohort · editorial cohort label
  • tier · 1=top, 2=mid, 3=long-tail/niche
  • confidence · high / medium / low

Scores

  • score_total · 0–100
  • score_discovery · 0–25
  • score_article_structure · 0–30
  • score_identity_attribution · 0–20
  • score_content_addressability · 0–15
  • score_ai_bot_policy · 0–10

Discovery

  • has_llms_txt · boolean
  • has_sitemap · boolean
  • has_mcp_well_known · boolean
  • ai_bot_directive_count · integer

Per-bot

  • blocks_GPTBot · boolean
  • allows_GPTBot · boolean
  • blocks_ClaudeBot · boolean
  • blocks_Google_Extended · boolean
  • blocks_PerplexityBot · boolean
  • addresses_OAI_SearchBot · boolean
  • allows_OAI_SearchBot · boolean
  • policy_class · derived enum

Full leaderboard (103 rows)

Sorted by total SARI score, descending. Each row has a stable anchor link (e.g. #wired-com) so individual records can be referenced in citations or shared in research. Rows tagged low confidence were excluded from cohort medians because fewer than three articles could be sampled.

#SiteCohortTotalDAICPllmsmcpAI policyConf.
1#Polygonpolygon.comculture-entertainment81.02030.015.010.06Blankethigh
2#Pocket-lintpocket-lint.comreviews-service79.72030.015.011.73Partialhigh
3#Seeking Alphaseekingalpha.combusiness-finance75.01529.015.010.06Blankethigh
4#The Vergetheverge.comtech75.01030.015.010.010Differentiatedhigh
5#Eatereater.comvertical-food74.01029.015.010.010Differentiatedhigh
6#Bloombergbloomberg.comtop-tier-news73.01527.010.015.06Blankethigh
7#Voxvox.comtop-tier-news73.01028.015.010.010Differentiatedhigh
8#Marketing Brewmarketingbrew.comvertical-marketing71.02030.05.010.06Blankethigh
9#Morning Brewmorningbrew.comnewsletter-hybrid71.02030.05.010.06Blankethigh
10#ZDNetzdnet.comtech71.01030.015.010.06Blankethigh
11#CNETcnet.comtech70.01029.015.010.06Blankethigh
12#Vulturevulture.comculture-entertainment70.01030.010.010.010Differentiatedhigh
13#CNBCcnbc.combusiness-finance69.02023.010.010.06Blankethigh
14#Travel + Leisuretravelandleisure.comvertical-travel69.01028.015.010.06Blankethigh
15#Varietyvariety.comculture-entertainment69.01028.010.015.06Blankethigh
16#NPRnpr.orgtop-tier-news68.52027.55.010.06Blanketmedium
17#Rest of Worldrestofworld.orgindie-longform68.31024.015.013.36Blankethigh
18#The Registertheregister.comtech68.31028.310.010.010Differentiatedhigh
19#404 Media404media.coindie-longform67.01029.015.010.03Partialhigh
20#Lonely Planetlonelyplanet.comvertical-travel67.01029.015.010.03Partialhigh
21#Scientific Americanscientificamerican.comvertical-science67.01026.015.010.06Blankethigh
22#The Guardiantheguardian.comtop-tier-news66.01030.010.010.06Blankethigh
23#VentureBeatventurebeat.comtech64.01029.05.010.010Differentiatedhigh
24#Adweekadweek.comindustry-trade63.01030.010.010.03Partialhigh
25#Associated Pressapnews.comtop-tier-news62.71030.06.710.06Blankethigh
26#Tom's Guidetomsguide.comreviews-service62.3529.015.013.30Silenthigh
27#Inc.inc.combusiness-finance62.01024.015.010.03Partialhigh
28#Modern Healthcaremodernhealthcare.comindustry-trade62.01026.05.015.06Blankethigh
29#The New York Timesnytimes.comtop-tier-news62.01026.010.010.06Blanketmedium
30#Semaforsemafor.comtop-tier-news62.01024.015.010.03Partialhigh
31#The Athletictheathletic.comvertical-sports62.01026.010.010.06Blanketmedium
32#Bon Appétitbonappetit.comvertical-food61.01030.05.010.06Blankethigh
33#Condé Nast Travelercntraveler.comvertical-travel61.01030.05.010.06Blankethigh
34#MedPage Todaymedpagetoday.comvertical-health61.01030.05.010.06Blankethigh
35#The New Yorkernewyorker.comtop-tier-news61.01030.05.010.06Blankethigh
36#Pitchforkpitchfork.comculture-entertainment61.01030.05.010.06Blankethigh
37#MIT Technology Reviewtechnologyreview.comtech61.01030.05.010.06Blankethigh
38#Wiredwired.comtech61.01030.05.010.06Blankethigh
39#Financial Timesft.comtop-tier-news60.71024.710.010.06Blankethigh
40#The Washington Postwashingtonpost.comtop-tier-news60.51024.510.010.06Blanketmedium
41#Fast Companyfastcompany.combusiness-finance60.31022.315.010.03Partialhigh
42#DEV Communitydev.totech60.01030.011.78.30Silenthigh
43#Engadgetengadget.comtech60.0530.015.010.00Silenthigh
44#The Hollywood Reporterhollywoodreporter.comculture-entertainment60.01024.05.015.06Blankethigh
45#FiveThirtyEightfivethirtyeight.comindie-longform58.01030.05.010.03Partialhigh
46#Business Insiderbusinessinsider.combusiness-finance57.31026.05.013.33Partialhigh
47#Politicopolitico.comtop-tier-news57.01026.05.010.06Blankethigh
48#Tom's Hardwaretomshardware.comreviews-service57.0527.015.010.00Silenthigh
49#Fortunefortune.combusiness-finance56.01023.010.010.03Partialhigh
50#InfoQinfoq.comtech55.0530.010.010.00Silenthigh
51#Moz Blogmoz.comvertical-marketing54.01026.05.010.03Partialhigh
52#Wirecutterwirecutter.comreviews-service54.0529.010.010.00Silenthigh
53#BBC Newsbbc.comtop-tier-news52.01021.05.010.06Blankethigh
54#The Informationtheinformation.comtech51.91517.38.38.33Partialhigh
55#Forbesforbes.combusiness-finance50.71019.05.06.710Differentiatedhigh
56#CNNcnn.comtop-tier-news50.61016.38.310.06Blankethigh
57#STAT Newsstatnews.comvertical-health47.31019.35.010.03Partialhigh
58#Reutersreuters.comtop-tier-news46.61017.33.310.06Blankethigh
59#MarketWatchmarketwatch.combusiness-finance46.0526.010.05.00Silenthigh
60#Puck Newspuck.newstop-tier-news41.0521.05.010.00Silenthigh
61#Search Engine Journalsearchenginejournal.comvertical-marketing41.0521.05.010.00Silenthigh
62#National Geographicnationalgeographic.comvertical-science40.0515.05.015.00Silenthigh
63#Trusted Reviewstrustedreviews.comreviews-service40.01010.73.310.06Blankethigh
64#Epicuriousepicurious.comvertical-food39.31010.03.310.06Blankethigh
65#MarTechmartech.orgvertical-marketing37.7517.75.010.00Silenthigh
66#Serious Eatsseriouseats.comvertical-food36.0100.05.015.06Blanketmedium
67#9to5Mac9to5mac.comtech35.0515.05.010.00Silenthigh
68#Naturenature.comvertical-science32.7100.05.011.76Blankethigh
69#Ars Technicaarstechnica.comtech31.0100.05.010.06Blankethigh
70#The Economisteconomist.comtop-tier-news31.0100.05.010.06Blankethigh
71#The Conversationtheconversation.comindie-longform31.0100.05.010.06Blankethigh
72#Kottkekottke.orgindie-longform30.0100.05.05.010Differentiatedhigh
73#The Atlantictheatlantic.comtop-tier-news30.0100.00.010.010Blanketmedium
74#Medium (platform)medium.complatform29.7100.05.011.73Partialhigh
75#Aeonaeon.coindie-longform28.0100.05.010.03Partialhigh
76#Consumer Reportsconsumerreports.orgreviews-service28.0100.05.010.03Partialhigh
77#Healthlinehealthline.comvertical-health28.0150.00.010.03Partialhigh
78#TechCrunchtechcrunch.comtech28.0100.01.713.33Partialhigh
79#PCMagpcmag.comreviews-service27.7100.01.710.06Blankethigh
80#Digital Trendsdigitaltrends.comreviews-service26.0100.00.010.06Blankethigh
81#Axiosaxios.comtop-tier-news23.0100.00.010.03Partialhigh
82#Food52food52.comvertical-food20.050.05.010.00Silenthigh
83#Ghost (platform)ghost.orgplatform20.050.05.010.00Silenthigh
84#Mayo Clinicmayoclinic.orgvertical-health20.050.05.010.00Silenthigh
85#The Puddingpudding.coolindie-longform20.050.05.010.00Silenthigh
86#Quanta Magazinequantamagazine.orgvertical-science20.050.05.010.00Silenthigh
87#The Wall Street Journalwsj.comtop-tier-news20.050.05.010.00Silenthigh
88#The Hustlethehustle.conewsletter-hybrid18.757.00.06.70Silenthigh
89#Longreadslongreads.comindie-longform18.0100.00.05.03Partialhigh
90#Substack (platform)substack.complatform16.750.05.06.70Silenthigh
91#Quartzqz.combusiness-finance15.050.05.05.00Silenthigh
92#Rtingsrtings.comreviews-service15.050.05.05.00Silenthigh
93#Smithsonian Magazinesmithsonianmag.comvertical-science15.050.05.05.00Silenthigh
94#ESPNespn.comvertical-sports13.0100.00.00.03Partiallow
95#Gizmodogizmodo.comtech13.0100.00.00.03Partiallow
96#Daring Fireballdaringfireball.netindie-longform8.050.00.00.03Partiallow
97#Harvard Business Reviewhbr.orgbusiness-finance8.050.00.00.03Partiallow
98#WebMDwebmd.comvertical-health8.050.00.00.03Partiallow
99#Defectordefector.comculture-entertainment5.050.00.00.00Silentlow
100#Notebookchecknotebookcheck.netreviews-service5.050.00.00.00Silentlow
101#Search Engine Landsearchengineland.comvertical-marketing5.050.00.00.00Silentlow
102#The Ringertheringer.comvertical-sports5.050.00.00.00Silentlow
103#Timetime.comtop-tier-news5.050.00.00.00Silentlow

Reuse and citation

Released under CC-BY-4.0. Reuse with attribution. Suggested citation:

Nouriel, M. (2026). SARI Publisher Audit Dataset (Wave 1) [Data set]. Automation Switch. https://automationswitch.com/research/agent-legibility-audit/dataset

For the methodology, the audit script, and the editorial framing, see the companion deep dive. For visual summaries, see the infographic. For other waves and future research, see the research hub.