WHY RSS PUNCHES ABOVE ITS WEIGHT.
RSS is the most-consumed endpoint by AI fetchers - roughly 40% of all logged AI requests across the 47-site research network. It punches above its weight for two reasons: it is structured, predictable, and cheap to parse (so bots prefer it over HTML for discovering changes); and it is chronological, so it gives bots a clean view of what is new without crawling the whole site.
The corollary: a site without a feed gets crawled less. A site with a stripped or broken feed gets crawled even less. The default RSS configuration on most CMSes is stripped, costing you visibility you did not realise you were leaving on the table.
FOUR SIGNALS.
An enriched feed includes all four:
- Full content in
<content:encoded>CDATA blocks. Not just titles. Not just excerpts. The entire article body, formatted HTML. - Accurate timestamps in RFC-822 format (
Tue, 27 Apr 2026 09:00:00 +0000). BothpubDateon each item andlastBuildDateat the channel level. - Discoverability: a
<link rel="alternate" type="application/rss+xml">in the HTML<head>of every relevant page. The feed itself includes an<atom:link rel="self">. - Robots.txt access: the feed path is not blocked. Sounds obvious; broken on enough sites to be worth checking.
WHAT CMS DEFAULTS BREAK.
The five most common feed defects from audits:
- Titles only: feed contains
<title>+<link>+<description>(one-line summary), no<content:encoded>. Bots can discover the URL but cannot extract content from the feed itself. - Excerpt-only:
<description>contains the first 200 characters with "..." appended. Same problem at smaller scale. - Broken XML: unescaped
&or invalid CDATA boundaries. Validators reject the feed; bots may skip it entirely. - Wrong dates:
pubDateis the build time of the feed, not the article publication time. Causes the entire feed to look perpetually new, which AI bots can detect and discount. - Missing self-link:
<atom:link rel="self" href="https://yoursite.com/feed.xml"/>absent. Some validators flag, some bots ignore the feed.
FIXES BY PLATFORM.
The four most common platforms:
- WordPress: install RSS plugin (e.g., RSS Editor) and toggle "full content" in settings. Default theme often serves excerpts; flag is at Settings -> Reading -> "For each post in a feed, include" -> "Full text".
- Shopify: blog feeds default to titles + URL only. Use a plugin (Search & Discovery, or custom Liquid template) to add full content.
- Ghost: full content by default. Verify feed validates and includes accurate timestamps. Most issues here are content-paraphrased AI excerpts in the description field.
- Custom / static (Next.js, Astro, Eleventy, etc.): write a feed generator that emits the full body as HTML in
<content:encoded>. The mishamanko.com feed builder is in this repo atscripts/build-feed.pyas a reference.
VERIFICATION CHECKS.
After implementation, verify with three tests:
- 1. W3C feed validator (
validator.w3.org/feed). Should return zero errors. - 2. Curl the feed:
curl -s https://yoursite.com/feed.xml | head -200. Verify the first item contains a<content:encoded>block with substantive HTML. - 3. Check robots.txt and Cloudflare config.
curl -A "GPTBot/1.0" https://yoursite.com/feed.xmlshould return 200, not 403.
THE BOTTOM LINE.
RSS is the highest leverage-per-hour intervention in the implementation playbook. Two to four hours of work to enrich a feed yields measurable AI bot traffic increase within weeks. Most sites have not done it. The competitive advantage is real and short-lived - assume the SERP catches up in 12-18 months and ship the work now while it still differentiates.