Say Hello

CLAUDEBOT, CLAUDE-USER,
CLAUDE-SEARCHBOT.

Anthropic added Claude-SearchBot in early 2026, completing a three-bot architecture (training, on-demand, search index). This is the practitioner breakdown plus the deprecated identifiers people are still using by mistake.

Anthropic Bots
3
Newest Bot
Early 2026
Deprecated UAs
2
Robots.txt
Strict

ANTHROPIC'S 2026 UPDATE.

Anthropic added Claude-SearchBot in early 2026. The full active stack is now ClaudeBot (training), Claude-User (on-demand), and Claude-SearchBot (search index). The older identifiers Claude-Web and anthropic-ai are deprecated but still showing up in copy-pasted robots.txt files everywhere.

Anthropic's bots are notable for two things: strong robots.txt compliance (stricter than most) and a deliberate, low-aggression crawl pattern compared to OpenAI's bots and Perplexity. They visit fewer pages but tend to spend more time per page, and they retry less aggressively when they hit 4xx errors.

Finding 01.
The training crawler

CLAUDEBOT - TRAINING DATA.

Purpose: bulk crawling for Claude model training. Similar role to GPTBot.

Crawl pattern: slow, broad, predictable. Respects robots.txt strictly. Honors Crawl-delay directives more reliably than most other AI bots.

Correct policy: allow if comfortable with training-data inclusion, block if not. Same reasoning as GPTBot.

Finding 02.
The on-demand fetcher

CLAUDE-USER - ON-DEMAND.

Purpose: fetches specific URLs when a Claude user asks Claude to read a particular page or follows a citation link. Per-request.

Crawl pattern: bursty, user-triggered. Tends to spend longer per page than ChatGPT-User; the model reads and reasons more thoroughly.

Correct policy: allow. Blocking Claude-User just produces 403s when users try to follow citations to your page.

Finding 03.
The new search-index crawler

CLAUDE-SEARCHBOT - CITATION INDEX.

Purpose: builds the index Claude uses for web search and citation. Added to the public bot list in early 2026.

Crawl pattern: aggressive on fresh URLs, similar to OAI-SearchBot. Hits RSS and sitemaps first, then HTML. Lower total volume than OpenAI's equivalent because Claude's web search has lower query volume than ChatGPT's.

Correct policy: allow. This is the bot whose visits correlate with Claude citations. Blocking it means giving up Claude's citation surface entirely.

CLAUDEBOT
Training. Slow, broad, strictly respects robots.txt and Crawl-delay. Block to opt out of training.
CLAUDE-USER
On-demand. User-triggered fetches. Tends to dwell longer per page than other on-demand bots.
CLAUDE-SEARCHBOT
Search index. Newest of the three. Drives Claude citations. Allow unless you actively want zero Claude visibility.

THE DEPRECATED IDENTIFIERS.

Two old Anthropic identifiers still show up in robots.txt files everywhere:

PERPLEXITY SIDEBAR.

While we are on bot families: Perplexity also runs a two-tier model worth knowing. PerplexityBot handles bulk crawling for the Perplexity index; Perplexity-User handles on-demand fetches when users ask Perplexity to read a specific page.

Perplexity is the most aggressive of the major AI bots in raw request volume. Its bulk crawler hits feeds and sitemaps frequently and is the one most likely to trigger rate-limiting on smaller sites. Allow both; rate-limit gently if needed (no hard 429s on bot crawl bursts).

THE BOTTOM LINE.

Three Anthropic bots, two Perplexity bots, three OpenAI bots (covered separately). That is eight named user-agents to handle correctly in your robots.txt. The deprecated identifiers add four more to be aware of and ignore. Audit your config against the current list; remove the deprecated entries; verify policies on each of the active eight.

Stop Guessing What AI Sees

MEASURE THE LEVERS
THAT ACTUALLY EXIST.

If you want this methodology applied to your specific site - your real logs, your real citation data, your real fix list - the audit is the productized way to do it.