Say Hello

VERIFYING REAL
GPTBOT.

Most articles on AI bots skip the verification step. Spoofed user-agents are common - some scrapers pretend to be GPTBot to bypass rate limits, some adversaries pretend to be Googlebot to map your site. Here are the actual verification commands per bot.

Methods
2
Spoof Rate
Significant
Detection Time
Seconds
Source
Vendor Docs

WHY VERIFICATION MATTERS.

User-agent strings are trivially spoofed. Anything can claim to be GPTBot in a single header. Real verification is done at the network level via reverse DNS lookup or IP range matching. Most published articles on AI bots skip this step entirely.

Practical reasons to verify: you want to grant special treatment to real AI bots (e.g., bypass rate-limits) without granting it to spoofers; you want to filter your bot logs for accurate measurement; you want to detect adversaries probing your site under a friendly disguise.

Finding 01.
Method 1

REVERSE DNS.

The cleanest method. Each major bot's IPs reverse-resolve to the operator's domain. Spoofed IPs do not.

  • GPTBot, OAI-SearchBot, ChatGPT-User reverse to *.openai.com
  • ClaudeBot, Claude-User, Claude-SearchBot reverse to *.anthropic.com
  • PerplexityBot, Perplexity-User reverse to *.perplexity.ai
  • Bingbot reverses to *.search.msn.com
  • Googlebot, Google-Extended reverse to *.googlebot.com or *.google.com
  • Bytespider reverses to *.bytedance.com

THE VERIFICATION COMMAND.

From any terminal, given an IP from your access log:

Finding 02.
Method 2

PUBLISHED IP RANGES.

Most operators publish their bot IP ranges. Faster than reverse DNS for high-volume verification.

  • OpenAI: https://openai.com/gptbot.json - JSON list of CIDR ranges per bot.
  • Anthropic: https://docs.anthropic.com/en/docs/agents-and-tools/claude-code/security#crawler-and-bot-allowlist - published per bot.
  • Google: https://www.gstatic.com/ipranges/goog.json + https://developers.google.com/search/docs/crawling-indexing/verifying-googlebot.
  • Bing: https://www.bing.com/toolbox/bingbot.json.
  • Perplexity: published in their docs.

PRACTICAL USAGE.

In production, the typical flow:

THE BOTTOM LINE.

User-agent verification is mandatory if you are doing anything special for AI bots (bypassing rate-limits, returning cached responses, serving optimised HTML). Otherwise spoofers get the same privileges and the discrimination is meaningless. The verification commands are simple, the IP-range files are public, and there is no excuse for treating user-agents as ground truth in 2026.

Stop Guessing What AI Sees

MEASURE THE LEVERS
THAT ACTUALLY EXIST.

If you want this methodology applied to your specific site - your real logs, your real citation data, your real fix list - the audit is the productized way to do it.