Say Hello

INTERNATIONAL
AI VISIBILITY.

Subdirectory vs subdomain. Hreflang done right. Bytespider in non-English markets. Translation quality vs MT. Schema in multiple languages. The full multi-language AI visibility strategy.

Hreflang
Required
Subdirectory
Preferred
Bytespider
Critical
Translation
Human

WHY INTERNATIONAL AI VISIBILITY IS DIFFERENT.

English-language AI visibility advice translates poorly to other markets. Different bot priorities (Bytespider matters more than GPTBot in China), different ranking signals (Naver in Korea, Yandex in Russia), different schema-language conventions, different audience surfaces.

If you are operating internationally, the playbook below covers the structural decisions that move the needle. It is not exhaustive - per-market tactics deserve their own articles - but it is the correct foundation.

Finding 01.
Decision 01

SUBDIRECTORY VS SUBDOMAIN VS CCTLD.

Three URL patterns for international sites:

  • Subdirectory: example.com/de/. Single domain, multiple language paths. Easiest to manage; SEO authority concentrated on one domain.
  • Subdomain: de.example.com. Separate domain per language. Clean separation but splits authority.
  • ccTLD: example.de. Country-code TLD per market. Best local-search signal but most expensive (separate domain registrations, separate hosting, separate authority-building).

FOR AI VISIBILITY, SUBDIRECTORY WINS.

Subdirectory is the strongest pattern for AI visibility because it concentrates authority on a single domain that AI bots can ground entity claims against. Subdomain or ccTLD splits the authority across multiple sites; the AI engine has to learn each separately.

ccTLDs still matter for traditional local-search ranking (Google.de prefers .de domains for German queries), but for AI surfaces the cost of split authority typically outweighs the local-ranking benefit unless you are operating at scale in each market.

Finding 02.
Implementation

HREFLANG FOR AI.

Hreflang tells search engines and AI engines the language and regional targeting of each page. Implement in three places: HTML <link rel="alternate" hreflang="..."> in <head>, sitemap.xml <xhtml:link> entries, HTTP Link: headers (optional but cleanest for non-HTML resources).

Critical: include hreflang="x-default" for the canonical fallback. Most implementations skip this; without it, AI engines may default to whichever language version they crawled first, which is often wrong.

Validate hreflang implementation with the Google Search Console hreflang report and Sitebulb's hreflang audit. Common defects: missing reciprocal links between language pairs, inconsistent language codes (use ISO 639-1), mismatched hreflang and lang attributes.

Finding 03.
Bytespider

BYTESPIDER IN NON-ENGLISH MARKETS.

Article N. 13 covers Bytespider in detail. For international AI visibility specifically: in East Asian markets (China, Vietnam, Indonesia, parts of Southeast Asia), Bytespider drives more AI bot traffic than GPTBot or ClaudeBot combined. Doubao (China's ChatGPT-equivalent) is the dominant consumer LLM in that region.

Implication: if you are blocking Bytespider as part of a Western "block training bots" policy, you are cutting yourself off from these markets. The correct policy in international contexts is allow Bytespider; rate-limit at Cloudflare if its volume strains your origin.

Finding 04.
Translation

TRANSLATION QUALITY.

Machine-translated content gets cited less than human-translated content. AI engines detect MT-style phrasing and discount the source. The mechanism is partly perplexity-based (MT produces unusual word patterns that the engine flags as low-quality) and partly direct (the engine's training data includes both MT-flagged and high-quality content; it learned to prefer the latter).

Practical tiers:

  • Best: native human translation with editorial review. Highest cost, best citation rates.
  • Acceptable: MT followed by human editing (post-editing flow). Medium cost, decent citation rates.
  • Risky: pure MT with no human review. Cheapest, lowest citation rates. Use only for low-priority content.
Finding 05.
Schema

SCHEMA IN MULTIPLE LANGUAGES.

Schema markup needs to match the language of the page it is on. Article on a German page should have German headline; Person on a Japanese page should have Japanese name.

Stable @id values across languages: the canonical entity is the same person/organisation; only the visible labels change. Use one @id per entity, point to it from each language version.

Schema generators on most CMSes get this right by default. WordPress (RankMath, Yoast) and Shopify both translate schema labels appropriately when configured for multi-language sites.

THE BOTTOM LINE.

International AI visibility is structurally different from English-only AI visibility, mostly because the bot priorities and audience surfaces differ. Subdirectory + clean hreflang + Bytespider-allowed + human-translated content + per-language schema is the foundation. From there, per-market tactics layer on top.

Stop Guessing What AI Sees

MEASURE THE LEVERS
THAT ACTUALLY EXIST.

If you want this methodology applied to your specific site - your real logs, your real citation data, your real fix list - the audit is the productized way to do it.