GPTBot

OpenAI's official web crawler that indexes content for ChatGPT's browse mode and RAG features. Blocking it in robots.txt removes your content from ChatGPT's retrieval system. User agent: GPTBot/1.0. Allowing access is recommended for any brand pursuing ChatGPT citation presence.

GPTBot is OpenAI’s official web crawler — the bot that fetches and indexes web content for use in ChatGPT’s browse mode and retrieval-augmented generation features. Its user agent string is GPTBot/1.0.

What GPTBot crawls for

GPTBot collects web content to:

Power ChatGPT’s browse mode (live web retrieval for current information)
Build retrieval indexes used when ChatGPT performs web searches
Potentially contribute data to OpenAI’s training pipeline (though this is separate from browse mode)

Controlling GPTBot access

You control GPTBot access through your robots.txt file:

To block GPTBot entirely:

User-agent: GPTBot
Disallow: /

To block specific sections:

User-agent: GPTBot
Disallow: /private/
Disallow: /members-only/

To explicitly allow (or verify it’s allowed):

User-agent: GPTBot
Allow: /

The AI visibility tradeoff

Blocking GPTBot means your content won’t appear in ChatGPT’s browse mode results or RAG-powered responses from OpenAI products. For most brands pursuing AI visibility, blocking GPTBot is counterproductive — it directly prevents citation by the world’s most-used AI engine.

Blocking may be appropriate for: paywalled content, private customer data, content under active copyright dispute. For general marketing and educational content, allowing GPTBot is the default recommendation.

Other AI crawlers to know

PerplexityBot: Perplexity AI
Anthropic-AI / ClaudeBot: Anthropic / Claude
Googlebot: Google Search + AI Overviews (not blocking this one is table stakes)
bingbot: Microsoft Bing + Copilot

PreviousCrawl Budget

NextUnlinked Mention

Ready to improve your AI visibility?

Put your knowledge into practice with step-by-step tutorials.