GPTBot is OpenAI’s official web crawler — the bot that fetches and indexes web content for use in ChatGPT’s browse mode and retrieval-augmented generation features. Its user agent string is GPTBot/1.0.
What GPTBot crawls for
GPTBot collects web content to:
- Power ChatGPT’s browse mode (live web retrieval for current information)
- Build retrieval indexes used when ChatGPT performs web searches
- Potentially contribute data to OpenAI’s training pipeline (though this is separate from browse mode)
Controlling GPTBot access
You control GPTBot access through your robots.txt file:
To block GPTBot entirely:
User-agent: GPTBot
Disallow: /
To block specific sections:
User-agent: GPTBot
Disallow: /private/
Disallow: /members-only/
To explicitly allow (or verify it’s allowed):
User-agent: GPTBot
Allow: /
The AI visibility tradeoff
Blocking GPTBot means your content won’t appear in ChatGPT’s browse mode results or RAG-powered responses from OpenAI products. For most brands pursuing AI visibility, blocking GPTBot is counterproductive — it directly prevents citation by the world’s most-used AI engine.
Blocking may be appropriate for: paywalled content, private customer data, content under active copyright dispute. For general marketing and educational content, allowing GPTBot is the default recommendation.
Other AI crawlers to know
- PerplexityBot: Perplexity AI
- Anthropic-AI / ClaudeBot: Anthropic / Claude
- Googlebot: Google Search + AI Overviews (not blocking this one is table stakes)
- bingbot: Microsoft Bing + Copilot