SEO · Free tool
robots.txt Generator
Pick a template (WordPress, Shopify, staging-block, AI-block), add custom rules, get a ready-to-deploy /robots.txt.
Save as /robots.txt at site root
User-agent: * Allow: / Disallow: /private/ Sitemap: https://www.example.com/sitemap.xml Host: https://www.example.com
Goes at the root of the domain
Place this file at https://yoursite.com/robots.txt — sub-folder robots.txt files are ignored. Subdomains need their own file. The Sitemap: directive is read by every major crawler (Google, Bing, Yandex, DuckDuckGo) regardless of Search Console — it's the simplest way to expose sitemaps to engines without dashboard access.
Block crawling, not indexing
A common mistake: assuming Disallow: hides pages from search. It doesn't — it blocks crawling. A blocked URL can still appear in results with a placeholder snippet if other sites link to it. To fully remove a page from Google, use the robots meta tag generator to set noindex first, let Google re-crawl, then add the Disallow. Validate your rules against specific URLs with the robots.txt tester. The pitfalls are catalogued in robots.txt mistakes that hide your site.
Quick reference
- Paths are case-sensitive; user-agents are not.
- The longer (more specific) matching rule wins; Allow wins on ties.
- Sitemap lines can repeat for multiple sitemaps.
- Use the staging template + IP allowlist combo for pre-launch sites.
FAQ
Where does robots.txt go?
Root of the domain only - https://example.com/robots.txt. Sub-folder robots.txt files are ignored. Subdomains need their own (blog.example.com/robots.txt is separate from www.example.com/robots.txt).
Can I block Google from indexing using robots.txt?
No - Disallow prevents crawling, not indexing. A blocked URL can still appear in search results (with a generic snippet) if other sites link to it. To remove from index, use the noindex meta tag or x-robots-tag header instead.
Should I link my sitemap from robots.txt?
Yes - every search engine that fetches robots.txt also picks up the Sitemap: directive. It is the easiest way to expose sitemaps to Bing, Yandex, DuckDuckGo and crawlers that do not have Search Console access.