Free Indian Tools

robots.txt Generator

Pick a template (WordPress, Shopify, staging-block, AI-block), add custom rules, get a ready-to-deploy /robots.txt.

TemplateSitemap URLCanonical hostExtra rules (one per line)

Save as /robots.txt at site root

User-agent: *
Allow: /
Disallow: /private/

Sitemap: https://www.example.com/sitemap.xml
Host: https://www.example.com

Goes at the root of the domain

Place this file at https://yoursite.com/robots.txt — sub-folder robots.txt files are ignored. Subdomains need their own file. The Sitemap: directive is read by every major crawler (Google, Bing, Yandex, DuckDuckGo) regardless of Search Console — it's the simplest way to expose sitemaps to engines without dashboard access.

Block crawling, not indexing

A common mistake: assuming Disallow: hides pages from search. It doesn't — it blocks crawling. A blocked URL can still appear in results with a placeholder snippet if other sites link to it. To fully remove a page from Google, use the robots meta tag generator to set noindex first, let Google re-crawl, then add the Disallow. Validate your rules against specific URLs with the robots.txt tester. The pitfalls are catalogued in robots.txt mistakes that hide your site.

Quick reference

Paths are case-sensitive; user-agents are not.
The longer (more specific) matching rule wins; Allow wins on ties.
Sitemap lines can repeat for multiple sitemaps.
Use the staging template + IP allowlist combo for pre-launch sites.

FAQ

Where does robots.txt go?

Root of the domain only - https://example.com/robots.txt. Sub-folder robots.txt files are ignored. Subdomains need their own (blog.example.com/robots.txt is separate from www.example.com/robots.txt).

Can I block Google from indexing using robots.txt?

No - Disallow prevents crawling, not indexing. A blocked URL can still appear in search results (with a generic snippet) if other sites link to it. To remove from index, use the noindex meta tag or x-robots-tag header instead.

Should I link my sitemap from robots.txt?

Yes - every search engine that fetches robots.txt also picks up the Sitemap: directive. It is the easiest way to expose sitemaps to Bing, Yandex, DuckDuckGo and crawlers that do not have Search Console access.