noindex vs Disallow vs canonical — when to use which

The three Google indexing tools get confused constantly. Right one per scenario — noindex for removal, Disallow for crawl-block, canonical for consolidation.

Quick frame: Three tools, three jobs: noindex removes pages from Google's index; Disallow blocks crawling (but not necessarily indexing); canonical consolidates ranking signals across duplicate URLs. Picking the wrong one is the most common technical SEO mistake.

The decision tree

Want the page gone from search? → noindex (via meta tag or x-robots-tag header). Want to save Google's crawl budget? → Disallow in robots.txt. Have multiple URLs serving the same content? → canonical to the chosen one.

noindex — for removal

Use the robots meta generator to emit:

<meta name="robots" content="noindex,follow" />

Google must crawl the page to see this tag. So: don't Disallow it. After Google sees the noindex, the page drops from the index over the next 1–4 weeks.

Common use cases:

Thank-you pages after form submission.
Internal search result pages.
Pagination pages 2+ if they offer no unique value.
Staging environments accessible to crawlers.

Disallow — for crawl-block

Use the robots.txt generator for the syntax. Disallow blocks crawling. The URL can still appear in search results with a placeholder snippet if other sites link to it.

Common use cases:

/admin/, /wp-login.php — block bot exploration of admin paths.
/search/ — internal search results with infinite URL variants.
/checkout/, /cart/ — pages with no SEO value.

Don't use Disallow for content you want hidden — use noindex instead.

canonical — for consolidation

Use the canonical tag generator to point duplicate URLs to a single canonical. Google indexes the canonical and treats the others as alternates, consolidating link equity.

Common use cases:

Product variants by colour / size sharing the same description.
Same content reachable via multiple paths.
Print-friendly versions of pages.
URLs with tracking parameters (utm_*).

The most common confusion: Disallow + want it removed

The error: you Disallow a URL in robots.txt expecting it to drop from search. It doesn't — Google can't crawl, so it can't see your noindex, so the URL stays indexed with a placeholder.

Correct sequence to remove:

Remove the Disallow (let Google crawl).
Add noindex meta tag.
Wait for re-crawl (1–4 weeks).
Once page is dropped from index, optionally re-add Disallow.

When to use multiple at once

Layering is fine when each serves its purpose:

canonical + Disallow for tracking parameters (canonical consolidates, Disallow saves crawl budget).
noindex + canonical for thin pages with a related canonical home.

But don't combine noindex + Disallow (kills the noindex).

For wider crawl-control patterns, see robots.txt mistakes that hide your site.

FAQ

Q. Does noindex pass link equity? A. noindex,follow allows links on the page to pass equity. Over long periods Google may treat the page as crawled-but-not-followed, but for typical use cases follow still works.

Q. How long until a noindexed page disappears? A. 1–4 weeks for most sites. Faster if you request indexing via Search Console.

Q. Can canonical and hreflang conflict? A. Each language version should canonical to itself; hreflang then references all variants. They work in parallel.

noindex vs Disallow vs canonical — when to use which

The decision tree

noindex — for removal

Disallow — for crawl-block

canonical — for consolidation

The most common confusion: Disallow + want it removed

When to use multiple at once

FAQ

Robots Meta Tag Generator

Related guides