Crawl budget for small Indian sites — when it actually matters

Crawl budget worries are usually overblown for small sites. Here is when it genuinely matters, how to diagnose, and the fixes that actually shift the dial.

19 April 2026 · 2 min read

Quick frame: For sites under ~10,000 URLs, crawl budget is rarely the bottleneck. Indexing problems on small sites are usually about content quality, internal linking or canonical confusion — not crawl exhaustion. Diagnose first; optimise crawl budget only when the data demands it.

When crawl budget genuinely matters

E-commerce with 100,000+ SKUs (especially with faceted filters).
News sites publishing 50+ articles daily.
Sites with aggressive parameter-based URL generation (calendars, search).
Sites with chronically slow server response (TTFB > 1s consistently).

If your site doesn't fit one of these, crawl budget is probably not your problem.

Diagnose first

Search Console → Settings → Crawl stats shows you:

Total requests by Googlebot per day.
Average response time.
File type breakdown.

If "Average response time" is over 500ms, your server speed is throttling crawl. Fix that before anything else.

If "Total requests" is high but indexed pages are low (Coverage report), the problem is wasted crawl — Google is fetching pages that don't deserve indexing.

Three highest-leverage fixes

1. Block useless parameter URLs

Faceted search creates infinite URL variants. Add to robots.txt:

Disallow: /search?
Disallow: /*?sort=
Disallow: /*?filter=

Combine with canonical tags pointing parameter URLs back to the clean page.

2. Speed up the server

TTFB matters more for crawl budget than for users. A 200ms TTFB lets Googlebot crawl 5× more pages per day than a 1000ms TTFB.

3. Prune low-value pages

Use noindex via the robots meta generator on:

Thin tag pages (under 3 substantive posts).
Author archives no one searches for.
Date archives (year / month / day).
Pagination pages 2+ with no unique content.

What doesn't actually help

Manually requesting indexing for thousands of URLs (capped per day).
Submitting the sitemap repeatedly (Google re-fetches automatically).
Adding crawl-delay (Google ignores it).

The companion checks

Once crawl budget is healthy, the next questions are about content discoverability via how Google discovers new pages. For sitemap hygiene, see XML sitemap best practices.

FAQ

Q. Should I block JavaScript / CSS to save crawl budget? A. Never — Googlebot needs CSS and JS to render. Blocking them breaks rendering and downgrades the page.

Q. Does crawl budget reset every day? A. Yes, roughly. Google maintains a rough quota per host per day, recalculated based on past response times and crawl yield.

Q. Does adding more content increase crawl budget? A. Indirectly — sites that publish more often get crawled more often. But the crawl quota scales with site authority and server speed, not just publishing volume.

Try the free tool

XML Sitemap Generator

Paste URLs → standards-compliant XML sitemap with lastmod and priority.

Open XML Sitemap Generator →