Skip to content
Free Indian Tools

SEO

XML sitemap best practices for India — sizing, lastmod, submission

XML sitemap rules that matter in 2026 — 50,000 URL ceiling, accurate lastmod, sitemap index grouping, robots.txt and Search Console submission. India focus.

24 April 2026 · 2 min read


Quick frame: XML sitemap caps at 50,000 URLs / 50 MB uncompressed. Beyond that, split into multiple sitemaps and reference them via a sitemap index. The single field that actually matters is lastmod — accurate, content-change-driven dates. Priority and changefreq are largely ignored by Google.

Generating the sitemap

Two paths depending on tooling:

  • Framework-native: Next.js, Hugo, Astro all auto-generate sitemaps.
  • Manual / custom: paste URLs into the XML sitemap generator, download, host at /sitemap.xml.

For multi-thousand URL sites with multiple content types, group into themed sitemaps (sitemap-blog.xml, sitemap-products.xml, sitemap-pages.xml) and combine via the sitemap index generator.

lastmod is the only field that matters (for Google)

Google ignores priority and largely ignores changefreq. The signal that actually drives re-crawl prioritisation is lastmod — but only when it's honest.

A sitemap that updates lastmod daily but with no actual content changes gets ignored. A sitemap that updates lastmod only when content genuinely changes (republish, body edit, schema bump) gets prioritised re-crawl.

Reference from robots.txt

Always add the sitemap URL to your robots.txt:

Sitemap: https://www.example.in/sitemap.xml

Every major crawler (Google, Bing, Yandex, DuckDuckGo) reads this directive. It's the easiest way to expose sitemaps to engines without dashboard access. For multi-sitemap sites, you can list multiple Sitemap: lines.

Submission and monitoring

  1. Google Search Console → Sitemaps → submit your sitemap URL.
  2. Bing Webmaster Tools → Sitemaps → submit same URL.
  3. Re-check Search Console → Sitemaps weekly. Look for "Submitted" vs "Indexed" counts. Large gaps indicate crawl or content quality issues.

Indian-context sizing examples

  • Small blog (50–500 URLs): single sitemap.xml, hand-crafted is fine.
  • Mid e-commerce (1,000–50,000 URLs): single auto-generated sitemap; might split products from blog later.
  • Large marketplace (50,000+ URLs): sitemap index referencing multiple themed sitemaps, regenerated nightly.

Common pitfalls

  • Sitemap contains non-200 URLs (404s, 301s) — these waste crawl budget. Audit and prune.
  • Sitemap contains noindex pages — strip them. The sitemap should list indexable URLs only.
  • Sitemap contains the wrong protocol (http instead of https) — fix at generation time.

The companion piece on robots.txt is in robots.txt mistakes that hide your site.

FAQ

Q. Should I include hreflang in the sitemap or in HTML? A. Either works. For sites with many language variants, sitemap-level hreflang reduces per-page bloat.

Q. Does Google index a sitemap on submission? A. No — submission tells Google to fetch the sitemap; it's a guide for crawling, not a guarantee of indexing.

Q. How often should I re-submit? A. You don't need to — Google re-fetches automatically based on lastmod. Manual re-submission is rarely useful.

Try the free tool

XML Sitemap Generator

Paste URLs → standards-compliant XML sitemap with lastmod and priority.

Open XML Sitemap Generator

Related guides