|
Contents
|
Script Deep Dive: generate-sitemap.ps1Sitemap Protocol 0.9 · changefreq · priority What This Script Does
A sitemap doesn't guarantee indexing — it's merely a suggestion. Search engines have the right to ignore certain pages and to crawl pages not listed in the sitemap. But providing an accurate, up-to-date sitemap can significantly speed up the discovery of new content, especially for new websites and infrequently updated pages. The Three Fields of the Sitemap ProtocolEach <loc> (required) — the full URL of the page, including protocol and domain<lastmod> — last modified date, in YYYY-MM-DD format. Search engines use it to prioritize crawling recently updated pages<changefreq> — update frequency hint (daily / weekly / monthly), helping crawlers allocate their crawl budget<priority> — relative priority (0.0-1.0). Homepage is 1.0, blog posts are 0.7, tool pages are 0.3-0.5Honestly, modern search engines pay less attention to What Pages Are ScannedThe current version scans the following, generating approximately 40 URLs: src/content/blog/ and blog/en/, automatically skipping draftssrc/content/page/ and page/en/archive.html, tags.html, stats.html
Why Static Generation Instead of DynamicSome websites use CGI or PHP to dynamically generate sitemaps (scanning the database on each request). This site chose static generation. The reasons are straightforward: This is a consistent design philosophy: static first, CGI as a last resort. Output Location and Auto-DiscoveryThe sitemap outputs to the project root (not This site's Search Engine Discovery FlowA complete search engine discovery flow goes like this: the crawler first reads In 2026, a 90s-style website with no JavaScript and pure table-based layout can still be found by Google, thanks in no small part to the sitemap and solid SEO metadata.
|