Hugo and sitemap.txt

I’m slowly moving the Infinite Ink website out of the 1990s and, as part of this process, I’m learning about sitemaps, which were introduced by Google in 2005 (~15 years ago).

Page contents

About sitemaps

Sitemaps are used by search engines to crawl and index websites. Details are at these links:

To learn about sitemaps in Hugo, see:

Hugo’s default sitemap

Hugo’s default is to output a file named sitemap.xml in a project’s public/ directory using the built-in sitemap template, which you can view at:

In November 2020 this template looks like this:

{{ printf "<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"yes\"?>" | safeHTML }}
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
  xmlns:xhtml="http://www.w3.org/1999/xhtml">
  {{ range .Data.Pages }}
  <url>
    <loc>{{ .Permalink }}</loc>{{ if not .Lastmod.IsZero }}
    <lastmod>{{ safeHTML ( .Lastmod.Format "2006-01-02T15:04:05-07:00" ) }}</lastmod>{{ end }}{{ with .Sitemap.ChangeFreq }}
    <changefreq>{{ . }}</changefreq>{{ end }}{{ if ge .Sitemap.Priority 0.0 }}
    <priority>{{ .Sitemap.Priority }}</priority>{{ end }}{{ if .IsTranslated }}{{ range .Translations }}
    <xhtml:link
                rel="alternate"
                hreflang="{{ .Language.Lang }}"
                href="{{ .Permalink }}"
                />{{ end }}
    <xhtml:link
                rel="alternate"
                hreflang="{{ .Language.Lang }}"
                href="{{ .Permalink }}"
                />{{ end }}
  </url>
  {{ end }}
</urlset>

 

As you can see in the highlighted line above, this produces a list of all .Data.Pages of a website.

Using sitemap.txt instead of sitemap.xml

As part of my foray into the world of sitemaps, I’ve decided to use a simple sitemap.txt file rather than a complicated (to me) sitemap.xml file. To learn about the sitemap.txt file, see:

To set up Infinite Ink’s website to use a sitemap.txt file, I did the following three steps.

 

1. Override the default sitemap output file name

To tell Hugo to name the sitemap file sitemap.txt, I put the following in Infinite Ink’s config.yaml:

sitemap:
  filename: sitemap.txt

 

If your Hugo config file is written in TOML rather than YAML, you can use this syntax:

[sitemap]
  filename = "sitemap.txt"

 

2. Override the default sitemap template

To use my own sitemap template, rather than what’s listed above in Hugo’s default sitemap, I created layouts/_default/sitemap.xml in my Infinite Ink project root, which contains this:

Note that I hand code which portals are listed because some Infinite Ink portals are not yet ready to be announced to the world.

To view Infinite Ink’s current sitemap, see www.ii.com/sitemap.txt.

The output file, sitemap.txt, must be UTF-⁠8 encoded.

 

3. Update robots.txt

To tell web crawlers about this sitemap file, I added the following to Infinite Ink’s robots.txt file.

Sitemap: https://www.ii.com/sitemap.txt
User-agent: *

 

I did this directly on Infinite Ink’s web server, but it’s possible to use Hugo to maintain a website’s robots.txt.

 

Please edit this page 👍 👎 📝