Hugo and sitemap.txt

I’m slowly moving the Infinite Ink website out of the 1990s and, as part of this process, I’m learning about sitemaps, which were introduced by Google in 2005 (~16 years ago).

Page contents

About sitemaps

Sitemaps are used by search engines to crawl and index websites. Details are at these links:

To learn about sitemaps in Hugo, see:

Hugo’s default sitemap

Hugo’s default is to output a file named sitemap.xml in a project’s public/ directory using the built-in sitemap template, which you can view at:

In February 2021 this template looks like this:

{{ printf "<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"yes\"?>" | safeHTML }}
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
  xmlns:xhtml="http://www.w3.org/1999/xhtml">
  {{ range .Data.Pages }}
    {{- if .Permalink -}}
  <url>
    <loc>{{ .Permalink }}</loc>{{ if not .Lastmod.IsZero }}
    <lastmod>{{ safeHTML ( .Lastmod.Format "2006-01-02T15:04:05-07:00" ) }}</lastmod>{{ end }}{{ with .Sitemap.ChangeFreq }}
    <changefreq>{{ . }}</changefreq>{{ end }}{{ if ge .Sitemap.Priority 0.0 }}
    <priority>{{ .Sitemap.Priority }}</priority>{{ end }}{{ if .IsTranslated }}{{ range .Translations }}
    <xhtml:link
                rel="alternate"
                hreflang="{{ .Language.Lang }}"
                href="{{ .Permalink }}"
                />{{ end }}
    <xhtml:link
                rel="alternate"
                hreflang="{{ .Language.Lang }}"
                href="{{ .Permalink }}"
                />{{ end }}
  </url>
    {{- end -}}
  {{ end }}
</urlset>

 

As you can see in the highlighted line above, this produces a list of all .Data.Pages of a website.

Using sitemap.txt instead of sitemap.xml

As part of my foray into the world of sitemaps, I’ve decided to use a simple sitemap.txt file rather than a complicated (to me) sitemap.xml file. To learn about the sitemap.txt file, see:

To set up Infinite Ink’s website to use a sitemap.txt file, I did the following three steps.

 

1. Override the default sitemap output file name

To tell Hugo to name the sitemap file sitemap.txt, I put the following in Infinite Ink’s config.yaml:

sitemap:
  filename: sitemap.txt

 

If your Hugo config file is written in TOML rather than YAML, you can use this syntax:

[sitemap]
  filename = "sitemap.txt"

 

2. Override the default sitemap template

To use my own sitemap template, rather than what’s listed in the section Hugo’s default sitemap above, I created layouts/_default/sitemap.xml in Infinite Ink’s project root that contains this:

To learn about site.Params.mainSections, which is used in line 2 of this layout file, see gohugo.io/functions/where/#mainsections.

Note that I hand code which portals are listed because some Infinite Ink portals are not ready to be indexed by search engines.

The output file, sitemap.txt, must be UTF-⁠8 encoded.

To view Infinite Ink’s current sitemap, see www.ii.com/sitemap.txt.

 

3. Update robots.txt

To tell web crawlers about this sitemap file, I added the following to Infinite Ink’s robots.txt file.

Sitemap: https://www.ii.com/sitemap.txt
User-agent: *

 

I did this directly on Infinite Ink’s web server, but it is possible to use Hugo to maintain a website’s robots.txt.

 


Comments 👍 👎 📝

To comment, you must be signed in to GitHub.