Migrating to Hugo
I have been meaning to move away from Wordpress to a static site generator for a very long time, due to:
- The slowness of WP, since every page request makes multiple database calls due to the spaghetti code nature of WP and its plugin architecture. Caching can help somewhat, but it has brittle edge cases.
- Its record of security holes. I mitigated this somewhat by isolating PHP as much as possible.
- It is almost impossible to follow front-end optimization best-practices like minimizing the number of CSS and JS files because each WP plugin has its own
My original plan was to go with Acrylamid, but about a year ago I started experimenting with Hugo. Hugo is blazing fast because it is implemented in Go rather than a slow language like Python or Ruby, and this is game-changing. Nonetheless, it took me over a year to migrate. This post is about the issues I encountered and the workflow I adopted to make it work.
Wordpress content migration
There is a migration tool, but it is far from perfect despite the author’s best efforts, mostly because of the baroque nature of Wordpress itself when combined with plugins and an old site that used several generations of image gallery technology.
Unfortunately, that required rewriting many posts, specially those with photos or embedded code.
Photo galleries
Hugo does not (yet) support image galleries natively. I started looking at the HugoPhotoSwipe project, but got frustrated by bugs in its home-grown YAML parser that broke round-trip editing, and made it very difficult to get galleries with text before and after the gallery proper. The Python-based smartcrop for thumbnails is also excruciatingly slow.
I wrote hugopix to address this. It uses a simpler one-way index file generation method, and the much faster Go smartcrop implementation by Artyom Pervukhin.
Broken asset references
Posts with photo galleries were particularly broken, due to WP’s insistence on
replacing photos with links to image pages. I wrote a tool to help me
find broken images and other assets, and organize them in a more rational way
(e.g. not have PDFs or source code samples be put in static/images
).
It also has a mode to identify unused assets, e.g. 1.5GB of images that no longer belong in the hugo tree as their galleries are moving elsewhere.
Password-protected galleries
I used to have galleries of family events on my site, until an incident where some Dutch forum started linking to one of my cousin’s wedding photos and making fun of her. At that point I put a pointed error message for that referrer and controlled access using WP’s protected feature. That said, private family photos do not belong on a public blog and I have other dedicated password-protected galleries with Lightroom integration that make more sense for that use case, so I just removed them from the blog, shaving off 1.5GB of disk in the bargain.
Search
There are systems that can provide search without any server component, e.g. the JavaScript-based search in Sphinx, and I looked at some of the options referenced by the Hugo documentation like the Bleve-based hugoidx but the poor documentation gave me pause, and I’d rather not run Node.js on my server as needed by hugo-lunr.
Having recently implemented full-text search in Temboz using SQLite’s FTS5 extension, I felt more comfortable building my own search server in Go. Because Hugo and fts5index share the same Go template language, this makes a seamless integration in the site’s navigation and page structure easy.
Theme
There is no avoiding this, moving to a new blogging system requires a rewrite of a new theme if you do not want to go with a canned theme. Fortunately, Hugo’s theme system is sane, unlike Wordpress’, because it does not have to rely on callbacks and hooks as much as with WP plugins.
RSS GUIDs and permalinks
One pet peeve of mine is when sites change platform with new GUIDs or permalinks in the RSS feeds, causing a flood of old-new articles to appear in my feed reader. Since I believe in showing respect to my readers, I had to avoid this at all costs, and also put in place redirects as needed to avoid 404s for the few pages that did change permalinks (mostly image galleries).
Doing so required copying the embedded RSS template and changing:
<guid>{{ .Permalink }}</guid>
to:
<guid isPermaLink="false">{{ .Params.rss_guid | default .Permalink }}</guid>
The next step was to add rss_guid
to the front matter of the last 10
articles in my legacy RSS feed.