March 25, 2021

Bleve

I think I’ve finally got this working properly, but who knows how much longer it will stay online so I’d better write this quickly.

I’d been looking for a way to better perform searches for this blog within the config. That is to say, for certain routes, I’d like posts with certain characteristics to appear. The criteria for the posts are a search, and are associated to the route configuration.

I’d gotten the route configuration pretty well settled, but didn’t have a good way to query the in-memory database for these disparate criteria. Until now.

There is a Go package called Bleve that is working reasonably well to return posts based on facets. I define my posts in markdown with yaml frontmatter. In this way, I can associate metadata of any kind to any post. In particular, all of my posts have categories, a date, their “slug”, and authors. Posts are also assigned a facet of metadata based on the repository of posts they come from, such as “posts” or “pages”, which are stored separately. I can then search for a related post by sending the query to the Bleve query processor. Bleve returns the search results as a list of index values. For me, these values are the slug of the post, which allows me to pull the full post from the in-memory key value store holding the posts.

The Bleve searches allow multiple faces. If I want a list of posts within the “blog” category, I’d search for “Repo:posts Categories:blog”. I can also tell Bleve to return a specific number of results starting at a specific index in the list, which is really handy for pagination. I do not yet have a pagination control on the site, but when I do, it will be able to feed that into the query. It already works, the HTML is just not appearing that would let you move from page to page yet. I’m still considering an elegant way to define that in the page templates.

Bleve is useful, but it has some weird flaws. For one, the documentation is atrocious. Where most Go libraries have at least some minimal go-doc documentation - built with some standard tool into a standard format which is not great but is at least something - Bleve seems to be ok with documenting just a few things and letting you sift through the source for all of the details.

Worse, there are performance differences between my Mac and the Ubuntu server that the site runs on. Maybe the performance should be different on different hardware, but I wouldn’t expect such the drastic difference.

Without changing some concurrency settings, Bleve crashes with out of memory errors pretty easily on Ubuntu. It seems like it has trouble doing concurrent index insertions, which basically means that it’s not thread safe. I’m going to have to rewrite a bunch of stuff to work around this well. If I only use one worker instead of 32 (and 32 is only a limit on Ubuntu, not OS X) then the whole thing seems to work reliably enough. At least, I’m not seeing it eat memory like it was, and it’s not crashing like before.

Unfortunately, the additional searching takes a toll on performance, and I’m back up into the 300ms range. This is really weird because locally, round-trip times are under 60ms for the whole page. But on a server dedicated to this one application, it’s taking hundreds of milliseconds. I know I’m quibbling over milliseconds, but that was the whole point! Now I’m going to have to build a cache to get the speed I want, and that just feels like cheating.

Ok. Next up, a thing in Svelte that lets you create a configuration file. If it can do round-trip editing of the config, that’ll be pretty sweet. Then I’ll ship the app with a few pages of documentation, and all you’ll need to do to operate the whole app is literally run it with go run sn.go and then open the browser to look at the site.