Using a local LLM to convert grep output to Caddy config // .:. brainsik

Today I ran into a great use case for a local LLM: converting grep output into config code.

While looking at some metrics for the various websites I run, I noticed this blog (brainsik.net) was suddenly sending 404s for very old URLs used when I was briefly hosted on Tumblr. Those URLs had a format like “/post/[long-number]/title-of-the-post”. IIRC, the long number was the actual identifier and it didn’t matter what text came after. Why these suddenly were being hit after a decade plus, I do not know.

When I migrated from Tumblr to my own static setup I added the following to the markdown front matter for posts I created on Tumblr:

tumblr_permalink:
- http://brainsik.tumblr.com/post/3282351590/bomb-crater-swimming-pools

I didn’t want to lose where the original source was, just in case. That came through today. I was able to grep through my posts directory and quickly see everything that had a Tumblr URL. Here’s the truncated output:

$ rg 'tumblr.com/post'
content/posts/2011-02-14-bomb-crater-swimming-pools.md
6:- http://brainsik.tumblr.com/post/3282351590/bomb-crater-swimming-pools

content/posts/2012-12-15-python-cron-task-exit-if-already-running.md
6:- http://brainsik.tumblr.com/post/37951578978/python-cron-task-exit-if-already-running

content/posts/2011-02-13-tools-never-die-waddaya-mean-never.md
6:- http://brainsik.tumblr.com/post/3277916749/tools-never-die-waddaya-mean-never

This output had everything I needed to create Caddy redirect rules like:

redir /post/3282351590/* /2011/bomb-crater-swimming-pools/ permanent

There were enough entries that I didn’t want to edit things by hand. I started to consider whether it would be quicker to write some dirty Perl or Python when I thought: “Can an LLM just convert this for me?" I didn’t want the LLM to write a script, I wanted it to directly output the redir rules.

Turns out not only is this within some LLMs’ capability, it can be done with a tiny local model! Here’s the truncated result from a local Gemma 3n (4B)¹ model:

redir /post/3282351590/* /2011/bomb-crater-swimming-pools/ permanent
redir /post/37951578978/* /2012/python-cron-task-exit-if-already-running/ permanent
redir /post/3277916749/* /2011/tools-never-die-waddaya-mean-never/ permanent

Here is the prompt I used.
Here is the console session where I run it against Gemma 3n.

This is a really small model. It’s so small I can run it on my phone:

Screenshot from iOS showing the LocallyAI app using the Gemma 3n model. We see the end of the prompt and the first 3 results are the same as when run on the desktop.

It’s exciting to me that a one-off, bespoke text conversion problem like this can be quickly solved by a local LLM. The main issue was I initially did not trust the output and so took the time to check the output. Given how high LLM error rates are, I’m not sure when I would just yolo and take the result at face value. Quite a factor for deciding when to use this tactic.

In the past, I found much value in solving these coding puzzles. They felt like good exercises to keep those programming muscles toned. People should probably still do them, especially while still building up those chops. At this point in my life / career, programming is a tool, not the job, and having done so many of these puzzles already, I find I often just want to solve the main problem at hand and move on. I’m curious to find out when it feels right to still do them.

The exact model I’m using on the laptop is mlx-community/gemma-3n-E4B-it-lm-4bit. ↩︎