Every once in a while, we all feel the need to modify something that someone else built.

Sometimes those patches make sense to upstream, but not always.
Sometimes they need a bit more time to bake, before they're ready to share with the world.
Sometimes they're too specific to your environment.
Sometimes it's just some personal preference, that the upstream wouldn't want to force upon everyone.
And sometimes, just sometimes, you just want to run it yourself now, before it has had the time to make it through the whole review gauntlet.

Hey.
Just a heads up, this is a (very) long-winded1 announcement for a new project of mine, Lappverk.
The rest of the post will provide some background on the hows and whys, and I'd recomment sticking around for that.
But you can always jump straight over there, if you'd like. I won't hold you.

This is easy enough, right?

These days, pretty much every project uses Git. It is, famously, a DVCS. A Distributed Version Control System. Distributed. It's right there in the acronym. That means that it can work without any central repository.

You can just clone anything you want, and commit whatever changes you want. You can make your own new central repository, and push your changes to there... original project be damned!

Hell, most Git hosts2 have an easy-to-use button to do just that: "Fork"!

So... that's the dream, right? Just use Git for what it's good at? Right? Right?

One tiny catch...

Git really wants history to roll forward. You commit a change, and then you make another commit that builds on that. You never change an existing commit, you just add new commits with the further changes.

Or, well. You can change a commit,3 but Git doesn't have any way to relate the two commits. It just sees two entirely different timelines, and gets very confused if they ever end up mixing. There's a reason that "Never rewrite public branches" is a very common mantra.

And that works when you're working on a (relatively) static foundation. When you're working on your own project, or even working on a patch that you expect to integrate relatively soon into some public project.

You pick some foundation (usually just main), make your feature branch, send it through review,4 maybe backport it to a few older versions, and then it just... becomes part of the project's history. Irrelevant... until, a few years later, someone comes back to bisect past it, looking for where on $DEITY's Green Earth™ that damn bug was introduced.

But such is not the fate of the capital-P Patch. When we make a Patch that we intend to maintain over time, we have different needs.

Suddenly, we want to have the opposite of what Git wants us to do.
We want to have atomic patches representing, well, one idea.
We want to be able to change that patch.
We want to track how the patch changed, just like how Git lets us track changes through files.

A Patch lives on, isolated from the upstream. For months, for years, for decades.
It gets combined with other, unrelated, patches.
It gets5 rebased as the upstream releases new versions.
It gets split out from the rest, when it's finally time to submit it upstream.
And, sometimes, it gets discarded.

And Git just.. can't really provide that.
If we roll forward, all of our different changes inevitably end up tangled together.
If we continuously rewrite the history of one branch.. we don't really have a history anymore. We just have a shared folder with extra steps.

Hey, what about Jujutsu?

Jujutsu (jj) is a pretty new version control system, whose main claim to fame is making it easier to change your history. It even keeps an "operation log" tracking the repository's state over time!

In theory, that'd be a great start... but jj still only concerns itself with local behaviour. When you push a jj repository, you're still only pushing the Git history that it currently corresponds to. If you rewrite the history and push again, you push a completely distinct Git history, just like what Git itself would have produced. There is, as far as I know, no support for sharing or collaborating on the operation log itself.

The granularity of the oplog is also a bit unhelpful.. it automatically creates a new entry for every jj command you run, which ends up closer to your editor's undo history than it is to the intentional kind of history that Git tracks.

Surely, this has been solved already?

A while ago, the question came up at my old job. My (then-)manager referred us to an old Stack Overflow question on the topic. ...a question that he wrote, about a decade earlier. One that, to this day, didn't really have an answer, beyond a dejected shrug.

Surely... this has to be a common problem already?

Now.. who has ran into this problem before? Who has to manage a lot of patches for... everything?

If you're anything like me, you're all shouting at your monitors at this point.

Natalie, you're talking about Linux distributions!

Yes.. yes.. point taken. Let's have a look at how the pros do it. How do they do it, anyway?

A patchy quilt

Most distributions maintain their patches as .patch files. You know the ones, the ones you get from git diff (and diff!).

diff --git a/README.org b/README.org
index 3699653..3686b62 100644
--- a/README.org
+++ b/README.org
@@ -3,3 +3,5 @@
 Lorem ipsum, blah blah.
+
+Look, here's one just now!

As it turns out, they have some pretty neat properties. They're just text, so we can version and diff them like any other text file. We can store them in our regular version control system of choice. (Git?)

But... where do they come from, anyway?

Red Hat doesn't really have an answer, beyond "run diff!". That works when you have one change that you want to save..., but it doesn't really help us beyond that. If we have multiple changes, it's on us to split up each change cleanly, or to manage changes to these diffs. Oh well.

Nixpkgs tells us to prefer generating patches from upstream commits if possible. That's reasonable advice - if it's been merged then the patch probably won't be needed for any future releases, so we just want things to work now. But for other cases, there's little to say apart from "Spin up a temporary git clone, and export each patch yourself". Meh.

But Arch, Debian, and Ubuntu paint a different picture. They refer to this otherwise little-known tool called Quilt. Quilt gives us something that looks.. remarkably like a version control system. We can record changes, and navigate between them. We can view rudimentary histories! And it all operates natively on .patch files!

Oh happy days, our problems are solved!

But it quickly becomes evident that Quilt is no Git. Aside from the commands being pretty far from what we're used to,6 it's pretty easy to screw up.

Quilt doesn't really have a baseline for what it expects the original source tree to look like, so every time we want to change a file we need to tell it that before we make the change, or the change will end up attributed to the wrong patch (or missed, altogether).

And, that's kind of theme throughout. It can do what you want, but there'll be some awkwardness involved.

Can't we have the same kind of readable and versionable (text-first) history, but in a way that feels more... Gitty?

Enter, Lappverk

What if we could just.. import our patchset into Git, at the start of our session, and then export it back out as a series of patch files when we're done? That way we're effectively following the commandments of The Holy Quiltmother, but still get to keep our familiar Git UI.

After all, nothing's more Git than Git.

As it turns out, Git actually has built-in commands for doing that kind of import and export: format-patch (creates patch files from the history, one patch per commit) and am (imports patch files, creating one commit per patch file).

They're not quite there, you still need to specify revision ranges manually, and make sure you're not importing the same patch twice. And they don't quite roundtrip cleanly, so every time you do an am && format-patch cycle you end up with ever-so-slightly different patch files.

But those are edges we can file off, the core idea was there.

And so, I made Lappverk, a tool that adds some conventions around the workflow.

It normalizes commit metadata, so every export looks the same. It keeps track of the upstream, and it knows where the upstream commits end and your patches begin. And (imports/exports aside), it's just the same old git that your tooling already knows how to work with.

It's pretty neat, if I dare say so myself:

# Create the new project and patch series
lappverk init project example/patches/tyck --upstream https://codeberg.org/natkr/tyck.git
lappverk init patch-series example/patches/tyck/0.1.0 --base v0.1.0

# Check out the current state of the patch series
# (which'll currently just be the upstream release)
# DANGER: This will clear all current state in the worktree
# - all unexported changes will be lost.
lappverk checkout example/patches/tyck/0.1.0

# Go to the checkout
# lappverk checkout also prints the path, so you can pushd that as well
pushd $(lappverk path worktree example/patches/tyck/0.1.0)

# Make some changes!
echo "This edition brought to you by lappverk!" >> templates/comment-list.html.j2
git add templates/comment-list.html.j2
git commit -m "Add lappverk propaganda"

# Update the patch series from the current commit of the worktree
# @ is a shorthand for the active project path
lappverk export @/0.1.0

# We can now inspect the patches, and commit them to our patch repository
popd
git add example/patches/tyck
git status
# > On branch main
# > Your branch is up to date with 'origin/main'.
# >
# > Changes to be committed:
# >   (use "git restore --staged <file>..." to unstage)
# >         new file:   example/patches/tyck/0.1.0/0001-Add-lappverk-propaganda.patch
# >         new file:   example/patches/tyck/0.1.0/lappverk-series.toml
# >         new file:   example/patches/tyck/lappverk-project.toml
git commit -m "Add new patch series for tyck v0.1.0!"

# Now anyone can obtain the patched source tree!
lappverk checkout example/patches/tyck/0.1.0

It's honestly kind of addictive, suddenly being able to easily just.. make the small tweaks I want, without having to worry about the overhead of a full-fat fork.

In fact, this very blog is powered by Lappverk right now!

Acknowledging history

Remember how my involvement here started out at the old job?

In fact, Lappverk started out as Patchable, the internal tool I developed for that very use case. Lappverk is effectively "Patchable with the Stackable-specific assumptions stripped out".

Without the Patchable there probably still wouldn't have been a Lappverk. And my table of contents would have been a little more broken.

Farewell, for now!

Now go forth, and patch some software of your own!

  1. I'm me, after all...

  2. At least Forgejo (including Codeberg), GitHub, and GitLab... I suspect Gitorious had it too, back before they kicked the bucket.

  3. And some people are very eager to tell you so.

  4. Maybe repeating the process a few times.

  5. Well, ideally...

  6. This also means that any external tooling like IDEs will be pretty confused.