aboutsummaryrefslogtreecommitdiff
path: root/_posts/2021-02-22-can-we-please-move-past-git.md
blob: 61296d5ee591e487dc75a50026c05edb8ecc0fef (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
---
title: Can We Please Move Past Git?
---

> Git is fundamentally a content-addressable filesystem with a VCS user interface written on top of it.
> 
> — [Pro Git §10.1](https://git-scm.com/book/en/v2/Git-Internals-Plumbing-and-Porcelain)

Most software development is not like the Linux kernel's development; as such, Git is not designed for most software development.
Like Samuel Hayden tapping the forces of Hell itself to generate electricity, the foundations on which Git is built are overkill on the largest scale, and when the interface concealing all that complexity cracks, the nightmares which emerge can only be dealt with by the Doom Slayer that is [Oh Shit Git](https://ohshitgit.com/).
Of course, the far more common error handling method is [start over from scratch](https://xkcd.com/1597/).

Git is bad.
But are version control systems like operating systems in that they're all various kinds of bad, or is an actually good VCS possible?
I don't know, but I can test some things and see what comes up.

## Mercurial

[Mercurial](https://www.mercurial-scm.org/) is a distributed VCS that's around the same age as Git, and I've seen it called the Betamax to Git's VHS, which my boomer friends tell me is an apt analogy, but I'm too young for that to carry meaning.
So let me see what all the fuss is about.

Well, I have some bad news.
From the [download page](https://www.mercurial-scm.org/downloads), under "Requirements":

> Mercurial uses Python (version **2.7**). Most ready-to-run Mercurial distributions include Python or use the Python that comes with your operating system.

Emphasis theirs, but I'd have added it myself otherwise.
Python 2 has been dead for a very long time now, and saying you require Python 2 makes me stop caring faster than referring to "GNU/Linux".
If you've updated it to Python 3, cool, don't say it uses Python 2.
Saying it uses Python 2 makes me think you don't have your shit together, and in fairness, that makes two of us, but I'm not asking people to use my version control system (so far, at least).

You can't be better than Git if you're that outdated. (Although you can totally be better than Git by developing a reputation for having a better UI than Git; word of mouth helps a lot.)

## Subversion

I am a fan of subverting things, and I have to respect wordplay.
So let's take a look at [Subversion](https://subversion.apache.org/) (sorry, "Apache® Subversion®").

There are no official binaries at all, and the most-plausible-looking blessed unofficial binary for Windows is TortoiseSVN.
I'm looking through the [manual](https://tortoisesvn.net/docs/release/TortoiseSVN_en/tsvn-repository.html#tsvn-repository-layout), and I must say, the fact that branches and tags aren't actually part of the VCS, but instead conventions on top of it, isn't good.
When I want to make a new branch, it's usually "I want to try an experiment, and I want to make it easy to give up on this experiment."
Also, I'm not married to the idea of distributed VCSes, but I do tend to start a project well before I've set up server-side infrastructure for it, and Subversion is not designed for that sort of thing at all.
So I think I'll pass.

You can't be better than Git if the server setup precedes the client setup when you're starting a new project.
(Although you can totally be better than Git by having monotonically-ish increasing revision numbers.)

## Fossil

[Fossil](https://fossil-scm.org/) is kinda nifty: it handles not just code but also issue tracking, documentation authoring, and a bunch of the other things that services like GitHub staple on after the fact.
Where Git was designed for the Linux kernel, which has a fuckton of contributors and needs to scale absurdly widely, Fossil was designed for SQLite, which has a very small number of contributors and does not solicit patches.
My projects tend to only have one contributor, so this should in principle work fine for me.

However, a few things about Fossil fail to spark joy.
The fact that repository metadata is stored as an independent file separate from the working directory, for example, is a design decision that doesn't merge well with my existing setup.
If I were to move my website into Fossil, I would need somewhere to put `boringcactus.com.fossil` outside of `D:\Melody\Projects\boringcactus.com` where the working directory currently resides.
The [documentation](https://fossil-scm.org/home/doc/trunk/www/whyusefossil.wiki#definitions) suggests `~/Fossils` as a folder in which repository metadata can be stored, but that makes my directory structure more ugly.
The rationale for doing it this way instead of having `.fossil` in the working directory like `.git` etc. is that multiple checkouts of the same repository are simpler when repository metadata is outside each of them.
Presumably the SQLite developers do that sort of thing a lot, but I don't, and I don't know anyone who does, and I've only ever done it once (back in the days when the only way to use GitHub Pages was to make a separate `gh-pages` branch).
Cluttering up my filesystem just so you can support a weird edge case that I don't need isn't a great pitch.

But sure, let's check this out.
The [docs](https://fossil-scm.org/home/doc/trunk/www/inout.wiki) have instructions for importing a Git repo to Fossil, so let's follow them:

```text
PS D:\Melody\Projects\boringcactus.com> git fast-export --all | fossil import --git D:\Melody\Projects\misc\boringcactus.com.fossil
]ad fast-import line: [S IN THE
```

Well, then.
You can't be better than Git if your instructions for importing from Git don't actually work.
(Although you can totally be better than Git if you can keep track of issues etc. alongside the code.)

## Darcs

[Darcs](http://darcs.net/) is a distributed VCS that's a little different to Git etc.
Git etc. have the *commit* as the fundamental unit on which all else is built, whereas Darcs has the *patch* as its fundamental unit.
This means that a branch in Darcs refers to a set of patches, not a commit.
As such, Darcs can be more flexible with its history than Git can: a Git commit depends on its temporal ancestor ("parent"), whereas a Darcs patch depends only on its logical ancestor (e.g. creating a file before adding text to it).
This approach also improves the way that [some types of merge](https://tahoe-lafs.org/~zooko/badmerge/simple.html) are handled; I'm not sure how often this sort of thing actually comes up, but the fact that it could is definitely suboptimal.

So that's pretty cool; let's take a look for ourselves.
Oh.
Well, then.
The [download page](http://darcs.net/Binaries) is only served over plain HTTP - there's just nothing listening on that server over HTTPS - and the downloaded binaries are also served over plain HTTP.
That's not a good idea.
I'll pass, thanks.

You can't be better than Git while serving binaries over plain HTTP.
(Although you can totally be better than Git by having nonlinear history and doing interesting things with patches.)

## Pijul

[Pijul](https://pijul.org/) is (per [the manual](https://pijul.org/manual/why_pijul.html))

> the first distributed version control system to be based on a sound mathematical theory of changes. It is inspired by Darcs, but aims at solving the soundness and performance issues of Darcs.

Inspired by Darcs but better, you say?
You have my attention.
Also of note is that the developers are also building their own GitHub clone, which they use to [host pijul itself](https://nest.pijul.com/pijul/pijul), which gives a really nice view of how a GitHub clone built on top of pijul would work, and also offers free hosting.

The manual gives [installation instructions](https://pijul.org/manual/installing.html) for a couple Linuces and OS X, but not Windows, and not Alpine Linux, which is the only WSL distro I have installed.
However, someone involved in the project [showed up in my mentions to say that it works on Windows](https://twitter.com/nuempe/status/1359614145415548939), so we'll just follow the generic instructions and see what happens:

```text
PS D:\Melody\Projects> cargo install pijul --version "~1.0.0-alpha"
    Updating crates.io index
  Installing pijul v1.0.0-alpha.38
  Downloaded <a bunch of stuff>
   Compiling <a bunch of stuff>
error: linking with `link.exe` failed: exit code: 1181
  |
  = note: "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\BuildTools\\VC\\Tools\\MSVC\\14.27.29110\\bin\\HostX64\\x64\\link.exe" <lots of bullshit>
  = note: LINK : fatal error LNK1181: cannot open input file 'zstd.lib'


error: aborting due to previous error
```

So it doesn't work for me on Windows.
(There's a chance that instructions would help, but in the absence of those, I will simply give up.)
Let's try it over on Linux:

```text
UberPC-V3:~$ cargo install pijul --version "~1.0.0-alpha"
<lots of output>
error: linking with `cc` failed: exit code: 1
  |
  = note: "cc" <a mountain of arguments>
  = note: /usr/lib/gcc/x86_64-alpine-linux-musl/9.3.0/../../../../x86_64-alpine-linux-musl/bin/ld: cannot find -lzstd
          /usr/lib/gcc/x86_64-alpine-linux-musl/9.3.0/../../../../x86_64-alpine-linux-musl/bin/ld: cannot find -lxxhash
          collect2: error: ld returned 1 exit status


error: aborting due to previous error
UberPC-V3:~$ sudo apk add zstd-dev xxhash-dev
UberPC-V3:~$ cargo install pijul --version "~1.0.0-alpha"
<lots of output again because cargo install forgets dependencies immediately smdh>
   Installed package `pijul v1.0.0-alpha.38` (executable `pijul`)
```

Oh hey, would you look at that, it actually worked, and all I had to do was wait six months for each compile to finish (and make an educated guess about what packages to install).
So for the sake of giving back, let's add those instructions to the manual, so nobody else has to bang their head against the wall like I'd done the past few times I tried to get Pijul working for myself.

First, clone the [repository for the manual](https://nest.pijul.com/pijul/manual):

```text
UberPC-V3:~$ pijul clone https://nest.pijul.com/pijul/manual
Segmentation fault
```

Oh my god.
That's extremely funny.
Oh fuck that's *hilarious* - I sent that to a friend and her reaction reminded me that Pijul is *written in Rust*.
This VCS so profoundly doesn't work on my machine that it manages to segfault in a language that's supposed to make segfaults impossible.
Presumably the segfault came from C code FFId with `unsafe` preconditions that weren't met, but still, that's just amazing.

*Update 2021-02-24*: One of the Pijul authors reached out to me to help debug things.
Apparently [`mmap` on WSL is just broken](https://github.com/microsoft/WSL/issues/658), which explains the segfault.
They also pointed me towards [the state of the art in getting Pijul to work on Windows](https://nest.pijul.com/pijul/pijul/discussions/140#57f9eead-915f-45c1-a169-a5bd417d9213), which I confirmed worked locally and then [set up automated Windows builds using GitHub Actions](https://github.com/boringcactus/pijul-windows-builds).
So if we have a working Pijul install, let's see if we can add that CI setup to the manual:

```text
PS D:\Melody\Projects\misc> pijul clone https://nest.pijul.com/pijul/manual pijul-manual
✓ Updating remote changelist
✓ Applying changes       47/47
✓ Downloading changes    47/47
✓ Outputting repository
```

Hey, that actually works!
We can throw in some text to the installation page (and more text to the getting started page) and then use `pijul record` to commit our changes.
That pulls up *Notepad* as the default text editor, which fails to spark joy, but that's a papercut that's entirely understandable for alpha software not primarily developed on this OS.
Instead of having "issues" and "pull requests" as two disjoint things, the Pijul Nest lets you add changes to any discussion, which I very much like.
Once we've recorded our change and made a discussion on the repository, we can `pijul push boringcactus@nest.pijul.com:pijul/manual --to-channel :34` and it'll attach the change we just made to [discussion #34](https://nest.pijul.com/pijul/manual/discussions/34).
(It appears to be having trouble finding my SSH keys or persisting known SSH hosts, which means I have to re-accept the fingerprint and re-enter my Nest password every time, but that's not the end of the world.)

So yeah, Pijul definitely still isn't production-ready, but it shows some real promise. That said, you can't be better than Git if you aren't production-ready.
(Although you can totally be better than Git by having your own officially-blessed GitHub clone sorted out already.)
(And maybe, with time, you can be eventually better than Git.)

## what next?

None of the existing VCSes that I looked at were unreservedly better than Git, but they all had aspects that would help beat Git.

A tool which is actually better than Git should start by being no worse than Git:

* allow importing existing Git repositories
* don't require Git users to relearn every single thing - we already had to learn Git, we've been through enough

Then, to pick and choose the best parts of other VCSes, it should

* have a UI that's better, or at least perceived as better, than Git's - ideally minimalism and intuitiveness will get you there, but user testing is gonna be the main thing
* avoid opaque hashes as the primary identifier for things - `r62` carries more meaning than `7c7bb33` - but not at the expense of features that are actually important
* go beyond just source code, and cover issues, documentation wikis, and similar items, so that (for at least the easy cases) the entire state of the project is contained within version control
* approach history as not just a linear sequence of facts but a *story*
* offer hosting to other developers who want to use your VCS, so they don't have to figure that out themselves to get started in a robust way

And just for kicks, a couple of extra features that nobody has but everybody should:

* the CLI takes a back seat to the GUI (or TUI, I guess) - seeing the state gets easier that way, discovering features gets easier that way, teaching to people who aren't CLI-literate gets easier that way
* contributor names & emails aren't immutable - trans people exist, and `git filter-graph` makes it about as difficult to change my name as the state of Colorado did
* if you build in issue/wiki/whatever tracking, also build in CI in some way
* avoid internal jargon - either say things in plain $LANG or develop a consistent and intuitive metaphor and use it literally everywhere

I probably don't have the skills, and I certainly don't have the free time, to build an Actually Good VCS myself.
But if you want to, here's what you're aiming for.
Good luck.
If you can pull it off, you'll be a hero.
And if you can't, you'll be in good company.