Git's submodules are so universally derided that there's practically an entire industry devoted to providing alternatives for managing dependencies.
But like anything in git, it's often worth giving the man-pages a good going-over and figuring out whether there's some options that do what you want, or to see if they've improved lately.
What I want
So, Metre is my exemplar project. It's got a slew of submodules, in part because some of our customers run (really) ancient versions of Linux and so we're going to need to statically link. Yay, fun!
But that means managing and shipping our own build of OpenSSL, for example - and that's a terrifying prospect for our Security Guy (lovely chap called Simon). It's pretty terrifying for me, too, actually.
In practical terms, then, our release cycle involves advancing along a stable branch on all the submodules, such that we're confident that we've picked up any bugfixes. This needs to be as simple as possible - really, a single command we can run as we need to.
But, we want to have high confidence that checking out a particular commit hash of Metre will give us the same dependencies we built with.
Git Submodule Add
Initially, I went for git submodule
and a lot of manual work. I (lead dev) wasn't happy with this. Simon The Security Guy wasn't happy with this. Pete, one of our senior devs, conducted a full review of the project and highlighted it too.
The problem is that one slip and a dependency could be left with a serious security issue in. And Metre is meant to be all about security.
The plus-side of git submodule
is that it tracks the commit hashes of submodules, and you can check them all out at the right hash with either a git clone --recursive
or a git submodule update --init --recursive
.
We considered switching to something else, but then we'd lose much of the built-in smarts of git submodule
, and that's also a pain.
Oh, look - branches!
A deep dive into man git-submodule
and man 7 gitsubmodules
, however, found me gold.
First, there's a -b branch
switch to git submodule add
. That adds the submodule at a specific branch, and moreover sets the "tracking branch" - the one git normally pulls from - to the remote origin branch just as you'd normally do.
Second, I found a config option of submodule.{submodule name}.branch
, which stores this. This isn't quite as great as you'd think, though, because while you can set this in git config
for the repository, it's not tracked.
Fear My Editor Skillz
However, submodule configuration is stored in the repository in the .gitmodules
file at the top. So you can edit that file, find the section, and simply add a branch
key right there:
[submodule "deps/spiffing"]
path = deps/spiffing
url = http://github.com/surevine/spiffing
[submodule "deps/openssl"]
path = deps/openssl
url = git://git.openssl.org/openssl.git
branch = OpenSSL_1_1_0-stable
The default is master
, though, so if that's all you wanted, you've got that already.
Updating Branches
The normal command for updating submodules is git submodule update
. There's three flags of interest:
--init
performs a git submodule init
if the submodule isn't already cloned into place, and nothing otherwise - so it's always safe to use.
--recursive
recurses through each submodule, running the same git submodule update
command in each.
--remote
is the magic - that performs a git pull
along the remote tracking branch. It's this that we want.
Workflow Summary
So now the workflow looks like this:
git clone --recursive git@github.com:surevine/metre
- clones the repository and checks out the HEAD
of master
.
git checkout foo
- checks out the foo branch or commit - and will switch the submodules to the commits they were for foo.
git submodule update --init --recursive --remote
- updates all submodules recursively along their tracking branches. Without the --remote
, it'll reset the submodule working directories to the "right" commit for the parent.
Finally, you can:
git config submodule.recurse true
- tells git that most commands should act recursively, in particular git pull
.