Learning a distributed version control system is kind of a revelation. If you’re like me you’ve probably gone through the following progression. You started off with SourceSafe. You thought, wow, how did we ever get things done without version control?
Sometime after that you switched to cvs. If you were like me you probably thought. Wow, I can’t believe we ever put up with all those locks and only allowing a single person to edit a file at once. How stupid was that?
A few years later you might have switched to subversion or svn. At the time I remember thinking. Wow, how did we ever live without atomic commits? Then, after that I got experience with perforce or p4 and I remember thinking, OMG, how did I ever live without this speed?
About 15 months ago I learned mercurial or hg. Joel Spolsky turned me on to it with 2 great articles. The first about how his employees all started using hg until it finally donned on him that maybe he should look into why.
That lead him to write a great set of tutorials about some of the benefits of distributed version control. I followed those tutorials and promptly set about using hg in some small projects.
Some of the people collaborating on those projects had git experience and they pointed out how disappointed they were using mercurial. I was basically thinking like “whatever, you probably are just not familiar with it.”
Chromium, the project I currently work on, started allowing git as an option. After hearing so may raves about git so I finally took the plunge. It’s been at least 6 months since I started using git and I can say that again I had another of those “WOW” moments. This time it’s “Wow, how did I ever get along without easy branching”.
People have tried to explain how git works, what the commands are, how it’s different from svn, p4 or hg. Rather than go into technical details about how it works I’m just going to show my workflow. I believe this workflow helps me be more efficient, helps me write more code, and is not encouraged by svn, p4, or hg.
In git you always work in branches. Git itself has no concept of a main branch. There is a convention that many projects use of a main branch called ‘master’ but nothing in git enforces that. It’s just a convention. I happen to follow it. I keep a ‘master’ branch which is effectively my copy of the official version with all the latest code contributed by everyone on the team. To update that and get the latest I do this
$ git checkout master switched to branch 'master' $ git pull
At setup time I created a ‘master’ branch and it is set to pull the latest stuff from the place I cloned from so ‘git pull’ gets all the latest changes.
Now I want to do some work. I’m going to add support for texture compression so I make a new branch
$ git checkout -b texture-compression Switched to a new branch 'texture-compression'
I edit some files and commit them locally. I upload the changes for code review and then send the changes to our try bots (servers that build the project) to see if they work on various platforms.
$ git commit -a -m "Added texture compression" $ cl upload $ try
$ git checkout -b add-webgl-js-logging master
And in generally under 1 second I’m now working on a new thing. I write some code, I get half way in and someone comes by and asks if I can look at why the new TexStorage function we added is having problems. No problem.
$ git commit -a -m "work in progress" $ git checkout -b texstorage-work master
And now I’m working on a fresh branch, it has none of the other changes I’ve worked on today. I build, debug, find the issue and tell my co-worker how to fix it. Now I want to get back to working on my the WebGL JS Logging stuff so
$ git checkout add-webgl-js-logging
And I’m back to working on that. I get an email from the try bots that there’s a problem with my texture compression stuff. I check in my work here and switch to that.
$ git commit -m "work in progress" $ git checkout texture-compression
I make a few changes, commit it locally and start the try servers on my new stuff
$ git commit -a -m "fixed bug in texture-compression, added unit test" $ cl upload $ try
$ git checkout add-webgl-js-logging
And I’m back to working on that.
Does this workflow of switching between different things seem useful to you? How did I accomplish this in svn or p4 or hg? I had multiple copies of the entire project checked out in different folders. In git none of that is necessary. hg basicially says as much. If you want to work on 2 different things at once you should ‘hg clone’ which means ‘copy everything’. On chromium that takes a couple of minutes in hg. In git, making a new branch to start working on something else takes under 2 seconds, even on chrome which is 27k files!
On top of that, hg, svn and p4 arguably don’t encourage this kind of branching. In git can see what I’m working on easily.
$ git branch -vv * add-webgl-js-logging 513b342 work-in-progress master 2bc582a networking latency issue 54125 fixed texstorage-work 2bc582a networking latency issue 54125 fixed texture-compression a34d46c fixed bug in texture compression, added unit test
This shows that I’m on branch ‘add-webgl-js-logging’. There are 4 branches total. It’s pretty clear that ‘texstorage-work’ is at the same state as ‘master’ and what was last done on the ‘texture-compression’ branch.
What would be the equivalent in hg, svn or p4? Given that checking out the entire project again is very slow on those systems most likely the best you can do is make a few folders ‘checkout01’, ‘checkout02’, ‘checkout03’, ‘checkout04’ and try to mentally remember that you were working on texture compression in checkout02 and webgl logging in checkout04. I actually used to do that. In both p4 and svn I’d give my checkout folders more interesting names like ‘mars’, ‘penelope’, ‘samson’ and other random names but I’d have no way of knowing which folder contained which work except to remember in my head or manually switch to each one and do an ‘svn status’ or something similar.
Now it’s easy. branches are SUPER CHEAP so I can make as many as I want. Switching between them is trivial and fast and git’s design encourages it making it easy to see what I’m doing.
Let’s keep going just to finish this up
I get an email from the try bots that says everything went well with texture-compression so let’s check that in. Now in chromium’s case they’ve added a commit queue system so at this point I can just go to our code review site and click ‘commit’ and the bots will do some more thorough tests and if they all pass the code would automatically be committed. We happen to still support svn at this point in time (I’m sure all the git fan team members can’t wait for the day we switch to 100% git). But, assuming I want to check in by hand. I save the stuff I’m working on for logging, switch to the texture-compression branch and commit.
$ git commit -a -m "latest work in progress" $ git checkout texture-compression $ cl dcommit
It’s now checked in. Let’s switch back to webgl gl logging and try to check that in.
$ git checkout add-webgl-js-logging $ cl dcommit conflict foo/bar/js-logging.cc has changed aborted
Something changed since I started working on this feature and it’s telling me I should fix that first. Let’s grab the latest. Since I happen to base everything off my own local copy of master I would generally do this.
$ git checkout master $ git pull $ git rebase add-webgl-js-logging master $ cl dcommit
That switches to the ‘master’ branch. Gets all the latest changes off the net and merges them into master so it matches. I then switch back to add-webgl-js-logging telling it to ‘rebase’ on master which effectively means. Take all my changes and put them aside, update the add-webgl-js-logging branch so it matches master, then reapply my changes on top. If I didn’t care about my local ‘master’ branch I could just rebase directly off the latest stuff from the net with this
$ git fetch origin/master $ git rebase add-webgl-js-logging origin/master $ cl dcommit
Note that I haven’t gone into any of the details of adding files, resolving conflicts or other things. That’s not the point of this post. The point is to show my workflow in git. This is why git is so popular. It encourages a certain style of workflow that none of the other popular systems do AFAIK.
If you want to learn git I highly recommend you read at least the first and second chapter of ProGit. Another huge advantage to git is this concept called ‘the stage’ which has no equivalent in any other system I know of. It lets you easily edit a bunch of files but then chose which ones will get committed when, even down to selecting individual lines in a file. The first 2 chapters of ProGit will cover the stage and some other differences of a distributed system (hg, git) vs a non-distributed system (svn, cvs, p4)
I also highly recommend you start with github and their help
Because it’s so different there is a learning curve for git. It’s not nearly as bad as people make it out to be. I think most of the complaints come from 3-4 years ago and things have gotten much better.
Once you learn git I don’t think you’ll ever want to go back to another system.
NOTE: If you’re curious about ‘cl’ and ‘try’ above, in our actual workflow they are installed as git plugins. Why I have no idea. Git has what some might call a plugin system. Any script named git-name will be run if you type ‘git name’. In the case above, ‘try’ and ‘cl’ are custom scripts shell scripts, git-try and git-cl that run python scripts git_try.py and git_cl.py respectively and are part of chromium, not git. There’s no particular reason to name them as git plugins AFAIK. They could just as easily be named ‘try’, ‘chromium_try’ and ‘commit_to_chromium’ or something. ‘For your project you’d likely do something different. You might ‘try’ by building locally. You might commit by pushing to a specific remote branch or by putting up a pull request like on github. Again, the point is the workflow, not the specifics.