Wednesday, August 13, 2008

Git

For about a month and a half, I've been secretly using Git at work for my local source control management. Shhh.

At work we manage code in Perforce, so eventually my code has to work its way back into P4. Still, I now am finding it difficult to imagine working efficiently without using Git on a daily basis, and the thought of continuing to bridge new projects with the P4 repository makes me cringe a little.

Git is extremely powerful partly because it allows you to make fast branches in-place. How many times have you heard this: "I'd like to try that, but I would have to create another Perforce branch because I've got too much going on in my branch right now." I don't know about you, but I've heard it a lot. I've said it a lot. P4 developers on large projects are only going to have one branch because it takes so long to get a second one going. Spinning two all the time doesn't help unless you use it regularly. You don't use the second branch it for a while, you integrate down and you have to recompile everything from scratch. With Git, on the other hand, switching to the branch is in place. So the compiler only has to recompile and re-link the difference between the branches.

Git also has the ability to easily share between peers. Well, that's all it does, actually -- so it can be a downside and an advantage. The good part is that two people sharing their code doesn't require copying files up to a fileshare and copying them back down (lest you want to do file integrations in P4, which aren't very fun).

Git can work completely offline, so you're not dependent on a central server to check in code and share with others. Granted, I don't think the future holds much for "offline" technologies, but for the time being this can be pretty advantageous.

A question arises with git, "Where's the main development line"? In my case, I just integrate it back into our Perforce server using one of the many P4-git bridge scripts. Normally though, it's not always that straightforward to identify "dev line." The issue with Git is there's (potentially) no central server and any checkin from any other Git repository can be pulled over into any other. Github has a nice graph and explanation for how this works.

In practice one can make a central line for their code and let everyone submit to it, just like Perforce. Once you do this, however, the advantages of using git are lost somewhat. The idea of a central gatekeeper who lets code through is lost at that point. Git works really well for open source projects because of this "gatekeeper" concept, however I still am unsure how it would go with a large scale development.

Probably the biggest downside to Git is that it's pretty darn obscure. Apparently the last two years have been spent trying to make it usable by anyone except Linus Torvalds. The Windows ports are relatively poor and non-Windowsy, but they do work.

7 comments:

Anonymous said...

What do you think of this post by this Orbitz engineer?

http://daveonscm.blogspot.com/2007/09/agile-branches-vs-streams.html

Trimbo said...

I read his post, read up more on AccuRev, watched some videos, and I have to admit I'm still not completely clear on what he's talking about in this blog post. Mostly I feel like the AccuRev is just a terminology switch for the point of marketing. The concepts are similar to other SCMs.

For example, "If you promote a single task (i.e. bunch-o-files) to Integration, the other 3 task streams -automatically- have visibility to the newer versions! "

This is what every other SCM in the world does, with the terminology changed. Someone promotes to a line that other people have branched from, they can then choose to integrate down from that line.

He also writes "unlike using branches, streams don't require massive merging all over the place." The demo I saw requires just as much merging as every other SCM. So what's the difference between a merge and a "massive merge"? What does this mean?

Maybe what he means is that it seems like AccuRev can in a GUI-friendly way isolate sections of a project, which is certainly advantageous over Git and Perforce. Git requires that everything you're working on be in the same repository. That's great, when you never need to merge projects together to make a bigger product. Perforce lets you manipulate branch/clientspecs to any level you want, and maintenance can become a nightmare. So the idea of having SCM that's abstracted a little bit more like AccuRev seems pretty good. However I don't really see how it's technically different than other SCMs, just that it has its own set of decisions about how to handle the data, and a whole new set of terminology.

Anonymous said...

I get the feeling that the streams technology is something cool that got lost in translation between the AccuRev engineers and marketing people.

The best I can tell is that AccuRev streams are basically automatic integrations that can be scheduled and gated. The primary advantage would therefore be the ability to subscribe to particular paths without having the manually integrate, particularly where multiple branches are involved, and where a traditional SCM would require manual integration per step. I don't see how this side-steps the conflict/merge problem for hot code, but I can see how it could be helpful for auxiliary files such as build tags, particularly when the stream pushes are gated by automated test systems. It also seems like something that is REALLY cool and REALLY helpful 99.99% of the time and then 0.01% of the time creates an enormous train wreck in the depot.

Honestly, though, I don't see how this couldn't be layered on top of a traditional branching SCM. Providing better UI and branch automation would, I think, get you very far to what AccuRev claims to achieve. Certainly P4 could have a much nicer branching UI than what it currently has in P4Win/P4V, which is basically just a dialog on top of "p4 branch."

Timinator said...

I just downloaded Git. It makes no sense. I don't even know how to add files! How can I have a source control app if I can't even ADD FILES. Epic fail.

Trimbo said...

git init
git add (filenames)
git commit

A great intro to git is Peepcode's git book. Check it out here.

Wu-Man said...

What bidirectional bridge are you using to sync with Perforce? I need to use one for my day job as well.

Trimbo said...

Wu-Man: i use the git-p4 that's in the git source.

For integration to git, I have it set up on a cron job to check P4. If there was something checked in, syncs into master.

I work in a separate branch -- always -- then rebase from master (i.e. the synched up P4) and submit that using git-p4.

This workflow works pretty great if you take the time to set it up. I changed what I'm working on and haven't yet set it up for the larger tree I'm working with.