There are many, many git tutorials out there. http://rogerdudler.github.io/git-guide/ is pretty cool and gives a nice overview. However, it doesn't quite set you up to play around with the commands. Here is a completely self-contained walk-through that doesn't require any networking to experiment with a reasonable git setup.
Let's work through a simple scenario where two friends (Mark and Barb) want to hack on some code together. Since they are both jet-setting around the country and typically working from laptops, they decide to setup a central repository for their code base. (Note: These days, you can do that on github in about 45 seconds.) We're going to simulate the three computers (Barb's laptop, Mark's laptop, and the central repository server) on one physical computer. There's no fundamental difference from using physically remote machines: you'd had to setup the networking support (typically using ssh/sshd) and authentication (public/private keys) and you'd specify the remote machine when executing commands.
Without further ado (about nothing), let's create a pseudo-laptop and create an initial (empty) code base:
% mkdir marks_code # I just spun up a new computer -- didn't you see it?
% cd marks_code # and logged on
% git init . # create a git repo here; currently empty
Initialized empty Git repository in /Users/mfenner/gitdemo/marks_code/.git/
Now, since we know we want to use git, we can use git to create an initial repository that isn't keeping track of anything (yet). We do this with the git init
command. Incidentally, to get help on this command (on Unix-y systems), type man git-init
. git
is structured as "one big program" with lots of sub-commands (git init
, git log
, git diff
, etc.), but the man pages (help documentation) are made for each command. You specify it as git-<cmd>
for whatever command cmd
you want to investigate.
Ok, let's populate the code base.
% cat > test.py
print "Hello World!"
And we can commit that file into the repository. Currently, the file exists on the hard drive, but the reposiotry knows nothing about it. Let's tell the repository it exists (by adding
it). First, we can take a look at what git
knows about with git status
.
% git status
On branch master
Initial commit
Untracked files:
(use "git add <file>..." to include in what will be committed)
test.py
nothing added to commit but untracked files present (use "git add" to track)
Notice how helpful git is. It tells you that you can use git add
to add a file. Let's do it:
% git add test.py
% git status # git will also tell you how to "unadd" test.py
At this point, we've told git that we care about this file, but we have actually committed the file into the repository/history. Here goes:
% git commit -m "Added hello world functionality"
I used -m
to specify the commit message (a note about what the reason for this current commit). If you don't put something in there, git will dump you into an editor and let you type a message in. Now would be a great time to try out: git log
, git show
, and git blame
.
Fair enough. Now we could have Barb do a magic git command git clone <path/to>/marks_code
and she would get a (standalone) copy of the repository. There are two issues with this approach:
- We would have to do some "magic" to let me request changes from her, because she cloned me without my involvement. She could send me changes, but I couldn't drive that process. This isn't a show stopper: you'll see a command in a second that would let us do a bi-directional hook up of our repos (phew, that sounds steamy).
- A bigger issue is that neither of us is working on a server/workstation computer. That is, we aren't working from a fixed location. And, our machines will randomly blip on and off the Internet as we do some work, take a flight, go to sleep, etc. When our computers go "dark" the other person literally can't send or request updates.
To resolve these issues, we'll use a "central repository" that we both communicate with. This also gives us an offsite backup for free, if you are using github or the equivalent. Note, my mockup demo here is not offsite (I sincerely hope I didn't have to tell you that)!
% cd .. # go up one level (out of marks_code)
% mkdir central # create a server
% cd central # and login (ha, easy!)
% git init --bare # create a special type of repository
# that won't be used by anyone locally
Now, let's feed the new central repository. First, let's see who Mark's repo is setup to communicate with:
% cd ../marks_code # go back to mark's laptop
% git remote -v # who are my remote connections (currently none!)
I'm not connected to anyone else (sad face). Let's hook me up to the central repo for both pushing and pulling -- that is, sending and receiving updates to the repo.
% git remote add origin /Users/mfenner/gitdemo/central # link the repos
% git push --set-upstream origin master # link the master branches
# (and send my repo as an update)
Now, we can tell Barb about the "central server" and she can make her own copy of it.
% cd ..
% mkdir barbs_code && cd barbs_code
% git clone /Users/mfenner/gitdemo/central
# cloning creates the tailing directory name and puts the repo in it
% cd central
% git log --oneline
Very often, when you are adding new stuff, you do it in a "branch". Why?
- It helps keep new/different code isolated.
- It let's you keep
master
as a (hopefully) 100% working version of the code. - It makes a record of when and how new features were added to the code.
Enough talk, let's fight. Here, we're working as Barb.
% git branch # view current branches
% git branch add-function # create a new branch called ... and activate it
% git branch
# update test.py and testmod.py
% cat > testmod.py
def foo(x,y):
return x + y
% more test.py
from testmod import foo
% print "Hello World!"
% print foo(3,5)
# Now, get those changes stored away!
% git add testmod.py
% git commit -a # -a does "auto-add stuff git knows about"
# Now, switch back to the boring, old master
% git branch master
% ls # why is testmod.pyc around?
# it is "just" a file on the disk, git doesn't know about it
# merge our new functionality (after testing, of course!)
% git merge add-function # merge named branch onto current branch
% git status # barb's gotten ahead of the remote master
% git push
# Barb can also clean up the (no longer needed) development branch
% git branch -d add-function
So, I (Mark) have been sleeping (!) and I wake up and want to start coding. Before I do anything else, I see if the repo has been updated -- I try to grab any work that Barb did overnight. Easy as pie:
% git pull
I was asleep and I'm still a bit groggy. I haven't had my coffee yet and disaster strikes:
% echo "amanaplanacanalpanama" > test.py # NUKED my file! ACK!
# GIT! CAN YOU HELP ME!?!
% git status
On branch master
Your branch is up-to-date with 'origin/master'.
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
modified: test.py
no changes added to commit (use "git add" and/or "git commit -a")
Look at the line with git checkout -- <file>
. I can use that to save me from myself:
git checkout -- test.py
# PHEW!
Last, but not least, say I want to look at an older version of the code (before Barb got involved)
git log --oneline
git checkout <aCommitID> # enter a hex id like 54eccfc
# (just the first few digits are enough)
git checkout master # click our heels to get home
Note, we didn't do git checkout HEAD
(which is usually legal) because we were in zombie land (a "detached HEAD state". This means that if we wanted to do new work (on this old commit, sort of like going back in time and starting again), we would have to create a new branch and work off of it (which git told us when we did the checkout of the older commit id).
Additional Resources
You can grab a copy of this notebook.
Even better, you can view it using nbviewer.
License
Unless otherwise noted, the contents of this notebook are under the following license. The code in the notebook should be considered part of the text (i. e., licensed and treated as as follows).
DrsFenner.org Blog And Notebooks by Mark and Barbara Fenner is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4. 0 International License.
Permissions beyond the scope of this license may be available at drsfenner.org/blog/about-and-contacts.