Posted on

Good git Behavior for the CSA Code Team

Man Branches Banner

I am in the midst of training and evaluating people that are interested in working on the Store Locator Plus code projects.   Some people are veteran coders.  Others are learning coding in a team environment for the first time.    Here are some basic guidelines on using git when working on the CSA codebase that can be useful if you are using git version control for any team project.

Use Descriptive Commit Messages

Your commits messages should be descriptive at least 99% of the time.   These are not good commit messages:

Inline image 1
When the team or project lead goes to integrate your code with the work everyone else has contributed they should have some clue what each commit was doing.     A single word is NEVER a good commit message.   At the very least you should almost always be using a noun and a verb in a commit message.   Debug actions would be at least a little better than “debug”.   Better yet, us a short sentence.

Create Separate Debugging Branches

Since we are on the topic of debug commits, they should rarely be in  your “mainline” branch of code.  Yes debugging happens.    Yes, they often end up in a series of commits on a code branch especially when working on a complex piece of code.  However if you start out with a cycle of coding where you know “I’m going to do a lot of debugging to figure out what is going on here” then it is almost always a good idea to start by creating a new “debug_this_whacky_thing” branch and dumping all your “code barf” in there until you figure things out.
When you do, go back to the “real_work” branch and check that out and put the valuable pieces of code from your learned lessons in that branch.
If you manage to stumble across a useful piece of code on your “testing_stuff” branch you can always add it on to your “real_work” branch with something called “cherry picking”.    That is a git command and in SmartGit is simple to execute.  Checkout the real_work branch, then go select the one or two commits that did something useful from the debugging_code_barf branch and “cherry pick” them.

Commit Often

Small frequent commits are better with just about any version control system and ESPECIALLY when using git.   It tends to create fewer code conflicts during a merge.     This does not mean committing every single line of code on a per-commit basis.   However you should commit every time you write code that changed something and are at a “stopping point”.    Typically this is at the point of “ok I am going to test this now and see if it does what I expected”.    Often it is best to do a “dry run” and make sure there are no blatant errors such as syntax errors before committing.     Try to commit unbroken, if not functional, code.    In other words it should not crash whatever you are working on with an immediate and obvious “your forgot a curly bracket” error.

Use Branches

Like the debugging branch noted above, any time you start a new concept, path,  model, design, or feature start a new branch.   Try to work from a root point, such as the last major release of a product or the last tested-to-be-working version of the software.    Unless your new concept requires the code of a prior effort going back to the root “last working base copy we published” is a good starting point.    The project or team lead will merge code into a prerelease or production (master) branch or an integration branch to create a new product release version.
If you have done work on several feature branches that are not dependent on each other but work better together, create your own integration branch.   “my_super_pack” branch can be a merge-commit of your “feature_a”, “super_awesome_feature”, and “feature_b” branches.

CSA Branch Standards

At CSA I like to use a 3-branch methodology for managing projects.    The branches are master, prerelease, and integration. All 3 branches are only aligned when a new production version is released and there is no ongoing development on the project.
master – always points to the latest production release available to the general public.   This is where the current commit pointer ends up after integration and prerelease phases are complete and the production scripts are executed.  This branch is always tagged with the current production release number.  Developers only start new branches here if a prerelease branch does not exist.
git master branch
prerelease – always points to the latest release of the software that was published to the public in prerelease format.  Prerelease software is not fully tested, though usually has passed the rudimentary functional testing.  this is considered the “beta” version of the next master release.  All CSA Premier Members and beta test groups are given access to prerelease software.    This branch is always tagged with the current software version number, which is bumped if further changes are needed before “going to production”.   Developers almost always start new branches here.
git prerelease branch
integration – this branch points to the current integration branch used by the project manager to pull together developer commits in preparation for conflict resolution and rudimentary software testing prior to being given an official “prerelease” stamp.  This is the release used for internal testing and development and should be considered unstable.    Developers rarely start new code branches on this branch.
git integration branch

 

Posted on

Update Git On CentOS 6.3

As of this writing, CentOS 6.3 has a default git version of 1.7.1-2.  This version is what you will have installed if you run the typical install command:

# yum install git

However, GitHub and many other services require git version 1.7.10 or higher. It turns out there is a very easy way to get git.  You need to do this from a privileged account, but then the process is simple.

Add RPM Forge to Yum Repos

# wget 'http://packages.sw.be/rpmforge-release/rpmforge-release-0.5.2-2.el6.rf.x86_64.rpm'

# rpm --import http://apt.sw.be/RPM-GPG-KEY.dag.txt

# rpm -i 'http://packages.sw.be/rpmforge-release/rpmforge-release-0.5.2-2.el6.rf.x86_64.rpm'

# yum clean all

Install New Version from RPM Forge

# cd /etc/yum.repos.d
# vim (or whatever) rpmforge.repo

change the enabled=0 flag to enabled=1 in the section labelled [rpmforge-extras].

# yum  update
# yum provides git

This will return a longer list of available git modules.

Install the newer git by copying the FULL REPO NAME.   For example:

# yum install git-1.7.11.3-1.el6.rfx.x86_64 
# git --version

You should have the new release installed.

Now go back and edit rpmforge.repo and disable the rpmforge-extras repository.  Then ensure the yum directory is cleaned up.

# yum update
Posted on

Cleaner Git Log With Merges

In some of our repositories now, Panhandler in particular, there are a lot of merge commits.  As the master branch containing the API definition gets updated, each driver branch merges in master so that it can be updated to target the latest API.  This results in the driver branches having various merge commits that bring in master, and that can make the output of a simple ‘git log’ more verbose than you may want.

One useful option for git-log is ‘–no-merges’, which will omit any merge commits.  If we have a repo like this

O---O---A---B---C---D---E---F    [master]
         \           \
  O---O---G---H---I---J---K      [driver]

and we run
$ git checkout driver
$ git log --no-merges
then we won’t see commits G or J, since those are points where master was merged into the driver branch.

But one issue is that we will still see the commits from the master branch.  That can be annoying if all we want to see are the driver commits.  So what we can do is tell Git that we want to exclude them, like so:
$ git log driver ^master
This should be read as, “Display all commits in ‘driver’ that are not in ‘master’”.  Which in this case shows all of the driver branch.

You have probably used the ‘..’ syntax before, writing something like
$ git log driver..origin/driver
to see all of the commits in the ‘origin/driver’ that are not in ‘driver’.  That ‘..’ syntax is actually a short-cut for the first example.  It is entirely equivalent to
$ git log ^driver origin/driver
“All of the commits which are not in ‘driver’ but that are in ‘origin/driver’”.

There are a lot of options for controlling which commits are displayed by git-log.  Ultimately those options stem from git-rev-list, which is the plumbing command used by git-log and many other parts of Git to produce a list of commits.  The documentation for git-rev-list has all the details about ‘History Simplification’, which can help you narrow down your focus when working in a repository that has a lot of merges.

Posted on

More Info From Git Branches

For those of you that follow our posts on a regular basis, you’ll know we are somewhat biased toward using Git for version control.   While Git takes some getting used to, once you have the basics down it is a very powerful tool.   As you become more familiar with Git, you start to realize there are quite a few “power tools” available to you that let you delve deeper into the history of your project.   The following article posted by Eric Ritz on our corporate mailing list provides yet another useful tip on how to get more from Git.  Enjoy!

– Lance

Something I don’t see advertised very much is the fact that git-branch has a verbose parameter: ‘–verbose’, or just ‘-v’.  This parameter will show you which commit each branch is currently pointing to.

For example, from my copy of the Panhandler repo:

$ git branch -v
driver/commission-junction 9c9721e CJ: Implement set_results_page()
driver/ebay                e1f86f3 eBay: Update test script to test recent API changes
master                     599a2db Document the type of each member of PanhandlerProduct
* phpunit-test               19d97e5 TEMP Skip one test since I don’t have a connection

If you use the parameter twice you will get an additional piece of information: the upstream branch for each branch (if any) and how many commits separate the two.  As an example, this is from the MoneyPress Commission Junction repo:

$ git branch -vv
* (no branch) bc97baa Disable curl peer verification check
master      7f0b529 [origin/master: behind 27] Remove the old keywords.php
next        88a41ec [origin/next] Update version and update history for v1.0.4

The difference here is the inclusion of the upstream branches in braces.  And in the case of ‘master’ I see how far behind I am.

If you want a quick overview of where every branch is, then ‘git branch -vv’ is a useful way to get that.  Or ‘git branch -avv’ if you want to see all the remote branches too.  Personally, I find this useful after fetching a lot of updates, as it lets me see how far behind my tracking branches are.  It’s also a quick way to tell if multiple branches are pointing to the same commit, which I occasionally want to know.

You may find it useful in those or other situations.

Posted on

Working With Git

Here are some basic cheat-sheet style hints for working with the git version control system.

Creating A Repository

These are the steps for starting a new repository with git.  The commands here assume you will be working with a group of people that all need access to the repository from your server.  You will need command line access to the Linux shell to do this.

In our example we assume our developers all belong to the user group “dev” on the Linux box and our storage of the git repositories is in the /devdir/gitrepos directory.

  1. Log in as a user who belongs to the dev group.
  2. Make your way over to /devdir/gitrepos.
  3. Create the directory for your project. Make sure it ends with .git, e.g. wego.git.
  4. Go inside the directory you just made.
  5. Enter git --bare init --shared=group
  6. Finally do a chmod a+x hooks/post-update.

That last step is required because without it you would always have to run git update-server-info in the remote repository after every push. Git comes with a hook that automatically handles that for when you are pushing over SSH or HTTP, or really anything except Git’s own protocol.

Also the part about ending the repo name with .git is not mandatory—simply convention. People tend to suffix that whenever the repository is “bare”, meaning it has no working copy associated with it.

Adding Content To Your Newly Created Repository

Now that you have new, and empty, git repository you are ready to begin work in your development directory.  Here we are assuming you are working on your development server.  This may not be, and often is not, the same server where you store your git repositories.

  1. Login to your development server.
  2. Go to the directory where you want to start working on your new wego project.
  3. Enter git init
  4. Enter git remote add origin ssh://yourname@yourdomain.com/devdir/gitrepos/wego.git/.
    • Obviously substitute your login and repository name above.
  5. Load up something to put in the repo…
    1. touch README
    2. git add README
    3. git commit README
  6. git push origin master

If you clone an existing repo Git will set remote to the http (e.g. http://yourhost.yourdomain.com/wego.git/) address, which you cannot push back to because the server won’t allow it for security reasons—otherwise any one in the world could update our repos. So do this right after you clone the repo:

  • git config remote.origin.url ssh://yourname@yourdomain.com/home/gitrepos/wego.git/

Then git push origin master will do the right thing.

For bonus points you can open the .git/config file inside your local repo (or create it) and add this:

[branch "master"]
[remote "origin"]
        url = ssh://yourname@yourdmain.com/devdir/gitrepos/wego.git/
        fetch = +refs/heads/*:refs/remotes/origin/*

This tells Git that your master branch should track the origin branch. So now you can just type git push and git pull to upload and download commits.

Cloning That Repository Somewhere Else

Now that you’ve got stuff in your git repository other people are going to want to use it.   They can copy (clone) the repository by logging into their development server and using the git clone command:

# git clone ssh://theirname@yourdomain.com/devdir/gitrepos/wego.git/ ./wego

This assumes that theirname@yourdomain.com is a user in your development group on the server where the original repository lives.

On A New Development Server?  Setup Your Environment

This is what you’ll want to do if you are working on your own development server.  It will save you some typing when you add stuff to the repository when working in a shared development environment.

  1. git config --global user.name your name
  2. git config --global user.email your email
  3. git config --global core.autocrlf true
  4. git config --global core.safecrlf warn

On A Shared Server? Setting Author Per Commit

# git commit --author='user <user@email.com>'

Ignore That File

You can tell Git to ignore various files in your directory structure by adding a file .gitignore into the parent directory.  Inside that file you can specify a file name, or a group of files via a wild-card match.   This gets checked against each file regardless of the directory to which it belongs, in essence it is a recursive match.

Ignore the .htaccess file and anything ending with .backup as an extension
 # vi .gitignore
 .htaccess
 *.backup

Show Me The Branches

Get a visual look at the branches from a command line / ASCII terminal.

# git log --graph --oneline --decorate --all

Rename A Git Branch

You named your branch “temp/this-will-never-work” then realized what you did in there actually DID work.  Here is how you rename it locally and back on the remote box.

 # git checkout $branch
 # git branch -m minor/nystore
 # git push origin HEAD

This will add a NEW branch on the remote box, the old branch will still remain, so you’ll need to do some cleanup:

Local cleanup of remote references:

 # git branch -r -d origin/branch-to-delete

For git version 1.6.6.1 or higher:

 # git push origin --delete branch-to-delete

For older git version, you should upgrade, but this trick may help:

 # git push origin :branch-to-delete

Oops, Didn’t Mean That, Redo

To redo a commit you just pushed…

Delete the remote branch (you don’t always want to do this, be careful here) using the method described above in “Renaming A Git Branch”.

Go back to the last group of edits before you made your commit:

  # git reset --soft HEAD^

The HEAD^ says “go back from where I am now (HEAD) by one level”.   The –soft tells git to “keep all my edits please”, as opposed to –hard, which throws away all of your work.

Cherry Picking

Cherry picking allows you to build a new branch based on select commits from other branches.  Here is an example:

# Add the files from this commit to your current branch, but don't commit them yet
# This just puts the updated files in your directory
#
git cherry-pick --no-commit e85cd343234521c392950a208d677843c92a0392
# Change the author on those new files & add them with a commit
#
git commit -a --author='Lance C <Lance-isnt-here@cybersprocket.com>'
# Go get another one...
#
git cherry-pick -n 15a5454403f4fd75f469d6410d6de06ab008027d
# Now commit the new stuff, using the commit message that was
# in the commit ID specified here
#
git commit -c 15a5454403f4fd75f469d6410d6de06ab008027d
# Go and delete the driver/cafepress branch on the origin repo
#
git push origin --delete driver/cafepress
# Rename our driver/temp-cafepress branch to driver/cafepress
#
git branch -m driver/temp-cafepress driver/cafepress
# Go push our stuff back to origin
#
git push origin HEAD

Tagging Your Work

Tags are useful for marking your commits with short phrases, release numbers, or other pieces of info that will help you identify important commits.  Here are some short tips on how to tag your work.

First, you’ll need a PGP key.   Login to your linux box and stay in your home directory.  Run this command:

gpg --gen-key

Look for the key ID in the first few lines of output.  It should look something like this:

gpg: /home/lcleveland/.gnupg/trustdb.gpg: trustdb created
gpg: key D7809C04 marked as ultimately trusted
public and secret key created and signed.

gpg: /home/lcleveland/.gnupg/trustdb.gpg: trustdb createdgpg: key D6808C94 marked as ultimately trustedpublic and secret key created and signed.

This 8-digit number is your signing key.  Now use that to set your git configuration for your user so that your tags can be automatically signed:

git config --global user.signingkey $KEY_ID

Now to tag commits:

git tag -s v1.0
git push --tags origin master
Posted on

Versioning Word Documents In Git

We need your help!


Cyber Sprocket is looking to qualify for a small business grant so we can continue our development efforts. We are working on a custom application builder platform so you can build custom mobile apps for your business. If we reach our 250-person goal have a better chance of being selected.

It is free and takes less than 2 minutes!

Go to www.missionsmallbusiness.com.
Click on the “Login and Vote” button.
Put “Cyber Sprocket” in the search box and click search.
When our name comes up click on the vote button.

 

And now on to our article…

 

At first I didn’t know if I should write this email. I really, really, really do not like dealing with Word documents. It has nothing to do with Word specifically as a product; I hate documents in that kind of format in general, including the stuff OpenOffice.org produces. I don’t like working with WYSIWYG documents, at all. One argument I can make against using Word files on projects is that you can’t meaningfully put them in a repository.

Well—this isn’t true. You can do it, and actually do things like diff Word documents. So ultimately I decided it is more helpful to share this information than to secretly hide it in an attempt to keep people from using that God awful format. Of course, I’m going to regret it as soon as there’s some Word document in one of the repositories…

A rarely used feature of Git (in my experience) is its ability to assign ‘attributes’ to files. You do this by making a .gitattributes file in the repository. It is a text file that maps file names or globs to attributes. A simple example would be

*.fl[av] binary

This tells Git that all ‘flv’ and ‘fla’ files are binary, and therefore Git should never try to diff them or perform any CRLF conversions, regardless of any other settings.

Something else we can do with attributes is control how diffs are generated for files. For our specific task here, we want to tell Git to use our customized ‘diff driver’ for Word documents. We can start out by putting this in our attributes file:

*.doc diff=word

Now whenever Git diffs ‘doc’ files it will invoke the ‘word’ driver. Which means now we have to define that driver. We can do this in one of three places.

    1. Our personal, global .gitconfig file.
    2. A .gitconfig file in the repository that can be shared by developers.
    3. The .git/config file in the repository, which is not shared.

Adding support for diffing certain files is something we typically want to share with everyone on a project, so the second choice makes the most sense here. But the way we define the driver is the same regardless of where we actually do it. First I will show you what we have to put in the file to define the driver, then discuss it.

    textconv = strings

The first line should look familiar if you have messed around with your .gitconfig file before; it is your typical INI file section header. When we assigned the attribute ‘diff=word’ that means Git will look for the section ‘’ for the definition. The second line sets the ‘textconv’ property of the driver; this property names a program or command that is capable of translating the file into a text format which Git can then diff like normal. The ‘strings’ program is part of the GNU binutils package, which you can get on all platforms. It rips out all of the printable strings from a binary file.

With that said, it should be clear now how this helps us diff Word documents. Our driver passes in the ‘doc’ file to a program that can take out all of the printable strings. Even though Word is a binary format, it stores the text of the document as text strings that we can pull out. Once we have done that, Git is capable of diffing the file like normal, and we can meaningfully use tools like ‘git log -p’ to get an idea of the changes that some commit made to a Word document.

This techinque can be used with any file format for which you can generate meaningful text output. For example, if you use a tool to take the metadata out of image files then you can make a driver for that and get useful diff info. This never affects the way Git stores these files; they will still be handled just like any other binary file. The benefits are only cosmetic, allowing us to use Git’s diffing tools to get a better idea of what changes have been applied to those binary files. But nonetheless, that information can be very useful when working with such files.

Posted on

Basic Overview of Bare Git Repositories

We need your help!


Cyber Sprocket is looking to qualify for a small business grant so we can continue our development efforts. We are working on a custom application builder platform so you can build custom mobile apps for your business. If we reach our 250-person goal have a better chance of being selected.

It is free and takes less than 2 minutes!

Go to www.missionsmallbusiness.com.
Click on the “Login and Vote” button.
Put “Cyber Sprocket” in the search box and click search.
When our name comes up click on the vote button.

 

And now on to our article…

During a meeting yesterday we were talking about our internal repo library and reading some file out of all of the repositories for the purposes of generating some type of overview page. And it was mentioned that this would be a little tricky since all of the Git repositories would be “bare repositories.” I want to explain want that means.

A bare repository is one with no working tree. That is, there is nothing checked out. At the top level of every repository you will find a ‘.git’ directory. If you made that directory the only one in the repository, then that repository would be bare. This is the reason why it is convention for bare repository directories to be named ‘foo.git’, since when you clone them they are the contents of ‘/foo/.git’.

For example, here is a bare version of the a project database interface repository.

$ git clone --bare ~/Projects/3DC/DBInterface/
Initialized empty Git repository in /home/eric/Temp/DBInterface.git/

$ cd DBInterface.git/ && ls
branches/  config  description  HEAD  hooks/  info/  objects/  packed-refs  refs/

Notice the directory has none of the files checked out. You cannot do work in a bare repository, because of this. You cannot even try to check out a working tree.

$ git checkout HEAD^
fatal: This operation must be run in a work tree

So what’s the point of bare repositories then?

  1. They make useful conduits for pushing and pulling work. Git will fight like Hell to stop you from pushing into a repository with a work tree, because this is potentially destructive. Since bare repos have no work tree, this is a non-issue for them.
  2. They save space. All of the files for the repository are stored—one way or another—in the ‘objects’ directory in the example above. Checking out files will create a copy of much of those objects. If we need a repository only for pushing and pulling, and do not intend to do work there, then we save space by having nothing checked out.
  3. They are safer from accidental destruction. Because they have no working tree, many commands will fail automatically. You cannot screw up a bare repository via mistaken merge or rebase or whatever. They will all fail.

Note that all three of these benefits come from the fact that bare repositories have no working tree.

But while that is all nice, it complicates the task we were talking about. Namely, reading the contents of some given file from a bare repository. A moments consideration tells us that this must be possible somehow. If it were impossible to get at the file information, then how could we ever meaningfully clone a bare repository to do actual programming? So that file info is somewhere.

To figure out how to get it, we have to understand the low-level Git objects. There are four of them:

  1. Blobs
  2. Trees
  3. Commits
  4. Tags

In our case we only care about the first two. All objects have two components:

  1. SHA-1 Hash for a Name
  2. Content

The name is always a hash of the content. The nature of that content depends on the object in question. Blobs are nothing more than content and a hash of said content. When you add a file in a repository, Git creates a blob for the contents of that file. Blobs store nothing but the content; they do not even store things like file names. When people say that Git tracks content and not files, they are referring to this. Because blobs store only content, if you add duplicates of a file with different names, Git stores them only once, in one blob.

Git stores those file names in trees. The content of a tree is a list of file names and permissions associated with blobs or other trees. They are like the ‘directories’ of low-level Git objects. We can get at trees by using the so-called ‘plumbing’ commands. Let’s say we are still in the bare repository from the previous example. We want to see the files on the ‘master’ branch. We can do that with the ls-tree command, which takes a ‘tree-ish’ argument. In Git terminology, a ‘tree-ish’ is anything which can name a single tree, which may not be the name of a tree itself.

What I mean is this: we already know that a branch is nothing more than a pointer to a certain commit. A commit object is a snapshot of a single tree at a given point in history, along with some other information like message, commit time, author name, and so on. Because a commit is always a snapshot of one tree, a commit is an acceptable tree-ish.

This all means we can do this in our bare repository:

$ git ls-tree master
040000 tree 59913b97f0f1ce48b2a8c6f5d77fa986418f3292    Schemata

This output is telling us the ‘master’ branch has one tree. If we cloned and checked out the branch, we would have the ‘Schemata’ directory. Since we have the name of the tree here—the hash—we can look inside that directory without cloning anything or checking anything out.

$ git ls-tree 59913b97f0f1ce
100644 blob ecc73640542ac00ee0dbfb5e781fa219ea9f2abd    SharedUsers.MySQL

The tree has one blob inside, with the name ‘SharedUsers.MySQL’. We can look at that blob if we want.

$ git show ecc7364

[ Long SQL file here… ]

Using these two commands we can navigate the Git trees and blobs as if they were the directories and files of a non-bare repository. At this point I think the solution of pulling out info from a fixed file should be obvious. We run ls-tree on the ‘master’ branch (or whatever is our convention), grep for the hash of the blob for our file, and then print out its contents, probably into some other script or program.

Posted on

Undoing Mistakes Easily in Git

When I was explaining Git stuff to a co-worker on Friday, I said, if anything ever goes wrong to just remember this to undo it:

$ git checkout @{1}

I told him that would put him back in the state he was in prior to whatever his last command was. Although then I did not go on to give a full, technical explanation of the cryptic command, as I did not think he needed to know it; I rarely need to understand the mechanics of a safety device in order to benefit from it.

But here I’m going to explain what this odd looking command is really doing. And it is worth remembering.

Whenever the tip of a branch changes—maybe you make a commit, or reset something, or simply checkout another branch—Git records that change in something called the reflog. You can run the command git-reflog in any repository to get a peek of it. It looks something like this:

$ cd ~/Scripts/Git && git reflog show ejmr/git-meta
8419d22 ejmr/git-meta@{0}: rebase -i (finish): refs/heads/ejmr/git-meta onto fadfe03
7ffaf89 ejmr/git-meta@{1}: commit: Make git-redimen work
5478546 ejmr/git-meta@{2}: commit: META tweak a comment
63b9804 ejmr/git-meta@{3}: rebase finished: refs/heads/ejmr/git-meta onto fadfe0343019350f98cbf72c883df01dd5b5d6ae
024923b ejmr/git-meta@{4}: rebase finished: refs/heads/ejmr/git-meta onto 0e728470bc140928ea6de7e8551452a860f095aa
8a53c14 ejmr/git-meta@{5}: commit: WIP Initial attempt at rewriting git-redmine to use git-meta
72f73c1 ejmr/git-meta@{6}: commit: WIP First version of git-meta
f3af9ba ejmr/git-meta@{7}: Branch: renamed refs/heads/ejmr/fix-git-redmine to refs/heads/ejmr/git-meta

This is showing me the entire reflog for my git-meta branch that I had been working on recently. Note that from its conception the branch has changed eight times, and the reflog has recorded every one faithfully. By reading it backwards you can see where I renamed an existing branch, made two commits, rebased them on other work, made two more commits, then did an interactive rebase to smash things together.

That’s all fine and great. The real benefit of this is that I can return to any of those previous states. Running the command

$ git checkout ejmr/git-meta@{6}

will put me right back at my first attempt at writing the script. This underlies something fundamentally important about Git: look at how I rebased my work. Twice. Whenever you do a rebase and look at the results in some tool like gitk you would think those old commits have gone away. Clearly this is not the case, since I can get back to them via the reflog. The point I want to stress then is this:

Git garbage collects objects with no references. The reflog counts as a reference.

You probably now have a good idea of why I told my co-worker that command. Since the reflog counts your current state as {0}, then {1} is always the last state. The only thing to explain is the absence of a branch name. If you run git-reflog without any arguments then it is the same as running

$ git reflog show HEAD

So likewise, the command I told him is the same as

$ git checkout HEAD@{1}

which moves HEAD back to its previous state. This can be a really useful way to undo some type of screw up. Accidentally mess up a rebase? Use the above. Forget the best way to undo a merge? Use the above. So on and so on.

But the reflog is not a silver bullet, for a few reasons:

  1. When you delete a branch, you also delete its reflog.
  2. Entries in any reflog are not kept longer than thirty days if they are unreachable from the current tip. Or in other words, if the reflog is the only remaining reference.
  3. Entries are not kept longer than ninety days, under any circumstances.
  4. The last two times can be configured by setting the values of gc.reflogexpireunreachable and gc.reflogexpire respectively. I do not recommend setting either to zero so that you have references to everything forever, because then Git will not garbage collect those otherwise unreachable objects, and you will be potentially wasting a lot of space.

    Anyways—remember that

    $ git checkout @{1}

    is your friend.

Posted on

Commit Templates

On a lot of our commit messages we add a ‘Redmine-Ticket’ line with the appropriate ticket number. On some projects we also add a ‘Version’ line naming the build for which that patch is associated (using tags would be better for that in my opinion, but that’s a different subject). So most commits on one project have two lines like so:

Redmine-Ticket: 1763
Version: 6.80

Typing that over and over can get old, but fortunately it can be automated. Git allows you to specify a template file for commits, which you can do globally or per-repository by configuring the value of ‘commit.template’, which should name the template file. Let’s say I make a file called ‘ProjectCommit.template’ with these contents:

Redmine-Ticket:
Version: 6.80

Then I can do this inside the repository:

git config commit.template ‘ProjectCommit.template’

Now whenever I commit anything, Git will automatically use the contents of that file as my initial commit message, which saves me the trouble of having to type that stuff out. I just add the ticket number and an appropriate message.

You do not have to commit these templates to use them. You can make your own template and then add a line in the file .git/info/exclude to ignore it locally. Or for certain projects it may be valuable to have a template as part of the repository that everyone uses.

Posted on

Easy Documentation for Git, MySQL, PHP, et cetera

This is what I do on my box to quickly find documentation, which you guys may find helpful.  Especially those of you on Linux—although you could do this on Windows too.

Most package managers make available ‘-doc’ packages, like php-doc, mysql-doc, and so on.  Install these for all the major software you use.

Next, install ‘screen’.

Now put this is your Bash config:

# Displays various types of documentation.

function doc() {
    case "$1" in
    'llvm')
        screen -t 'LLVM Documentation' w3m /usr/share/doc/llvm-doc/html/index.html ;;
    'erlang')
        screen -t 'Erlang Documentation' firefox /usr/share/doc/erlang-doc-html/html/doc/index.html ;;
    'python')
        screen -t 'Python Documentation' w3m /usr/share/doc/python3-doc/html/index.html ;;
    'php')
        screen -t 'PHP Documentation' w3m /usr/share/doc/php-doc/html/index.html ;;
    'ghc')
        firefox /usr/share/doc/ghc6-doc/index.html & ;;
    'postgresql')
        screen -t 'PostgreSQL Documentation' w3m /usr/share/doc/postgresql-doc-8.4/html/index.html ;;
    'mysql')
        screen -t 'MySQL Documentation' w3m /usr/share/doc/mysql-doc-5.0/refman-5.0-en.html-chapter/index.html ;;
    'apache')
        screen -t 'Apache Documentation' w3m /usr/share/doc/apache2-doc/manual/index.html ;;
    'j')
        screen -t 'J Documentation' w3m ~/Software/j602/help/index.htm ;;
    'lua')
        screen -t 'Lua Documentation' w3m /usr/share/doc/lua5.1-doc/doc/index.html ;;
    'git')
        screen -t 'Git Documentation' w3m /usr/local/share/doc/git-doc/index.html ;;
    'lighttpd')
        screen -t 'Lighttpd Documentation' w3m /usr/share/doc/lighttpd-doc/ ;;
    'plt-scheme')
        screen -t 'PLT Scheme Documentation' w3m /usr/share/plt/doc/index.html ;;
    'gambit')
        screen -t 'Gambit Documentation' w3m /usr/share/doc/gambit-doc/html/index.html ;;
    'tintin++')
        screen -t 'TinTin++ Documentation' zless /usr/share/doc/tintin++/tintin19.txt.gz ;;
    'sqlite')
        screen -t 'SQLite Documentation' w3m /usr/share/doc/sqlite3-doc/index.html ;;
    'django')
        screen -t 'Django Documentation' w3m /usr/share/doc/python-django-doc/html/index.html ;;
    'sbcl')
        screen -t 'SBCL Documentation' w3m /usr/share/doc/sbcl-doc/html/index.html ;;
    'boost')
        screen -t 'Boost Documentation' w3m /usr/share/doc/libboost-doc/HTML/index.htm ;;
    'smalltalk')
        screen -t 'GNU Smalltalk Documentation' info Smalltalk  ;;
    'haskell-tutorial')
        screen -t 'Haskell 98 Tutorial' w3m /usr/share/doc/haskell98-tutorial/html/index.html ;;
    'haskell-report')
        screen -t 'Haskell 98 Report' w3m /usr/share/doc/haskell98-report/html/index.html ;;
    'java')
        firefox "/home/eric/Documents/Books/Programming/Java SDK/index.html" & ;;
    esac
}

Replace ‘w3m’ with the browser you want to use.  And make sure the paths are correct.  If you’re on a Debian-based box, that’s where those doc packages will end up.

Now whenever you’re at the terminal you can simply run stuff like

$ doc git
$ doc postgresql

to browse through the official docs.

For Git in particular you will have to build the HTML docs.  In the Git source directory:

$ make html
$ sudo make install-html

There ya go, easy way to look up docs quickly.

Posted on

Version Control Your Home Directory

Git is useful not only for working with software projects, but also for keeping a history of things in your home directory.  If you’re on Linux, then I recommend putting your entire home directory under Git. And if you’re on Windows, put the most appropriate directory under version control.

At first this may sound a little off the wall.

However, I am not recommending that you track everything.  In fact, I suggest that you very carefully pick and choose what you want to track.  This is what I did: after initializing the repository in my home directory, I opened the file .git/info/exclude and added this:

# Ignore everything
*

Which tells Git to ignore everything in my home directory.  Then after that I added specific rules to un-ignore the files I was interested in.  You can do this by prefixing the filename with the ‘!’ character.  For example:

# Ignore everything
*
# But save my Emacs files
!.emacs
!.emacs.d/
# And my Git config
!.gitconfig

And so on.  By doing this you can easily manage what Git tracks without creating a huge mess of your home directory repository.  Then you have all the benefits of using Git for things in your home directory, and can do stuff like push your home directory repository to one of our servers as a backup.

Posted on

Git: Version 1.6.6

The next major version of Git—1.6.6—should be out any day now.  At which point I will let you guys know.  But I wanted to go ahead and tell you about the changes to take note of.

git-config

In your config files, variables that take paths can begin with ~/ or ~user/ and will be expanded as expected.  Something that caught Chris out today, actually.

git-checkout

If the remote repository has a branch called ‘foo’, and you do not have a local branch called ‘foo’, then the command:

$ git checkout foo

will create and checkout a local version of the remote branch.  It will be the same thing as currently doing:

$ git checkout -t origin/foo.

git-fetch

There are a few new options to this command, but one really important one: --prune.  In Git 1.6.6 you can run:

$ git fetch --prune origin

which does the same thing as:

$ git fetch origin && git remote prune origin

git-merge

Has a new option: --ff-only.  This will make the merge fail if it is not a fast-forward.  Currently you can do this by pushing.  For example, if I wanted to merge ‘foo’ into ‘bar’ but only if it would fast-forward, then I would have to do:

$ git push . foo:bar

But now if I am on the ‘bar’ branch I can do:

$ git merge --ff-only foo

Which semantically makes more sense.  The git-push form makes sense, but it is very idiomatic Git, whereas the new git-merge usage is very readable.

Because git-pull calls git-merge, it also has the –ff-only option.

git-rebase –interactive

Has a new command called ‘reword’.  It does the same thing ‘edit’ does during an interactive rebase, except it only lets you edit the commit message without returning control to the shell.  This is  extremely useful.  Currently if you want to clean up your commit messages you have to:

$ git rebase -i next

Then set all the commits to ‘edit’.  Then on each one:

# Change the message in your editor.
$ git commit --ammend
$ git rebase --continue

Using ‘reword’ instead of ‘edit’ lets you skip the git-commit and git-rebase calls.

git-notes

This is completely new.  It lets you add notes to commits without actually changing the commit.  The notes are separate objects in the Git database which are associated with the commits.  If you are in charge of a repository then you may likely find this useful for keeping notes on various commits in dev branches.

git-push

A lot of people were pissed off with Git 1.6.0 because it broke backwards compatibility with a number of commands, even though the community tried to tell people in advanced.  Git 1.7.0 will be  coming out after 1.6.6, and again there will be compatibility issues.  So this time around the community is really making sure that people know about it, by issuing huge warnings with commands  that are going to change.

This relates to git-push because two uses of the command will not work by default in 1.7.0:

1. Trying to push into the currently checked out branch.

2. Trying to git push $remote :$foo where $foo is the current HEAD in $remote.

Because both of these will be disabled in 1.7.0, you will see loud warnings for them in 1.6.6.  You really should not be doing either of these things anyways.