Git: Checking out the Next Commit

Use case: I sometimes like to browse a repository by going to its initial commit and stepping through the first few commits one-by-one to see how the repository evolved. This is instructive when learning domains and languages that are unfamiliar and gives insight into other people's problem solving processes.

Solution (Windows):

  1. Checkout the initial commit using git checkout $(git rev-list --reverse --max-parents=0 HEAD | Select-Object -First 1).
  2. Navigate to the next commit by running git checkout $(git rev-list --reverse <branch> ^HEAD | Select-Object -First 1) as many times as desired, where <branch> is the name of the default branch (e.g. main or master).

Solution (Linux):

  1. Checkout the initial commit using git checkout $(git rev-list --reverse --max-parents=0 HEAD | head -n 1).
  2. Navigate to the next commit by running git checkout $(git rev-list --reverse <branch> ^HEAD | head -n 1) as many times as desired, where <branch> is the name of the default branch (e.g. main or master).

If you want an explanation of these commands, read on.

Getting the initial commit

git rev-list --max-parents=0 HEAD returns a list of commits that don't have a parent commit. Since we want the initial commit, it won't have a parent. But the command can return a list of commits. Why? A repository can have multiple commits without parents from rewriting the repository's history. The React repository is a good example of this. If we run git rev-list --max-parents=0 HEAD on that repository, we get four commits:

a33c0b897c502f82f59fe283e71323b9cb0503a6
848327760f4d351e41f75385709c7748cfff9164
5e0dfdac54e509a97483af1f78af09451c03bfbb
75897c2dcd1dd3a6ca46284dd37e13d22b4b16b4

To figure out the correct one, let's use git rev-list --max-parents=0 HEAD --pretty to prettify the output:

commit a33c0b897c502f82f59fe283e71323b9cb0503a6
Author: ****** <*******.com>    
Date:   Tue May 11 16:10:34 2021 -0700         
                                            
    Initial commit                             
                                            
commit 848327760f4d351e41f75385709c7748cfff9164
Author: ****** <*******.com>    
Date:   Tue Aug 13 10:09:17 2019 -0700         
                                            
    Initializing empty merge repo              
                                            
commit 5e0dfdac54e509a97483af1f78af09451c03bfbb
Author: ****** <*******.com>    
Date:   Tue Jan 22 11:04:37 2019 -0800         
                                            
    Initial commit                             
                                            
commit 75897c2dcd1dd3a6ca46284dd37e13d22b4b16b4
Author: ****** <*******.com>    
Date:   Wed May 29 12:46:11 2013 -0700         
                                            
    Initial public release     

I blanked out the authors' details for privacy, but we can see these commits are ordered by latest first. We want the earliest commit, which is 75897c2dcd1dd3a6ca46284dd37e13d22b4b16b4 in 2013. To get that, we run git rev-list --reverse --max-parents=0 HEAD to reverse the order of the output:

75897c2dcd1dd3a6ca46284dd37e13d22b4b16b4 <-- we want this
5e0dfdac54e509a97483af1f78af09451c03bfbb
848327760f4d351e41f75385709c7748cfff9164
a33c0b897c502f82f59fe283e71323b9cb0503a6

To extract that commit hash, we can pipe the output of this command into another command. This is what the | does. It takes the output of the left-hand side and sends it as input to the right-hand side. In our case, the left-hand side is git rev-list --reverse --max-parents=0 HEAD and the right-hand side is Select-Object -First 1 or head -n 1, depending on which operating system is being used. Select-Object -First 1 and head -n 1 do the same thing: output the first x lines. We're specifying 1 as we only want the first line. Now we have our commit hash.

$(...) runs a subcommand, so everything inside of the parentheses is executed first. Once finished, our result can be used directly. So, using the React example above, running our subcommand essentially leads to git checkout 75897c2dcd1dd3a6ca46284dd37e13d22b4b16b4 being executed.

Moving to the next commit

The only difference between getting the initial commit and moving to the next one is we use git rev-list --reverse <branch> ^HEAD as part of the subcommand. For the branch, I'll use main to make things more concrete.

Starting small and ignoring the --reverse flag for now, running git rev-list main ^HEAD gets all commits that are reachable from main and excludes those that are reachable from HEAD. What does that mean? Let's say we have the following set of commits on the main branch:

A<-B<-C<-D<-E

where A is the initial commit and E is the most recent commit. Note the arrows pointing backwards. These are the links to the parent commits. Assume we've checked out commit B:

A<-B<-C<-D<-E
   │
   └ HEAD points here

So which commits are reachable from main? All of them. E has a parent of D which has a parent of C and so on. rev-list will follow those parent links to get the list of commits reachable from main. What commits are reachable from HEAD? We can get to B since we're already there, and following the parent link gets us to A. So we have:

Reachable from main: A<-B<-C<-D<-E
Reachable from HEAD: A<-B

As a reminder, the command we're using will exclude those reachable from HEAD so we're left with:

C<-D<-E

If we did nothing else, the command would output the commits in the order they were visited in the graph: E, D, then C. But E isn't the next commit after B. C is. We use the --reverse flag to reverse this order giving us: C, D, then E. As with getting the initial commit, these results are piped to either Select-Object or head, depending on the operating system, and we take only the first result. That result will be the next commit we want to git checkout. In our example, this would execute git checkout C.

Hiring?

I'm looking for work in London, UK. If you have a full-stack or backend role available, please contact me: hireme2025@johnh.co