GitHub rebasing for the terrified

This is a follow up to my GitHub primer post where I introduced how most of the git concepts. One of the things that catches a lot of people out when the first use git is the idea of rebasing. While you're working on the code base, other people can be working in their own forks, and things can get out of date and this can cause problems when you go to merge your changes back in. So how do you avoid getting into "rebase hell"? This is the stuff I wish I'd properly appreciated before I started, and includes the commands I use to get out of problems.

I'm going to repeat the diagram from the previous post because I'm going to refer to some steps in it by number:

I'm going to refer to the "main" branch here as it;s now the default that GitHub creates, but many repositories will still be using the old default of "master" used upstream development branch. If your project uses "master", or you're working from an alternate upstream branch, substitute that instead in everything that follows. In all of these examples my personal branch name is "fixsomethingbranch" but obviously that can be whatever you like, and is typically something more meaningful to what you are doing.

So here are my tips for avoiding rebase hell:

1. Don't implement things on the "main" branch

First things first, in order to avoid getting into trouble in the first place, as a rule of thumb for most projects DON'T SKIP STEP 5 IN THE DIAGRAM - YOU SHOULD AVOID DOING CHANGES ON THE `main` BRANCH. I feel this needs to be in bold! This branch should be kept as clean as possible in your local file system and should only pick up changes that have been integrated into the upstream codebase. Why is this? It means you always have a "known stable base"on your local file system to apply your changes to. When  you want to start doing some work, checkout from the main branch by doing:
  • git checkout main; git checkout -b fixsomethingbranch
Or as a single command:
    • git checkout -b fixsomethingbranch main

    You could even have a "create new git branch" alias to avoid missing this:
    • alias gnb="git checkout main -b"
    Although that might confuse things if you have some projects using "main" and other using something else - you'd need separate aliases, so it might be better to stick with the full version.

    2. Keep your checked out "main" in sync with upstream regularly

    I'd settle for "Before you start work on a new branch" as a definition of "regularly". Step 4 in the diagram covers this step, but it's always a good idea to keep your local copy of the branch in sync with the upstream repository. Assuming you have a remote set up called "up" tracking the upstream repository[*] you can do this with:
    • git checkout main
    • git pull up main
    • git push origin main
    There shouldn't be any merge conflicts at this point, as you won't be making changes into main. If you have committed to main (or possibly if someone has force pushed changes into the upstream repository's main branch) then you can use:
    • git checkout main
    • git reset --hard up/main
    • git push -f origin main
    if you've just run the previous commands, the initial checkout shouldn't be necessary, but I included it again because I'm paranoid, and I don't want you to be on the wrong branch when you push things back.

    [*] - If you don't have a suitable upstream, you really ought to have one, so create one with git remote add up https://github.com/project/repository - the name "up" is up to you - many people use "upstream" instead of up so bear that in mind when reading anyone else's examples, but I prefer having fewer characters to type...

    3. Rebase your branch on main if possible before submitting a PR

    While you don't necessarily have to do this, particularly if it's a change you've made quickly since syncing your upstream, it's good practice to ensure your changes will merge smoothly. It becomes more important if you know some changes affecting the same parts of the code have gone into the upstream codebase (and if you have unit tests, it's good to rebase and run them before submitting a PR to reduce the risk of failure. While you can submit a PR and then rebase - even in the GitHub UI, it's nicer to your reviewers if they don't have to see unfinished changes. To rebase your branch onto main, follow the process in the previous paragraph to bring your local copy of the main branch to bring it up to date, then go back to your own branch and perform a rebase as follows. If you're not confident with this process yet, it might be worth taking a note of the commit has, so in the worst case you can git reset --hard <hash> to get back to where you started before attempting the rebase)
    • git checkout fixsomethingbranch
    • git rebase main
    If there is a merge conflict then you'll get a message like this which shows a conflict in the file ansible/vagrant/Vagrantfile.CentOS6 because my local branch had changed the same lines as another merged PR and git doesn't know what to do with it:

    [sxa@sainz AdoptOpenJDK_Unix_Playbook]$ git rebase main
    First, rewinding head to replay your work on top of it...
    Applying: pbTests: switch mirrorlist to vault on centOS6 yum repo
    Using index info to reconstruct a base tree...
    A ansible/Vagrantfile.CentOS6
    Falling back to patching base and 3-way merge...
    Auto-merging ansible/vagrant/Vagrantfile.CentOS6
    CONFLICT (content): Merge conflict in ansible/vagrant/Vagrantfile.CentOS6
    error: Failed to merge in the changes.
    Patch failed at 0001 pbTests: switch mirrorlist to vault on centOS6 yum repo
    hint: Use 'git am --show-current-patch' to see the failed patch

    Resolve all conflicts manually, mark them as resolved with
    "git add/rm <conflicted_files>", then run "git rebase --continue".
    You can instead skip this commit: run "git rebase --skip".
    To abort and get back to the state before "git rebase", run "git rebase --abort".

    While this looks like a big scary message, it's not that bad in practice. It gives you all of the information you need. In this case there is one problematic file "ansible/Vagrantfile.CentOS6". You can see the local patch that you did in your branch with "git am --show-current-patch".  If you edit the file you'll see marker lines starting with <<<<<<<, =======, >>>>>>>. Between the first two is the changes that came from main (It'll be marked as <<<<<<< HEAD:) , between the last two is the lines you've changed in your branch. Edit the file to make it look "correct" after resolving the conflict  (be sure to remove the marker lines!) Note that there may be more than one conflict in each file - I usually search the file for "====" to locate them.

    Once you've done the edits, save the file(s), exit your editor and use git add on the files you've changed, then run git rebase --continue You can also use git rebase --skip to not merge in that change from main, but that's not usually the thing you want to do as you'll typically have a merge conflict when you create the PR. If you decide to give up on the rebase altogether at this point because it's too complicated to deal with right now, you can issue git rebase --abort

    Once you've completed that, you'll probably need to force push your branch up again if you've already pushed it once, since you're changing the commit history that you had already pushed to your branch:
    • git push -f origin

    4. Help! I didn't start on main and now have extra commits in my PR!

    I refer you to point 1 in this blog post :-)

    OK if you're asking for help with this I'll cover how to get out of it... Sometimes people find they've ended up with "extra" commits when they create a PR and are unsure how it happened. This typically happens if you started on another branch with some of your changes, and created the new branch from there. If you follow the convention of always checkout out from a clean "main" branch you should not run into this problem. (other changes could have gone into the upstream repository first, which will make the commit hash different). The best thing to do if you end up in this situation is to try and remove the extra commits from your branch history, and then you can rebase on main (if necessary) to pull them back in. In order to erase them from your commit history, you can use a command like:
    • git log --oneline
    to see how far back you need to go to cover all of the erroneous commits, then use:
    • git rebase -i HEAD~5
    The number 5 here is an example (and note that the character before it is a tilde, not a hyphen) - go back as far as you need to cover all of the commits you want to remove. When your editor starts, change "pick" to "drop" (or "d") for the extra commits you have from the old branch, then save. That will remove the commits from your history. If you had already pushed your branch back up to GitHub, you'll need to "force push" to send your changes up:
    • git push -f fixsomethingbranch

    Help - someone else pushed to my branch!

    In some projects, the committers may have the ability to push directly to your branch in github. They may do this to rebase on top of the latest changes. If you do this then your local copy will be out of sync with the github copy. If this happens you will get an error such as this:

    hint: You have divergent branches and need to specify how to reconcile them.

    hint: You can do so by running one of the following commands sometime before

    hint: your next pull:

    hint: 

    hint:   git config pull.rebase false  # merge

    hint:   git config pull.rebase true   # rebase

    hint:   git config pull.ff only       # fast-forward only

    hint: 

    hint: You can replace "git config" with "git config --global" to set a default

    hint: preference for all repositories. You can also pass --rebase, --no-rebase,

    hint: or --ff-only on the command line to override the configured default per

    hint: invocation.

    fatal: Need to specify how to reconcile divergent branches.

    In general you probably want to rebase, so either use the second git config command in the example to set that as a default, or use git pull --rebase

    Summary

    Hopefully if you're reading this there has been something useful in there for you. Once you've got used to the operations in here they don't seem so scary, but they can be really daunting at first as you worry about whether you're going to lose any changes you've made and/or get stuck in a rebase, but this blog should help you avoid that situation in the first place, or get out of it if you're stuck.

    Comments

    Popular posts from this blog

    macOS - first experiences from a Linux user perspective

    Antisocial Networking: List of Top Tips

    Customer service: contacting banks via the internet - how hard can it be?