keeping git repository in sync after forking
With my recent motivation to renew this site, I’ve built the visuals on the site from scratch, using what’s called a starter theme, with Sage. In order to build upon this theme, and for increasing interactivity to the greater internet, I’m applying principles and ideas I’ve learned from the Indieweb community.
As I’ve been integrating new features into Sage, I’m essentially logging the base changes I’m making to Sage and have forked the Sage repository into an Indieweb version that I’ve named Sage Indieweb.
I had a few goals in creating this new repository (repo):
- Update Sage to include updated Microformats (Indieweb requirement)
- Make minimal updates to the original code
- Non-destructively add other code changes using WordPress specific functions and hooks
- Provide a central repository to keep track of theme updates that might help other WordPress theme developers to update their own themes
- Keep Sage Indieweb in sync with the Sage 9 point releases
The last bullet point took me months to figure out how to accomplish without spending hours making manual copy and paste of the various updates. This fork started with Sage version 9.0.5 and I’ve recently updated it to 9.0.7 (latest in the master branch to be correct) using a series of commands using git
that took a lots of trial and error. I had to piece together a bunch of Stack Overflow answers and Git tutorials to make this work and even what I’m documenting might not be as efficient as it could be.
After forking Sage 9.0.5 and converting it into a new repo, I made several changes that I added to the repo over the course of a few months. But my speed was not just from lack of coding time, it was also anxiety from watching the original Sage repository make progress but not knowing how to get to these new changes without manual work.
A couple of weeks ago, I had the itch to finally update Sage Indieweb to match Sage 9.0.7, two more point releases than the fork. I dove in further to learn more about a few git
commands: git remote
, git fetch
, and git cherry-pick
. Finally, I pushed through a lot of research and came up with what I hope will be a good workflow to update other repositories in this situation.
The new Sage Indieweb repo I created had all the same history up to the 9.0.5 release, which makes sense for me to be able to just merge the newer updates to Sage directly into Sage Indieweb. However, with more reading, I realized this wasn’t a good option because I want to keep the git history clean.
Here was my rough mental model:
- Add a remote repository for Sage to associate to my repository, maybe using git subdirectory or subtree, maybe just remote branch
- Fetch the updated changes to Sage
- Squash the newer commits into one combined commit
- Merge squashed commit bundle into my repository
Unfortunately, I was a little off and the above was tweaked:
- Add a remote repository for Sage
- Fetch the updated changes to Sage
- Cherry pick a range of commits into Sage Indieweb
- Squash the cherry picked commits into one combined commit
- Update Github repository
Let’s go through every step.
Add a remote repository for Sage
I was certain there had to be a way for me to associate the new repo with the original but nothing described what I wanted. Both repos are, effectively, unrelated since I took the forked repo, removed all the history, and made it into a new standalone repository on Github with a clean git history.
$ git remote -v
origin https://github.com/asuh/sage-indieweb.git (fetch)
origin https://github.com/asuh/sage-indieweb.git (push)
After trial and error, I realized it was possible to just ad a new remote repository with a new name. This is how I associated it with the new repo.
$ git remote add sage https://github.com/roots/sage.git
$ git remote -v
origin https://github.com/asuh/sage-indieweb.git (fetch)
origin https://github.com/asuh/sage-indieweb.git (push)
sage https://github.com/roots/sage.git (fetch)
sage https://github.com/roots/sage.git (push)
Okay, so let’s get everything from the repo.
Fetch the updated changes to Sage
There might be a more efficient way for me to do this but I just fetched all of Sage including its various branches and tags.
git fetch sage
This made me nervous, maybe there was now some changes but the coast was clear.
$ git status
On branch master
Your branch is up to date with 'origin/master'.
Now comes the trickiest part.
Cherry pick a range of commits into Sage Indieweb
Originally, I was hoping I could just squash all the newest commits down into one commit and just merge that new commit. Maybe this is possible but I didn’t see a straightforward way to do this without a using a merge command, but even that didn’t appear good.
I found out that it’s possible to cherry pick a range of commits. Lots of trial and errors in practice since I wasn’t sure if I needed additional flags. In fact, I was hoping I could just flag something like --squash
into the cherry pick command to squash everything together as I’m cherry picking that range, but that was not possible.
$ git cherry-pick 37c7e0d..19057f6
error: could not apply 9040a3d... Normalize and enforce single quotes in scripts
hint: after resolving the conflicts, mark the corrected paths
hint: with 'git add <paths>' or 'git rm </paths><paths>'
hint: and commit the result with 'git commit'</paths>
Darn, it’s a merge conflict.
So what I’m telling git
to do is grab all the commits from just after 9.0.5, which is a SHA-1 of 37c7e0d
all the way to the latest commit, separated by the two dots which signifies everything in between these two commits.
You can see the first of many merge conflicts or issues that stopped the process along the way. The errors were either because of my Indieweb updates to Sage or because of remote merges that occurred on the Sage repo.
Wait, what? Cherry picking doesn’t apply merges based from the remote repo because they are commits with more than one parent. I had to actually choose how to merge in the remote merges. So confusing.
So, when I got a regular merge conflict like the one above, I went to my editor, made the updates and necessary changes, then continued the cherry picking process.
$ git cherry-pick --continue
When I came across a merge conflict based on a remote merge, it was more complicated. Lots of wrong commands helped me learn I had add a flag to choose what’s called the mainline, which lets git know which of the parents to add. It’s still confusing to me but you can read more about it.
$ git cherry-pick -m 1 93ee95d
On branch master
Your branch is ahead of 'origin/master' by 9 commits.
(use "git push" to publish your local commits)
You are currently cherry-picking commit 93ee95d.
nothing to commit, working tree clean
The previous cherry-pick is now empty, possibly due to conflict resolution.
If you wish to commit it anyway, use:
git commit --allow-empty
If you wish to skip this commit, use:
git reset
Then "git cherry-pick --continue" will resume cherry-picking
the remaining commits.
Oh man, now I’m confused again. But after more referencing the logs, researching what’s going on and getting more errors, I decided to just $ git commit --allow-empty
to get past this since I likely had the commit in there.
All in all, I had a lot of $ git log
and $ git cherry-pick --continue
to get through this. In the end, I verified that everything was brought over.
Squash the cherry picked commits into one combined commit
While the command $ git merge --squash
is what I wanted to use, I ended up having to use $ git rebase -i 32bb46d
, 32bb46d
being the latest commit made, to date, to the Sage master branch in order to squash everything down to one combined commit. I need to read more to see if the commit is required because it would have been nice to just $ git rebase -i
.
I looked through the log of what I committed and the log needed to be cleaned up as well. So I edited it with $ git commit --amend
since I hadn’t pushed it to the repo to give it an appropriate title.
Update Github repository
Once all of this was cleaned up and the history looks good, off to the repository it goes. $ git push
and off it goes!
Thoughts about this process
This was an exercise in frustration as I was trying to figure out how to do all of this. Git is a fascinating tool to accomplish a lot especially when working on teams and using feature branches.
$ git log
is an ok default but there’s so much more I could use it for visually- I wish upstream merges weren’t so tough to merge into local repositories. This use of mainline is not obvious and I feel like there’s UX improvement that could be shown here, similar to an interactive rebase view
The learning process isn’t always pretty but I have a better grasp at other concepts that’ll spin off from merging and fetching.