The git rebase command often causes Git newbies to stay away from it because of its reputation as a Git magic command, but in fact it can really make a team's life easier if it can be used correctly. In this article we will compare git rebase to the often compared git merge command, and identify potential scenarios where rebase can be used in a real-world typical Git workflow.
Concept overview
First of all, we should understand that git rebase is used to handle the same problems that the git merge command handles. Both commands are used to integrate changes from one branch into another branch - they just achieve the same goal in different ways.
Consider this scenario, when you start developing new features in a dedicated branch, another team member updates the contents of the main branch. This will create a forked commit history that will be familiar to anyone who uses Git as a code collaboration tool.
Now assume that the new content in the main branch is related to the new feature you are developing. In order to apply the new code in the main branch to your feature branch, you have two methods: merge and rebase.
Use merge
The simplest way is to merge the main branch into the function branch:
git checkout feature
git merge main
Or use the following one-line command:
git merge feature main
This will create a merge commit in the feature branch. This commit will connect the commit history of the two branches. It will look like this in the branch diagram structure:
The merge operation is friendly because it is non-destructive. The existing branch history will not be changed. This feature avoids all the pitfalls of the rebase operation (discussed in detail below).
But on the other hand, this also means that every time the feature branch needs to apply changes from the upstream branch, an irrelevant commit history will be added to the commit history. If the main branch is very actively updated, this operation will also pollute the submission history of the feature branch to a considerable extent. Although the complicated git log command can alleviate the confusion of this commit history, it will still make other developers confused about the commit history.
Use rebase
In order to replace the merge operation, you can also rebase the commit history of the feature branch to the top of the commit history of the main branch:
git checkout feature
git rebase main
These operations will put the starting history of the feature branch above the last commit of the main branch, and also achieve the purpose of using the new code in the main branch. However, compared to creating a new merge commit in the merge operation, the rebase operation will rewrite the commit history of the original branch by creating a brand new commit for each commit of the original branch.
The biggest benefit of using the rebase operation is that you can make the project submission history very clean and tidy. First, it eliminates the unnecessary creation of merge commits required by git merge operations. Secondly, as shown in the figure above, rebase will create a linear project submission history - that is to say, you can start from the top of the feature branch and search down to the starting point of the branch without encountering any historical forks. This is easier when using commands such as git log, git bisect and gitk.
However, in order to obtain this easy-to-understand submission history, you need to pay two prices: security and traceability. If the golden rule of rebasing is not followed, rewriting a project's commit history can have potentially disastrous consequences for collaborative workflows. Third, the rebase operation loses the contextual information that the merge commit can provide - so you have no way of knowing when the feature branch applied the changes from the upstream branch.
Interactive rebase possible
Interactive rebase gives you the opportunity to make changes to the commit record before committing the changes to other branches. This is even more powerful than an automatic rebase operation, after all, it provides complete control over the branch's commit history. Generally speaking, the usage scenario of this operation is to sort out the messy commit records of the function branch before merging the function branch to the main branch.
To perform an interactive rebase operation, you need to pass the i option parameter to the git rebase command.
git checkout feature
git rebase -i main
Executing the above command will open a text editor containing a list of all commits that need to be moved in the branch:
pick 33d5b7a Message for commit #1
pick 9480b3d Message for commit #2
pick 5c67e61 Message for commit #3
The list above represents exactly what the branch's history looks like after it is rebased. By modifying the pick command or reordering the commit history, you can make the final commit history look like anything you want. For example, if the second submission fixes a bug in the first submission, you can use the fixup command instead of pick to compress the two submissions together.
pick 33d5b7a Message for commit #1
fixup 9480b3d Message for commit #2
pick 5c67e61 Message for commit #3
After you save and close this file, Git will perform a rebase operation based on your modification results. According to the above example, the project history will become as follows:
By clearing out unimportant commit history, you can make the overall project history more readable and understandable. This is something that the git merge operation cannot provide.
The golden rule of rebase operation
Once you understand what rebase is, the next most important thing is to understand when not to use it. The golden rule about git rebase is never use it on a public branch.
For example, think about what would happen if you rebase the main branch onto the feature branch:
The rebase command will move all commits in the main branch to the top of the commit record in the feature branch. The problem is that this change currently only appears in your local repository. Other developers are still developing on the original main branch. Since rebase will produce a new commit record, Git will think that your local main branch is now forked from everyone else's.
The only way to synchronize two different main branches is to merge them, which will produce a redundant merge commit, and most of the commits in this merge will be the same (the previous main branch and your local in the main branch). Needless to say, this is really confusing.
So any time before executing a git rebase command, first check "Is anyone else using this branch?" If the answer is yes, then you should stop and think about other non-destructive operations (such as try git revert command). Except in such cases, it is safe to rewrite the commit history.
Force-Pushing
If you do rebase the main branch and then want to push the main branch to the remote repository. At this time, Git will prevent you from pushing this time because the submission of the local branch conflicts with the submission of the remote branch. However, you can still force the push by using the --force option, like this:
# Be very careful with this command! git push --force
The result of the force push will be that the main branch of the remote warehouse uses the branch submission history that you have rebased. Of course, this will make other team members very confused. So don’t use the force push option unless you know exactly what you are doing.
There is only one situation where you "should" use the force push command, and that is when you push a private branch to the remote repository and then do some cleanup work. Your probably thought at this time is: "Oh! I find it is more appropriate to use the records of the current branch instead of the records of the branch that have been pushed before." Even so, it's important to make sure no one is collaborating with you on this branch.
Workflow practice
Regardless of the size of the team, the rebase operation can be smoothly integrated into the existing team's workflow. In this section, we take a look at what benefits rebase can provide at different stages of feature development.
In any kind of workflow, if we want to involve rebase, the first step is to create a dedicated branch for feature development. This provides the necessary branching structure to use rebase safely:
local cleanup
One of the most suitable scenarios for including a rebase operation in an existing workflow is when cleaning up a local in-progress development branch. By regularly using interactive rebase operations, you can clean up the commit records of this branch, making each commit more focused and meaningful. Interactive rebase operations allow you to write code without paying too much attention to the commit history. In fact, you can clean up the commit history afterwards.
When using the git rebase command, there are two options that can be used as the new base: the parent branch of the feature branch (such as the main branch), or a commit in the history of this branch. We saw an example of the first case in the section on interactive rebase. The latter option is useful for modifying the commit history within this branch. For example, the following command will start a rebase operation for the last three commit histories.
git checkout feature git rebase -i HEAD~3
By specifying HEAD~3 as the new base for the rebase operation, you are not actually moving the branch - you are just interactively rewriting the history of the three commits following the HEAD~3 commit. Note that this operation will not introduce upstream modifications into the feature branch:
If you want to rewrite the entire feature branch history, you should try the git merge-base command, which will return you the original base of the feature branch. The following command returns the commit ID of the original base. Once obtained, it can be used as a parameter for the git rebase command:
git merge-base feature main
The usage scenario of rebase like the above is very helpful to introduce git rebase into the existing workflow, after all, it will only affect the local branch. What other developers can see is only the work you have completed, a beautiful branch submission history with a clean submission history, easy to understand branch content, and easy to track the development process.
But still, this can only be done with private branches. If you collaborate with other developers through the same branch, then this branch is a public branch and rewriting the commit history is not allowed.
There is no alternative to cleaning up the local commit history for git merge operations.
Introduce changes from upstream
At the beginning of this article, we discussed how to introduce changes to the upstream main branch through git merge or git rebase. The merge operation is safe enough because it preserves the complete commit history, but the rebase operation creates a linear commit history by moving the commit history of the feature branch to the top of the main branch.
This use of the git rebase operation is similar to cleaning up the local submission history (it can also be done at the same time). The difference is that the submission of the upstream main branch will be introduced during the execution.
Remember that rebase can work on any remote branch, not just the main branch. For example, when you need to collaborate with others to develop a feature, you can introduce other people's development content through rebase.
For example, when you and another developer named John both commit to the feature branch, after you fetch the remote feature branch, the local warehouse should look like the following:
To integrate this fork, you can do the same as you did with the main branch: either merge the john/feature branch into the local feature branch through the merge operation, or rebase the local feature branch onto the top of the john/feature branch.
Please note that this does not conflict with the golden rule of rebase, because only new commits from your local feature branch are moved to the top of the john/feature branch, and all commit history before the new commit is unchanged. This is like saying: "Add the new content I submitted on top of what John has already submitted." In most cases, this operation is more intuitive to humans than using the merge operation.
The default behavior of the git pull command is to perform a merge operation, but you can specify the behavior of the pull operation to be rebase by adding the --rebase option.
Using pull requests for feature review
If you use pull requests for code review work, you should avoid using git rebase after creating the pull request. Once you create a pull request, other developers will come to view your submission, which means that the branch at this time counts as a public branch. Then rewriting the submission history at this time will make it impossible for Git and team members to determine which submissions belong to this function.
When introducing any other people's modifications, you should use git merge instead of git rebase.
Therefore it is usually a good idea to perform an interactive rebase after submitting a pull request to clean up the commit history.
Integrate approved features
For function codes that have been reviewed by the team, you can first use rebase to move the new code to the top of the main branch, and then perform git merge to merge the new functions into the main branch.
This operation is similar to rebasing the upstream branch to the local feature branch, except that since you cannot rewrite the submission history of the main branch, you can only integrate the code of the feature branch into the main branch through the git merge operation at the end. However, performing a rebase before merging can ensure that the merge operation can move forward quickly, so that the submission history looks perfectly linear. This also gives you the opportunity to clean up the commit history before actually merging.
If you are not very comfortable with git rebase operations, you can always use a temporary branch to perform rebase operations. In this case, if you accidentally mess up the commit history of the feature branch, there is always a chance to start over from the original feature branch. Like this:
git checkout feature
git checkout -b temporary-branch
git rebase -i main
# [Clean up the history]
git checkout main
git merge temporary-branch
Summarize
That’s everything you need to know to get started with rebase. If you want a clean, linear commit history rather than one with many intertwined merge commits, then you should try using git rebase instead of git merge when integrating branches.
On the other hand, if you want to save the complete commit history and avoid rewriting the history of public commits, you can still stick to git merge. Either is fine, but at least you now have another option to take advantage of git rebase when the opportunity arises.