Avoiding Merge Regressions

One of the great things Git is how easy it make merging. Two developers can work on the same file and in most cases, the merge algorithm will silently and successfully combine their changes without any manual intervention required.

But merging is not magic, and it's not bullet proof. It's possible for changes to conflict (e.g. two developers edit the same line), and it's also possible for changes that don't strictly "conflict" to nevertheless cause a regression.

Regressions due to merges can be very frustrating, so here are five tips to avoid them.

1. Little and often

Merges are more likely to be successful if they are performed regularly. Wherever possible avoid long-lived branches and instead integrate into the master branch frequently. This is the philosophy behind "continuous integration": frequent merges allow us to rapidly detect and resolve problems. This might require you adopt techniques such as "feature flags" which allow in-progress work to be present but inactive in the production branch of the code. If you absolutely cannot avoid using a long-lived feature branch, then at least merge the latest changes from master into your feature branch on a regular basis, to avoid a "big bang" merge of several months of work once the feature is complete.

2. Pay special attention to merge conflicts

Git identifies changes that cannot be automatically merged as "conflicts". These require you to choose whether to accept the changes from the source or target branch, or whether to rewrite the code in such a way that incorporates the modifications from both sides (which is usually the right choice).

Sometimes, due to unfortunate characteristics of the development tools in use, you can find that certain high-churn code files are constantly producing conflicts. In older .NET framework projects, for example, csproj and package.json files would constantly require manual merges. The volume of these trivial conflicts can cause developers to get lazy, and start resolving them too rapidly without due care and attention.

Whenever your code conflicts with someone else's changes, find out who made the conflicting changes. They should be involved in the merge process. I recommend "pair merging" where possible, where you agree together on the resolution of the conflict before completing the merge. But if that's not possible, at least make contact with the author of the conflicting change, and ask them to specifically review the changes to conflicting files.

3. Code reviews

Code reviews are also an important part of avoiding regressions. I recommend using a "pull request" process, where no code gets into the master branch without going through a code review. If any merge conflicts are involved, then all authors whose code conflicted with your changes should be invited to the code review, in addition to whoever is usually invited.

I also recommend that in the pull request description, you should explicitly highlight areas of special concern. This is especially important if a code review contains many files, as it's possible for reviewers to get code review fatigue after looking through the first few hundred changes, and start missing important things. Make sure reviewers are aware of the high-risk areas of change, which includes any merge conflicts.

4. Unit tests

Unit tests have several benefits, but one particularly valuable one is that they protect us against regressions. If your feature gets broken by someone else's merge, it's very easy to point the finger of blame at them, but first should ask yourself the question "why didn't I write a unit test that could have detected this"? Undetected regressions indicate gaps in automated test coverage.

5. M&M's (Microservices & Modularization)

If you're performing many merges, it's because many developers are working on the same codebase. Which probably means that you have a large "monolith". One of the benefits of adopting a microservices architecture, and extracting components out into their own modules (e.g. NuGet packages in .NET), is that it makes it much easier for different teams working on different features to do so without stepping on each other's toes. So lots of merge conflicts may indicate that your service boundaries are in the wrong place, or you have code with too many responsibilities.

Collective responsibility

It's important to recognize that these five suggestions are not all aimed at the individual who performs the merge. In fact, only #2 is directly aimed at the merger. The others are the shared responsibility of the rest of the development team. And these five suggestions aren't the only ways we can reduce the likelihood of merge conflicts. If you're interested in more on this topic, I've also written about how the application of principles like the Open Closed Principle, and Single Responsibility Principle result in code that is easier to merge.