0 Comments Posted in:

One of the great things Git is how easy it make merging. Two developers can work on the same file and in most cases, the merge algorithm will silently and successfully combine their changes without any manual intervention required.

But merging is not magic, and it's not bullet proof. It's possible for changes to conflict (e.g. two developers edit the same line), and it's also possible for changes that don't strictly "conflict" to nevertheless cause a regression.

Regressions due to merges can be very frustrating, so here are five tips to avoid them.

1. Little and often

Merges are more likely to be successful if they are performed regularly. Wherever possible avoid long-lived branches and instead integrate into the master branch frequently. This is the philosophy behind "continuous integration": frequent merges allow us to rapidly detect and resolve problems. This might require you adopt techniques such as "feature flags" which allow in-progress work to be present but inactive in the production branch of the code. If you absolutely cannot avoid using a long-lived feature branch, then at least merge the latest changes from master into your feature branch on a regular basis, to avoid a "big bang" merge of several months of work once the feature is complete.

2. Pay special attention to merge conflicts

Git identifies changes that cannot be automatically merged as "conflicts". These require you to choose whether to accept the changes from the source or target branch, or whether to rewrite the code in such a way that incorporates the modifications from both sides (which is usually the right choice).

Sometimes, due to unfortunate characteristics of the development tools in use, you can find that certain high-churn code files are constantly producing conflicts. In older .NET framework projects, for example, csproj and package.json files would constantly require manual merges. The volume of these trivial conflicts can cause developers to get lazy, and start resolving them too rapidly without due care and attention.

Whenever your code conflicts with someone else's changes, find out who made the conflicting changes. They should be involved in the merge process. I recommend "pair merging" where possible, where you agree together on the resolution of the conflict before completing the merge. But if that's not possible, at least make contact with the author of the conflicting change, and ask them to specifically review the changes to conflicting files.

3. Code reviews

Code reviews are also an important part of avoiding regressions. I recommend using a "pull request" process, where no code gets into the master branch without going through a code review. If any merge conflicts are involved, then all authors whose code conflicted with your changes should be invited to the code review, in addition to whoever is usually invited.

I also recommend that in the pull request description, you should explicitly highlight areas of special concern. This is especially important if a code review contains many files, as it's possible for reviewers to get code review fatigue after looking through the first few hundred changes, and start missing important things. Make sure reviewers are aware of the high-risk areas of change, which includes any merge conflicts.

4. Unit tests

Unit tests have several benefits, but one particularly valuable one is that they protect us against regressions. If your feature gets broken by someone else's merge, it's very easy to point the finger of blame at them, but first should ask yourself the question "why didn't I write a unit test that could have detected this"? Undetected regressions indicate gaps in automated test coverage.

5. M&M's (Microservices & Modularization)

If you're performing many merges, it's because many developers are working on the same codebase. Which probably means that you have a large "monolith". One of the benefits of adopting a microservices architecture, and extracting components out into their own modules (e.g. NuGet packages in .NET), is that it makes it much easier for different teams working on different features to do so without stepping on each other's toes. So lots of merge conflicts may indicate that your service boundaries are in the wrong place, or you have code with too many responsibilities.

Collective responsibility

It's important to recognize that these five suggestions are not all aimed at the individual who performs the merge. In fact, only #2 is directly aimed at the merger. The others are the shared responsibility of the rest of the development team. And these five suggestions aren't the only ways we can reduce the likelihood of merge conflicts. If you're interested in more on this topic, I've also written about how the application of principles like the Open Closed Principle, and Single Responsibility Principle result in code that is easier to merge.


0 Comments Posted in:

Last week I was tasked with tracking down a perplexing problem with an API - every call was returning a 500 error, but there was nothing in the logs that gave us any clue why. The problem was not reproducible locally, and with firewall restrictions getting in the way of remote debugging, it took me a while to find the root cause.

Once I had worked out what the problem was, it was an easy fix. But the real issue, was the fact that we had some codepaths where exceptions could go unlogged. After adding exception logging to some places that had been missed, it became immediately obvious what was going wrong. Had this issue manifest itself on a production system, we could have been looking at prolonged periods of down-time, simply because we weren't logging these exceptions.

So this post is a quick reminder to check all your services - are there any places where exceptions can be thrown that don't end up in your logs? Here's three places to check.

Application startup

When an application starts up, one of the first things you should do is create a logger, and log a "Starting up" message at informational level. This is invaluable in providing a quick sanity check that at your application code did in fact start running and that it is correctly configured for logging.

I also like to log an additional message once all the startup code has completed. This alerts you to any problems with your service only managing to get half-way through initialization, or if there is a long-running operation hidden in the start-up code (which is usually a design flaw).

Of course, you should also wrap the whole startup code in an exception handler, so that any failures to start the service are easy to diagnose. Something like this is a good approach:

public void Main()
{
    var logger = CreateLogger();
    try 
    {
        logger.Information("Starting up");
        Startup();
        logger.Information("Started up");
    }
    catch (Exception ex)
    {
        logger.Error(ex, "Startup error");
    }
}

Middleware

In our particular case, the issue was in the middleware of our web API. This meant the exception wasn't technically "unhandled" - a lower level of the middleware was already catching the exception and turning it into a 500 response. It just wasn't getting logged.

Pretty much all web API frameworks provide ways for you to hook into unhandled exceptions, and perform your own custom logic. ASP.NET Core has exception middleware that you can customize, and the previous ASP.NET MVC allows you to implement a custom IExceptionHandler or IExceptionLogger. Make sure you know how to do this for the web framework you're using.

Long-running threads

Another place where logging can be forgotten is in a long-running thread such as a message pump, that's reading messages from a queue and processing them. In this scenario, you probably have an exception handler around the handling of each message, but additionally you need to log any exceptions at the pump level - e.g. if it loses connection to the message broker, you don't want to die silently and end up no longer processing messages.

In this next sample, we've remembered to log exceptions handling a message, but not exceptions fetching the next message.

while(true)
{
    // don't forget to handle exception that happen here too!
    var message = FetchNextMessage(); 
    try
    {
        Handle(message);
    }
    catch(Exception ex)
    {
        logger.Error(ex, "Failed to handle message");
        // don't throw, we want to keep processing messages
    }
}

You might already have this

Of course, some programming frameworks and hosting platforms have good out-of-the box logging baked in, which saves you the effort of writing this yourself. But it is worth double-checking that you have sufficient logging of all exceptions at whatever point they are thrown. An easy way to do this is to just throw a few deliberate exceptions in various places in your code (e.g. MVC controller constructor, middleware, application startup, etc), and double-check that they find their way into the logs. You'll be glad you did so when something weird happens in production.

In a world of microservices, observability is more critical than ever, and ensuring that all exceptions are adequately logged is a small time investment that can pay big dividends.


0 Comments Posted in:

Recently I've been having a lot of discussions with teams wanting to move towards a cloud-based microservices architecture. And inevitably the question arises whether the best choice would be to go with containers, or a serverless "Functions as a Service" (FaaS) approach.

To keep this discussion from becoming too abstract, let's imagine we're planning to host our application in Azure. Should we create an AKS (Azure Kubernetes Service) cluster and implement each microservice as a container? Or should we use Azure Functions, and implement each microservice as a Function App?

And to keep this article from becoming too long, I'm going to restrict myself to making just a few key points in favour of both approaches.

It's not either/or

First, it's important to point out that hybrid architectures are possible. There is no rule preventing you from using both AKS and Azure Functions, playing to the strengths of each platform. And if you're migrating from a monolith, you may well be running alongside some legacy Virtual Machines anyway.

Also, if you like the Azure Functions programming model, it's quite possible to host Azure Functions in a container. And if you like the consumption-based pricing model and elastic scale associated with serverless, then technologies like Azure Container Instances can be combined with AKS to essentially give you serverless containers.

And while serverless essentially forces you in the direction of PaaS for your databases, event brokers, identity providers etc, you can do exactly the same with containers - there's no reason why they can't reach out to PaaS services for these concerns rather than containerizing everything.

A few strengths of containers

What factors might cause us to favour containers?

Containers are particularly good for migrating legacy services. If you've already implemented a batch process, or web API, then getting that running in a container is much easier than rewriting it for serverless.

Containers make it trivial for us to adopt third party dependencies that aren't easily available (or cost-effective) as PaaS. There's a wealth of open source containerized services you can easily make use of such as Redis, RabbitMQ, MongoDb, and Elasticsearch. You have freedom choose when and if it makes sense to switch to PaaS versions of these services (one nice pattern is to use containerized databases for dev/test environments, but a PaaS database like Azure SQL Database in production).

Containers have a particularly good story for local development. If I have 20 microservices, I can bundle them all into a Docker compose file, and start them all up in an instant. With serverless, you need to come up with your own strategy for how developers can test a microservice in the context of the overall application.

A containerized approach can also simplify the security story. With serverless, you're typically exposing each microservice with a HTTP endpoint publicly on the internet. That means each service could potentially be attacked, and great care must be taken to ensure only trusted clients can call each service. With a Kubernetes cluster, you don't need to expose all your microservices outside the cluster - only certain services are exposed by an ingress controller.

A few strengths of serverless

What are some key strengths of serverless platforms like Azure Functions?

Serverless promotes rapid development by providing a simplified programming model that integrates easily with a selection of external services. For example, with Azure Functions, makes it trivial to connect to many Azure services such as Azure Service Bus, Cosmos DB and Key Vault.

Serverless encourages an event-driven nanoservice model. Although containers place no constraints on what programming models you use, they make it easy to perpetuate older development paradigms involving large heavyweight services. Serverless platforms strongly push us in the direction of event-driven approaches which are inherently more scalable, and promote light-weight small "nanoservices" that can be easily discarded and rewritten to adapt to changing business requirements (a key driver behind the idea of "microservices").

Serverless can offer extremely low cost systems, by supporting a "scale to zero" approach. This is extremely compelling for startups, who want to keep their initial costs to a minimum during a proof of concept phase, and also allows lots of dev/test service deployments in the cloud without worrying about cost. By contrast, with containers, you would almost always have a core number of nodes in your cluster that were always running (so with containers you might control cost either by running locally, or by sharing a Kubernetes cluster).

Serverless also excels in supporting rapid scale out. Azure Functions very quickly scales from 0 to dozens of servers under heavy load, and you're still only paying for the time your functions are actually running. Achieving this kind of scale out is more work to configure with containerized platforms, but on the flip side, with container orchestrators you will have much more control over the exact rules governing scale out.

Summary

Both containerized and serverless are excellent approaches to building microservices, and are constantly borrowing each other's best ideas, so the difference isn't huge (and maybe this question won't even be meaningful in 5-10 years).

Which one would I pick? Well, I think for a more "startupy" application, where it's greenfield development with a small number of developers trying to prove out a business idea, I think serverless really shines, whereas for more "enterprisey" applications, with a lot more components, development teams and maybe some legacy components involved, I think containerized approaches are more promising. In fact, most systems I work on are essentially "hybrid" - combining aspects of serverless, containers and plain old virtual machines.

Finally, for an amusing take on the topic, make sure you check out this genius serverless vs containers rap battle from the Think FaaS podcast.

Want to learn more about how easy it is to get up and running with Azure Container Instances? Be sure to check out my Pluralsight course Azure Container Instances: Getting Started.