0 Comments Posted in:

In this series:

In the previous installment of this series, we talked about how Azure Service Bus offers us the ability to send our messages to a queue, where each message we send is intended to be handled once, or we can send the message to a topic which allows multiple subscribers to handle each message.

This raises the question of how to decide whether to use a queue or a subscription. The answer is that it depends a lot on what sort of message you are sending.

So far in this series we've talked a lot about "messages", which is simply the generic term we use to describe the data that we are transmitting via Service Bus. We've already seen that a message has a "payload" plus metadata. But its completely up to us what we put in that payload.

Types of messages

There are several types of messages we can send:

  • Commands are messages that instruct the recipient to perform a specific task. (e.g. SendEmail). They are usually posted to queues, since the task is usually intended to be performed a single time (e.g. the SendEmail command should result in one email being sent).
  • Events are messages that inform the recipient that something has happened. (e.g. OrderReceived). These are usually posted to topics, since the sender doesn't necessarily know or care about what should happen next. Maybe multiple actions should take place, in which case there would be one subscription per action.
  • Requests are messages that ask the recipient to fetch some data and to send it as another message, called the response. The request is often sent to a queue, and may specify which queue the response should be sent on (which might even be a temporary queue that is deleted after the response message has been sent). Sometimes this pattern is used to implement synchronous behaviour over the inherently asynchronous nature of messaging, by blocking while we wait for the response message to arrive (I generally consider that to be an anti-pattern though - I prefer to just use a HTTP API if I need to synchronously request data).
  • Of course, you can send anything you like in a message. Some people simply serialize documents as messages. This might be an entity from a database such as an Order or a Customer. This sort of message doesn't really imply anything about the intent of the message, so it's another pattern I generally avoid.

Let's focus a bit more on command and event messages which I consider to be the two most useful.

Command messages

As we've already said, command messages request that an action is performed, and is usually sent to a queue.

My preference is for the command handlers to be as dumb as possible. In other words, the code sending the command should have already made any business logic decisions about exactly what needs to be done. The command message should therefore contain enough information for the handler to simply perform the action.

It's very important for queue message handlers to be idempotent. That is to say, if for whatever reason the command message gets handled twice, it should have the same effect as handling it once. Your chances of achieving this will greatly improve if you adhere to the single responsibility principle and ensure that your message handler only does one thing. That way it's much easier to reason about what would happen if the message handler failed half way through and had to be retried.

Often commands can be raised in a "fire and forget" manner. In other words, the sender just requests that the action is taken, but doesn't need to be notified when the action has been completed. That keeps things simple, but there may be occasions when you need to know when the action has completed, or maybe take mitigating action if the requested task could not be performed for some reason. In these cases, you might permit the message handlers to raise some kind of "done" event. However, this approach should be treated with caution as it can result in using messages to build a poor man's workflow engine, something that is better achieved with a dedicated framework like Durable Functions (which I've created a Pluralsight course on).

Sometimes with commands, people want a way to prioritize them. Some command messages should be processed urgently, maybe because an end user is waiting for that task to complete, while other command messages represent "background" actions that can wait. Queues are inherently first-in first-out, and don't natively support prioritization. There are a few approaches you can take to this. One of the simplest is to simply have two queues - a high priority and a low priority queue. (It is possible to implement roughly the same thing using a topic with multiple subscriptions and use metadata on the message to filter the message to the appropriate subscription, although I prefer just to use queues).

If your prioritization needs become more complex than simply high and low priorities, then it may be that queues are not the most appropriate technology, and the "commands" could be stored in a database, allowing for much richer and more complex sorting to determine what should be done next. Of course if you take this approach, you lose out on some of the benefits of queues, and often end up requiring you to write more complex code, so this should only be attempted if you really need the flexibility to handle commands in arbitrary orders.

Another important consideration with any message handlers is what level of parallelization you require. How many command messages can or should be handled simultaneously? In many cases, it's quite safe to process dozens of messages at the same time. But you do need to consider whether concurrent message handling could overwhelm a downstream resource. We've already seen that the Service Bus SDK lets us specify how many concurrent messages should be handled by a listener process.

Finally, another question that arises is whether to have a queue per message type (e.g. one queue for SendEmail commands, and another queue for CreateZipFile commands), or one "command queue" that you put all command messages onto. There is no one right answer to this question. With a queue per message type there is more potential for parallel operations and more control over prioritization. I often go for a compromise approach where similar commands (e.g. commands that result in blob storage operations, or commands that result in communication with a specific web API) are put on the same queue, rather than going to the extreme of just one queue or having dozens of queues.

Event messages

Whilst command messages have their place, my preference when designing an architecture is to make use of event messages wherever possible. Sometimes these are called "domain events", as each message describes some interesting thing that happens in the domain of the microservice that raises them.

Domain events are particularly great for cross-cutting concerns like auditing. If I want to generate an audit log, I can simply subscribe to the relevant "event" messages and turn them into audit messages, without that concern having to be littered in multiple places through the code. Another example would be user notifications - often users might want to get emailed or texted when certain things happen in the system, and events are a great way of triggering such behaviour.

Try to avoid command messages masquerading as events. Sometimes I see an event named something like "SendEmailRequested". It's basically being too opinionated on what the subscribers should do. Instead raise an event that says what happened, and if the right thing to do is send an email in response to that, then a subscription can be created to send the email.

One challenge with event messages is deciding how much information they should contain. Imagine an event message contains a user id, but all (or most of) the subscribers also need to know the user name and email address. If the message only contains the user id, then all the subscribers will simultaneously need to look up the same user name and email address, which can have a performance impact.

However, if you decide that the message should contain the user name and email address, you are in danger of bloating the message, but also the code raising the event might not have that information to hand, and so have to perform a lookup itself. This can result in you negating some of the performance benefits of messaging, as you are now having to make an extra network call before sending an ongoing message.

I can't say I know what the right answer is here, and I generally take a compromise approach. If the publisher of the event happens to have information to hand that would likely be useful to all or most subscribers, then I put it into the event. But if the publisher doesn't have that information to hand, then I'm fine with requiring the subscribers to look it up themselves, even if that might result in multiple subscribers making the same downstream query. Caching can often help alleviate some of the performance impact in this situation.

Often an application might publish all of its domain events to a single topic, and Azure Service Bus allows subscriptions to either receive all events, or set up "filtered" subscriptions to allow them to only receive the specific events they are interested in. One interesting question is whether in a microservices architecture each microservice should have its own topic, or whether its OK for all microservices to publish to a single topic, which becomes a giant stream of events. I've heard a few different opinions on which is best, and to be honest I'm not sure there's one right answer. You might start with a single shared topic, but break it up into something more granular as your system grows much larger and the volume of messages becomes overwhelming.

As with queues, the single responsibility principle still matters. If there are three things you need to do in response to a single domain event message, then create three subscriptions, and let each one perform one of the actions. This makes it easier to write idempotent code, and also gets things done faster thanks to parallelism. One thing you need to beware of is re-publishing a domain event to a topic because you want to trigger one of its subscribers again (e.g. as part of some kind of error recovery process). That would result in all the subscribers running for a second time, potentially producing unwanted results.

Finally, as I mentioned earlier when discussing commands, it's very tempting to start to chain handlers together into primitive "workflows" with each subsequent stage in the workflow being triggered by a domain event message. This can result in fragile and hard to debug workflows, and so if you detect yourself doing that kind of thing, then consider whether a dedicated workflow framework like Azure Durable Functions is a better fit.

Summary

There are many types of message you might choose to send, but commands and events are two of the most important, and your choice of topics or queues usually depends on what sort of message you're sending. It's worth investing some time up front deciding what messages make most sense for your application, which may well be a mixture of commands and events.

Stay tuned for the next installment in this series coming soon...

Want to learn more about how easy it is to get up and running with Durable Functions? Be sure to check out my Pluralsight course Azure Durable Functions Fundamentals.

0 Comments Posted in:

In this series:

So far in this series I've mostly focused on queues which are relatively straightforward to understand. The sender sends a message to the queue, and then listeners can connect to the queue and ask for messages. In this diagram, we send one message to a queue, but there are two listeners (this is called the "competing consumers" pattern). Only one of the two listeners will receive and process the queue message.

queues

Topics and Subscriptions

Azure Service Bus allows you to post messages to a topic rather than a queue. The difference is that a topic doesn't store messages. It simply forwards messages onto any subscriptions on that topic. If a topic has no subscriptions, then posting messages to it essentially results in messages being lost.

Note: this means its really important that you create all necessary subscriptions before posting messages to your topic.

If a topic has two subscriptions, then when we post a message to the topic, a copy of that message is placed on both subscriptions. That means in this example, the single message will be received by listener 3, and by either listener 1 or listener 2 (since they are "competing consumers" on subscription 1).

topics

Pub-sub

The type of messaging that topics and subscriptions enable is often called "pub-sub", or "publish and subscribe". It's a great fit when your messages are essentially "events" - in other words, the message simply announces that something has happened.

This allows a single message to result in multiple actions. With this architecture, the sender of the message doesn't usually care what those actions are. It just wants to let other parts of the system that something has happened. We'll talk more about messages as "events" later in this series as this is a very common pattern for event-driven architectures.

Creating a topic and subscription

Earlier in this series we saw how to use the Azure CLI to create an Azure Service Bus namespace and created a queue. Let's also create a topic called topic1 and create two subscriptions for that topic (subscription1 and subscription2):

$topicName = "topic1"
az servicebus topic create -g $resourceGroup `
    --namespace-name $namespaceName -n $topicName

az servicebus topic subscription create -g $resourceGroup `
    --namespace-name $namespaceName `
    --topic-name $topicName -n "subscription1"

az servicebus topic subscription create -g $resourceGroup `
    --namespace-name $namespaceName `
    --topic-name $topicName -n "subscription2"

Posting a Message to a Topic

For this demo, I've updated my Sender application in GitHub to do some basic command line parsing, allowing us to specify not just the Service Bus connection string, and message contents, but to specify either a queue name or a topic name to send to. You can look at the code on GitHub to see how that works.

The more interesting part is that now, instead of using a QueueClient, we use a TopicClient to post the message. Both classes implement implement the ISenderClient interface allowing us to use the same code to send a message whichever option we choose:

ISenderClient senderClient;
if(topicName != null)
    senderClient = new TopicClient(connectionString, topicName);
else 
    senderClient = new QueueClient(connectionString, queueName);

var body = Encoding.UTF8.GetBytes(messageText);
var message = new Message(body);
// configure other properties of the message here if we want
await senderClient.SendAsync(message);
Console.WriteLine($"Sent message to {senderClient.Path}");
await senderClient.CloseAsync();

With these changes, we can run our application with dotnet run and pass our own command line arguments (after -- to ensure they are interpreted as application arguments) like this:

dotnet run -- -c $connectionString -t "topic1" -m "OrderReceived"

If all has gone well, the "OrderReceived" message we sent will have been forwarded by the topic onto both subscription1 and subscription2. Let's see how to receive the messages next.

Receiving Messages on a Subscription

I made a similar change to the Receiver application in the demo project, allowing the user to choose between listening on a queue or a subscription.

To listen on a subscription we need to use a SubscriptionClient instead of a QueueClient, and the SubscriptionClient needs to know both the name of the topic and subscription it should be listening on. But apart from that, everything remains the same, and we can take advantage of the IReceiverClient interface that both clients implement.

IReceiverClient listenerClient;
if (topicName != null)
{
    listenerClient = new SubscriptionClient(connectionString, topicName, subscriptionName);
}
else
{
    listenerClient = new QueueClient(connectionString, queueName);
}
var messageHandlerOptions = new MessageHandlerOptions(OnException);
listenerClient.RegisterMessageHandler(OnMessage, messageHandlerOptions);
Console.WriteLine($"Listening on {listenerClient.Path}, press any key");
Console.ReadKey();
await listenerClient.CloseAsync();

Now we can run our application and listen to subscription1 on topic1 and we'll get the OrderReceived message we sent earlier.

dotnet run -- -c $connectionString -t "topic1" -s "subscription1"

And if we run again listening on subscription2, we'll see that this subscription also picks up a copy of the OrderReceived message. This essentially allows us to have to separate message handlers, each performing an important task - e.g. one might take payment for the order, and the other might send the customer an order confirmation email.

dotnet run -- -c $connectionString -t "topic1" -s "subscription2"

Summary

Topics and subscriptions offer a powerful alternative to queues, allowing us to create more than one handler for the same message. But we haven't said too much yet about why this functionality is so useful. Next up we'll talk about two of the most important types of message - commands and events.


0 Comments Posted in:

In this series:

In an earlier post, we saw some basic code examples of how to send and receive messages from an Azure Service Bus queue. In this post, we're going to look at some options we can configure when receiving messages.

MessageHandlerOptions

If you recall, our basic message receiving code looked a bit like this, where we passed a message handler function to RegisterMessageHandler.

static void Main(string[] args)
{
    var connectionString = args[0];
    var queueName = "queue1";
    var queueClient = new QueueClient(connectionString, queueName);
    var messageHandlerOptions = new MessageHandlerOptions(OnException);
    queueClient.RegisterMessageHandler(OnMessage, messageHandlerOptions);
    Console.WriteLine("Listening, press any key");
    Console.ReadKey();
}

But you can also see that we can pass an instance of MessageHandlerOptions, which allows us to control the way we receive messages. It only has four properties, so let's look at each of them.

ExceptionReceivedHandler

We've already seen that we're required to provide an exception handler function. This will be called if there are problems communicating with the Service Bus. We not only get access to the exception that was thrown, but the ExceptionReceivedContext tells us a little bit more about what exactly it was trying to do when the problem occurred. In our demo we just write to the console, but here you'd typically write out to logs, and maybe even do something that will cause an alert to be raised if this is a critical application running in production.

static Task OnException(ExceptionReceivedEventArgs args)
{
    Console.WriteLine("Got an exception:");
    Console.WriteLine(args.Exception.Message);
    Console.WriteLine(args.ExceptionReceivedContext.Action);
    Console.WriteLine(args.ExceptionReceivedContext.ClientId);
    Console.WriteLine(args.ExceptionReceivedContext.Endpoint);
    Console.WriteLine(args.ExceptionReceivedContext.EntityPath);
    return Task.CompletedTask;
}

MaxConcurrentCalls

We can also specify how many messages we want to process concurrently. By default, this is set to 1, which means that we'll process 1 message off the queue, and only when that has finished will we move onto the next one.

Note that MaxConcurrentCalls applies to just the process that you are running. If for example you had three instances of the Receiver application each listening to the same queue with MaxConcurrentCalls set to 1, then you would process three messages in parallel.

In most situations, a certain amount of concurrent message handling is desirable, since it allows you to work through a queue more quickly, but as we mentioned earlier in this series, there are some trade-offs. If you handle too many messages in parallel, you could overload downstream services such as a database. Also, if all the messages relate to the same entity in a database, you could find that each handler tries to make conflicting updates to the same row in a database table, resulting in concurrency errors.

I updated my Receiver demo application to optionally sleep for 5 seconds in the message handler, and then set MaxConcurrentCalls to 4. Then when I ran it with 5 messages already in the queue, it was clear that the first four were handled concurrently, and then the fifth was handled afterwards. Feel free to download the demo application and experiment with it yourself.

AutoComplete

One nice feature of Service Bus is that when you retrieve a message from the queue, it's not necessarily removed instantaneously. Instead, that message is "locked" for a time period, so it's not visible to anyone else. When you've finished handling the message you then "complete" that message which tells Service Bus you've finished handling the message, and it can safely delete it. This is called the "peek-lock" mode, and is the default behaviour with the SDK.

However, you can also "abandon" the message - in other words you tell Service Bus that for whatever reason, you were not able to process the message. In that case the message becomes available again to be retried.

If AutoComplete is set to true (which is the default), then assuming your handler completes successfully, the Service Bus SDK will automatically complete the message for you. I recommend you use this behaviour, but in some more advanced scenarios you might want to take control yourself.

If you are completing messages yourself, then you do so by calling CompleteAsync (or AbandonAsync) on the QueueClient, and you need to pass in the LockToken for the message, which is available on the SystemProperties of the Message.

await queueClient.CompleteAsync(message.SystemProperties.LockToken);

If you never get back to the Service Bus to complete or abandon the message, then Service Bus will eventually time out the "lock" you have on the message and make it visible again, which is what the next property we'll discuss is about.

MaxAutoRenewDuration

Ideally, your message handler should complete quite quickly, before Service Bus decides to time it out. But sometimes your message handler might legitimately need to perform a long-running task. In this case, you need to keep checking in with Service Bus to tell it that you're still working on the message, and that it should retain the lock. With the Service Bus SDK, you can do this with MaxAutoRenewDuration.

The default is five minutes, meaning that periodically, the SDK will talk back to Service Bus asking to "renew" the lock. It might do this every 30 seconds or so. But after five minutes, it will no longer renew the lock, so it's important if you have a handler that might take an hour or more, to extend this duration to greater than the longest time you might need. Otherwise, while one instance of your message handling microservice is processing the message, it could become visible again, and another instance might concurrently work on the same message, which can often result in bugs that are hard to track down.

Summary

The default message receive options are fine for many use cases, but as you build more complex applications, you will find that you need to fine tune some of these parameters. In particular I've found that it is very useful to make MaxAutoRenewDuration and MaxConcurrentCalls adjustable through configuration in your application, so if you find in production that there is a problem with throughput for a queue, or with messages taking longer to process than expected, you can tweak the values without needing to update source code.

In our next post, we'll start looking at the concepts of Topics and Subscriptions.