0 Comments Posted in:

In my recent Pluralsight course "Versioning and Evolving Microservices with ASP.NET Core", I built my demos on top on an existing microservices application that had been created as the basis for a Microservices in ASP.NET Core learning path on Pluralsight (which is nearing completion).

For messaging between microservices, this application used Rebus which is a very simple service bus implementation in .NET that allows you to plug in a few different services as a back-end. For example, in Azure you could use Azure Service Bus, Azure Storage Queues or Azure SQL Database, and many other "transports" are available.

In this post, I'll go through the basics of setting up Rebus in ASP.NET Core, where we are going to have one microservice send a message, and another handle those messages.

I'm going to use Azure Storage Queues as the transport as they are (1) very simple and cheap, and (2) have an emulator you can use for local development. And of course by using an abstraction, we are free to change later to a different underlying transport.

Sending messages

In the project that will send messages using Rebus, we'll add references to following NuGet packages in our csproj file:

<PackageReference Include="Rebus" Version="6.4.1" />
<PackageReference Include="Rebus.AzureQueues" Version="1.0.0" />
<PackageReference Include="Rebus.Microsoft.Extensions.Logging" Version="2.0.0" />
<PackageReference Include="Rebus.ServiceProvider" Version="5.0.6" />

Next, in Startup.ConfigureServices we'll use the AddRebus method to set up Rebus. I'm configuring it to integrate with the ASP.NET Core logging. I'm using UseAzureStorageQueuesAsOneWayClient for the transport as this project will only need to send messages, rather than receive them.

I need to pass in a CloudStorageAccount which in development can use the connection string of the Azure Storage Emulator (UseDevelopmentStorage=true).

And I also need to tell it what queues to route different message types to. In my simple example I have one message called VoteMessage, and I'll send that to a queue name I read out of configuration.

var storageAccount = CloudStorageAccount.Parse(Configuration.GetConnectionString("AzureQueues"));

// Configure and register Rebus
services.AddRebus((configure, provider) => configure
    .Logging(l => l.MicrosoftExtensionsLogging(provider.GetRequiredService<ILoggerFactory>()))
    .Transport(t => t.UseAzureStorageQueuesAsOneWayClient(storageAccount))
    .Routing(r => r.TypeBased().Map<VoteMessage>(Configuration["AzureQueues:QueueName"]))
    );

Also in my Startup.Configure method I added the following call to UseRebus:

app.ApplicationServices.UseRebus();

Finally, in my controller or Razor page that wants to send the message, I simply take a dependency on Rebus.Bus.IBus, and call the Send method. Rebus knows what queue to send it to based on the type of message I'm sending.

await bus.Send(new VoteMessage()
{
    PollId = Poll.Id,
    Option = option,
    VoterId = voterId
});

Handling messages

In the ASP.NET Core application that handles the messages, the steps are similar. In my very simple example, I'm putting the handlers into an existing ASP.NET Core web API application, although in a production app, I'd want to separate this concern and have a dedicated worker process.

The steps are almost identical. We're referencing the same four NuGet packages. And our Startup.ConfigureServices method looks like this.

services.AddRebus((configure, provider) => configure
    .Logging(l => l.MicrosoftExtensionsLogging(provider.GetRequiredService<ILoggerFactory>()))
    .Transport(t => t.UseAzureStorageQueues(storageAccount, Configuration["AzureQueues:QueueName"])));

Note that we don't need to set up any routing because this project isn't sending any messages. But we do need to tell it which queue to listen for messages on which is provided as an argument to the UseAzureStorageQueues method.

Rebus also needs to know how to create the appropriate handler for each message type. The easiest way to do this is to tell it to scan an assembly for all handlers with a call to AutoRegisterHandlersFromAssemblyOf. This will register every class in the assembly that implements IHandleMessages<T>. It also means that your handler classes can take dependencies on anything that can be resolved from the ASP.NET Core DI container.

services.AutoRegisterHandlersFromAssemblyOf<VoteMessageHandler>();

Again in Startup.Configure method we need to call UseRebus. Note that this will actually attempt to start listening on the queue, so if your Storage Queue is inaccessible or your configuration is wrong, you may get an error here (another reason not to put message handlers in the same project as APIs).

app.ApplicationServices.UseRebus();

Finally, we need a message handler for our VoteMessage, which is achieved by implementing IHandleMessages<VoteMessage> and implementing the Handle method. As you can see it's very straightforward:

public class VoteMessageHandler : IHandleMessages<VoteMessage>
{
    private readonly VoteDbContext dbContext;
    private readonly ILogger<VoteMessageHandler> logger;

    public VoteMessageHandler(VoteDbContext dbContext, ILogger<VoteMessageHandler> logger)
    {
        this.dbContext = dbContext;
        this.logger = logger;
    }
    public async Task Handle(VoteMessage message)
    {
        // save to the database
    }
}

Summary

Obviously Rebus has a lot more capabilities than I'm showing here. But the nice thing about it is that there is a very low barrier of entry and you can start simple, adding basic messaging with swappable transports to your ASP.NET Core microservices with minimal effort.


0 Comments Posted in:

Many of you will know that with ASP.NET Core, it's really easy to generate OpenAPI (Swagger) documentation. In fact, it's now a part of the default template for a web API. If you type dotnet new webapi you'll get a project that already references the Swashbuckle.AspNetCore NuGet package which will give you a nice webpage showing all the endpoints in your API and letting you test them easily

Swagger UI

There's also a link on this page to a swagger.json document which describes your API in such a way that a client could be automatically generated from it.

Not this again!

Now, auto-generated clients is something that I've been burned by in the past. The tooling to generate them was often complex, painful to automate and resulted in ugly generated code that was never flexible enough to do what I wanted. And so I've avoided using it for a long time.

However, with the rise of gRPC, it seems auto-generated clients are making something of a comeback, and the latest Visual Studio can generate clients for both gRPC and OpenAPI sevices now.

Add Service Reference

I recently needed to make a quick demo app where one ASP.NET Core web application needed to call into an ASP.NET Core web API, and it seemed an ideal opportunity for me to give auto-generated clients another chance. It mostly was a smooth experience, but I did run into a couple of minor issues, so I thought I'd document my findings here.

Adding a Service Reference in Visual Studio

Adding a service reference in Visual Studio is very easy. Select the project in Solution Explorer and choose Project | Add Service Reference. From here you can choose whether to add a reference for an OpenAPI or gRPC service.

If you choose OpenAPI you have the option of either pointing directly to a Swagger document, or accessing one via a URL. I chose a local Swagger JSON file, which I'd saved to disk by visiting the Swagger page for my web API and downloading the JSON.

We'll discuss the configuration options shortly, but if just you accept the defaults you end up with something like the following in your csproj file:

  <ItemGroup>
    <OpenApiReference Include="..\api\swagger-v1.json">
      <CodeGenerator>NSwagCSharp</CodeGenerator>
      <Link>OpenAPIs\swagger-v1.json</Link>
    </OpenApiReference>
  </ItemGroup>

Customising the generated code

How do we use the generated client? Well, first we need to know what it's called. We can actually see the code if we navigate into our obj folder. In my example, the file was called swagger-v1Client.cs, and the name of the generated class was swagger_v1Client which was not what I wanted.

What's more, the names of the methods on the client were also not what I would consider to be intuitive names. So let's see a few ways to improve the generated code.

My first tip, is that your Swagger file should include an operationId for each method. This is achieved by providing a Name property in the method attributes in the web API. Here's a simple example:

[HttpGet(Name=nameof(Get))]
public IEnumerable<WeatherForecast> Get()

Second, in the csproj file we can specify a number of options. For example, we can control the class name, namespace, and output path of the generated file, by adding extra properties to the OpenApiReference node:

<Namespace>Weather</Namespace>
<ClassName>WeatherServiceClient</ClassName>
<OutputPath>WeatherServiceClient.cs</OutputPath>

Finally, there are two more things I want to change about the generated client. First, I want an interface, allowing unit tests to mock the client if necessary. And second, by default the constructor takes two parameters - a base URL and a HttpClient. The base URL gets in the way of us easily registering this client in our DI container, so I want to turn that off and configure the base URL a different way.

Both of these changes can be achieved by customising the NSwag options. (NSwag is the tool that is generating the client). The options I changed were setting /UseBaseUrl to false, and /GenerateClientInterfaces to true. And we can set these with an options property:

<Options>/UseBaseUrl:false /GenerateClientInterfaces:true</Options>

Here's what the full configuration in the csproj file looks like after these customizations:

  <ItemGroup>
    <OpenApiReference Include="..\api\swagger-v1.json">
      <CodeGenerator>NSwagCSharp</CodeGenerator>
      <Link>OpenAPIs\swagger-v1.json</Link>
      <Namespace>Weather</Namespace>
      <ClassName>WeatherServiceClient</ClassName>
      <OutputPath>WeatherServiceClient.cs</OutputPath>
      <Options>/UseBaseUrl:false /GenerateClientInterfaces:true</Options>
    </OpenApiReference>
  </ItemGroup>

Note that Visual Studio sometimes seemed reluctant to re-generate the client even though I had changed the options. Deleting the file from the obj folder seemed to get it working again.

Registering the client

The final step is that I want to use the AddHttpClient method in my Startup.ConfigureServices method. (Check out Steve Gordon's series on HttpClientFactory for more information on why this is a good idea).

This allows us to set the base address for the client after it has been created, fetching the value from configuration:

services.AddHttpClient<IWeatherServiceClient, WeatherServiceClient>(
    (provider, client) => {
    client.BaseAddress = new Uri(Configuration.GetValue(
        "WeatherServiceBaseAddress", "https://localhost:44369/"));
});

With this step completed, any controller or Razor page that needs to access the web API can just take a dependency on IWeatherServiceClient.


0 Comments Posted in:

In this article I want to discuss the materialized view pattern and why I think serverless is a good match for it. I've written this as my submission to the "Serverless September" initiative. Be sure to check out the other great contributions to this series.

Materialized Views

Let's start with a brief overview of what is meant by a "materialized view". In an ideal world, we'd store all of our data in a database that was perfectly optimized for every query we threw at it. No matter what question we asked of the data, it would respond quickly and efficiently.

In the real world, unfortunately, that's just not possible. Whether you're storing your data in a relational database like SQL Server or a document database like Cosmos DB, there are going to be occasions where the queries you want to run are simply too slow and costly.

The materialized view pattern is a simple idea. We build a new "view" of the data that stores the data in just the right way to support a particular query. There are several different approaches to how the materialized view is generated, but the basic idea is that every time we update application data, we also update any materialized views that depend on that data.

You can create as many materialized views as you want. It's quite common to create one for a specific view on a web-page or mobile application, since performance is especially important when an end user is waiting for a page-load.

An Example Scenario

Let's consider a very simple example scenario. Imagine we have an e-Commerce website. Our database will store products (all the things we sell) and orders (all the things we've sold).

If our database is a relational database, then we'd probably have at least three tables - one for the products, one for orders, and one for "order lines" as a single order may be for multiple products.

database schema example

And if our database was a document database, then we'd probably use a document for each product and a document for each order, including its "order lines". Here's a simplified representation of what an order document might look like in a document database:

{
    'id': 1234,
    'customerEmail': '[email protected]',
    'lines': [
        { 'productId': 129512, 'quantity': 1},
        { 'productId': 145982, 'quantity': 2}
    ]
}

So far, so good. But let's imagine that when a customer views a product, we also want to show them some recommendations of a few products that other customers who bought this item also purchased.

Whether we used a relational database or a document database, this query is perfectly possible to implement. All we need to do is to find all orders that include the product currently being viewed, and then group all the "order lines" by product id, sort them by the most popular, and take the top few rows. Simple! But very slow and expensive. If we have millions of orders in our database the query will probably take far too long to complete for the web-page to load without timing out.

Instead we can create a materialized view that is specifically designed to return the product recommendations. For each product we could keep a running total of the counts of other items purchased at the same time. And now what was previously an expensive aggregate operation across all order lines in the entire database becomes a simple lookup by product id.

Eventual Consistency

Having decided we want a materialized view, we have a problem. How will we keep the view up to date?

Basically every time a new order comes in, as well as saving it to the database, we also need to update the materialized view. If a new order is placed for products A and B, then we need to update two rows in our materialized view: the row for product A to add B as a recommendation, and vice versa.

The most common approach taken for materialized view generation is asynchronous updates. For example, when a new order comes in, we might publish an message to an event bus, and a subscriber to that message can be responsible for updating the materialized view.

This approach has several benefits. It means the code storing a new order can remain quick, and the potentially slower operation of updating the materialized view can happen in the background later. It also means we can scale out with many materialized views being updated in parallel.

But it does also mean that there will be a short window of time where our materialized view isn't fully up to date. This is known as the problem of "eventual consistency". The recommendations materialized view might not be fully up to date immediately, but will eventually become so.

This idea can worry people. Isn't it bad to show our customers data that isn't fully up to date? Actually it turns out that in many cases it doesn't matter as much as you might think. In our example, it hardly matters at all. So what if the recommendations I see for related products aren't perfectly up to date?

Of course, the asynchronous approach to updating materialized views isn't strictly required. There are ways to put the update to the main data and all materialized views inside a single transaction, so they are always consistent. However, this comes at a significant performance cost penalty and may also involve the complexities of distributed transactions.

Why Serverless?

Now we've considered what a materialized view pattern is, and why we might want to do it, let's see why I think serverless is a good fit for implementing these views.

First, materialized view updates are usually event-driven. And serverless "Functions as a Service" frameworks like Azure Functions are great for implementing event handlers. You could create an Azure Function for each materialized view you want to keep up to date. The built-in scalability of Azure Functions means that even if you have lots of materialized views to update, they can be processed in parallel.

Second, sometimes you need to rebuild an entire materialized view from scratch. This is common when you're introducing a new materialized view, so you can't just subscribe to updates to the source data - you need to go back to the beginning. This generates a significant backlog of work, and serverless platforms like Azure Functions have the capacity to rapidly scale out to help you catch up quickly.

Alternatively, sometimes materialized views are generated using a schedule. For example, maybe every night at midnight you completely rebuild one of the views. Again, this is something that serverless is great at - it's easy to create a timer-triggered Azure Function (although if you're processing a lot of data in a single function you'll need to take care not to time out - in the consumption plan Azure Functions have a time limit which defaults to 5 minutes)

Finally, it can be very hard to predict ahead of time how much compute resource is needed to maintain a materialized view. Some views may require very infrequent updates, while others are constantly churning. Some changes can come in bursts. Again, having a platform that dynamically scales out to meet demand, but also can scale to zero to save money during idle periods gives you a cost-effective way to pay only for the compute you need.

Cosmos DB change feed

The materialized view pattern can be implemented against any database, including SQL Database and Cosmos DB. In fact, it's quite common to pair this pattern with "event sourcing", where you simply store "events" that represent the changes to application state, rather than the current state of the application, which is typically what a regular database would store.

Cosmos DB has a really helpful feature called the "change feed" which allows you to replay all the changes to documents in a collection. That is to say, every time a new document is created or updated, a new entry appears in the change feed. You can then either subscribe to that change feed, to be notified about every change, or replay it from the start, to receive notifications of all historic changes. This makes it ideal for both generating a new materialized view and keeping it up to date.

The change feed needs to keep track of how far through the backlog of change events you are, and this is done by maintaining a "lease" which is very simply stored in a container in your CosmosDB database.

Azure Functions comes with a built-in trigger called the CosmosDBTrigger that makes it really easy to subscribe to these change notifications, and use them to update your materialized views. And it gives you the changes in batches (typically 100 changes in a batch), enabling you to make your materialized view update code more efficient.

Sample application

To demonstrate serverless materialized view generation with the Cosmos DB change feed, I decided to create a very simple proof of concept application. The scenario I chose to implement was the e-Commerce example I described above, where we want to generate product "recommendations" based on order history from all customers.

I chose Cosmos DB for my database, and my database design is very simply a products collection which holds a document per product. And an orders collection which holds a document for each order. For the recommendations, I could have created a recommendation document per product (so the recommendation materialized view was entirely separate), but for this demo I decided to store the recommendations directly in the product document itself.

I made a very simple console application to auto-generate products and random orders that use the products. Here's the code for the product generator and the order generator.

I also wrote a function to produce the recommendations on the fly without the aid of a materialized view. This shows off quite how slow and expensive performing this query is. (With CosmosDB, every query consumes a number of "request units" (RUs) and the more orders you have, the more RUs this query will consume) It was already getting very slow with just a few thousand orders in the database, and I suspect that by the time there are millions of orders, it simply wouldn't be possible to run this in real-time.

Then I created an Azure Function application that contained a single function to update my materialized view. This uses a CosmosDBTrigger which means this function will be triggered every time there is an update to documents in the collection we're monitoring. In our case, it's the orders collection, so any new orders will trigger this function. I also need to specify where the "lease" is stored that will keep track of how far through the updates we are. And I've set StartFromBeginning to true, although technically for a materialized view like our recommendations there is no need to do that if we don't mind starting with no recommendations.

I'm aso using a CosmosDB binding so that I can perform additional queries and updates to the products collection. Here's the function declaration:

[FunctionName("UpdateRecommendationView")]
public static async Task Run([CosmosDBTrigger(
        databaseName: databaseName,
        collectionName: Constants.OrdersContainerId,
        ConnectionStringSetting = connectionStringSettingName,
        LeaseCollectionName = "leases",
        StartFromBeginning = true,
        CreateLeaseCollectionIfNotExists = true)]
    IReadOnlyList<Document> input,

    [CosmosDB(
        databaseName: databaseName,
        collectionName: Constants.ProductsContainerId,
        ConnectionStringSetting = "OrdersDatabase")] DocumentClient productsClient,

    ILogger log)

Inside the function, the input parameter that the CosmosDBTrigger is bound to contains a list of updates. Each update includes the full document contents, so we can simply loop round and examine each order. In our case, for each order that is for more than one product we must update the product document for each product in that order to increment its count of purchases.

Obviously this could be quite slow, but that's one of the advantages of building these views asynchronously - it's happening in the background and not affecting the end user's experience. I could also rework this code slightly to take advantage of the batched nature of the updates, which would possibly reduce the number of document updates we need to make for each batch - currently we would update the same product more than once if it featured in multiple orders in the batch.

I did make one small performance enhancement, which is to ensure that the recommendation list size for each product doesn't grow above 50, just using a very simplistic technique. Obviously a real-world recommendations engine would be more sophisticated.

foreach (var doc in input)
{
    Order order = (dynamic) doc;
    log.LogInformation("Order: " + order.Id);

    if (order.OrderLines.Count <= 1)
    {
        // no related products to process
        continue;
    }
    foreach (var orderLine in order.OrderLines)
    {
        var docUri = UriFactory.CreateDocumentUri(databaseName, Constants.ProductsContainerId, orderLine.ProductId);
        var document = await productsClient.ReadDocumentAsync(docUri,
            new RequestOptions { PartitionKey = new PartitionKey("clothing") }); // partition key is "/Category" and all sample data is in the "clothing" category
        Product p = (dynamic) document.Resource;
        if (p.Recommendations == null)
        {
            p.Recommendations = new Dictionary<string, RelatedProduct>();
        }

        log.LogInformation("Product:" + orderLine.ProductId);
        foreach (var boughtWith in order.OrderLines.Where(l => l.ProductId != orderLine.ProductId))
        {
            if (p.Recommendations.ContainsKey(boughtWith.ProductId))
            {
                p.Recommendations[boughtWith.ProductId].Count++;
                p.Recommendations[boughtWith.ProductId].MostRecentPurchase = order.OrderDate;
            }
            else
            {
                p.Recommendations[boughtWith.ProductId] = new RelatedProduct()
                {
                    Count = 1,
                    MostRecentPurchase = order.OrderDate
                };
            }
        }
        if (p.Recommendations.Count > 50)
        {
            // simplistic way to keep recommendations list from getting too large
            log.LogInformation("Truncating recommendations");
            p.Recommendations = p.Recommendations
                .OrderBy(kvp => kvp.Value.Count)
                .ThenByDescending(kvp => kvp.Value.MostRecentPurchase)
                .Take(50)
                .ToDictionary(kvp => kvp.Key, kvp => kvp.Value);
        }
        // update the product document with the new recommendations
        // (another optimization would be to not update if the recommendations hadn't actually changed)
        await productsClient.ReplaceDocumentAsync(document.Resource.SelfLink, p);
    }
}

Summary

The materialized view pattern is a very useful and powerful way to enable complex queries to be performed rapidly. The supporting views can be generated asynchronously, and serverless platforms like Azure Functions are a great fit for this. The CosmosDB change feed greatly simplifies the task of setting this up, and has built-in support in Azure Functions.

Despite not having used the CosmosDB change feed processor before I found it really quick and easy to get up and running with it. You can learn more about the Cosmos DB change feed here.

Want to learn more about how easy it is to get up and running with Azure Functions? Be sure to check out my Pluralsight courses Azure Functions Fundamentals and Microsoft Azure Developer: Create Serverless Functions