0 Comments Posted in:

I recently gave a talk about some of the challenges I encountered writing audio applications in C#. One of the key issues I talked about was how to handle situations where I wanted to use algorithms in C# that existed in other languages, but no "fully managed" implementation was available.

The two main scenarios in which this occurs are:

(1) Using DSP algorithms such as a resampler or a Fast Fourier Transform (FFT). Other examples might be digital filters, low-latency convolution, echo suppression, pitch detection and varispeed playback, all of which are very useful building blocks for audio applications and non-trivial to implement yourself.

And (2) using codecs to encode or decode audio in proprietary formats such as MP3, AAC, Ogg/Vorbis etc. Again, these require specialist knowledge that makes it unrealistic to simply implement your own version just because the programming language you have selected doesn't provide that capability in its core framework.

Porting or Interop?

Every time I ran into issues like this with NAudio (which was very frequent) I had two main choices:

Option 1 was to find an existing (usually C/C++) implementation of the algorithm and port it to C#.

Option 2 was to write P/Invoke wrappers for a native DLL so a .NET application can make use of the existing implementation.

In this article I want to explore the benefits and disadvantages of each approach, and ask the question of whether in the future there might be anything that makes life easier for us to consume code written in other languages.

Interop to native code

Let's start with writing P/Invoke wrappers around a native DLL. There are several advantages to this approach.

First of all, using a native library ought to be faster. Although the performance overhead of porting C code to C# is theoretically minimal, it's likely that an established library implementing a codec or DSP algorithm will have undergone a fair amount of performance tuning that a straightforward port of the code might not fully benefit from.

Secondly, it's less error-prone. Although the syntax of C/C++ and C# are on the surface of things very similar, there are plenty of gotchas waiting for you, particularly when dealing with pointers, or bitwise operations.

Thirdly, and perhaps most importantly, it's much easier to keep up to date. Porting an existing codebase from one language to another is something you want to do once and then forget. However, if the codebase you ported is receiving regular bugfixes or updates, it is a real pain to keep having to migrate the code deltas across.

A good example here is the Concentus open source port of the Opus reference library to C#. The project is an amazing achievement, but the author understandably now no longer has time to maintain the port, resulting in it falling behind of the latest version of Opus. By contrast, a P/Invoke wrapper requires much less maintenance.

It's not all good news though. First of all, a native wrapper is only usable if you have binaries for the specific platform you want to run on. Even for Windows, that often means you need 32 and 64 bit versions of the wrappers (and native DLLs). And if you want to run on other supported platforms (such as Linux, or ARM64), then you can't use regular Windows DLLs - you need to find platform specific native libraries.

And that's not all. There are several scenarios in which your .NET code is running in a sandboxed environment. I initially ran into this with NAudio when I wanted to use it in Silverlight, but other situations such as Windows 10X applications, Blazor applications, or running in environments like Azure App Service you may not be able to call native APIs at all, or be restricted as to which APIs can be used.

In these environments, you often have no option but to find some way to create a fully managed version of the algorithm you need. So let's consider porting next.

Porting to managed code

Despite the considerable amount of additional work, there are several benefits to porting to managed code. Obviously, one huge advantage is that the code is now fully portable and able to run in on any platform that .NET can run on, whether that be in a browser with Blazor or part of a Xamarin Android or iOS app.

Another advantage is that you have the opportunity to reshape the API to be more idiomatic, making it feel more natural for .NET developers.

You also get the safety benefits that come with the managed execution environment, such as protection overrunning the bounds of arrays or dereferencing pointers to memory that has been freed.

One issue I ran into with NAudio and porting was to do with licensing. A large percentage of open source audio code is licensed under GPL or LGPL, which are both incompatible with the more commercially friendly MIT license that I was using.

This meant that even when there were perfectly good algorithms available for porting to C#, I wasn't able to use them. This was especially annoying when I needed a good resampler, and it was quite some time before I found one that I was able to get permission to include in NAudio (the WDLResampler).

And although I listed performance as a benefit of interop, there are actually some potential performance benefits to fully porting an algorithm to managed code. That's because interop itself adds some overhead. In his superb article on performance improvements in .NET Core 3.0, Stephen Toub says "one of the key factors in enabling those performance improvements was in moving a lot of native code to managed".

Isn't there a better way?

For many parts of NAudio, due to the tradeoffs of interop versus managed, I ended up doing both. I wrapped three separate native resampler APIs (ACM, DMO and MFT) in addition to porting a managed one. I also created wrappers for two MP3 decoders, as well as created a fully managed one.

But what if we didn't have to keep doing this? Why isn't there some kind of universal format that would allow us to share code between almost any language? It feels like it's about time that something like that should exist. And maybe we're finally getting close...

Intermediate representations

Imagine a world where no matter what language an algorithm was written in, whether C, Java, C#, JavaScript, Python etc, you could consume it on any runtime - .NET, node.js, Java, Python, etc. (Let's assume for now that the code in question is not inherently tied to a particular operating system, which is the case for the sorts of code I'm talking about here).

The way this would be possible is with some kind of "intermediate representation". You take the original code, and convert it into a common representation that allows it to run in more than one environment.

We've actually seen this many times. For example, languages like CoffeeScript and TypeScript are not directly supported in browsers, but we can transpile to the intermediate representation of JavaScript to us to run code written in these source languages on any platform that can run JavaScript.

JavaScript transpilation

And of course the .NET framework itself has excellent language interoperability by virtue of the fact that we can compile to CIL (Common Intermediate Language). This means that code I write in C# can be consumed in F# and vice versa.

Common intermediate language

All this is very nice, but it's far from universal. We just have small families of related languages that interoperate nicely, while everything else is still outside.

For example, Java also has an intermediate representation called "bytecode". The Java Virtual Machine (JVM) can run any code that compiles to bytecode, but can't directly run things compiled to CIL. Likewise the .NET Common Language Runtime (CLR) can't run Java bytecode.

But in recent years, we've seen the emergence of a new intermediate format that shows promise to take cross-language interoperability a lot further.

Is WebAssembly the universal binary format?

WebAssembly (abbreviated "Wasm") is a portable binary format. It's a similar idea to CIL or Java bytecode. It can be thought of as the instruction set for a virtual machine. What makes this particularly universal is that anything that can run JavaScript can run programs compiled to WebAssembly. So already the ubiquitous nature of the web means that it runs in a very broad range of environments.

What's particularly impressive about WebAssembly is that many languages that are traditionally thought of as low-level languages that compile to native code can also be compiled directly to WebAssembly. This means that code written in C or C++ as well as newer languages like Rust or Go can compile directly into WebAssembly.

WebAssembly

There have been some really impressive demos such as running the Doom engine, or AutoCad on WebAssembly. Both are examples of very large legacy C/C++ codebases that previously would be unthinkable to run in a browser.

What grabbed my attention more though was the ability to compile C++ DSP algorithms into WebAssembly allowing them to be directly used from JavaScript applications. The WebDSP project is a great example of this.

It gets even better. Although you can't compile C# directly to WebAssembly, (because it relies on capabilities of the CLR that aren't offered by WebAssembly), you can compile a stripped-down version of the .NET CLR into WebAssembly. And this is the magic that powers Blazor. Essentially, this means that you can write C# code and have it run in any browser, which is pretty incredible. Blazor has tremendous momentum and popularity and even those of us who are a bit jaded after the death of Silverlight can see that it has a promising future.

The way Blazor works is something along these lines. The C# code is still compiling to CIL, but because the CLR can run on WebAssembly, it can load and run CIL.

Blazor

And .NET isn't the only runtime that can execute in the browser thanks to WebAssembly. There's Pyodide which brings the Python 3.8 runtime to the browser via WebAssembly, along with the Python scientific stack including NumPy, Pandas, Matplotlib, SciPy, and scikit-learn. And there's a great list here pointing to several similar projects for other popular programming languages.

What's still missing

All this is pretty cool, but actually it still doesn't get me what I want. Most of the audio-related libraries I want to consume in .NET are written in C/C++, and although they can be compiled to WebAssembly, and theoretically, I could write a C# Blazor app that calls into those WebAssembly libraries, that's not actually what I want to do. The missing piece of the puzzle would be for a regular .NET application being hosted by the .NET CLR to be able to call methods in a WebAssembly library and for them to run as though they were fully managed code. Is something like that possible?

WASM in .NET

Well theoretically it ought to be possible. After all WebAssembly is a fairly constrained set of instructions, so they could each be mapped to CIL instructions. And it turns out that there are a couple of open source projects attempting to do exactly that.

The first is a project started by Eric Sink, called wasm2cil. As an example of what can be achieved, Eric says "I can start with the C code for SQLite and its shell application, compile it to Wasm with Clang, "transpile" it to a .NET assembly, and then run it with .NET Core. The result is the SQLite shell, entirely as managed code, with no pinvokes involved."

Super cool stuff, and it's open source. However, Eric is also clear that this is an experimental work in progress. There are various bits and pieces not yet supported.

Another project along similar lines I found is dotnet-webassembly by Ryan Lamansky. It appears to be under active development, although it too is not necessarily complete. There is a GitHub issue about whether it is mature enough to run the ffmpeg libraries, which contain a huge toolkit of codec-related capabilities. It seems that it initially did not work, but some progress has been made on this front recently.

So it seems we are getting close, but not there yet. If it does become possible to take a DSP or codec library compiled into WASM, and call it as though it were a fully managed .NET library without a large performance penalty, that would be a huge boost to the .NET platform in general. C# developers like myself could use almost whatever C/C++ libraries they want without needing to do interop to native code or go through the pain of porting.

It's interesting and encouraging to see that the .NET team are taking WebAssembly seriously and have many features planned to improve the experience for running .NET applications on WebAssembly. Of course, that doesn't directly address my concern of running WebAssembly code on the .NET platform, but I'm hoping at some point it will get on the radar.

Summary

In conclusion, it can be a frustrating experience when your language of choice (in my case C#) doesn't give you easy access to perfectly good existing code that was written in other languages. It would be wonderful if there was some kind of universal "intermediate format" where code that implements things like DSP and algorithms (which are inherrently portable between languages and operating systems) could be easily used no matter what programming language you were using or what platform you were running on. WebAssembly shows some real potential to be that universal format, and I'll be keeping a close eye on this space over the next few years to see how things evolve.

For now though, when I want access to codecs or DSP in C#, I'm still stuck with choosing between P/Invoke or porting, and more often than I would like, I end up doing both.


0 Comments Posted in:

In this post I want to give an overview of what happens when you turn on the Docker tooling in Visual Studio 2019. If you're like me, you want to know a bit about what will happen under the hood before using a feature like this. I have questions like, "what changes will be made to my project files?", "will I still be able to run the projects normally (i.e. not containerized)?", "what about team members using VS Code instead?"

So for those who have not yet dived deeply into the world of containers yet, here's a basic guide to how you can try it out yourself for a very simple "microservices" application.

Demo scenario

To start with, let's set up a very simple demo scenario. We'll create a Visual Studio solution that has two web apps which will be our "microservices".

dotnet new web -o Microservice1
dotnet new web -o Microservice2
dotnet new sln
dotnet sln add Microservice1
dotnet sln add Microservice2

And optionally we can update the Startup.Configure method to help us differentiate between the two microservices:

app.UseEndpoints(endpoints =>
{
    endpoints.MapGet("/", async context =>
    {
        await context.Response.WriteAsync("Hello from Microservice1!");
    });
});

Launch Profiles

The "traditional" way to launch multiple microservices in Visual Studio to would be to go to "Project | Set Startup Projects..", select "Multiple Startup Projects" and set both microservices to "Start".

Set startup projects screenshot

Now when we run in VS2019, by default, our two microservices will run hosted by IIS Express. Mine started up on ports 44394 and 44365, and you can see configured port numbers in the Properties/launchSettings.json file for each microservice.

Here's an example, and you'll notice that out of the box I've got two "profiles" - one that runs using IIS Express, and one (called "Microservice1") that uses dotnet run to host your service on Kestrel.

{
  "iisSettings": {
    "windowsAuthentication": false,
    "anonymousAuthentication": true,
    "iisExpress": {
      "applicationUrl": "http://localhost:7123",
      "sslPort": 44365
    }
  },
  "profiles": {
    "IIS Express": {
      "commandName": "IISExpress",
      "launchBrowser": true,
      "environmentVariables": {
        "ASPNETCORE_ENVIRONMENT": "Development"
      }
    },
    "Microservice1": {
      "commandName": "Project",
      "dotnetRunMessages": "true",
      "launchBrowser": true,
      "applicationUrl": "https://localhost:5001;http://localhost:5000",
      "environmentVariables": {
        "ASPNETCORE_ENVIRONMENT": "Development"
      }
    }
  }
}

If we were to run our two microservices directly from the command-line with dotnet run, they would not use IIS Express, and we'd find that only one of the two would start up as they'd both try to listen on port 5001. There are a few options for overriding this when you are running from the command-line, but since this post is about Visual Studio, let's see how we can select which profile each of our microservices uses.

To change the launch profile for a project, first we need to right-click on that project in the Solution Explorer and choose "Set as Startup Project" (n.b. I'm sure there must be a way to do this without switching away from multiple startup projects, but I haven't found it if there is!).

This will give us access to a drop-down menu in the Visual Studio command bar which lets us switch between IIS Express and directly running the project (which shows with the name of the project so Microservice1 in this example).

Change launch profile

Once we have done this for both microservices, we can change back to multiple startup projects, and we need to make one final change, modifying the applicationUrl setting in launchSettings.json for Microservice2 so that it doesn't clash with Microservice1. I've chosen ports 5002 and 5003 for this example:

"Microservice2": {
    "commandName": "Project",
    "dotnetRunMessages": "true",
    "launchBrowser": true,
    "applicationUrl": "https://localhost:5003;http://localhost:5002",
    "environmentVariables": {
    "ASPNETCORE_ENVIRONMENT": "Development"
    }
}

Now when we run in VS2019, we'll see two command windows that run the microservices directly with dotnet run and both services can run simultaneously. If they wanted to communicate with each other, we'd need to give them application settings holding the URL and port numbers they can use to find each other.

All that was just a bit of background on how to switch between launch profiles, but it's useful to know as there'll be a third option when we enable Docker.

Enabling Docker for a Project

In order to use Docker support for VS2019, you obviously do need Docker Desktop installed and running on your PC. I have mine set to Linux container mode and running on WSL2.

Once we have Docker installed and running, then we can right-click Microservice1 in the Solution Explorer, and select "Add | Docker Support...".

Add Docker Support

This will bring up a dialog letting you choose either Linux or Windows as the Target OS. I went with the default of Linux.

Once you do this, several things will happen.

First, a Dockerfile is created for you. Here's the one it created for my microservice:

#See https://aka.ms/containerfastmode to understand how Visual Studio uses this Dockerfile to build your images for faster debugging.

FROM mcr.microsoft.com/dotnet/aspnet:5.0-buster-slim AS base
WORKDIR /app
EXPOSE 80
EXPOSE 443

FROM mcr.microsoft.com/dotnet/sdk:5.0-buster-slim AS build
WORKDIR /src
COPY ["Microservice1/Microservice1.csproj", "Microservice1/"]
RUN dotnet restore "Microservice1/Microservice1.csproj"
COPY . .
WORKDIR "/src/Microservice1"
RUN dotnet build "Microservice1.csproj" -c Release -o /app/build

FROM build AS publish
RUN dotnet publish "Microservice1.csproj" -c Release -o /app/publish

FROM base AS final
WORKDIR /app
COPY --from=publish /app/publish .
ENTRYPOINT ["dotnet", "Microservice1.dll"]

What's nice about this Dockerfile is that it's completely standard. It's not a special "Visual Studio" Dockerfile. It's just the same as you would use if you were working from Visual Studio Code instead.

The next change of note is to our csproj file. It's added a UserSecretsId which is is a way to help us keep secrets out of source code in a development environment. It's also set the DockerDefaultTargetOS to Linux which was what we selected.

But notice that we've also now got a reference to the Microsoft.VisualStudio.Azure.Containers.Tools.Targets NuGet package.

<Project Sdk="Microsoft.NET.Sdk.Web">

  <PropertyGroup>
    <TargetFramework>net5.0</TargetFramework>
    <UserSecretsId>2e9eb51f-13b8-406a-9735-92c975674696</UserSecretsId>
    <DockerDefaultTargetOS>Linux</DockerDefaultTargetOS>
  </PropertyGroup>

  <ItemGroup>
    <PackageReference Include="Microsoft.VisualStudio.Azure.Containers.Tools.Targets" Version="1.10.9" />
  </ItemGroup>

</Project>

This new package reference might make you a bit nervous. Does this mean that we can now only build our application with Visual Studio, or only build on a machine that has Docker installed? Has it made our microservice somehow dependent on Docker in order to run successfully?

The answer is fortunately no to each of those questions. All that this package does is that it will build a container image when we build our project in Visual Studio.

If we issue a docker image ls command we'll see that there is now a microservice1 docker image tagged dev. This is what Visual Studio will use to run our microservice in a container.

Docker image ls

However, if you issue a docker ps command which shows you running containers, you might be surprised to see that this container is already running, despite not having started debugging yet.

docker ps

What's going on here? Why is Visual Studio running my microservice without me asking it to? The answer is, this container isn't actually running microservice1 yet. Instead it is a pre-warmed container that Visual Studio has already started to speed up the development loop of working with containerized projects.

You can learn more about this at the link provided at the top of the auto-generated Dockerfile. There's lots of excellent information in that document, so make sure you take some time to read through it.

The basic takeaway is that this container uses volume mounts so that whenever you build a new version of your code, it doesn't need to create a new Docker image. The existing container that is already running will simply start running your code, which is in a mounted volume.

You can use the docker inspect command to see details of the mounted volumes, but again there is a helpful breakdown available here explaining what each one is for. There are mounts for your source code, the compiled code, and NuGet packages for example.

If you exit Visual Studio it will clean up after itself and remove this container, so if you do a docker ps -a you should no longer see the microservice1 container.

Before we see how to run, let's just quickly look at the two other changes that happened when we enabled Docker support for the service.

The first is that a new profile has been added to the launchSettings.json file that we saw earlier. This means that for each project in our solution that we enable Docker support for, we can either run it as a Docker container, or switch back to one of the alternatives (IIS Express or dotnet run) if we prefer.

"Docker": {
    "commandName": "Docker",
    "launchBrowser": true,
    "launchUrl": "{Scheme}://{ServiceHost}:{ServicePort}",
    "publishAllPorts": true,
    "useSSL": true
}

Finally, it also helpfully creates a .dockerignore file for us which protects our Docker images from being bloated or unintentionally containing secrets.

Running from Visual Studio

So far we've only converted one of our microservices to use Docker, but we can still run both of them if we have the "multiple startup projects" option selected. Each project simply uses the launch profile that's selected, so can run one microservice as a Docker container and one with dotnet run or IIS Express if we want.

The obvious caveat here is that each technique for starting microservices (IIS Express, Docker, dotnet run) will use a different port number. So if your microservices need to communicate with each other you'll need some kind of service discovery mechanism. Tye is great for this, but that's a post for another day. By the end of this post we'll see Docker Compose in action which gives us a nice solution to this problem.

While you're running the application, you can check out the container logs using the excellent Visual Studio Container window. This not only lets you see the logs, but also the environment variables, browse the file system, see which ports are in use and connect to the container in a terminal window.

Visual Studio Container Window

The debugger is also set up automatically attach to the code running in the container, so we can set breakpoints exactly as though we were running directly on our local machine.

Container orchestration

If you are building a microservices application, then you likely have several projects that need to be started, as well as possibly other dependent containerized services that need to run at the same time. A common approach is to use a Docker Compose YAML file to set this up, and again Visual Studio can help us with this.

I've added "Docker Support" to my second microservice using the same technique described above, and now we can add "Container Orchestrator" support by right-clicking on one of our microservices and selecting "Add | Container Orchestrator Support...".

Add Container Orchestrator Support

Next, we will be asked which container orchestrator we want to use. This can either be "Kubernetes/Helm" or "Docker Compose". Kubernetes is an increasingly common choice for hosting containers in production and Docker Desktop does allow you to run a single-node local Kubernetes cluster. However, I think for beginners to Docker, the Docker Compose route is a little simpler to get started with, so I'll choose Docker Compose.

Choose Container Orchestration

We'll again get prompted to choose an OS - I chose Linux as that's what I chose for the Docker support.

Let's look at what gets created when I add Docker Compose support. First, a new project is added to my solution, which is a "Docker Compose" project (.dcproj):

Docker Compose Project

The project includes an auto-generated docker-compose.yml file. A Docker Compose file holds a list of all the containers that you want to start up together when you run your microservices application. This can be just your own applications, but can also include additional third-party containers you want to start at the same time (e.g. a Redis cache).

Here, the created Docker Compose file is very simple, just referencing one microservice, and indicating where the Dockerfile can be found to enable building:

version: '3.4'

services:
  microservice1:
    image: ${DOCKER_REGISTRY-}microservice1
    build:
      context: .
      dockerfile: Microservice1/Dockerfile

There is also a docker-compose.override.yml file. An override file allows you to specify additional or alternative container settings that apply to a specific environment. So you could have one override file for local development, and one for production. Here, the override file is specifying the environment variables we want to set, the ports we want to expose and the volumes that should be mounted.

version: '3.4'

services:
  microservice1:
    environment:
      - ASPNETCORE_ENVIRONMENT=Development
      - ASPNETCORE_URLS=https://+:443;http://+:80
    ports:
      - "80"
      - "443"
    volumes:
      - ${APPDATA}/Microsoft/UserSecrets:/root/.microsoft/usersecrets:ro
      - ${APPDATA}/ASP.NET/Https:/root/.aspnet/https:ro

The other small change, is that our microservice .csproj file has been updated with a reference to the Docker Compose project:

<DockerComposeProjectPath>..\docker-compose.dcproj</DockerComposeProjectPath>

If we do the same for microservice2, and add container orchestration support, it simply will update our Docker Compose file with an additional entry:

version: '3.4'

services:
  microservice1:
    image: ${DOCKER_REGISTRY-}microservice1
    build:
      context: .
      dockerfile: Microservice1/Dockerfile

  microservice2:
    image: ${DOCKER_REGISTRY-}microservice2
    build:
      context: .
      dockerfile: Microservice2/Dockerfile

The other change that has happened, is that we've now gone back to having a single "startup project". However, this startup project is the Docker Compose project, so when we start debugging in Visual Studio, it will launch all of the services listed in our Docker Compose file.

Docker Compose Startup

And when we start debugging, there will simply be one container running for each of the services in the Docker Compose YAML file.

Docker Compose Containers

Running with Docker Compose may seem similar to simply starting multiple projects, but it does offer some additional benefits.

First, Docker Compose will run the containers on the same Docker network, enabling them to communicate easily with each other. They can refer to each other by name as Docker Compose gives them a hostname the same as the container name. This means microservice1 could call microservice2 simply at the address http://microservice2.

Second, we are free to add additional dependent services to our Docker Compose file. In .NET applications, a very common required dependency is a SQL database, and I wrote a tutorial on containerizing SQL Server Express that explains how you can do that.

Summary

In this post we've seen that it's very straightfoward to add Container and Container Orchestrator support to a Visual Studio project. But I've also hopefully shown that you don't necessarily have to go all in on this if you're new to Docker and just want to experiment a bit.

If you have other team members who do not have Docker installed, they can simply continue building and running the services in the usual way. And if they don't want to use the Visual Studio tooling, they can still use regular Docker (and Docker Compose) commands to build and run the containers from the command line.


0 Comments Posted in:

If you have huge amounts of data in Azure Blob Storage, you may want to consider reducing your costs by using one of the cheaper "access tiers".

Hot and Cool Tiers

The default access tier is "Hot", which means that the blob is readily available to access at any time. It's intended for blobs that need to be accessed frequently.

You can also move blobs into the "Cool" tier, which has a reduced cost with the tradeoff of slightly less availability. It's ideal use case is for blobs that you are storing for at least a month and don't need to access frequently.

Archive Tier

There's another access tier, called "Archive" which provides greatly reduced storage costs, but essentially makes your blobs "offline". To read the contents of a blob in Archive, you must first "rehydrate" it back into the Hot or Cool access tiers, which can take several hours. So you should only use the Archive tier for situations where you can accept a delay if you do need to access the file again in the future.

Note that there is also an "early deletion fee" to pay if you remove something from archive storage less than 180 days before putting it in there.

Automatic Archiving

One really nice feature that complements these access tiers is the ability to automatically move blobs between access tiers. So for example you could set a rule that any blob that hasn't been accessed for six months gets automatically moved into archive.

Changing Access Tiers with the SDK

Let's see how we can use the Azure Blob Storage SDK, which is available in the Azure.Storage.Blobs NuGet package to move blobs between access tiers.

We'll start off by creating a container client to work with:

var service = new BlobServiceClient(connectionString);
var containerClient = service.GetBlobContainerClient("mycontainer");

Example 1 - Directly Creating a Blob in the Archive Tier

In this example we'll directly create a new blob (archive.txt) and put it into the Archive access tier directly. You might do this if you are backing up large amounts of data for long-term storage that you don't expect to need to access again.

var archiveBlobClient = containerClient.GetBlobClient("archive.txt");
if(!await archiveBlobClient.ExistsAsync())
{
    var ms = new MemoryStream(Encoding.UTF8.GetBytes("Archive"));
    var uploadOptions = new BlobUploadOptions() 
        { AccessTier = AccessTier.Archive };
    await archiveBlobClient.UploadAsync(ms, uploadOptions);
}

Example 2 - Moving an Existing Blob to the Archive Tier

In this example we're going to use a blob called moveToArchive.txt. If it doesn't exist, we'll create it (in the default Hot access tier), but then immediately ask for it to be archived by calling SetAccessTierAsync.

If it does exist, we'll simply find out what tier it is in by calling GetPropertiesAsync and checking the value of AccessTier. Moving a blob into Archive should happen immediately.

var moveToArchiveBlobClient = containerClient.GetBlobClient("moveToArchive.txt");
if (!await moveToArchiveBlobClient.ExistsAsync())
{
    // upload new file to hot access tier
    var ms = new MemoryStream(Encoding.UTF8.GetBytes("Move to Archive"));
    await moveToArchiveBlobClient.UploadAsync(ms);
    
    // then move it into the Archive access tier
    await moveToArchiveBlobClient.SetAccessTierAsync(AccessTier.Archive);
}
else
{
    // already exists - check what tier it is in
    var props = await moveToArchiveBlobClient.GetPropertiesAsync();
    if (props.Value.AccessTier == "Archive")
    {
        Console.WriteLine("File has successfully been moved to Archive");
    }
}

Example 3 - Rehydrate a blob from Archive

This example is a little more involved. We're going to create a blob called rehydrateme.txt in the archive tier. If the blob already exists, we'll check what access tier it is in. If it is still in the Archive tier, and we haven't requested rehydration yet, we'll do so by calling SetAccessTierAsync.

We can tell if a rehydration is in progress by checking the ArchiveStatus property. This will have the value of rehydrate-pending-to-hot or rehydrate-pending-to-cool. In that case we simply need to wait until the ArchiveStatus property changes from Archive.

We can also inspect the value of AccessTierChangedOn to know when the access tier last changed for the blob.

var rehydrateBlobClient = containerClient.GetBlobClient("rehydrateme.txt");
if (!await rehydrateBlobClient.ExistsAsync())
{
    Console.WriteLine("Creating file in archive to rehydrate");
    var ms = new MemoryStream(Encoding.UTF8.GetBytes("Rehydrate"));
    // upload directly as archive
    var uploadOptions = new BlobUploadOptions() 
        { AccessTier = AccessTier.Archive };
    await rehydrateBlobClient.UploadAsync(ms, uploadOptions);
}
else
{
    // already exists - check what tier it is in
    var props = await rehydrateBlobClient.GetPropertiesAsync();
    if (props.Value.AccessTier == "Archive")
    {
        if (props.Value.ArchiveStatus == null)
        {
            Console.WriteLine("Requesting rehydrate");
            await rehydrateBlobClient.SetAccessTierAsync(AccessTier.Hot);
        }
        else
        {
            Console.WriteLine($"Still rehydrating... {props.Value.ArchiveStatus}, changed {props.Value.AccessTierChangedOn}");
        }
    }
    else
    {
        Console.WriteLine($"Rehydrated blob is now in {props.Value.AccessTier}");
    }
}

When I ran this to test it out, the blob was still rehydrating several hours after I changed access tiers. I'm planning to do some more tests to find out what the average duration of a rehydration is, but it does appear that you should be expecting hours rather than minutes.

According to the documentation, it can take up to 15 hours. If you need it faster, you can pass a RehydratePriortiy of High to the SetAccessTierAsync method which should mean that your blob is available in an hour.

What happens when we try to read a blob in the Archive tier?

Finally, you might be wondering what happens if you have existing code that tries to read the contents of a blob in the Archive access tier. The answer is that they will get a 409 error back ("This operation is not permitted on an archived blob").

So for example, this code which directly uses the blob client to read a blob in the Archive access tier will throw a RequestFailedException:

try
{
    var b = await archiveBlobClient.DownloadAsync();
    using var s = new StreamReader(b.Value.Content);
    var contents = await s.ReadToEndAsync();
    Console.WriteLine("Contents of archive blob: " + contents);
}
catch (RequestFailedException e)
{
    Console.WriteLine("FAILED TO DOWNLOAD ARCHIVE BLOB: " + e.Message);
}

And this code which generates a readable SAS Uri for a blob in archive, and then attempts to use a HttpClient to download the contents of that blob will get a HttpRequestException with a 409 error.

// generate a SAS Uri
var accountName = Regex.Match(connectionString, "AccountName=([^;]+);").Groups[1].Value;
var accountKey = Regex.Match(connectionString, "AccountKey=([^;]+);").Groups[1].Value;
var cred = new StorageSharedKeyCredential(accountName, accountKey);
var sasBuilder = new BlobSasBuilder();
sasBuilder.BlobContainerName = archiveBlobClient.BlobContainerName;
sasBuilder.BlobName = archiveBlobClient.Name;
sasBuilder.SetPermissions(BlobSasPermissions.Read);
sasBuilder.ExpiresOn = DateTimeOffset.Now.AddHours(1);
var qparams = sasBuilder.ToSasQueryParameters(cred);
var sasUri = $"{archiveBlobClient.Uri}?{qparams}";

// now try to read from the archive SAS Uri
var h = new HttpClient();
try
{
    var contentSas = await h.GetStringAsync(sasUri);
    Console.WriteLine("SAS content of archive blob" + contentSas);	
}
catch (HttpRequestException hrx)
{
    // will get a 409 error
    Console.WriteLine("FAILED TO DOWNLOAD ARCHIVE SAS: " + hrx.Message);
}

Summary

The Azure Blob Storage SDK makes it very simple to move files in and out of different Access tiers. This gives you the potential for substantial cost savings if you are storing vast amounts of data that you rarely need to access. However, you may need to rework your application to cope with the delay of moving blobs out of the Archive tier.