0 Comments Posted in:

This follows on from my previous post about how to upload blobs using the Azure V12 Blob Storage SDK. It also updates a post I created a couple of years back about how to efficiently copy blobs between containers.

There are several approaches you can take to copy blobs, and some of them are much more efficient than others. Let's look at four scenarios in this post:

  1. Copying by reading and writing (the slow way!)
  2. Fast copying within a storage account
  3. Copying between storage accounts
  4. Copying to a writable SAS Uri

Setup

To get us started, let's see some code that gets us blob clients for two blobs that are in the same storage account, but different containers. One is the source blob, and the other represents where we want to copy it to:

var connectionString = "your storage connection string";
var service = new BlobServiceClient(connectionString);
var sourceContainerName = "container1";
var targetContainerName = "container2";
var sourcePath = "blob1.zip";
var targetPath = "blob1-copy.zip";
var sourceBlobClient = new BlockBlobClient(connectionString, sourceContainerName, sourcePath);
var targetBlobClient = new BlockBlobClient(connectionString, targetContainerName, targetPath);

Example 1 - Copying by reading and writing

You might be tempted to implement the copy using something like this:

using var sourceStream = await fromBlob.OpenReadAsync();
await toBlob.UploadAsync(sourceStream);

This works just fine, but notice what you are doing. You are downloading the entire contents of the source blob from blob storage and then re-uploading it, which isn't particularly efficient.

This kind of approach should only be taken when the source file is not in Azure Blob Storage. For example, if you were copying a file from an Amazon S3 bucket, then you could generate a presigned URL (similar to SAS) and download it with a HttpClient:

var uri = new Uri("my-aws-s3-presigned-url");
using var s = await httpClient.GetStreamAsync(uri);
await toBlob.UploadAsync(s);

However, assuming you are copying from and to Azure blob storage, there are much faster ways to copy blobs.

Example 2- Fast copying within a storage account

What is the faster technique? Well, the StartCopyFromUriAsync method can help us here. This can perform an instantaneous copy if the source and target are both in the same storage account, which is a huge timesaver.

This method accepts the URI of the "source file". If the source file is in the same storage account, this can simply be the Uri of the source blob - as no additional credentials are needed:

await targetBlob.StartCopyFromUriAsync(sourceBlob.Uri);

Example 3 - Copying between storage accounts

If the source file is from another storage account, then you need to generate a readable SAS Uri, to allow the target to copy from the file. That would look something like this:

var sourceBlobReadSas = sourceBlob.GenerateSasUri(BlobSasPermissions.Read, 
    DateTimeOffset.Now.AddHours(2));
await targetBlob.StartCopyFromUriAsync(sourceBlobReadSas);

Note that StartCopyFromUriAsync will copy the blob contents, but if you want to also copy things like metadata, tags and access tier, then you should pass in an instance of BlobCopyFromUriOptions, similar to what we did in my previous post about uploading blobs.

The reason the method is called StartCopyFromUriAsync, is that it doesn't necessarily finish instantaneously, and that will be the case if you copy across storage accounts. What you need to do is to wait for the pending copy to complete.

Here's a simple method I created that waits for the copy to complete, by polling the status every second:

async Task WaitForPendingCopyToCompleteAsync(BlockBlobClient toBlob)
{
    var props = await toBlob.GetPropertiesAsync();
    while (props.Value.BlobCopyStatus == CopyStatus.Pending)
    {
        await Task.Delay(1000);
        props = await toBlob.GetPropertiesAsync();
    }

    if (props.Value.BlobCopyStatus != CopyStatus.Success)
    {
        throw new InvalidOperationException($"Copy failed: {props.Value.BlobCopyStatus}");
    }
}

Copying to a writable SAS Uri

Finally, let's consider the situation where we don't have the connection string to the target blob, only a writable SAS Uri. Here's an example of how that writable SAS Uri might get created:

var targetUri = toBlob.GenerateSasUri(BlobSasPermissions.Write
        BlobSasPermissions.Create | 
        BlobSasPermissions.List | 
        BlobSasPermissions.Read,
		DateTimeOffset.Now.AddHours(2));

Now we can construct a BlockBlobClient from the writable SAS, and use StartCopyFromUriAsync passing in the source Uri (which needs to be a readable SAS Uri). Then we can use our WaitForPendingCopyToCompleteAsync helper method seen above.

async Task CopySasToSasUri(Uri sourceUri, Uri targetUri)
{
    var toBlob = new BlockBlobClient(targetUri);
    await toBlob.StartCopyFromUriAsync(sourceUri); 
    await WaitForPendingCopyToCompleteAsync(toBlob);
}

Note that we must have Read permissions on the target SAS or we won't be able to read its properties to find out when the copy completed.

Bonus - renaming blobs

Hopefully this post helps you save time by copying blobs the quick way. One interesting side note is that although there is no "rename" feature in the Azure Blob Storage SDK, you can achieve it very quickly with a copy (using StartCopyFromUriAsync) followed by a delete. The copy is instantaneous because you are staying in the same container.

Another thing worth mentioning is that the official docs recommend taking out a blob lease while you do a copy. This is to prevent other clients from modifying the blob while it is being copied. That wasn't an issue for me, since the source blobs I am copying will never be modified after initially being uploaded.

Vote on HN