0 Comments Posted in:

It's only been a month since I released my Durable Functions Fundamentals course on Pluralsight, but it's great to see that the platform is continuing to evolve and pick up new features.

A new v1.5 was released yesterday, and so I thought I'd highlight some of my favourite updates since I released my course.

Durable Functions and Azure Functions v2

Azure Functions version 2 has been in development for some time, and is hopefully close to being ready to go live, but isn't quite finished yet, so for my course I created a sample app with Azure Functions v1 and then ported it to Azure Functions v2

The changes between the two were relatively minor, but one gotcha with Durable Functions and Azure Functions V2 revolves around using the CreateCheckStatusResponse method of DurableOrchestrationClient.

Here's my starter function that kicks off a new orchestration. Notice that it takes a HttpRequestMessage parameter and returns a HttpResponseMessage:

[FunctionName("ProcessVideoStarter")]
public static async Task<HttpResponseMessage> Run(
    [HttpTrigger(AuthorizationLevel.Function, "get", Route = null)]
    HttpRequestMessage req,
    [OrchestrationClient] DurableOrchestrationClient starter,
    TraceWriter log)
{
    string video = req.RequestUri.ParseQueryString()["video"];

    if (video == null)
    {
        return req.CreateResponse(HttpStatusCode.BadRequest,
            "Please pass the video location the query string");
    }

    log.Info($"About to start orchestration for {video}");

    var orchestrationId = await starter.StartNewAsync("O_ProcessVideo", video);

    return starter.CreateCheckStatusResponse(req, orchestrationId);
}

This compiles just fine in Azure Functions V2, but if you use the templates to create a new HTTP triggered function, then you'll end up with a function that takes a HttpRequest and returns an IActionResult. This prevents you from using the CreateCheckStatusResponse method in these new-style HTTP triggered functions.

The good news is that there is now a new CreateHttpManagementPayload method that generates a JSON payload containing the correct URIs for the management APIs for this orchestration that you can use for the response. So here's an updated version of my starter function that uses the new-style function signature.

[FunctionName("ProcessVideoStarter")]
public static async Task<IActionResult> Run(
    [HttpTrigger(AuthorizationLevel.Function, "get", Route = null)]
    HttpRequest req,
    [OrchestrationClient] DurableOrchestrationClient starter,
    TraceWriter log)
{
    string video = req.GetQueryParameterDictionary()["video"];

    if (video == null)
    {
        return new BadRequestObjectResult(
            "Please pass the video location the query string");
    }

    log.Info($"About to start orchestration for {video}");

    var orchestrationId = await starter.StartNewAsync("O_ProcessVideo", video);
    var payload = starter.CreateHttpManagementPayload(orchestrationId);
    return new OkObjectResult(payload);
}

Custom Orchestration Status

Another great recently added feature is the ability to store a custom orchestration status against any running orchestration. The orchestrator function can set this value by calling SetCustomStatus on the DurableOrchestrationContext.

When you use the REST API to query the status of a running orchestration, you get to see the value of this custom status, making it great for tracking how far through the workflow you are.

For example, you might just update it with a simple string just before calling an activity or sub-orchestration:

ctx.SetCustomStatus("sending approval request email");

This would result in the following when we queried the status for this instance:

{
  instanceId: "741b87080ed74430a17863d9ee437101",
  runtimeStatus: "Running",
  input: "example.mp4",
  customStatus: "sending approval request email",
  output: null,
  createdTime: "2018-06-22T10:54:50Z",
  lastUpdatedTime: "2018-06-22T10:54:58Z"
}

But you're not limited to strings. You can store any serializable object in there, so it could include the IDs of sub-orchestrations that have been kicked off, or URLs of files that have been created in blob storage or IDs of rows that have been added to a database. It's a simple feature, but very powerful if you want to make your workflows easier to diagnose.

Enumerating Instances

Another great new feature in 1.5, is that now we can call an API (or use the GetStatusAsync method on the DurableOrchestrationClient) to retrieve details of all orchestrations stored in the task hub. This allows you to discover the orchestration ids of any running orchestrations, if you'd failed to keep track of them.

I think that management APIs like this will be vital in persuading developers to be willing to give Durable Functions a go in production. However, I think it needs to go further and give us support for paging and filtering the orchestrations (as there could be a vast number of them if Durable Functions is being used heavily), and it would also be good to be able to purge the event sourcing history of old Durable Functions from the task hub, either to simply clean up and save space, or as part of a required data retention cleanup.

I've submitted a GitHub issue with my desired enhancements to this feature, so do get involved in the discussion if you think this is something that would be useful to you as well.

JavaScript support

Finally, I have to mention that it seems like great progress is being made on adding JavaScript support to Durable Functions. It's not a feature I've tried myself yet, but there are some sample apps which give you a good feel for the required syntax in a JavaScript orchestrator function. Hopefully supporting more languages will result in Durable Functions being embraced by a wider pool of developers.

Want to learn more about how easy it is to get up and running with Durable Functions? Be sure to check out my Pluralsight course Azure Durable Functions Fundamentals.

0 Comments Posted in:

I started creating NAudio back in 2002, using v1.0 of the .NET Framework and developing on the open source SharpDevelop IDE.

Of course, a huge amount has changed in the .NET world since then. However, since NAudio was heavily used in commercial applications running on Windows XP, I was always reluctant to depend on newer .NET features that would cause us to have to stop supporting legacy versions of Windows, or to maintain two different versions of NAudio.

So NAudio remained for a long time on .NET 2.0, before eventually upgrading to .NET 3.5. This means it has not yet taken advantage of Task and async / await, and a .NET Standard version has not been built, meaning it can't be used from .NET Core.

Part of the reason for this is simply that NAudio is very Windows-centric. A large part of the codebase consists of P/Invoke or COM interop wrappers around the various Windows audio APIs. So even if a .NET Standard build were to be created, much of the functionality would fail to work if you tried to use it in a .NET Core app running on Linux.

Having said that, there are a fair amount of general purpose utility classes in NAudio that would be usable cross-platform, and it seems a shame to block .NET Core apps from using it, especially if they are being run on Windows. So over my Easter holidays I started to see what it would take to make a .NET Standard version of NAudio.

New csproj format

The first step was moving to the new csproj file format. I really love this new format, and it brings a lot of benefits. You don't need to explicitly include source files - they are picked up automatically. It's got a nicer way of specifying NuGet dependencies, and it can also contain all the metadata needed to define the project as a NuGet package. I also need to support unsafe code blocks.

Here's the first part of my csproj file, specifying three target frameworks (more on that later!), and the metadata for the NuGet package;

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <TargetFrameworks>netstandard2.0;net35;uap10.0.10240</TargetFrameworks>
    <Version>1.9.0-preview1</Version>
    <Authors>Mark Heath &amp; Contributors</Authors>
    <AllowUnsafeBlocks>true</AllowUnsafeBlocks>
    <Description>NAudio, an audio library for .NET</Description>
    <PackageLicenseUrl>https://github.com/naudio/NAudio/blob/master/license.txt</PackageLicenseUrl>
    <PackageProjectUrl>https://github.com/naudio/NAudio</PackageProjectUrl>
    <PackageTags>C# .NET audio sound</PackageTags>
    <RepositoryUrl>https://github.com/naudio/NAudio</RepositoryUrl>
    <Copyright>© Mark Heath 2018</Copyright>
    <GeneratePackageOnBuild>true</GeneratePackageOnBuild>
    <GenerateDocumentationFile Condition=" '$(Configuration)' == 'Release' ">true</GenerateDocumentationFile>
  </PropertyGroup>

Target Frameworks

I started off by trying to target the lowest version of .NET Standard I could get away with - .NET Standard 1.6. I'd had some success with a previous proof of concept attempt for that in the past. But I soon found there was just too much to fix up, and that .NET Standard 2.0 was going to be much easier.

But I also wanted to see if I could still produce a .NET 3.5 build, so I added the net35 target framework moniker. This resulted in some code that could compile for .NET 3.5 and not for .NET Standard 2.0 and vice versa.

There were two ways of handling this. First, I could exclude whole C# files that weren't going to work on that target framework. So for example, in .NET Standard, the WinForms user interface components were never going to work, but also a number of the input and output device implementations used things like Windows Forms objects or the Windows Registry and so couldn't easily be made to compile for .NET Standard 2.0. Here's how to exclude certain files from a particular framework.

  <ItemGroup Condition=" '$(TargetFramework)' == 'netstandard2.0' ">
    <Compile Remove="Utils\ProgressLog*.*" />
    <Compile Remove="Gui\*.*" />
    <Compile Remove="Wave\MmeInterop\WaveWindow.cs" />
    <Compile Remove="Wave\MmeInterop\WaveCallbackInfo.cs" />
    <Compile Remove="Wave\WaveInputs\WaveIn.cs" />
    <Compile Remove="Wave\WaveOutputs\WaveOut.cs" />
    <Compile Remove="Wave\WaveOutputs\AsioOut.cs" />
    <Compile Remove="Wave\WaveOutputs\AsioAudioAvailableEventArgs.cs" />
    <Compile Remove="Wave\WaveFormats\WaveFormatCustomMarshaler.cs" />
  </ItemGroup>

I also needed to reference Windows.Forms for the .NET 3.5 target framework only, which can be done like this:

  <ItemGroup Condition=" '$(TargetFramework)' == 'net35' ">
    <Reference Include="System.Windows.Forms" />
  </ItemGroup>

There were still some individual bits of code I needed to exclude depending on what target framework I was building. For this, I needed to use the built-in #define symbols, which are NET35 and NETSTANDARD2_0.

A fair amount of marshaling code needed to switch based on the target. Here's a simple example:

public static int SizeOf<T>()
{
#if NET35
    return Marshal.SizeOf(typeof (T));
#else
    return Marshal.SizeOf<T>();
#endif
}

One of the biggest challenges at this point was making sense of the thousands of error messages I was getting. Unfortunately, Visual Studio doesn't seem to offer any way to just build for a specific target framework, so you end up with errors mixed together for different target frameworks and it can be confusing to know which is for which. Add that to the fact that ReSharper also struggles with multi-targetting, and it was quite a frustrating experience trying to chase all the compile errors out.

There is a dropdown in the top-left of the VS 2017 code editor window that lets you see the syntax highlighting depending on what target you chose, which is handy to see which code surrounded by #if blocks is valid.

UWP

Over the past few years I've maintained a rather hacky and incomplete build of NAudio for Windows 8, WinRT, Universal Windows apps, and I wanted to see if I could target that as well. The framework moniker is uap10.0, which I'm sure I had compiling successfully at Easter, but for whatever reason on my new dev machine I can't make that compile at all, and so for now I've specifically targetted uap10.0.10240 which is the first version of Windows 10.

Obviously lots of NAudio has to be excluded to target UWP, and I also had to reference the Microsoft.NETCore.UniversalWindowsPlatform NuGet package, as well as MSBuild.Sdk.Extras which I then imported as a project. To be honest, I don't really understand exactly what this does or why it was necessary, but I was really floundering at this point until I found a project by Oren Novotny, and copied the csproj from there (which has since changed to be .NET Standard 2.0 only).

Here's my UWP specific parts of the project file:

  <ItemGroup Condition=" '$(TargetFramework)' == 'uap10.0.10240' ">
    <Compile Remove="Utils\ProgressLog*.*" />
    <Compile Remove="Gui\*.*" />
    <Compile Remove="Wave\Compression\*.cs" />
    <Compile Remove="Wave\Asio\*.cs" />
    <Compile Remove="Wave\MmeInterop\WaveWindow.cs" />
    <Compile Remove="Wave\MmeInterop\WaveCallbackInfo.cs" />
    <Compile Remove="Wave\Midi\MidiInterop.cs" />
    <Compile Remove="Wave\WaveInputs\WaveIn.cs" />
    <Compile Remove="Wave\WaveInputs\WaveInEvent.cs" />
    <Compile Remove="Wave\WaveInputs\WasapiCapture.cs" />
    <Compile Remove="Wave\WaveInputs\WasapiLoopbackCapture.cs" />
    <Compile Remove="Wave\WaveOutputs\WaveOut.cs" />
    <Compile Remove="Wave\WaveOutputs\WaveOutEvent.cs" />
    <Compile Remove="Wave\WaveOutputs\DirectSoundOut.cs" />
    <Compile Remove="Wave\WaveOutputs\WasapiOut.cs" />
    <Compile Remove="Wave\WaveOutputs\MediaFoundationEncoder.cs" />
    <Compile Remove="Wave\WaveOutputs\AsioOut.cs" />
    <Compile Remove="Wave\WaveOutputs\AsioAudioAvailableEventArgs.cs" />
    <Compile Remove="Wave\WaveStreams\AudioFileReader.cs" />
    <Compile Remove="Wave\WaveStreams\Mp3FileReader.cs" />
    <Compile Remove="Wave\WaveStreams\WaveFormatConversionProvider.cs" />
    <Compile Remove="Wave\WaveStreams\WaveFormatConversionStream.cs" />
    <Compile Remove="Wave\WaveFormats\WaveFormatCustomMarshaler.cs" />
    <Compile Remove="Wave\WaveProviders\MediaFoundationResampler.cs" />
    <Compile Remove="FileFormats\Mp3\Mp3FrameDecompressor.cs" />
  </ItemGroup>

  <ItemGroup Condition=" '$(TargetFramework)' == 'uap10.0' or '$(TargetFramework)' == 'uap10.0.10240'">
    <PackageReference Include="Microsoft.NETCore.UniversalWindowsPlatform" Version="6.0.8" />
    <PackageReference Include="MSBuild.Sdk.Extras" Version="1.5.4" PrivateAssets="All" />
  </ItemGroup>
  <Import Project="$(MSBuildSDKExtrasTargets)" Condition="Exists('$(MSBuildSDKExtrasTargets)')" />

I also had to add in a lot more code exclusions as in UWP there are lots of small missing attributes that were part of my interop signatures. For example:

    [ComImport,
#if !WINDOWS_UWP
    System.Security.SuppressUnmanagedCodeSecurity,
#endif
    Guid("59eff8b9-938c-4a26-82f2-95cb84cdc837"),
    InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]

Success?

With this in place, I was eventually able to get NAudio building, outputting a single NuGet package containing a .NET 3.5 dll, a .NET Standard DLL, and a UAP DLL.

A quick test revealed that this worked great - I could add it to regular .NET Framework apps, and use all the WinForms stuff if I wanted. I could also use it from a .NET Core console app and play audio with the WaveOutEvent class (on Windows of course)! And I could use it from a UWP application as well.

So mission accomplished perhaps? Well, I have one reservation about what I've done so far, and that's whether it's worth having the UWP target framework. With this target, it's not possible to build the NAudio solution on Windows 7 or without the UWP 10240 SDK installed. This would be a pain for contributors who want to make pull requests or experiment locally.

I could of course go back to separate solutions and a nuspec file to piece it all together, but given how experimental the UWP code still is, and the fact that newer versions of UWP now support .NET Standard 2.0 anyway, I'm wondering whether I'd be better off just targetting .NET Standard 2.0 and .NET 3.5, and producing an entirely separate assembly for the few UWP specific classes I have, and requiring that you use it on versions of Windows 10 that support .NET Standard 2.0. Let me know in the comments if you have a strong preference either way.

Try it out!

If you want to be nosey and see what I'm doing, the code is available on GitHub in the netstandard branch and I've pushed a 1.9.0.preview1 package to NuGet which you're encouraged to try and report your results.

What's Next?

My idea is that NAudio 1.9 will support both .NET Standard 2.0 and .NET 3.5 (not sure about UWP), but that it will mark the final version that I avoid making any breaking changes to the public API.

For NAudio 2.0, I'd like to modernize the interfaces for playback and recording, taking advantage of Span<T> and async/await for a much more modern coding style. I'd also split the behaviour out into more packages. The core NAudio 2.0 package would have contain .NET Standard cross-platform friendly code, but you'd then add additional NuGet packages for WaveIn/WaveOut, or WASAPI, or ASIO. There'd be another for UWP.

Of course, this assumes I manage to find some free time to work on NAudio 2.0, which realistically there won't be much of, but that's at least an idea of where I'd like to go next with the project.

Want to get up to speed with the the fundamentals principles of digital audio and how to got about writing audio applications with NAudio? Be sure to check out my Pluralsight courses, Digital Audio Fundamentals, and Audio Programming with NAudio.

0 Comments Posted in:

10 years ago I blogged that one of my most wanted C# language features was the ability to perform reinterpret casts between different array types (e.g. cast a byte[] to a float[]). This is something you frequently need to do in audio programming, where performance matters and you want to avoid unnecessary copies or memory allocations.

NAudio has used a trick involving explicit struct offsets for some time, but it does have some gotchas and I've always held out hope that one day we'd get proper language support for doing this.

Span<T>

So I'm very happy that in .NET Core 2.1, the new Span<T> functionality gives me exactly what I wanted. It's very exciting to see the significant performance optimisations this is already bringing to ASP.NET Core and wider parts of the .NET framework.

I was keen to try out Span<T> to see if it could be used in NAudio, and so while I was at the MVP Summit in March, I put together a quick proof of concept, using an early beta release of the System.Memory functionality. I was privileged to meet Krzysztof Cwalina while I was there who was able to give me some pointers for how to use the new functionality.

I've now updated my app to use the final released bits, and published the code to GitHub, so here's a quick runthrough of the changes I made and their benefits.

IWaveProvider and ISampleProvider

The two main interfaces in NAudio that define a class that can provide a stream of audio are IWaveProvider and ISampleProvider. IWaveProvider allows you to read audio into a byte array, and so is flexible enough to cover audio in any format. ISampleProvider is for when you are dealing exclusively with IEEE floating point samples, which is typically what you want to use whenever you are performing any mixing or audio manipulation with audio streams.

Both interfaces are very simple. They report the WaveFormat of the audio they provide, and define a Read method, to which you pass an array that you want audio to be written into. This is of course for performance reasons. You don't want to be allocating new memory buffers every time you read some audio as this will be happening many times every second during audio playback.

public interface IWaveProvider
{
    WaveFormat WaveFormat { get; }
    int Read(byte[] buffer, int offset, int count);
}

public interface ISampleProvider
{
    WaveFormat WaveFormat { get; }
    int Read(float[] buffer, int offset, int count);
}

Notice that both Read methods take an offset parameter. This is because in some circumstances, the start of the buffer is already filled with audio, and we don't want the new audio to overwrite it. The count parameter specifies how many elements we want to be written into the buffer, and the Read method returns how many elements were actually written into the buffer.

So what does this look like if we take advantage of Span<T>? Well, it eliminates the need for an offset and a count, as a Span<T> already encapsulates both concepts.

The updated interfaces look like this:

public interface IWaveProvider
{
    WaveFormat WaveFormat { get; }
    int Read(Span<byte> buffer);
}

public interface ISampleProvider
{
    WaveFormat WaveFormat { get; }
    int Read(Span<float> buffer);
}

This not only simplifies the interface, but it greatly simplifies the implementation, as the offset doesn't need to be factored into every read or write from the buffer.

Creating Spans

There are several ways to create a Span<T>. You can go from a regular managed array to a Span, specifying the desired offset and number of elements:

var buffer = new float[WaveFormat.SampleRate * WaveFormat.Channels];
// create a Span based on this buffer
var spanBuffer = new Span<float>(buffer,offset,samplesRequired);

You can also create a Span based on unmanaged memory. This is used by the WaveOutBuffer class, because the buffer is passed to some Windows APIs that expect the memory pointer to remain valid after the API call completes. That means we can't risk passing a pointer to a managed array, as the garbage collector could move the memory at any time.

In this example, we allocate some unmanaged memory with Marshal.AllocHGlobal, and then create a new Span based on it. Unfortunately, there is no Span constructor taking an IntPtr, forcing us to use an unsafe code block to turn the IntPtr into a void *.

var bufferPtr = Marshal.AllocHGlobal(bufferSize);
// ...
Span<byte> span;
unsafe
{
    span = new Span<byte>(bufferPtr.ToPointer(), bufferSize);
}

It's also possible to create a new Span from an existing Span. For example, in the original implementation of OffsetSampleProvider, we need to read samplesRequired samples into an array called buffer, into an offset we've calculated from the original offset we were passed plus the number of samples we've already written into the buffer:

var read = sourceProvider.Read(buffer, offset + samplesRead, samplesRequired);

But the Span<T> implementation uses Slice to create a new Span of the desired length (samplesRequired), and from the desired offset (samplesRead) into the existing Span. The fact that our existing Span already starts in the right place eliminates the need for us to add on an additional offset, eliminating a common cause of bugs.

var read = sourceProvider.Read(buffer.Slice(samplesRead, samplesRequired));

Casting

I've said that one of the major benefits of Span<T> is the ability to perform reinterpret casts. So we can essentially turn a Span<byte> into a Span<float> or vice versa. The way you do this changed from the beta bits - now you use MemoryMarshal.Cast, but it is pretty straightforward.

This greatly simplifies a lot of the helper classes in NAudio that enable you to switch between IWaveProvider and ISampleProvider. Here's a simple snippet from SampleToWaveProvider that makes use of MemoryMarshal.Cast.

public int Read(Span<byte> buffer)
{
    var f = MemoryMarshal.Cast<byte, float>(buffer);
    var samplesRead = source.Read(f);
    return samplesRead * 4;
}

This eliminates the need for the WaveBuffer hack that we previously needed to avoid copying in this method.

Span<T> Limitations

There were a few limitations I ran into that are worth noting. First of all, a Span<T> can't be used as a class member (read Stephen Toub's article to understand why). So in the WaveOutBuffer class, where I wanted to reuse some unmanaged memory, I couldn't construct a Span<T> up front and reuse it. Instead, I had to hold onto the pointer to the unmanaged memory, and then construct a Span on demand.

This limitation also impacts the way we might design an audio recording interface for NAudio. For example, suppose we had an AudioAvailable event that was raised whenever recorded audio was available. We might want it to provide us a Span<T> containing that audio:

interface IAudioCapture
{
    void Start();
    void Stop();
    event EventHandler<AudioCaptureEventArgs> AudioAvailable;
    event EventHandler<StoppedEventArgs> RecordingStopped;
}

/// not allowed:
public class AudioCaptureEventArgs : EventArgs
{
    public AudioCaptureEventArgs(Span<byte> audio)
    {
        Buffer = audio;
    }

    public Span<byte> Buffer { get; }
}

But this isn't possible. We'd have to switch to Memory<T> instead. We can't even create a callback like this as Span<T> can't be used as the generic type for Func<T>:

void OnDataAvailable(Func<Span<byte>> callback);

However, one workaround that does compile is to use Span<T> in a custom delegate type:

void OnDataAvailable(AudioCallback callback);

// ...
delegate void AudioCallback(Span<byte> x);

I'm not sure yet whether this approach is preferable to using Memory<T>. The recording part of my proof of concept application isn't finished yet and so I'll try both approaches when that's ready.

Next steps

There is still a fair amount I'd like to do with this sample to take full advantage of Span<T>. There are more array allocations that could be eliminated, and also there should now be no need for any pinned GCHandle instances.

There's also plenty more NAudio classes that could be converted to take advantage of Span<T>. Currently the sample app just plays a short tone generated with the SignalGenerator, so I'd like to add in audio file reading, as well as recording. Feel free to submit PRs or raise issues if you'd like to help shape what might become the basis for a future NAudio 2.0.

Span<T> and .NET Standard

Of course one big block to the adoption of Span<T> is that it is currently supported on .NET Core 2.1 only. It's not part of .NET Standard 2.0, and it seems there are no immediate plans to create a new version of the .NET Standard that supports Span<T>, presumably due to the challenges of back-porting all this to the regular .NET Framework. This is a shame, because it means that NAudio cannot realistically adopt it if we want one consistent programming model across all target frameworks.

Conclusion

Span<T> is a brilliant new innovation, that has the potential to bring major performance benefits to lots of scenarios, including audio. For the time being though, it is only available in .NET Core applications.

Want to get up to speed with the the fundamentals principles of digital audio and how to got about writing audio applications with NAudio? Be sure to check out my Pluralsight courses, Digital Audio Fundamentals, and Audio Programming with NAudio.