Posted in:

One of the challenges that frequently arises when writing audio code in C# is that you get a byte array containing raw audio that would be better presented as a short (Int16) array, or a float (Single) array. (There are other formats too – some audio is 32 bit int, some is 64 bit floating point, and then there is the ever-annoying 24 bit audio). In C/C++ the solution is simple, cast the address of the byte array to a short * or a float * and access each sample directly.

Unfortunately, in .NET casting byte arrays into another type is not allowed:

byte[] buffer = new byte[1000];
short[] samples = (short[])buffer; // compile error!

This means that, for example, in NAudio, when the WaveIn class returns a byte[] in its DataAvailable event, you usually need to convert it manually into 16 bit samples (assuming you are recording 16 bit audio). There are several ways of doing this. I’ll run through five approaches, and finish up with some performance measurements.

BitConverter.ToInt16

Perhaps the simplest conceptually is to use the System.BitConverter class. This allows you to convert a pair of bytes at any position in a byte array into an Int16. To do this you call BitConverter.ToInt16. Here’s how you read through each sample in a 16 buffer:

byte[] buffer = ...;
for(int n = 0; n < buffer.Length; n+=2)
{
   short sample = BitConverter.ToInt16(buffer, n);
}

For byte arrays containing IEEE float audio, the principle is similar, except you call BitConverter.ToSingle. 24 bit audio can be dealt with by copying three bytes into a temporary four byte array and using ToInt32.

BitConverter also includes a GetBytes method to do the reverse conversion, but you must then manually copy the return of that method into your buffer.

Bit Manipulation

Those who are more comfortable with bit manipulation may prefer to use bit shift and or to convert each pair of bytes into a sample:

byte[] buffer = ...;
for (int n = 0; n < buffer.Length; n+=2)
{
   short sample = (short)(buffer[n] | buffer[n+1] << 8);
}

This technique can be extended for 32 bit integers, and is very useful for 24 bit, where none of the other techniques work very well. However, for IEEE float, it is a bit more tricky, and one of the other techniques should be preferred.

For the reverse conversion, you need to write more bit manipulation code.

Buffer.BlockCopy

Another option is to copy the whole buffer into an array of the correct type. Buffer.BlockCopy can be used for this purpose:

byte[] buffer = ...;
short[] samples = new short[buffer.Length];
Buffer.BlockCopy(buffer,0,samples,0,buffer.Length);

Now the samples array contains the samples in easy to access form. If you are using this technique, try to reuse the samples buffer to avoid making extra work for the garbage collector.

For the reverse conversion, you can do another Buffer.BlockCopy.

WaveBuffer

One cool trick NAudio has up its sleeve (thanks to Alexandre Mutel) is the WaveBuffer class. This uses the StructLayout=LayoutKind.Explicit attribute to effectively create a union of a byte[], a short[], an int[] and a float[]. This allows you to “trick” C# into letting you access a byte array as though it was a short array. You can read more about how this works here. If you’re worried about its stability, NAudio has been successfully using it with no issues for many years. (The only gotcha is that you probably shouldn’t pass it into anything that uses reflection, as underneath, .NET knows that it is still a byte[], even if it has been passed as a float[]. So for example don’t use it with Array.Copy or Array.Clear). WaveBuffer can allocate its own backing memory, or bind to an existing byte array, as shown here:

byte[] buffer = ...;
var waveBuffer = new WaveBuffer(buffer);
// now you can access the samples using waveBuffer.ShortBuffer, e.g.:
var sample = waveBuffer.ShortBuffer[sampleIndex];

This technique works just fine with IEEE float, accessed through the FloatBuffer property. It doesn’t help with 24 bit audio though.

One big advantage is that no reverse conversion is needed. Just write into the ShortBuffer, and the modified samples are already in the byte[].

Unsafe Code

Finally, there is a way in C# that you can work with pointers as though you were using C++. This requires that you set your project assembly to allow “unsafe” code. "Unsafe” means that you could corrupt memory if you are not careful, but so long as you stay in bounds, there is nothing unsafe at all about this technique. Unsafe code must be in an unsafe context – so you can use an unsafe block, or mark your method as unsafe.

byte[] buffer = ...;
unsafe 
{
    fixed (byte* pBuffer = buffer)
    {
        short* pSample = (short*)buffer;
        // now we can access samples via pSample e.g.:
        var sample = pSample[sampleIndex];
    }
}

This technique can easily be used for IEEE float as well. It also can be used with 24 bit if you use int pointers and then bit manipulation to blank out the fourth byte.

As with WaveBuffer, there is no need for reverse conversion. You can use the pointer to write sample values directly into the memory for your byte array.

Performance

So which of these methods performs the best? I had my suspicions, but as always, the best way to optimize code is to measure it. I set up a simple test application which went through a four minute MP3 file, converting it to WAV and finding the peak sample values over periods of a few hundred milliseconds at a time. This is the type of code you would use for waveform drawing. I measured how long each one took to go through a whole file (I excluded the time taken to read and decode MP3). I was careful to write code that avoided creating work for the garbage collector.

Each technique was quite consistent in its timings:

Debug BuildRelease Build
BitConverter263,265,264166,167,167
Bit Manipulation254,243,250104,104,103
Buffer.BlockCopy205,206,204104,103,103
WaveBuffer239.264.26397,97,97
Unsafe173.172.16298,98,98

As can be seen, BitConverter is the slowest approach, and should probably be avoided. Buffer.BlockCopy was the biggest surprise for me  - the additional copy was so quick that it paid for itself very quickly. WaveBuffer was surprisingly slow in debug build – but very good in Release build. It is especially impressive given that it doesn’t need to pin its buffers like the unsafe code does, so it may well be the quickest possible technique in the long-run as it doesn’t hinder the garbage collector from compacting memory. As expected the unsafe code gave very fast performance. The other takeaway is that you really should be using Release build if you are doing audio processing.

Anyone know an even faster way? Let me know in the comments.

Want to get up to speed with the the fundamentals principles of digital audio and how to got about writing audio applications with NAudio? Be sure to check out my Pluralsight courses, Digital Audio Fundamentals, and Audio Programming with NAudio.

Comments

Comment by javO

Nice post. Is there any way for the WaveBuffer aproach to work with a 32bit sample?

Comment by Mark H

@javO, yes, WaveBuffer deals with int and float, both of which are 32 bit

Comment by Riley L

This was super useful, thanks for the post!

Riley L
Comment by mark gamache

var shorts = MemoryMarshal.Cast<byte,short>(bytes);

mark gamache
Comment by Wahono Cahyadi

Want to share it, who needs 32 bit data system, please download/copy.
The data can be used to examine/vary work. for example, what I am
currently doing and continuing to do today is making data compression
programs that can be used later in the cloud system. Imagine data of
1 terra byte or 1 peta byte can be compressed only by a few bytes.
The file contains data from 00 00 00 00, 00 00 00 01, ..... until
FF FF FF FF.
OPEN or log in to Cloud Drive - MEGA
email : [email protected]
password : bg-nano1
recovery key : BPxGnA8Tg6oCzX2pYbBinw
For 64-bit data, it may not be able to be transferred, or downloaded
uploaded because it contains large data, even if you have 5G or 6G
internet, 17,179,869,184 x 17,179,869,184 = (2.951479051793528e+20 byte)
That much data if printed on paper might be as much as a small mountain,
in contrast to 32 bits of data that have been stored as much as a room.
Kept as concrete evidence.
If you need a donation, you can donate to a bitcoin address:
1ECGtUSDtRiHZyJvnZx4BwgLETVu96cD8Q /or 18CzqsoCtesiPTVEmi1Yb6gjzquHrmpgEZ
Trims.

Wahono Cahyadi
Comment by Wentao Lu

short* pSample = (short*)buffer;
shouldn't this be
short* pSample = (short*)pBuffer;
instead?

Wentao Lu
Comment by Tom

I see a real disconnect here....You are showing conversion to a short array and so on when nAudio seems to use byte arrays for all its classes. Shouldn't we be seeing something like this:
I have a short[] array that contains16 bit samples of audio. Each int16 values contains audio and it is arranged as L channel, Right Channel. I need to use the bufferedwaveprovider but it only takes byte[] arrays. How is the above useful? Same thing in the PluraSight tutorials.
So the above blog should be showing the reverse to all of this no? I mean I need to convert my short array to a byte array and not voce versa.

Tom