Posted in:

A while ago on my blog I wrote about a C# language feature that I wanted - reinterpret casts between arrays of structs. One reason this would be so useful to me is that I want to improve the design of my open source audio library NAudio, and create an IWaveProvider interface that allows people to output their audio data in whatever format is most convenient for them. Audio is sometimes in 16 bit integer format, sometimes in 32 bit floating point format, and sometimes in compressed blocks of bytes (other common scenarios include 24 bit integers and 64 bit double precision floating point audio).

In the world of C/C++, this isn't a problem. The Read method of IWaveProvider simply needs to take a void pointer that can be cast to a byte, short or float pointer as appropriate. In .NET things are not nearly so simple. True, there are 'unsafe' pointers in C#, but using them immediately excludes developers from other .NET languages such as VB.NET from using the framework. Also, when reading or writing data from files in .NET, you must work with the System.IO.Stream class that expects reads and writes to provide byte arrays, requiring a manual copy from the pointer to an array.

My initial idea was to simply provide a variety of Read functions for IWaveProvider, and provide helper functions in an abstract base class that would allow users just to implement the one Read method whose signature best fitted their needs:

interface IWaveProvider 
{
   int Read(byte[] buffer, int byteCount);
   int Read(short[] buffer, int sampleCount);
   int Read(float[] buffer, int sampleCount);
   ...

However, a new contributor to the NAudio project, Alexandre Mutel, has come up with an ingenious solution thanks to a brilliant piece of lateral thinking. Suppose we define a WaveBuffer class that uses an explicit structure layout:

[StructLayout(LayoutKind.Explicit, Pack=2)]
public class WaveBuffer : IWaveBuffer
{
   [FieldOffset(0)]
   public int numberOfBytes;
   [FieldOffset(4)]
   private byte[] byteBuffer;
   [FieldOffset(4)]
   private float[] floatBuffer;
   [FieldOffset(4)]
   private short[] shortBuffer;
   ...

This class has some interesting capabilities. You can set byteBuffer to point to a new byte array, but then access it using floatBuffer. Sounds dangerous? Well it compiles, and initial tests show that it works just fine. It is true that using the floatBuffer accessor will let you write beyond the end of available space, but so long as you never write more than the requested number of samples to the buffer, you are safe. This structure even survives garbage collections without any issues.

This allows us to simplify our IWaveProvider interface dramatically:

interface IWaveProvider
{
   int Read(IWaveBuffer buffer);
   ...

Implementers of the Read method then have a choice of which buffer they write into. If they simply want to write samples (whether 16 bit integers or 32 bit floats) that is fine, but equally if it is easier to provide their data as a byte array (for example when reading from a WAV file), then that can be done. The WaveBuffer trick effectively gives us the casting feature we need.

Sounds too good to be true? Well there are some potential concerns. This approach could be described as a bit of a "hack". Do we know for sure that in a future version of the .NET framework it will still work (or even compile)? Does it work with 64 bit Windows? Could there be a garbage collection scenario we have not yet encountered that would cause us problems? Would people object to using a hack like this right at the core of the NAudio framework?

The solution to these concerns is fairly simple. We will use an interface, IWaveBuffer instead of using WaveBuffer itself. This allows us to create an alternative implementation if ever we find that WaveBuffer has any issues.

So the plan is that NAudio will be migrating to use IWaveProvider and IWaveBuffer in the future (not for version 1.2, but probably appearing in the following version), but if anyone can think of any problems with using the proposed WaveBuffer class, I would be interested to hear them.

Want to get up to speed with the the fundamentals principles of digital audio and how to got about writing audio applications with NAudio? Be sure to check out my Pluralsight courses, Digital Audio Fundamentals, and Audio Programming with NAudio.

Comments

Comment by Anonymous

At least in C# Express Edition this does not work.
I just tried it in the following manner:

[StructLayout(LayoutKind.Explicit, Pack=2)]
public struct TGridModelDataBlock
{
[FieldOffset(0)]
private byte[] byteData;

[FieldOffset(0)]
private int[,] intData;

[FieldOffset(0)]
private float[,] floatData;

public TGridModelDataBlock(uint cbBlocksize)
{
Debug.Assert(cbBlocksize == GriddedElevationModel.cbGridBlockSize);

intData = null;
floatData = null;
byteData = new byte[cbBlocksize];
}

public int[,] IntData
{
get
{
return intData;
}
}

public float[,] FloatData
{
get
{
return floatData;
}
}

public bool Read(BinaryReader reader)
{
try
{
reader.Read(byteData, 0, byteData.Length);

return true;
}
catch(Exception)
{
}

return false;
}

public bool Write(BinaryWriter writer)
{
try
{
writer.Write(byteData, 0, byteData.Length);

return true;
}
catch(Exception)
{
}

return false;
}
}

and when I try to access the
float data, I get an exception.

I am trying to overlay a
matrix of 4-byte signed
intergers with an array
of 4-byte floats without
using a fixed buffer and
so far I have failed...

In C/C++ this easy of course ;-)

Kinds regards,

Jeroen Posch
[email protected]

Anonymous
Comment by Mark H

I suspect it is your use of multi-dimensional arrays that is causing the problem. We just use single dimensional arrays.

Comment by Q

Just curious, why "Pack=2"?

Comment by Mark H

I think it is just that when you do a StructLayout.Explicit, it is a good idea to specify the pack, so you know it will do the same thing every time. I can't remember that there was a specific problem that this setting solved. We want to know that each array will reside in exactly the same place.

Comment by Q

That's cool, I understand the value of setting the Packing in general, just didn't know why you chose "2" in particular. I've tried it with "1" and it seems to work just as well. Just curious so that I could avoid any problems that I may not have yet seen.

Very cool idea though.

Comment by Rob Westgeest

I like the idea a lot however: this code does not work for me.

public interface ISampleBuffer
{
byte[] Bytes { get; set; }
float[] Floats { get; set; }
}

[StructLayout(LayoutKind.Explicit, Pack=2)]
public struct SampleBuffer : ISampleBuffer
{
[FieldOffset(0)]
private float[] floatBuffer;
[FieldOffset(0)]
private byte[] byteBuffer;
public byte[] Bytes { get { return byteBuffer; } set { byteBuffer = value; } }
public float[] Floats { get { return floatBuffer; } set { floatBuffer = value; } }
}

For some reason 'Bytes' is shown as a float[] in the debugger and (quite consistently) the following test throws an exception in the call to BitConverter.ToSingle():

[Test]
public void CastAFloatArrayToByteArray()
{
ISampleBuffer buffer = new SampleBuffer();
buffer.Floats = new float[] { 1.3f, -3.999f };
Assert.AreEqual(1.3f, BitConverter.ToSingle(buffer.Bytes, 0));
}

failed: System.ArgumentException : Destination array is not long enough to copy all the items in the collection. Check array index and length.

Comment by Mark H

Hi Rob,
your problem is because you initialised the float first, making it think that they byte array has only two elements (this is just a weird side-effect of this hack). Initialise the byte array to a blank array of 8 elements. Then write over the float values. Then your test should pass:

ISampleBuffer buffer = new SampleBuffer();
buffer.Bytes = new byte[8];
buffer.Floats[0] = 1.3f;
buffer.Floats[1] = -3.999f;
Assert.AreEqual(1.3f, BitConverter.ToSingle(buffer.Bytes, 0));

Comment by Rob Westgeest

It does!.. You're a genius Mark,

I am not sure it solves my problem though. This, works provided that I copy my float values one by one to my pre allocated buffer.

The problem I am trying to solve is: I have a buffer of float (sample data) and i want to write it to a stream as fast as possible. I suspect that one by one is not the fastest way to do that and kind of hoped that your neat little trick would enable me to just assign the float array and get it's the byte[] representation.

Would the only advantage of this approach be that I don't need pinning?

Comment by Mark H

Hi Rob,
if the creation of the float array is out of your control then you might be out of luck.
Yes, the lack of need for pinning is a good advantage of this approach.
You might be able to experiment with the unchecked keyword if all you need to do is read out of the byte buffer - you can safely go past what .NET thinks is the end of the buffer

Comment by Marc Bernier

I had a similar issue with a packed array of longs that I was retrieving from a RocksDb. I was using a fixed{} block to cast the byte* to a long*, but I like this approach better. Only caveat I had was that the long's Length was 8 times too big, but that's easily dealt with.

Marc Bernier