Concatenating Segments of an Audio File with NAudio

Today, someone asked how they could play segments of audio from a WAV file. So for example, they wanted to play a 5 second segment of audio that started 1 minute into the source file, and then a 30 second segment that started 10 seconds into the source file, and so on.

There’s lots of ways you could tackle this. One approach (which I don’t recommend) is to have some kind of a timer that detects where playback is up to in a file, and then jumps to a new position if you’ve gone past the end of the currently playing segment. The reason I don’t recommend this is that its horribly inaccurate. With NAudio, it’s actually possible to get accuracy right down to the sample level, so the instant we reach the end of the first segment, we jump seamlessly to the next.

Let’s see how we can do this.

The key is to create our own custom IWaveProvider. In NAudio an IWaveProvider is a simple interface that provides audio. You just need to implement the Read method to fill a new buffer of sound, and the WaveFormat property to indicate the format of the audio provided by the Read method. When you reach the end of the audio, Read should return 0.

IWaveProvider has no concept of current “position” or overall “length” – you can implement WaveStream if you need those. But for this example, I’m assuming that we know in advance what “segments” we want to play and we just need to play them each through once.

So let me first show you the code for the SegmentPlayer which is our custom IWaveProvider, and then I’ll explain how it works and how to use it.

class SegmentPlayer : IWaveProvider
{
    private readonly WaveStream sourceStream;
    private readonly List<Tuple<int,int>> segments = new List<System.Tuple<int, int>>();
    private int segmentIndex = -1;
    
    public SegmentPlayer(WaveStream sourceStream)
    {
        this.sourceStream = sourceStream;
    }
    
    public WaveFormat WaveFormat => sourceStream.WaveFormat;
    
    public void AddSegment(TimeSpan start, TimeSpan duration)
    {
        if (start + duration > sourceStream.TotalTime) 
            throw new ArgumentOutOfRangeException("Segment goes beyond end of input");
        segments.Add(Tuple.Create(TimeSpanToOffset(start),TimeSpanToOffset(duration)));
    }
    
    public int TimeSpanToOffset(TimeSpan ts)
    {
        var bytes = (int)(WaveFormat.AverageBytesPerSecond * ts.TotalSeconds);
        bytes -= (bytes%WaveFormat.BlockAlign);
        return bytes;
    }
    
    public int Read(byte[] buffer, int offset, int count)
    {
        int bytesRead = 0;
        while (bytesRead < count && segmentIndex < segments.Count)
        {
            if (segmentIndex < 0) SelectNewSegment();
            var fromThisSegment = ReadFromCurrentSegment(buffer,offset+bytesRead,count-bytesRead);
            if (fromThisSegment == 0) SelectNewSegment();
            bytesRead += fromThisSegment;
        }
        return bytesRead;
    }
    
    private int ReadFromCurrentSegment(byte[] buffer, int offset, int count)
    {
        var (segmentStart,segmentLength) = segments[segmentIndex];
        var bytesAvailable = (int)(segmentStart + segmentLength - sourceStream.Position);
        var bytesRequired = Math.Min(bytesAvailable,count);
        return sourceStream.Read(buffer, offset, bytesRequired);
    }
    
    private void SelectNewSegment()
    {
        segmentIndex++;
        sourceStream.Position = segments[segmentIndex].Item1;
    }
}

The first thing to notice is that we need a WaveStream to be passed to us as an input. This is because although our SegmentPlayer won’t support repositioning, we do need to be able to support repositioning from the source file to get the audio for each segment. Since WaveFileReader and Mp3FileReader both implement WaveStream, you could use a WAV or MP3 file as the source of the audio.

Now of course, you could dispense with the WaveStream altogether and just pass in a byte array of audio for each segment. That would perform better at the cost of potentially using a lot of memory if the segments are long.

The next thing to point out is that we have a list of “segments”, which are tuples containing the start position (in bytes) and duration (also in bytes) of each segment of audio within the source file. We have an AddSegment method that allows you to more conveniently specify these segments in terms of their start time and duration as TimeSpan instances. Notice in the TimeSpanToOffset method that we are very careful to respect the BlockAlign of the source file, to ensure we always seek to the start of a sample frame.

The bulk of the work is done in the Read method. We’re asked for a certain number of bytes of audio (count) to be provided. So we read as many as we can from the current segment, and if we still need some more, we move to the next segment. Moving to the next segment requires a reposition within the source file. Only when we reach the end of the segment list do we return less than count bytes of audio from our Read method.

Now this is a very quick and simple implementation. We could improve it in several ways such as caching the audio in each segment to avoid seeking on disk, or by upgrading it to be a WaveStream and allow repositioning. With a bit of multi-threading care we could even support dynamically adding new segments while you are playing. But I hope that this serves as a good example of how by implementing a custom IWaveProvider you can powerfully extend the capabilities of NAudio.

Let’s wrap up by seeing how to use the SegmentPlayer. In this example our source audio is an MP3 file. We set up four segments that we want to be played back to back and use WaveOutEvent to play them. We could have used WaveFileWriter.CreateWaveFile instead had we wanted to print the output to a WAV file instead of playing it.

using (var source = new Mp3FileReader("example.mp3"))
using (var player = new WaveOutEvent())
{
    var segmentPlayer = new SegmentPlayer(source);
    segmentPlayer.AddSegment(TimeSpan.FromSeconds(2), TimeSpan.FromSeconds(5));
    segmentPlayer.AddSegment(TimeSpan.FromSeconds(20), TimeSpan.FromSeconds(10));
    segmentPlayer.AddSegment(TimeSpan.FromSeconds(5), TimeSpan.FromSeconds(15));
    segmentPlayer.AddSegment(TimeSpan.FromSeconds(25), TimeSpan.FromSeconds(5));
    player.Init(segmentPlayer);
    player.Play();
    while(player.PlaybackState == PlaybackState.Playing)
        Thread.Sleep(1000);
}

The duration of the audio will be exactly 35 seconds in this instance, as each segment will be played instantaneously after the previous one ends.

Comments

September 6. 2017 06:45

Hi Mark
I'm having trouble compiling this code, errors in function ReadFromCurrentSegment. I hope you can give me some pointers please.
var segmentStart, segmentLength = segments[segmentIndex];
CS0819Implicitly-typed variables cannot have multiple declarators
CS0818Implicitly-typed variables must be initialized
var bytesAvailable = (int)(segmentStart + segmentLength - sourceStream.Position);
CS0165Use of unassigned local variable 'segmentStart'
MS VS Community 2015
Version 14.0.25420.01 Update 3
NAudio 1.7.3.0
Thanks

Tgd87

September 6. 2017 07:09

ah, OK, silly me, I'm using some C# 7 syntax that you'll need VS2017 for.
or just do something like
var segmentStart = segments[segmentIndex].Item1;
var segmentLength = segments[segmentIndex].Item2;

Mark Heath

September 6. 2017 07:16

Thanks for the quick reply Mark.
Works now.
Cheers

October 31. 2018 12:50

Hi Mark,
Can you help me how to concatenate list of MemoryStream of wav files? I have tried to use WaveFileReader class and the WaveFileWriter class but I have no luck, I've got an exception like Unable to read beyond the end of the stream. How to solve this issue?

Arnel Ambrosio

November 2. 2018 14:32

that exception would suggest an invalid WAV file. The WaveFileReader can read WAV files stored in a MemoryStream.

November 12. 2018 10:32

Hi Mark,
Yes you are right, thank you.

November 12. 2018 10:40

Hi Mark,
I have another question, Although this is not related to concatenating audio files but maybe you can help. How to convert the wav files to aac format? Does the NAudio supporting it? I found a tutorial says it can be done like
var aacFilePath = Path.Combine(outputFolder, "test.wav");
using (var reader = new WaveFileReader(testFilePath))
{
MediaFoundationEncoder.EncodeToAac(reader, aacFilePath);
}
But, when I tried this using Win10 it throws an error saying certain dll is missing. Is the conversion has dependency on Windows OS version?