0 Comments

Today, someone asked how they could play segments of audio from a WAV file. So for example, they wanted to play a 5 second segment of audio that started 1 minute into the source file, and then a 30 second segment that started 10 seconds into the source file, and so on.

There’s lots of ways you could tackle this. One approach (which I don’t recommend) is to have some kind of a timer that detects where playback is up to in a file, and then jumps to a new position if you’ve gone past the end of the currently playing segment. The reason I don’t recommend this is that its horribly inaccurate. With NAudio, it’s actually possible to get accuracy right down to the sample level, so the instant we reach the end of the first segment, we jump seamlessly to the next.

Let’s see how we can do this.

The key is to create our own custom IWaveProvider. In NAudio an IWaveProvider is a simple interface that provides audio. You just need to implement the Read method to fill a new buffer of sound, and the WaveFormat property to indicate the format of the audio provided by the Read method. When you reach the end of the audio, Read should return 0.

IWaveProvider has no concept of current “position” or overall “length” – you can implement WaveStream if you need those. But for this example, I’m assuming that we know in advance what “segments” we want to play and we just need to play them each through once.

So let me first show you the code for the SegmentPlayer which is our custom IWaveProvider, and then I’ll explain how it works and how to use it.

class SegmentPlayer : IWaveProvider
{
    private readonly WaveStream sourceStream;
    private readonly List<Tuple<int,int>> segments = new List<System.Tuple<int, int>>();
    private int segmentIndex = -1;
    
    public SegmentPlayer(WaveStream sourceStream)
    {
        this.sourceStream = sourceStream;        
    }
    
    public WaveFormat WaveFormat => sourceStream.WaveFormat;
    
    public void AddSegment(TimeSpan start, TimeSpan duration)
    {
        if (start + duration > sourceStream.TotalTime) 
            throw new ArgumentOutOfRangeException("Segment goes beyond end of input");
        segments.Add(Tuple.Create(TimeSpanToOffset(start),TimeSpanToOffset(duration)));        
    }
    
    public int TimeSpanToOffset(TimeSpan ts)
    {
        var bytes = (int)(WaveFormat.AverageBytesPerSecond * ts.TotalSeconds);
        bytes -= (bytes%WaveFormat.BlockAlign);
        return bytes;
    }
    
    public int Read(byte[] buffer, int offset, int count)
    {
        int bytesRead = 0;
        while (bytesRead < count && segmentIndex < segments.Count)
        {
            if (segmentIndex < 0) SelectNewSegment();
            var fromThisSegment = ReadFromCurrentSegment(buffer,offset+bytesRead,count-bytesRead);
            if (fromThisSegment == 0) SelectNewSegment();
            bytesRead += fromThisSegment;
        }
        return bytesRead;
    }
    
    private int ReadFromCurrentSegment(byte[] buffer, int offset, int count)
    {
        var (segmentStart,segmentLength) = segments[segmentIndex];
        var bytesAvailable = (int)(segmentStart + segmentLength - sourceStream.Position);
        var bytesRequired = Math.Min(bytesAvailable,count);
        return sourceStream.Read(buffer, offset, bytesRequired);        
    }
    
    private void SelectNewSegment()
    {
        segmentIndex++;
        sourceStream.Position = segments[segmentIndex].Item1;
    }
}

The first thing to notice is that we need a WaveStream to be passed to us as an input. This is because although our SegmentPlayer won’t support repositioning, we do need to be able to support repositioning from the source file to get the audio for each segment. Since WaveFileReader and Mp3FileReader both implement WaveStream, you could use a WAV or MP3 file as the source of the audio.

Now of course, you could dispense with the WaveStream altogether and just pass in a byte array of audio for each segment. That would perform better at the cost of potentially using a lot of memory if the segments are long.

The next thing to point out is that we have a list of “segments”, which are tuples containing the start position (in bytes) and duration (also in bytes) of each segment of audio within the source file. We have an AddSegment method that allows you to more conveniently specify these segments in terms of their start time and duration as TimeSpan instances. Notice in the TimeSpanToOffset method that we are very careful to respect the BlockAlign of the source file, to ensure we always seek to the start of a sample frame.

The bulk of the work is done in the Read method. We’re asked for a certain number of bytes of audio (count) to be provided. So we read as many as we can from the current segment, and if we still need some more, we move to the next segment. Moving to the next segment requires a reposition within the source file. Only when we reach the end of the segment list do we return less than count bytes of audio from our Read method.

Now this is a very quick and simple implementation. We could improve it in several ways such as caching the audio in each segment to avoid seeking on disk, or by upgrading it to be a WaveStream and allow repositioning. With a bit of multi-threading care we could even support dynamically adding new segments while you are playing. But I hope that this serves as a good example of how by implementing a custom IWaveProvider you can powerfully extend the capabilities of NAudio.

Let’s wrap up by seeing how to use the SegmentPlayer. In this example our source audio is an MP3 file. We set up four segments that we want to be played back to back and use WaveOutEvent to play them. We could have used WaveFileWriter.CreateWaveFile instead had we wanted to print the output to a WAV file instead of playing it.

using (var source = new Mp3FileReader("example.mp3"))
using (var player = new WaveOutEvent())
{
    var segmentPlayer = new SegmentPlayer(source);
    segmentPlayer.AddSegment(TimeSpan.FromSeconds(2), TimeSpan.FromSeconds(5));
    segmentPlayer.AddSegment(TimeSpan.FromSeconds(20), TimeSpan.FromSeconds(10));
    segmentPlayer.AddSegment(TimeSpan.FromSeconds(5), TimeSpan.FromSeconds(15));
    segmentPlayer.AddSegment(TimeSpan.FromSeconds(25), TimeSpan.FromSeconds(5));
    player.Init(segmentPlayer);
    player.Play();
    while(player.PlaybackState == PlaybackState.Playing)
        Thread.Sleep(1000);
}

The duration of the audio will be exactly 35 seconds in this instance, as each segment will be played instantaneously after the previous one ends.

Want to get up to speed with the the fundamentals principles of digital audio and how to got about writing audio applications with NAudio? Be sure to check out my Pluralsight courses, Digital Audio Fundamentals, and Audio Programming with NAudio.
Vote on HN
comments powered by Disqus