Posted in:

In this post I will demonstrate how you can implement varispeed playback with NAudio using the excellent SoundTouch library. To do so, I’ve prepared a very simple Windows Forms Application that lets you load an audio file and play it at varying speeds.

The key parts of the application are as follows. First of all, the SoundTouchInterop32.cs and SoundTouchInterop64.cs files include the necessary PInvoke signatures to call the x86 and x64 versions of SoundTouch respectively. These are SoundTouch.dll and SoundTouch_x64.dll. Make sure these DLLs are available in the path when running.

I’ve also created a wrapper class called SoundTouch which simplifies access to SoundTouch and calls the correct PInvoke method depending on whether the process is 64 bit or not.

The next part is VarispeedSampleProvider. This implements NAudio’s ISampleProvider interface so it can be easily inserted into a signal chain. It also exposes a PlaybackRate property which can be set to 1.0 for regular speed and 2.0 for 2x playback etc.

public VarispeedSampleProvider(ISampleProvider sourceProvider, 
    int readDurationMilliseconds, SoundTouchProfile soundTouchProfile)

The constructor for VarispeedSampleProvider takes a source provider, which is the input stream you want to speed up (in our example an audio file), the readDurationInMilliseconds, which allows control of how much will be read from the source provider in a single read. Obviously when you are speeding up or slowing down, the amount of audio you need to read from the source is different from the amount of time taken to play that audio. Something like 100ms will be fine to use here.

Finally, the SoundTouchProfile allows us to specify what SoundTouch options we want to use. There are a few switches you can experiment with which adjust quality and performance. The most significant is UseTempo. In tempo mode, SoundTouch will pitch compensate when you change speed, so the music will remain at the same pitch, just a different speed. This avoids the “chipmunk effect” when you play back at higher speeds. In the demo app I let you switch between modes while playback is stopped.

image

All that remains is to create an AudioFileReader, pass it into the VarispeedSampleProvider, and then pass that to the output device (WaveOutEvent in our example) to be played.

The one other thing worth mentioning in the demo project is that I allow you to reposition in the file, and when you do so, it’s a good idea to tell SoundTouch that a reposition has taken place so internal buffers can be flushed. This is done by calling Reposition on the VarispeedSampleProvider.

Want to see the code? You can access it on GitHub.

Want to get up to speed with the the fundamentals principles of digital audio and how to got about writing audio applications with NAudio? Be sure to check out my Pluralsight courses, Digital Audio Fundamentals, and Audio Programming with NAudio.

Comments

Comment by Tgd87

Hi Mark
I am building a protoype lip-sync application for animation.
I'd like to adjust the speed and tempo of segments (based on milliseconds) within a single audio.
I am trying to get your varispeed code to work with the concatenation class without much luck. This might not even be the best way to achieve what I am after.
May I ask what you would suggest I look at to varispeed adjust millisecond-based segments within a single audio file?
Thanks for your time. Please let me know if I can provide any further information.
Background
I have an audio file of spoken words and have aligned the text with the audio so I have a list of words with start and end times in milliseconds.
I'd like to be able to process each word's pitch and tempo using the millisecond timings from the alignment.
I understand that if I change the timing of one segment A then I need to add/subtract the final segment time for A from the positions of the next segments B, C, D, etc. This is a simple recalculating step I beleive.
Cheers

Tgd87
Comment by Mark Heath

I'd make my own custom IWaveProvider for something like this. That way you can control exactly how many bytes are passed through the VarispeedWaveProvider, before changing the pitch settings to other values.

Mark Heath
Comment by Tgd87

Great thanks for your reply Mark. I'll give that a shot
Cheers

Tgd87
Comment by ScooterGirl

The following line of code in class VarispeedSampleProvider throws two errors on build:
public WaveFormat WaveFormat => sourceProvider.WaveFormat;
Errors:
; expected
Invalid token ';' in class, struct, or interface member declaration
Any ideas?

ScooterGirl
Comment by Mark Heath

what version of Visual Studio are you using?

Mark Heath
Comment by ScooterGirl

VS2013. I’m thinking I should be in 2015 or greater?

ScooterGirl
Comment by Mark Heath

Yes, I'd recommend going to 2017, there's no good reason not to, and you can use the latest C# features

Mark Heath
Comment by Zam

Hello Mark, may I use this useful SampleProvider on my own Open Source project... maybe also doing some modifications?

Zam
Comment by Mark Heath

of course, glad its of use to you

Mark Heath
Comment by Zam

Thanks!
Also gratz for the great work

Zam
Comment by blahblahblahblah

Thanks for this tutorial! Please excuse the stupid question, but do you know how it would be possible to load a WAV file instead of MP3?
EDIT:
Nevermind, I just needed to create a WaveStream first:
WaveStream waveStream = new WaveFileReader(filePath);
WaveChannel32 inputStream = new WaveChannel32(waveStream);

blahblahblahblah
Comment by Mark Heath

Just use AudioFileReader for a simpler way to do this

Mark Heath
Comment by FlusherCheese

What do you mean? i wanted to do the same thing...

FlusherCheese
Comment by Mijin

Hi Mark, I'm using your code in an app that processes educational audio files (e.g. to automatically repeat sections, or play sections more slowly, etc)
It's all good, except that the sound quality is quite poor when slowing down audio; it becomes very choppy; it sounds like some kind of unintentional reverb.
Any clues? From initial debugging it looks like the issue is in the actual SoundTouch library, not your wrapper.

Mijin
Comment by Mark Heath

Yes, unfortunately it's very hard to generate natural sounding slowed down audio. You'll hear similar artefacts slowing down a YouTube video below 50%

Mark Heath
Comment by Mijin

Thanks for the reply.
I would say though that YouTube-quality audio slowdown would be fantastic. The results I'm getting right now are definitely inferior; even slowing down to 80% speed already results in audio that you need to strain to clearly hear exactly what's been spoken.

Mijin
Comment by Mijin

OK, I just tried randomly screwing around with the SoundTouch settings (without entirely knowing what I was doing).
I found that reducing the SequenceMS parameter greatly improves the clarity of slowed down speech.

Mijin