Varispeed Playback with NAudio using SoundTouch

May 11. 2016 Posted in:

NAudio

In this post I will demonstrate how you can implement varispeed playback with NAudio using the excellent SoundTouch library. To do so, I’ve prepared a very simple Windows Forms Application that lets you load an audio file and play it at varying speeds.

The key parts of the application are as follows. First of all, the SoundTouchInterop32.cs and SoundTouchInterop64.cs files include the necessary PInvoke signatures to call the x86 and x64 versions of SoundTouch respectively. These are SoundTouch.dll and SoundTouch_x64.dll. Make sure these DLLs are available in the path when running.

I’ve also created a wrapper class called SoundTouch which simplifies access to SoundTouch and calls the correct PInvoke method depending on whether the process is 64 bit or not.

The next part is VarispeedSampleProvider. This implements NAudio’s ISampleProvider interface so it can be easily inserted into a signal chain. It also exposes a PlaybackRate property which can be set to 1.0 for regular speed and 2.0 for 2x playback etc.

public VarispeedSampleProvider(ISampleProvider sourceProvider, 
    int readDurationMilliseconds, SoundTouchProfile soundTouchProfile)

The constructor for VarispeedSampleProvider takes a source provider, which is the input stream you want to speed up (in our example an audio file), the readDurationInMilliseconds, which allows control of how much will be read from the source provider in a single read. Obviously when you are speeding up or slowing down, the amount of audio you need to read from the source is different from the amount of time taken to play that audio. Something like 100ms will be fine to use here.

Finally, the SoundTouchProfile allows us to specify what SoundTouch options we want to use. There are a few switches you can experiment with which adjust quality and performance. The most significant is UseTempo. In tempo mode, SoundTouch will pitch compensate when you change speed, so the music will remain at the same pitch, just a different speed. This avoids the “chipmunk effect” when you play back at higher speeds. In the demo app I let you switch between modes while playback is stopped.

All that remains is to create an AudioFileReader, pass it into the VarispeedSampleProvider, and then pass that to the output device (WaveOutEvent in our example) to be played.

The one other thing worth mentioning in the demo project is that I allow you to reposition in the file, and when you do so, it’s a good idea to tell SoundTouch that a reposition has taken place so internal buffers can be flushed. This is done by calling Reposition on the VarispeedSampleProvider.

Want to see the code? You can access it on GitHub.

Comments

September 11. 2017 04:52

Hi Mark
I am building a protoype lip-sync application for animation.
I'd like to adjust the speed and tempo of segments (based on milliseconds) within a single audio.
I am trying to get your varispeed code to work with the concatenation class without much luck. This might not even be the best way to achieve what I am after.
May I ask what you would suggest I look at to varispeed adjust millisecond-based segments within a single audio file?
Thanks for your time. Please let me know if I can provide any further information.
Background
I have an audio file of spoken words and have aligned the text with the audio so I have a list of words with start and end times in milliseconds.
I'd like to be able to process each word's pitch and tempo using the millisecond timings from the alignment.
I understand that if I change the timing of one segment A then I need to add/subtract the final segment time for A from the positions of the next segments B, C, D, etc. This is a simple recalculating step I beleive.
Cheers

Tgd87

September 12. 2017 12:30

I'd make my own custom IWaveProvider for something like this. That way you can control exactly how many bytes are passed through the VarispeedWaveProvider, before changing the pitch settings to other values.

Mark Heath

September 13. 2017 01:53

Great thanks for your reply Mark. I'll give that a shot
Cheers

Tgd87

October 15. 2018 20:36

The following line of code in class VarispeedSampleProvider throws two errors on build:
public WaveFormat WaveFormat => sourceProvider.WaveFormat;
Errors:
; expected
Invalid token ';' in class, struct, or interface member declaration
Any ideas?

ScooterGirl

October 15. 2018 20:52

what version of Visual Studio are you using?

Mark Heath

October 16. 2018 01:39

VS2013. I’m thinking I should be in 2015 or greater?

ScooterGirl

October 16. 2018 14:33

Yes, I'd recommend going to 2017, there's no good reason not to, and you can use the latest C# features

Mark Heath

November 17. 2018 21:17

Hello Mark, may I use this useful SampleProvider on my own Open Source project... maybe also doing some modifications?

Zam

November 17. 2018 21:50

of course, glad its of use to you

Mark Heath

November 18. 2018 07:35

Thanks!
Also gratz for the great work

Zam

December 5. 2019 17:43

Thanks for this tutorial! Please excuse the stupid question, but do you know how it would be possible to load a WAV file instead of MP3?
EDIT:
Nevermind, I just needed to create a WaveStream first:
WaveStream waveStream = new WaveFileReader(filePath);
WaveChannel32 inputStream = new WaveChannel32(waveStream);

blahblahblahblah

December 11. 2019 11:50

Just use AudioFileReader for a simpler way to do this

Mark Heath

April 12. 2020 01:01

What do you mean? i wanted to do the same thing...

FlusherCheese

August 26. 2020 09:33

Hi Mark, I'm using your code in an app that processes educational audio files (e.g. to automatically repeat sections, or play sections more slowly, etc)
It's all good, except that the sound quality is quite poor when slowing down audio; it becomes very choppy; it sounds like some kind of unintentional reverb.
Any clues? From initial debugging it looks like the issue is in the actual SoundTouch library, not your wrapper.

Mijin

August 26. 2020 09:46

Yes, unfortunately it's very hard to generate natural sounding slowed down audio. You'll hear similar artefacts slowing down a YouTube video below 50%

Mark Heath

August 27. 2020 03:33

Thanks for the reply.
I would say though that YouTube-quality audio slowdown would be fantastic. The results I'm getting right now are definitely inferior; even slowing down to 80% speed already results in audio that you need to strain to clearly hear exactly what's been spoken.

Mijin

August 28. 2020 16:40

OK, I just tried randomly screwing around with the SoundTouch settings (without entirely knowing what I was doing).
I found that reducing the SequenceMS parameter greatly improves the clarity of slowed down speech.

Mijin