Varispeed Playback with NAudio using SoundTouch
In this post I will demonstrate how you can implement varispeed playback with NAudio using the excellent SoundTouch library. To do so, I’ve prepared a very simple Windows Forms Application that lets you load an audio file and play it at varying speeds.
The key parts of the application are as follows. First of all, the
SoundTouchInterop64.cs files include the necessary PInvoke signatures to call the x86 and x64 versions of SoundTouch respectively. These are
SoundTouch_x64.dll. Make sure these DLLs are available in the path when running.
I’ve also created a wrapper class called
SoundTouch which simplifies access to SoundTouch and calls the correct PInvoke method depending on whether the process is 64 bit or not.
The next part is
VarispeedSampleProvider. This implements NAudio’s
ISampleProvider interface so it can be easily inserted into a signal chain. It also exposes a
PlaybackRate property which can be set to 1.0 for regular speed and 2.0 for 2x playback etc.
public VarispeedSampleProvider(ISampleProvider sourceProvider, int readDurationMilliseconds, SoundTouchProfile soundTouchProfile)
The constructor for
VarispeedSampleProvider takes a source provider, which is the input stream you want to speed up (in our example an audio file), the
readDurationInMilliseconds, which allows control of how much will be read from the source provider in a single read. Obviously when you are speeding up or slowing down, the amount of audio you need to read from the source is different from the amount of time taken to play that audio. Something like 100ms will be fine to use here.
SoundTouchProfile allows us to specify what SoundTouch options we want to use. There are a few switches you can experiment with which adjust quality and performance. The most significant is
UseTempo. In tempo mode, SoundTouch will pitch compensate when you change speed, so the music will remain at the same pitch, just a different speed. This avoids the “chipmunk effect” when you play back at higher speeds. In the demo app I let you switch between modes while playback is stopped.
All that remains is to create an
AudioFileReader, pass it into the
VarispeedSampleProvider, and then pass that to the output device (
WaveOutEvent in our example) to be played.
The one other thing worth mentioning in the demo project is that I allow you to reposition in the file, and when you do so, it’s a good idea to tell SoundTouch that a reposition has taken place so internal buffers can be flushed. This is done by calling
Reposition on the
Want to see the code? You can access it on GitHub.
I am building a protoype lip-sync application for animation.
I'd like to adjust the speed and tempo of segments (based on milliseconds) within a single audio.
I am trying to get your varispeed code to work with the concatenation class without much luck. This might not even be the best way to achieve what I am after.
May I ask what you would suggest I look at to varispeed adjust millisecond-based segments within a single audio file?
Thanks for your time. Please let me know if I can provide any further information.
I have an audio file of spoken words and have aligned the text with the audio so I have a list of words with start and end times in milliseconds.
I'd like to be able to process each word's pitch and tempo using the millisecond timings from the alignment.
I understand that if I change the timing of one segment A then I need to add/subtract the final segment time for A from the positions of the next segments B, C, D, etc. This is a simple recalculating step I beleive.
I'd make my own custom IWaveProvider for something like this. That way you can control exactly how many bytes are passed through the VarispeedWaveProvider, before changing the pitch settings to other values.Mark Heath
Great thanks for your reply Mark. I'll give that a shotTgd87
The following line of code in class VarispeedSampleProvider throws two errors on build:ScooterGirl
public WaveFormat WaveFormat => sourceProvider.WaveFormat;
Invalid token ';' in class, struct, or interface member declaration
what version of Visual Studio are you using?Mark Heath
VS2013. I’m thinking I should be in 2015 or greater?ScooterGirl
Yes, I'd recommend going to 2017, there's no good reason not to, and you can use the latest C# featuresMark Heath
Hello Mark, may I use this useful SampleProvider on my own Open Source project... maybe also doing some modifications?Zam
of course, glad its of use to youMark Heath
Also gratz for the great work
Thanks for this tutorial! Please excuse the stupid question, but do you know how it would be possible to load a WAV file instead of MP3?blahblahblahblah
Nevermind, I just needed to create a WaveStream first:
WaveStream waveStream = new WaveFileReader(filePath);
WaveChannel32 inputStream = new WaveChannel32(waveStream);
Just use AudioFileReader for a simpler way to do thisMark Heath
What do you mean? i wanted to do the same thing...FlusherCheese
Hi Mark, I'm using your code in an app that processes educational audio files (e.g. to automatically repeat sections, or play sections more slowly, etc)Mijin
It's all good, except that the sound quality is quite poor when slowing down audio; it becomes very choppy; it sounds like some kind of unintentional reverb.
Any clues? From initial debugging it looks like the issue is in the actual SoundTouch library, not your wrapper.
Yes, unfortunately it's very hard to generate natural sounding slowed down audio. You'll hear similar artefacts slowing down a YouTube video below 50%Mark Heath
Thanks for the reply.Mijin
I would say though that YouTube-quality audio slowdown would be fantastic. The results I'm getting right now are definitely inferior; even slowing down to 80% speed already results in audio that you need to strain to clearly hear exactly what's been spoken.
OK, I just tried randomly screwing around with the SoundTouch settings (without entirely knowing what I was doing).Mijin
I found that reducing the SequenceMS parameter greatly improves the clarity of slowed down speech.