Part of


The Podcast Host

Cleanvoice Review: Auto-Remove Filler Words From Your Audio

article featured image

AI is no stranger in the audio world.  It’s actually been used for quite some time – from behind the scenes in R&D, to upfront in audio tools available today on the market.   I recently took a look at a newer player in the audio AI world, Cleanvoice.  Cleanvoice’s main purpose is to automatically filter out filler words or sounds, such as um, ah, and uh – in any language.  In this Cleanvoice review, I’ll take you through my findings on the accuracy and quality of edit.  For this test, I used control samples in English and German.  Will this replace podcast editors? Read on to find out!

Cleanvoice at a Glance

The user interface for Cleanvoice is housed 100% on their website.  It uses a drag and drop system, which makes it quite user-friendly.  Once audio is uploaded, enter in your license key, hit “Clean my audio”, and you’re well on your way.  Processing time varies with file duration. If you do not want to wait, the system will email you a link to download the filtered file later. The link will stay active for 7 days after which your files will be removed from their server.

Cleanvoice UI: Cleanvoice review

Cleanvoice Review: The English Test

Below is a raw recording with an excessive amount of filler words:

Now let’s listen to the AI-edited file:

The Findings:

The best way to really know how software will perform is how it will handle its task in imperfect scenarios. I deliberately included 19 words that are considered filler by Cleanvoice.  Additionally, I made my pacing awkward and inconsistent.  I slurred some of the filler words into preceding words without pausing in between.

Right away I noticed that it took an already inconsistent pacing and made it somewhat worse due to awkward “edits”.  For instance, the word that immediately followed a removed filler was often placed right up to the word preceding the filler that was removed. This created unnatural and choppy sentences.  In other cases, endings of actual words were cut off. This seemed to be the predominant action taken by the software where the filler words/sounds were slurred with preceding words. In fewer instances, beginnings of words were chopped off. In addition, a few filler words were not removed at all.  This resulted in some very awkward-sounding editing.

The original test file was approximately 45 seconds long.  The “edited” file was approximately 39 seconds.    

When an editor removes words, sometimes one needs to decide how much space that each separate instance requires.  Pacing and inflection of the surrounding words play a major role in that decision-making.  There are times when one rule doesn’t apply to all.  You must use your ears to find what sounds the most natural to a listener.

The Human Edit: English

Let’s listen to a human edited file:

Granted, the pacing/delivery still sucks. If you wanted the pacing 100% smoothed out, you would need inflection correction or a pickup recording.   This is not required, however, as the general meaning remains intact.   The main focus here is that words weren’t cut off, nor are there partial sounds.  In the AI edit, there were a few spots that sounded like burping while speaking!

Cleanvoice Review: The German Test

Kessi, of Trilunis Studios, lent her voice and native German-speaking for this test.

This recording is more conversational sounding than my deliberately inconsistent pacing in the English test. Her pacing flows nicely.  It’s longer than the English test at approximately two minutes.  Kessi’s observations are as follows:

  • Cut out quite a few words in the middle of sentences.
  • Cut off words that didn’t even have “ah” or “um” in it.
  • Noticed the incorrect cuts of the first syllable of words were almost always preceded by an “um”.
  • It did catch the majority of “ah” filler.

Conclusion on German Cleanvoice Test:

The audio suffered similarly to the English test.  Syllables (or parts) of words were being cut off when preceded or followed by filler words, or when the word, itself, contained filler-like pronunciation.

What Cleanvoice Did Well:

  • It kept your file format intact.  Upload a wav, you got a wav file.  Upload an mp3, you got an mp3
  • It kept the sample rate intact.  The sample rates are 48kHz and 96kHz. Each edited file remained at the source sample rate.
  • The interface is clean and easy to use.
  • No digital clicks were present in the edit.

Note: The bit depth was at 32 bit upon download. However, neither file recorded at 32 bit. Therefore, the files will never be true 32 bit audio.

Let Alitu Take Care of Your Podcast Editing

Alitu is a tool that takes your recording, polishes it up, adds your music, and publishes the episode, all automatically.

Learn more about Alitu


  • At the time of this review, the edits are not very consistent and sometimes quite inaccurate. I would not recommend for commercial use or lengthy projects.
  • The maximum upload size is 400mb. Therefore, this will force you to upload a compressed mp3 for longer files. Usually, once file conversion takes place to a compressed format, you do not want to do any type of edits or processing on it. File compression should always be the last thing you do.

Cleanvoice Review: Final Thoughts

AI is becoming more prevalent in our everyday lives.  The thing with AI is that it needs time to learn and improve.  It will be interesting to see this process play out.  Cleanvoice is currently working on several new features, according to their website.  Perhaps their filler AI editor will see some improvements when their new features release.

In conclusion, the edits produced are too choppy and inaccurate in both English and German.  It’s safe to say that a podcast editor’s job is secure, for now. Lastly, the concept of Cleanvoice is exciting. It is one that has potential to become a must in the audio toolkit. However, it’s not quite there yet.  

authors avatar

Editor’s Note

Cleanvoice has released a beta test of multi-track editing. Again, it’ll be interesting to see how this tool improves.

Some podcasters love to learn and hone their editing and production skills. Others want to focus 100% on their content, and choose to automate or outsource the audio work. If you’re in the later camp, be sure to check out Alitu, our own ‘Podcast Maker’ tool which makes recording, editing, and publishing a podcast as simple as humanly possible. Sign up for a 7-day free trial and give it a spin.