AI podcasting tools are popping out like bunnies in springtime, promising quick, efficient and superhuman results. These tools vary, and none can exceed human ability, given enough time and resources. But some AI podcasting tools might be worth your consideration if you need to do a task quickly. Let’s look at processes like scripting and recording, marketing and promotion, and making your cover art or podcast logo. We’ll examine AI tools for these tasks, how they work, and whether or not they can help you.
Transcriptions, Speech, and Artificial Voices
No matter what you do with your podcast, two central concerns are what you say and how you say it. Some people need help getting the words recorded, whether in audio or text form. Here are some AI tools that can help you make your podcast more accessible.
Whilst you shouldn’t use ChatGPT to script your podcast, it can certainly create a decent first draft for you to work from. It’s worth noting, too, that the demand to try ChatGPT right now is so high that the servers are usually overwhelmed.
If you do decide to use ChatGPT for your podcast, fact-check it and use it carefully. As with all tools, don’t use one for every task.
Whisper is another tool by OpenAI (the team who brought us ChatGPT). Whisper has been trained on 680,000 hours of audio data on the web, meaning it can generate some of the most accurate auto-generated podcast transcripts to date.
Generating transcripts for your podcast can be expensive or time-consuming. You either pay a ton of cash for a human to do the job or you create auto-transcripts that take almost as long to correct as they would do to create from scratch. Using OpenAI’s Whisper could save us all a whole bunch of time and money.
Descript started as a nifty transcription tool with artificial voices. Over time, it’s proven itself as a dependable podcasting tool for transcripts while growing to include video and storyboarding features. As Descript’s Lyrebird AI research decision continues to grow, their stock voices have changed to seem more natural. If you record a 30-minute script for descript’s Overdub voice training, it can generate new dialogue using an Overdub voice based on your recording.
So can you copy and paste a page of text into Descript’s Compositions window, and voila, instant podcast? To find out, I took a page of a story in the public domain from Project Gutenberg and used that as my podcast text. Descript can help you edit and upload your audio directly to your media host, so it’s pretty good as podcasting tools go. The AI voice, however, remained to be seen.
How Do Overdub Voices Work in Descript?
Training a new Overdub voice took about 40 minutes. I read Descript’s training script about wildlife and climate. Because I’m not a professional voice actor, it was tiring, my throat hurt, and my voice didn’t sound great.
Descript warned me that my pricing tier would only let me have 1000 words of a recognizable language with my Overdub voice, unless I upgraded to a higher price tier. I copied a story in the public domain from If: Worlds of Science Fiction, from July of 1952, and pasted it in as a new project. Descript showed me a pop-up warning, saying that I’d need to upgrade, or Descript would replace words outside the vocabulary with “gibberish.” The text was pretty simple, though, so I wasn’t concerned.
Listen to these samples and find out if you can tell the difference. Here’s a Descript recording of me reading Robots Of The World: Arise! by Mari Wolf.
Here’s a Descript recording of my Overdub voice reading the same text.
Well, they did warn me about the gibberish.
Overdub also took my broad American vowels and made them even flatter. Of course, if I still don’t like my voice, I can have any of Descript’s stock voices read my text for no additional charge.
Different voices in Descript’s stable have some emotional features. You can train your Overdub voice to imitate some emotional states. Training Overdub to imitate all the emotional states that a podcast audience would accept as meaningful would take much longer than recording my voice live.
The alternative would be to pay a voice actor to read my preferred text, record it and send it back. This would be more expensive and not as fast, but the result would have a more significant emotional impact.
Planning Your Podcast With AI
When you have an idea for a new podcast, there are hurdles like impostor syndrome, writer’s block, or self-doubt. These can hold you back and keep you from progressing. Unfortunately, there isn’t an AI software that generates Pure Genius and Performance Excellence. But, there are a few tools that can give you a crutch to lean on as you get started.
Banned in school districts across the nation for its ability to write a persuasive research paper, Chat GPT is one of the most talked about AI models on the planet, and it’s a lot of fun to play with. But depending on it to script your podcast episodes for you might not be the best idea. ChatGPT can make false information sound like facts so solid that it made me plan a trip to Glasgow to see the dinosaur custard cream factory.
Truthiness aside, Chat GPT is a great sounding board and idea generator. If you want to write a podcast episode about the tallest buildings in the world, ChatGPT will provide you with a list of buildings. Then, you’re free to research these in more reliable detail and add your own perspective. What would it sound like when Spidey swings between them? How much would it cost to clean all those windows? Which would Godzilla find the most tempting? Only you can analyze and deliver this to your audience.
When you get an idea for a new podcast, everything feels great. After a day or two, the shine wears off. If the idea doesn’t seem as rewarding as it did a minute ago, you can lose momentum. Or, your great idea might be stuck in, “What do I do next?”
The Alitu Showplanner can help. Part cheerleader, part strategist, this tool asks you questions about different aspects of your idea and uses your answers to formulate a launch plan. Here’s some more info about how Alitu can generate a launch kit so encouraging, it even shows you how “a podcast about watching paint dry” is a good idea.
AI Audio: Editing, Production
AI can offer a helping hand with some of the tricky and time-consuming aspects of recording, editing, and production. This isn’t necessarily about creating great-sounding audio for you out of thin air – it’s more about using tech to streamline and enhance what you’re already doing.
Neural network noise reduction, anyone?
Alitu: The Podcast Maker
Alitu is our very own ‘podcast maker’ app. With one single login and subscription, you can record (solo or calls), edit, publish, and distribute your podcast. Where the “AI” part comes in is mainly during the production process. Alitu automatically applies noise reduction, compression, and EQ to your audio. It levels everything up and optimises your loudness levels. It’ll also generate AI episode transcripts for you too, making it a fantastic tool for saving time and money.
You don’t need podcast music, but it has its benefits. Music can add a layer of professionalism and immediately make your show recognizable to podcast listeners. There are many ways to find and buy podcast music, including the AI route…
Melobytes: AI-Generated Music
Melobytes is an AI music platform you can lose hours playing around in. It’ll generate songs for you based on lyrics you enter, or even, images you upload. There are text-to-speech features, as well as loads of other tools you can tinker with. Quality of output can vary quite drastically, but if you achieve nothing else, you’ll have a good laugh.
Machine Learning Marketing and PR Tools
A media kit, at its most basic, is a folder containing your podcast art and a fact sheet. It can include press releases about specific milestones for your podcast. The fact sheet (to oversimplify) is the Who, What, Why, Where, When, and How of your podcast. They’re building blocks of information. Journalists use these routinely to write about any topic, so this fits neatly in their tool kit. This makes it easier for you to get press coverage and reviews.
Dubb Media, as Katie wrote, is an AI podcasting tool that’s a lot of fun to play with. It can help you figure out what stands out most about your episode and make a transcript and video clips that look cool on social media. Dubb can save time; once you upload your information, it works independently and sends you a summary after a few hours. But, it doesn’t make the simplest and most utilitarian asset of all: a fact sheet. Also, it’s not sophisticated enough to fully understand a podcast episode and explain it reliably or accurately.
Podcastmarketing.ai is another tool that transcribes your podcast and then uses artificial intelligence to distil it into show notes, episode descriptions, episode titles, quote cards, and social media posts.
To import your episode, you search by title, and the search option pulls the data from Apple Podcasts. I tried to use my podcast, but the search engine found an episode of Seattle Morning News instead. The episode discussed current world, national and local news.
The transcript was mostly accurate. The show notes, however, weren’t.
Podcastmarketing.ai also doesn’t save time. To generate your content, you must keep the browser window open and active on your screen while it’s working. You could walk away and let your computer run, but you can’t use your computer to do a separate task. The user interface says, “Building your listener pitch usually takes <1 minute.” I waited longer than that, and it never generated the pitch or other assets. At the time of writing, this isn’t an AI podcasting tool that can save you time, effort, or money, nor can it improve your existing work.
Capsho displays what appears to be an uncanny level of understanding when its AI generates your marketing materials. What makes Capsho work so well is that it asks you more questions than other AI marketing tools.
As an example, I edited a test episode of ADWIT. This was a year-end wrap-up episode, where we discussed the outstanding audio fiction we’d heard over the past year, as well as our wins, challenges, and goals for the next year.
Capsho generated the transcript and asked me to select an overall tone for the voice. I picked “quirky.” When I generated the materials, it would offer options to click on, such as “educational” or “engaging.” You can also select formats, such as lists of actionable tasks.
Capsho is programmed with emotionally affecting language. The AI analyzes your transcript and condenses it into something like a delicious, sweet beverage. Here’s the episode title and description Capsho generated.
Unlocking Creativity: Stories That Make an Impact
As I discussed my favorite audio 6630 Productions with my friend Sarah, I was surprised to discover the unexpected magic behind the scenes of these stories. From passionate creators to captivating soundscapes, I found a thriving community of audio drama that opened up a whole new world of exploration and creative growth. Little did I know at the time, this discovery would lead me on a journey of connection and growth that I could have never imagined.
In this episode, you will be able to:
1. Discover creative escapes during personal challenges for audio fiction inspiration.
2. Master the art of boundary-setting to balance your audio drama workload and commitments.
3. Uncover the power of keyword trends to attract a diverse podcast audience.
4. Rejoice in the achievement of audio fiction writing to stay motivated.
5. Grasp the significance of teamwork, acknowledgment, and growth in the audio drama sphere.Capsho’s description of an episode of ADWIT.
That’s distractingly effusive, but, other than some minor editing errors, not bad at all. Does this AI podcasting tool “understand” what the episode is about? It appears to describe both what we said, and what we meant, which is eerie.
Regarding Capsho’s pricing, their free trial is good for one episode. Their Podcaster plan ($29) generates a transcript and helps you make episode titles, descriptions, and show notes. Their Entrepreneur plan ($90/month) does all this, and even helps you tailor content specifically for the type of platform you’re using, whether LinkedIn or TikTok. This tier also includes a monthly Podcast Growth Masterclass.
Buzzsprout’s Cohost AI
Buzzsprout has been one of the best things to come out of Florida since Tom Petty. It’s an inexpensive and straightforward podcast hosting service. I’m one of those podcast snobs who doesn’t like their audio at 96 kbps mono, but most podcasters don’t care. Now, with Cohost AI, Buzzsprout has taken a big chunk off of your podcast publishing workflow. For an additional $10-$30 per month (depending on how many hours of data you upload each month), Cohost AI transcribes your podcast episode, offers five titles, and an episode description of roughly 230 words. It also breaks your episode into chapters and helps you place chapter markers.
This is an optional add-on. I’ve tested both Capsho and Cohost AI, and this feels similar to Capsho but with fewer choices. I enjoy the challenge of writing show notes. It feels like I’m getting my baby dressed up to send her out into the world. But, for the time it would save me, this option is very hard to turn down.
Ausha’s Social Media Posts with ChatGPT
Ausha has unveiled their Chat GPT-powered Social Media Manager, available for all Ausha customers. When you’re logged in, click on the “Communication” tab, and then “start a post on Twitter, Linkedin, Facebook or Instagram and use the button “Generate with AI” to help you get the perfect text for your social media publications.” Sounds really easy, but where does ChatGPT get information about your podcast to generate that post’s text? How do you know it’s relevant? If my episode title is “Three cool cats walk into a bar,” does ChatGPT know what I mean by “cool,” “cat,” “walk into,” or “bar?”
Podsqueeze is another tool that transcribes your episode, then makes promotion assets based on your transcript. Enter your RSS feed, select the episode you want to work on, and wait. In about ten minutes, Podsqueeze generates a transcript, and:
- Show notes
- Timestamps and chapter markers
- Tweets with emoji
- Links & Mentions (A list of things to link, with time stamps
- A blog post
- A newsletter issue
- Keywords to focus on
- Quotes and a quote image
The pricing is moderate for the features Podqueeze provides. The free tier has a lot of uses for someone who makes less than an hour of podcasts per month.
- Free: Transcribe and generate assets from your RSS feed for up to 50 minutes of podcast time per month, and save your content in your dashboard forever.
- Starter: $15/month or $144/year gets you the same features from the Starter tier, or you can upload local audio or video files, for up to 160 minutes of podcast time per month.
- Pro: $29/month or $288/year gets you all the same features as the Starter level, for 320 minutes of podcast time per month.
As with any of these tools, transcript accuracy affects what it can do with the assets Podsqueeze generates. It generates more kinds of content than Cohost AI, like tweets or the list of mentions to link. Podsqueeze’s AI is trained differently than Capsho’s, it doesn’t have as much pizazz. It’s a timesaver if you don’t mind taking the time to go in and “tune” the assets for accuracy. What sets it apart is the list of links and mentions, and that it can transcribe podcasts in over 30 languages.
Apparently, Texo can “eliminate inaccurate transcripts, add efficiency to your content process, and publish AI-written content developed with your audience in mind!” Okay, so it generates promotion assets based on a podcast transcript, like so many others. It “automatically extracts Headlines, Show Notes, Key Themes, Questions, Quotes, Social Media Posts and Hashtags.”
You can also “interact with a custom AI chatbot to extract any other content you might want.”
Texo’s free tier lets you upload one audio file per month. After that:
- Standard: Upload 5 files a month for $35.10, plus you can ask the AI chatbot 5 questions per upload
- Pro: Upload 10 files, plus you can ask the AI chatbot 10 questions per upload, for $80.10/month
- Business: Upload 20 audio files per month, plus you can ask the AI chatbot 50 questions per upload, for $224.10.
It’s interesting how the price per episode goes up, usually buying in bulk is a good reason to spend more. More money, more intelligence? I couldn’t tell you, because I didn’t get to try it, I’m still waiting for that confirmation email. The pricing seems awfully steep compared to Texo’s peers.
Flowjin, Headliner and Wavve
Imagine that you could feed your podcast episode into software, which would transcribe it and make it into short videos for social media, YouTube, or both. You might say, “Yes, I’ve heard of this. It’s called Headliner.” And yes, you’d be absolutely right. They’ve been making short video clips with options to include a waveform or captions for years. Some of you may say, “Nope, sorry, that’s Wavve.” You would also be right. Now, there’s a new contender in the audio-to-video space, Flowjin.
Flowjin appears to use AI to break your podcast’s audio into chapters, before making each chapter into a video file with a static image. That’s the one technical advantage over the others. You don’t have to think about what parts of the podcast are important. Flowjin also asks you for Twitter handles and images of the speakers; this is optional.
What’s the big difference between all of these tools? Ultimately, pricing.
So as to not overwhelm you with pricing tiers and features, here’s where their pricing and features are the most similar and different.
Headliner, Wavve, and Flowjin each have a free tier. Headliner’s is free, forever, for 5 videos and up to 10 minutes of transcription per month. Wavve’s is up to two minutes of video and transcription per month. Flowjin’s is a one-time-only upload of up to 80 minutes of audio, transcribed and converted into video clips.
- Headliner: Free, $7.99 or $19.99
- Wavve: Free, $10.99, $16.99, or $27.99
- Flowjin: $290, $490, $1490, or $2290.
If you wanted to make your podcast into 15 hours of video per month, you could either pay $20 a month, or $1490. I think I know what I’d pick.
Podium transcribes your podcast and makes shownotes and chapters, evaluates possible episode titles, highlights quotable moments, generates a list of SEO keywords, and makes social media posts based on that transcript. The free trial is good for up to three hours of audio. For $16 a month, you can get all of this for six hours a month of audio with Podium’s Creator plan. For $149 a month, you get this for 60 hours of audio.
Podium generates these as downloadable .txt files, or .vtt files, which you can use to overlay text in a video. Their Podium GPT tool generates darn near any kind of written content you can imagine based on your transcript: Twitter threads, LinkedIn posts, you name it, the sky’s the limit. I asked it to generate rules for a tabletop role-playing game based on my podcast transcript. It created a Candyland-style board game; not as complex as an actual TTRPG, but a playable game nonetheless.
Podium’s AI is really fast, compared to the previous tools I’ve tried. When I tested this, Podium said, more than once, that my side gig podcast, ADWIT, which I host with Sarah Golding, was hosted by Neil Jones. His name comes up once in the transcript. I’m not sure why Podium locked onto this like a transporter beam on a Starfleet captain.
Podium GPT appears to generate new text based on texts that it’s previously generated. Since it generated an episode summary that said Neil Jones is the host of ADWIT, when I asked who the host is, it doubled down. Nothing against Neil Jones, but his name was only mentioned once in the transcript. He’s not in the episode at all.
And, soon you can use Podium’s latest tool, Podbook, to turn your podcast into a book. In the meantime, maybe Podium could sacrifice speed for accuracy.
Can AI Understand and Promote Your Podcast Episodes?
It’s possible that someday AI podcasting tools may be able to extrapolate meaning from partial data, subtext, or nuance. Mark Lee, Professor Emeritus of Computer Science at Aberystwyth University, wrote that AI can’t reach its full potential without a physical body. It lacks a theory of mind or sense of self. Podcasts are about human experience. It takes a human to understand what a podcast episode is really about. When the AI tool knows what to ask the human user and how to use that information, it can improve. But there’s no one-size-fits-all instant AI solution.
Artificial Intelligence Cover Art and Logos
It’s not hard to make podcast cover art, but it takes special effort and a good eye for attractive visuals. Design programs like Canva have templates which can help you make a good one. There are dozens of tutorials on YouTube to show you how to make a podcast logo using open-source (free) image editing software, like Glimpse or GIMP.
And then there are AI art generators. Brace yourself. Here there be dragons.
Midjourney: An Accessible Example of AI Art Generators
Midjourney, like DALL-E and/or Stable Diffusion, is a program that generates images based on text descriptions. Founder David Holz says it’s intended for professional artists to quickly make prototypes for clients before making a complete product. None of these AI podcasting tools that make art are identical, but they work similarly. You enter a text description of what you want the art to look like. It generates an image based on that description.
How Does Midjourney Work?
Midjourney’s beta test is available via a Discord server. Users can make 25 images for free, and then pay $10 a month for up to 200 images, or $30 a month for unlimited use. There’s a bit of a learning curve to prompt Midjourney to make the exact image in your mind.
First, I tried to make an image for Podcraft. I asked for “a logo for a podcast about the craft of making a podcast.”
These four look cool. But, the colors are very different from our brand palette. Plus, I’ll have to fix the text (or change our podcast’s name). If I could break these images into layers, I could adjust the colors, font, and text using image editing software. Clearly, I need to learn how to prompt Midjourney differently. So, I asked for “a logo for a podcast about the craft of making a podcast, indigo, cobalt blue, yellow.”
I thought that Midjourney would serve me the same image style as the first try, but with the requested colour palette. Instead, it gave me a completely different image.
So, yes, I could use Midjourney to make a simple logo, but I’d spend as much time prompting Midjourney as I could to make it in Canva. Plus, if I wanted to repurpose portions of the image for my podcast merchandise, website, or other promotion assets, I’d need to use image editing software.
AI Art Generators and Ethics
Or, I could hire an artist, because if I expect to get paid for my podcasting work, shouldn’t I pay an artist for their work, too? AI art generators learned how to generate art using five billion images scraped from all over the internet, without the artists’ consent. At the time of this writing, at least two copyright lawsuits have been filed against the creators of Midjourney and tools like them. Using these tools is an ethical puzzle, to say the least.
AI Tools for Podcasters: Summary
There’s no doubt about it; AI will become a big part of the podcasting industry in the next few years. Everyone, from listening apps to podcast hosting platforms, will look for ways AI can enhance their services and user experience.
AI tools are so new that many don’t have the bugs worked out just yet. There’s plenty of resources to support improvement, though. In 2022, investors collectively poured at least 1.37 billion dollars into companies that make AI generators. Much of what AI learns depends on its input. Right now, “fun” tools like Chat GPT and Midjourney are getting loads of user input, so they have a lot of material. But, let’s not forget Tay, Microsoft’s chatbot who received so much racist, sexist invective that within 24 hours, she advocated for genocide.
There are plenty of reasons for software developers to make and promote AI podcasting tools. Podcasts provide a never-ending fountain of spoken text about all kinds of topics, with various languages, opinions, inflections and attitudes. They’re the perfect training ground for AI.
AI-Generated Podcasts: Just Because You Can, Does It Mean You Should?
Your podcast is more than a recording of words, sounds and music, edited as an audio file, and sent out into the world. How the voices sound, and what kind of audience connection and engagement happens are beyond the reaches of algorithms. A good podcast is a collage of unique ideas wrought digitally and shared with the world. AI podcasting tools can save some time. They may save you money. They can help you start your podcasting tasks, but they shouldn’t complete them for you. There isn’t an AI generator for meaning or emotional value.