AI podcasting tools are popping out like bunnies in springtime, promising quick, efficient and superhuman results. These tools vary, and none can exceed human ability, given enough time and resources. But if you need to have a task done quickly, some AI podcasting tools might be worth your consideration. Let’s look at processes like scripting and recording, marketing and promotion, and making your cover art or podcast logo. We’ll examine AI tools for these tasks, how they work, and whether or not they can help you.
Text, Speech, and Artificial Voices
At the beginning of your podcasting journey, there are hurdles like impostor syndrome, writer’s block or self-doubt. These can hold you back and keep you from progressing. Unfortunately, there isn’t AI software that generates Pure Genius and Performance Excellence. But, there are a few tools that can give you a crutch to lean on as you get started.
Banned in school districts across the nation for its ability to write a persuasive research paper, Chat GPT is a lot of fun to play with. But depending on it to script your podcast episodes for you might not be the best idea. ChatGPT can make inaccurate information sound like facts so solid that it made me plan a trip to Glasgow to see the dinosaur custard cream factory.
Truthiness aside, Chat GPT is a great sounding board and idea generator. If you want to write a podcast episode about the tallest buildings in the world, ChatGPT will provide you with a list of buildings. Then you’re free to research these in more reliable detail, and add your own perspective. What would it sound like when Spidey swings between them? How much would it cost to clean all those windows? Which would Godzilla find the most tempting? Only you can analyze and deliver this to your audience.
Whilst you shouldn’t use ChatGPT to script your podcast, it can certainly create a decent first draft for you to work from. It’s worth noting, too, that the demand to try ChatGPT right now is so high that the servers are usually overwhelmed.
If you do decide to use ChatGPT for your podcast, fact-check it and use it carefully. As with all tools, don’t use one for every task.
Whisper is another tool by OpenAI (the team who brought us ChatGPT). Whisper has been trained on 680,000 hours of audio data on the web, meaning it can generate some of the most accurate auto-generated podcast transcripts to date.
Generating transcripts for your podcast can be expensive or time-consuming. You either pay a ton of cash for a human to do the job or you create auto-transcripts that take almost as long to correct as they would do to create from scratch. Using OpenAI’s Whisper could save us all a whole bunch of time and money.
Descript started as a nifty transcription tool with artificial voices. Over time, it’s proven itself as a dependable podcasting tool for transcripts while growing to include video and storyboarding features. As Descript’s Lyrebird AI research decision continues to grow, their stock voices have changed to seem more natural. If you record a 30-minute script for descript’s Overdub voice training, it can generate new dialogue using an Overdub voice based on your recording.
So can you copy and paste a page of text into Descript’s Compositions window, and voila, instant podcast? To find out, I took a page of a story in the public domain from Project Gutenberg and used that as my podcast text. Descript can help you edit and upload your audio directly to your media host, so it’s pretty good as podcasting tools go. The AI voice, however, remained to be seen.
How Do Overdub Voices Work in Descript?
Training a new Overdub voice took about 40 minutes. I read Descript’s training script about wildlife and climate. Because I’m not a professional voice actor, it was tiring, my throat hurt, and my voice didn’t sound great.
Descript warned me that my pricing tier would only let me have 1000 words of a recognizable language with my Overdub voice, unless I upgraded to a higher price tier. I copied a story in the public domain from If: Worlds of Science Fiction, from July of 1952, and pasted it in as a new project. Descript showed me a pop-up warning, saying that I’d need to upgrade, or Descript would replace words outside the vocabulary with “gibberish.” The text was pretty simple, though, so I wasn’t concerned.
Listen to these samples and find out if you can tell the difference. Here’s a Descript recording of me reading Robots Of The World: Arise! by Mari Wolf.
Here’s a Descript recording of my Overdub voice reading the same text.
Well, they did warn me about the gibberish.
Overdub also took my broad American vowels and made them even flatter. Of course, if I still don’t like my voice, I can have any of Descript’s stock voices read my text for no additional charge.
Different voices in Descript’s stable have some emotional features. You can train your Overdub voice to imitate some emotional states. Training Overdub to imitate all the emotional states that a podcast audience would accept as meaningful would take much longer than recording my voice live.
The alternative would be to pay a voice actor to read my preferred text, record it and send it back. This would be more expensive and not as fast, but the result would have a more significant emotional impact.
Marketing and PR Tools
A media kit, at its most basic, is a folder containing your podcast art and a fact sheet. It can include press releases about specific milestones for your podcast. The fact sheet (to oversimplify) is the Who, What, Why, Where, When, and How of your podcast. They’re building blocks of information. Journalists use these routinely to write about any topic, so this fits neatly in their tool kit. This makes it easier for you to get press coverage and reviews.
Dubb Media, as Katie wrote, is an AI podcasting tool that’s a lot of fun to play with. It can help you figure out what stands out most about your episode and make a transcript and video clips that look cool on social media. Dubb can save time; once you upload your information, it works independently and sends you a summary after a few hours. But, it doesn’t make the simplest and most utilitarian asset of all: a fact sheet. Also, it’s not sophisticated enough to fully understand a podcast episode and explain it reliably or accurately.
Podcastmarketing.ai is another tool that transcribes your podcast and then uses artificial intelligence to distil it into show notes, episode descriptions, episode titles, quote cards, and social media posts.
To import your episode, you search by title, and the search option pulls the data from Apple Podcasts. I tried to use my podcast, but the search engine found an episode of Seattle Morning News instead. The episode discussed current world, national and local news.
The transcript was mostly accurate. The show notes, however, weren’t.
Podcastmarketing.ai also doesn’t save time. To generate your content, you must keep the browser window open and active on your screen while it’s working. You could walk away and let your computer run, but you can’t use your computer to do a separate task. The user interface says, “Building your listener pitch usually takes <1 minute.” I waited longer than that, and it never generated the pitch or other assets. At the time of writing, this isn’t an AI podcasting tool that can save you time, effort, or money, nor can it improve your existing work.
Can AI Understand and Promote Your Podcast?
It’s possible that someday AI podcasting tools may be able to extrapolate meaning from partial data, subtext, or nuance. Until then, it takes a human to understand what a podcast episode is really about.
Cover Art and Logos
It’s not hard to make podcast cover art, but it takes special effort and a good eye for attractive visuals. Design programs like Canva have templates which can help you make a good one. There are dozens of tutorials on YouTube to show you how to make a podcast logo using open-source (free) image editing software, like Glimpse or GIMP.
And then there are AI art generators. Brace yourself. Here there be dragons.
Midjourney: An Accessible Example of AI Art Generators
Midjourney, like DALL-E and/or Stable Diffusion, is a program that generates images based on text descriptions. Founder David Holz says it’s intended for professional artists to quickly make prototypes for clients before making a complete product. None of these AI podcasting tools that make art are identical, but they work similarly. You enter a text description of what you want the art to look like. It generates an image based on that description.
How Does Midjourney Work?
Midjourney’s beta test is available via a Discord server. Users can make 25 images for free, and then pay $10 a month for up to 200 images, or $30 a month for unlimited use. There’s a bit of a learning curve to prompt Midjourney to make the exact image in your mind.
First, I tried to make an image for Podcraft. I asked for “a logo for a podcast about the craft of making a podcast.”
These four look cool. But, the colors are very different from our brand palette. Plus, I’ll have to fix the text (or change our podcast’s name). If I could break these images into layers, I could adjust the colors, font, and text using image editing software. Clearly, I need to learn how to prompt Midjourney differently. So, I asked for “a logo for a podcast about the craft of making a podcast, indigo, cobalt blue, yellow.”
I thought that Midjourney would serve me the same image style as the first try, but with the requested colour palette. Instead, it gave me a completely different image.
So, yes, I could use Midjourney to make a simple logo, but I’d spend as much time prompting Midjourney as I could to make it in Canva. Plus, if I wanted to repurpose portions of the image for my podcast merchandise, website, or other promotion assets, I’d need to use image editing software.
AI Art Generators and Ethics
Or, I could hire an artist, because if I expect to get paid for my podcasting work, shouldn’t I pay an artist for their work, too? AI art generators learned how to generate art using five billion images scraped from all over the internet, without the artists’ consent. At the time of this writing, at least two copyright lawsuits have been filed against the creators of Midjourney and tools like them. Using these tools is an ethical puzzle, to say the least.
AI Tools for Podcasters: Just Because You Can, Does It Mean You Should?
AI tools are so new that many don’t have the bugs worked out just yet. There’s plenty of resources to support improvement, though. In 2022, investors collectively poured at least 1.37 billion dollars into companies that make AI generators. Much of what AI learns depends on its input. Right now, “fun” tools like Chat GPT and Midjourney are getting loads of user input, so they have a lot of material. But, let’s not forget Tay, Microsoft’s chatbot who received so much racist, sexist invective that within 24 hours she advocated for genocide.
There are plenty of reasons for software developers to make and promote AI podcasting tools. Podcasts provide a never-ending fountain of spoken text about all kinds of topics, with various languages, opinions, inflections and attitudes. They’re the perfect training ground for AI.
But your podcast is more than a recording of words, sounds and music, edited as an audio file, and sent out into the world. How the voices sound, and what kind of audience connection and engagement happens are beyond the reaches of algorithms. A good podcast is a collage of unique ideas wrought digitally and shared with the world. AI podcasting tools can save some time. They may save you money. They can help you start your podcasting tasks, but they shouldn’t complete them for you. There isn’t an AI generator for meaning or emotional value.