How to Do Crosstalk Edits: At-a-glance
‘Crosstalk’ is the term used to describe when more than one person is talking at the same time on a podcast. Sometimes this can be a good thing, other times, not so much.
Doing a crosstalk edit can help in 3 ways:
1. Identify sound issues: It can give a peek into what, if any, sound issues need special attention. This helps with scheduling editing time for the episode.
2. Monotask: Making all crosstalk decisions in this edit can help you focus on bigger content decisions in the next pass.
3. Simplify: Mixing multiple tracks down to 1 track after a crosstalk edit can make things simpler when doing your content edit.
Podcasting is an emotional medium so it’s important to keep emotional moments in a podcast. While editing episodes with energetic conversation it can be tempting to want to remove all cross-talk. It may seem cleaner that way. But those messy talking intersections hold emotional elements that will probably connect with your listeners the most.
What’s a Crosstalk Edit?
Crosstalk are the parts of a multitrack session where all of the participants are speaking at the same time. Here is an example from Audacity, a commonly used production software. At the end of the white highlighted area, you can see that the speaker in the top track and the speaker in the bottom track are talking at the same time.
In order to make any adjustments to this in the editing phase, you’d need to have separate audio tracks(like the above two tracks) because you need to isolate each speaker’s talking bits. When many voices are recorded on a single track and there’s crosstalk, the sounds blend together. You can’t make any adjustments on one of the many tracks. Think of it like crayons that melt in the sun: once they blend together you can’t separate the colors out again. I know, I tried as a kid.
Okay, to be fair, crosstalk can include sounds other than talking such as coughing, mic bumps, and such. But the decision tree for unwanted sounds like these is easy: silence them (and add room noise if needed).
For information on how to deal with those pesky other sounds, check out Stop! What’s That Sound: Troubleshooting Audio Issues.
What we’re focused on in this post are the talking collisions.
Overview of My Crosstalk Edit Process
My personal crosstalk edit process goes something like this:
- start at the beginning of the track and look for the first crosstalk segment
- when I find a crosstalk section, I ask myself a few questions (more on these shortly)
- I make any needed changes to one of the audio tracks
- I move on to the next crosstalk section
For a 60 minute interview with 2 speaking tracks, this usually takes me about 20 minutes. The total time spent depends on how many crosstalk moments there are and how picky I get with what is silenced, faded out or volume adjusted.
Now, like Colin shared in How Long Does it Take to Create a Heavily Produced Show like the Serial Podcast? I’d like to share a specific personal crosstalk example with you.
Sounds to Keep: Adjusting Laughter & Confirmation Moments
The below screenshot shows two mono tracks from an interview podcast I edited. The top one is me talking and the bottom one is the guest. You can see some crosstalk at the end of the highlighted area. This is where the guest was laughing and then saying “right, right, right”. The size of the waveform shows us that it’s a rather loud laugh through the size of the waveform.
When I listened to this crosstalk section, I asked myself these questions:
Question 1: Does this cross-talk show a connection between the speakers OR build conversation momentum?
Answer: Yes! Both the laughter and the “right, right, right” are connection building moments.
Question 2: Is either track more important? Does it need to stand out more than the other?
Answer: Yes. The talking track (top) needs to stand out because that’s where the meaning in the conversation is.
Question 3: What tool would be best to adjust the crosstalk from the less important track?
Answer: In this case, I could have lowered the volume of the laughter and “right, right, right” parts of the bottom track. Instead, I decided to use “fade out” to highlight the intensity of the laugh for a brief second and then lower it to let the talking in the top track stand out for most of the crosstalk time.
What you do NOT want to do in cases like this is completely silence the laughter in the less important track. This laughter is a connection moment that listeners will enjoy hearing. Engagement in podcasts are gems and you never want to remove them. I learned this lesson the hard way. When I first started to edit podcasts I overedited a lot of things, including crosstalk and accidentally deleted valuable moments like this. The results were conversations that sounded flat and empty. Keep the emotions in but vary their sound intensity to highlight the words instead.
Sounds to Silence: Overly Repeated Phrases & Boredom Confirmations
When making these crosstalk decisions it’s important to think both of that exact crosstalk moment AND the entire session. Why? If the guest in the above example said “right, right, right” too often in this interview the impact for the listener would probably change from being a connection moment to being repetitive and annoying. She didn’t do this but some guests do. Some hosts do. It happens. So as you’re making these decisions, keep all of your previous crosstalk decisions for that episode in mind and silence as needed. CTRL-L, the silence shortcut key, in Audacity is your friend!
Voice Tone Can Tell You a Lot
Also, it’s important to listen to the tone of the speaker to figure out what kind of confirmation the speaker is doing. For example, in this “right, right, right” example, I paid attention to the energy in her voice. I needed to figure out if this was an enthusiastic “right, right, right” or a bored “right, right, right.” This confirmation moment is NOT a connection moment and might actually cue the listener into the fact that this speaker is getting bored. You know how we sometimes say “yes, yes, yes” to speed someone up in their long explanation.
Waveforms Can Also Provide Information
The same confirmation words and phrases can be used differently so the meaning is sometimes in the way the speaker is saying them. The tone of the speaker’s voice will let you know, so listen carefully and if it is boredom confirmation, silence it. A quick trick is to look at the waveform. Bigger waveforms tend to indicate energy and enthusiasm but smaller waveform bits can show a lack of energy and possibly these boredom moments. If you see these in crosstalk, silence them. They do not give the conversation any momentum or show a connection with the host.
A Note on DAWs:
In case you’re wondering, DAW means “Digital Audio Workstation” – basically, whatever software you use to record and edit your podcast.
I do need to warn you that doing this kind of crosstalk edit is easier on some DAWs than others. The above example is given with Audacity but I’ve also done this type of edit with Hindenburg and Adobe Audition also. Hindenburg has a half waveform, which makes it harder to see the words in the waveform. Personally, I find it harder to do a crosstalk edit with Hindenburg but love other features that they have like auto-levelling and a super handy clipboard.
Audition has a full waveform but it takes more steps to fade and silence one track in a multitrack session. With all this in mind, I find Audacity the easiest of the 3 DAWs to do a quick, clean crosstalk edit. There are undoubtedly others that may be just as easy, I just haven’t played with them…yet.
And if you’d like more advice and guidance on editing, or on any other aspect of podcasting, be sure to check out The Podcast Host Academy. In there, you’ll get access to all of our video courses, templates, resources, and weekly live Q&A sessions, too!