How to Monitor Your Sound When Recording & Editing a Podcast

Speakers and Headphones and Monitors (Oh my!) Monitors are probably the most important audio device in our pursuit of professional sound and audio clarity. To be effective when mixing, studio recording technicians, engineers, sound designers and producers need to be able to hear the sound represented as accurately as possible. Only then can they create

introduction to studio monitors

Speakers and Headphones and Monitors (Oh my!)

Monitors are probably the most important audio device in our pursuit of professional sound and audio clarity. To be effective when mixing, studio recording technicians, engineers, sound designers and producers need to be able to hear the sound represented as accurately as possible. Only then can they create a universally listenable production, whether the audience listens through earbuds, stereo speakers or in a car traveling down the highway.

So what are monitors? How do they different from speakers? What about headphones? And what do I need to know to make good monitor purchasing decisions?

Speakers versus Monitors

The best way to understand the difference between speakers and monitors is to think of “speakers” as a more generic term and “monitors” (short for studio monitors or reference monitors) as more specific. Reference monitors are speakers. However, not all speakers are reference monitors. Even more confusing is the fact that video monitors, stage monitors, computer monitors, and studio monitors are all simplified and referred to as monitors. In any given studio setting, all of these monitors are often present.

I don’t have any fast and furious tricks for knowing which monitor someone is referring to except context. Typically when we refer to monitors in the studio, we mean reference monitors. Here’s a general breakdown of the differences:

Speaker or Loudspeaker
Speakers refer to any item used to project sound from a signal source. They typically consist of a woofer at minimum, but they sometimes include a tweeter to carry high frequency sounds more accurately. They come in various sizes and degrees of quality and include the other types.

PA or Live Loudspeaker
Typically used in theatre and music reinforcement systems and for public speeches and ceremonies, a PA is simply an amplified speaker. PA’s (short for public address systems) are less concerned with accurate representation of sound and more concerned with amplification to reach the most people. In the case of theatre and music loudspeakers, there is generally an attempt to produce high-fidelity sound for audiences, but they are often sweetened to make them sound clearer and more pleasing to audiences, and therefore don’t necessarily reproduce a sound accurately.

Stage Monitor or Live Monitor
A stage monitor is simply a loudspeaker designed so performers can hear themselves while on stage. They are aimed at the musician or performer, rather than at the audience.

Studio Monitor or Reference Monitor
These are speakers specifically designed to have a flat response so they accurately represent sound.

Need Help Launching Your Podcast?

We'll show you the exact steps, give you a launch schedule & help you along the way.

Check out the Courses

How monitors work

Just like microphones, monitors (and speakers) are transducers, meaning they convert energy from one form to another. However, where microphones convert sound vibrations into electrical energy, speakers go the other way, converting electrical signals back to sound so we can hear what was recorded.

Headphones vs Reference Monitors

Similar to speakers, headphones are a very general term for speakers that are worn over (or just inside) the ears. And just like speakers, there are balanced varieties that are referred to as reference monitors as well. Often they are termed over-the-ear or in-ear monitors. Over-the-ear monitors are more often used in studio situations as they do not bleed sound into open microphones. They are often referred to as “cans.”

Headphone style monitors are typically used for monitoring while recording, but can also be used for basic mixing and editing. It is typically unadvised to do complete mixes through cans because they can misrepresent how the audio will sound in an acoustic space. Air and walls and different acoustic anomalies will make reverb sound harsher and eq sound more midrange and unpleasant to the ear.

Simply put, headphone monitors often make everything sound too good, and as a result, producers will use more reverb and EQ than needed, making the audio sound muddy, over-reverbed and distorted when played through other speakers and monitors. So even if you do headphone mixes to edit dialog and narration, I recommend mixing audio at least once through a set of monitors before delivering your audio to an audience to ensure things sound consistent from audio device to audio device.


Frequency response
This is the range of frequencies your monitor can accurately represent. The range of human hearing is from 20Hz to 20000 kHz. This range decreases over time from the moment we’re born. Typically, we want to be able to listen to our mixes at a range from 40hz to 15kHz or better. Small monitors should have a low frequency of at least 70Hz and can be paired with a subwoofer to extend the range.

Off-Axis Response
Have you ever looked at a computer monitor or TV from the side and seen how the colors start looking wrong? Figures become distorted and more difficult to recognize. If so, then you’ll be familiar with the concept of off-axis response, which is similar, but with sound. Off-axis response is the shift in decibels when you’re not listening directly from the center of the listening field.

To test for off-axis response start at the center of the listening field while playing some audio. Move a small step off-center to the left or right. If there is a dramatic change in the sound (more than about 3dB), you are hearing what is referred to as a narrow off-axis response.

Typically, a good off-axis response allows two people (like a producer and engineer), sit side- by-side and listen to a mix with little change in the sound. It is for this reason that choosing monitors with a wide off-axis response (Or, simply “wide response”) is important.

Transient Response
Transient response deals with how the monitors shape how your sound is represented. A transient is the attack of your sound, a short, high amplitude signal at the beginning of a waveform. If you clap into a microphone. The transient is the first high peak at the instant that your hands came together.
Reference monitors need to represent our sounds in a way that they aren’t boomy or muddy and decay properly over time.

To test transient response, play a sound or music that you are familiar with. I prefer to use orchestral music or jazz, as these involve a wider range of frequencies. Listen to high notes, low notes and mid tones. Is there anything that feels or sounds “off” about the sound? Do cymbals sound harsh or uncharacteristically bright? Do basses sound unclear and lacking in definition? Typically, this is either a frequency response or a transient response problem.

Clarity and Detail
The object of selecting monitors is to find monitors in your budget that represent your recorded sound with precision. When listening through reference monitors, you should be able to pick up on small details that often aren’t heard through other speakers. Can you hear the saxophone player breathe before he plays? At the very least a good set of monitors will make subtle details more apparent than when listening through other speakers.

Other Specs to Consider

Low distortion
Any unwanted distortion is problematic. Look for monitors with a distortion less than 3% from 40Hz-20kHz at 90dB-SPL

This is how much power the speaker will put out in Sound Pressure Levels (SPL) at 1 watt from 1 meter away. 93dB/W/m is high. 85 is low.

Nearfield vs Midfield

Many professional studios are equipped with midfield monitors. They represent sound over a longer distance for bigger rooms. Typical home studios are equipped with nearfield monitors, which are smaller, less expensive and well-equipped to handle sound over shorter distances. Nearfield speakers also reduce problems associated with room acoustics as they are designed to be aimed directly at the listener's ears from a shorter distance to achieve the most direct sound possible.

2-way and 3-way

This is a tricky specification. It refers to the number of speakers in a monitor and how they are arranged. In general, 3-way speakers have a potential to produce more accurate sound, with less midrange distortion at louder volumes. This is sometimes true. However, inferior components, shortcuts, and misrepresented data can make this difficult to gauge. Always listen to monitors to verify whether they are right for you.

Active vs Passive

Active, or powered monitors have a built in amplifier. You just plug them in, turn them on, and the internal amplifiers power the monitor. Passive monitors require a separate amplifier to drive the monitor. Typically, most producers will be looking for active monitors.

Gear Recommendations

If you're looking for specific recommendations, here are a few resources for you:


Monitors affect your recording technique and the quality of your productions. Careful selection, placement and leveling provides a reliable foundation to create universally listenable mixes.

Specifications aside, it is extremely important to listen to your monitors. I highly recommend going to your local music or audio equipment store and listening to several monitors. Many of these stores have listening rooms where you can listen to the same audio over several different monitor systems. Listen carefully. Step from side to side. Walk around the room to get different perspectives on what you are hearing. Monitors are the most important tools in your arsenal to great, professional sound. Make sure you trust them.