What factors can we look at to determine our studio needs when it comes to an audio interface? What technical considerations are important? Over the course of my education and career, I have likely purchased more audio interfaces than any other piece of equipment. Here, I'll attempt to pass on everything I've learned along the way.
What is an Audio Interface?
First of all, what’s an “audio interface”? The term refers to any device that converts analog audio into a computer-readable digital format. Audio interfaces come in all shapes and sizes. There’s the built-in sound card inside your computer. There’s tiny little USB dongles. There’s the circuitry built into a USB microphone. And then there are breakout devices, ranging from a simple USB interfaces with a microphone input all the way up to the large rackmount interfaces found in many studios.
While the information below certainly applies across most audio interfaces, for this article in the context of a home studio, I am primarily addressing small two- and four-channel breakout devices that connect to your computer via USB, Firewire and other similar digital connection ports.
What about the Interface within My Computer?
There’s little that would compel me to recommend using the audio interface built into your laptop. They’re noisy, unbalanced and made of components with varying degrees of quality in order to keep costs down. This applies to the interface on smart phones as well.
For most applications, I recommend recording away from your computer altogether. Without the benefits of a control room, fan noise and other computer sounds will bleed into the recording.
In my studio, when we set out to record voices, I typically shut down any electronics and computer systems in the studio, pull out a standalone recorder and record with a clean background, free of fans and mouse clicks.
However, there are times when having the computer nearby is necessary to productivity or workflow. For that, we need an interface that will suit our needs. And that’s where things get tricky.
Different sound devices are made for different applications. For example, I wouldn’t use a Presonus Audiobox to record sound effects. The mic preamps are too noisy for detailed sound recording and the device doesn’t handle the sampling rates I need. For voice, however, the Audiobox would likely suit many people’s needs.
Let Alitu Take Care of Producing Your Podcast
Alitu is a tool that takes your recording, polishes it up, adds your music, and publishes the episode, all automatically.
So, what are you looking for? Let’s take a look at the different features you’ll find, and what they mean.
USB vs Firewire
I hear arguments regarding USB vs Firewire all the time. I don’t intend to rehash them, as most of them are highly subjective. The arguments are often weighted by whether the individual uses a Mac or PC, and they fail to look at the bottom line: both USB and Firewire are plenty fast enough to handle your audio signal.
The most determining fact in whether to use USB or Firewire (or PCI, or Fireport, or Thunderbolt) is your computer. If your computer is not slotted with Firewire interfaces, there isn’t a compelling argument to purchase a separate firewire interface to make other hardware work.
Analog vs Digital
Overall, the average user is going to be considered with analog inputs and outputs (I/O). Analog I/O refers to any signal where audio information is transmitted in the form of electrical pulses. In other words, if you plug a microphone, guitar or monitor into an input or output to transmit or receive sound, the I/O analog.
This becomes confusing because the main purpose of an audio interface is to convert your audio to a digital signal, or to convert from digital back to analog in the case of outputs. An easy, if oversimplified, way to think of it is that if you can hear it with your ears, even if it is from a digital source, what you’re hearing is an analog signal.
Digital Inputs and Outputs (I/O)
Digital I/O ports include MIDI, SCSI, Timecode, optical, S/PDIF and other ports used to carry machine-readable digital information from one system to another without converting them to an analog sound source first.
I cover digital inputs in the context of recording voice only because it’s commonly a feature on audio devices. If you use outboard mic preamps, digital mixers or other digital devices in your chain, it’s important to ensure your devices have compatible digital inputs. Otherwise, you can likely overlook this feature.
Analog Inputs and Outputs are one of the most important items to pay attention to. Typical external sound cards support both XLR and balanced TRS (¼”) cables, often in the same port (referred to as a combo connector). Ensure you use a balanced connection to avoid unwanted electrical interference. And make sure the audio connections on your interface match the connections on your microphones, monitors and other equipment.
Another Analog I/O factor is the number of ports. Audio Interfaces typically have two inputs and two outputs (2×2). Some have four inputs and two outputs (4×2).
It’s important to know how many devices you intend to connect to your audio interface in order to determine number of input and output ports. Multiple monitor combinations (5.1, quad, 7.1, etc) require an output for each monitor. Recording multiple actors requires enough inputs for each microphone you intend to plug into the device. For most podcasters and voice actors, a 2×2 or even a 1×2 recording device (like most USB microphones) are sufficient.
MIDI is used primarily for communicating between sequencing software and digital instruments and equipment. MIDI can also be used to control external peripherals like rackmount reverb modules. Unless you are a musician, a sound designer or a lighting designer, you likely don’t need MIDI ports.
Word clocks are used to synchronize digital playback and recording devices. Typically the internal clock from your computer hardware or software are sufficient. Again, unless you are heavily into creative audio production and design, a word clock is likely not a determining factor in an audio interface.
There are a number of different audio driver types. The key here is to ensure your audio interface meets your hardware and software needs. PC computers typically run ASIO, WDM and MME. ASIO is low-latency. Core Audio is OSX’s low latency driver. Ensuring your audio device is ASIO or Core Audio compatible will allow the widest for compatibility. Check reviews for inherent issues between PC and Mac compatibility before you purchase.
I’ll be honest here. This is primarily a Pro Tools consideration. It’s also one of the reasons I’ve gone through so many audio devices. If you’re working in Pro Tools, this can be a frustrating bit of trial and error. Something as simple as the crystal used in the sync clock of your audio device can make Pro Tools choke. This is less an issue since version 11, and there are a number of workarounds using a go-between software audio interface like ASIO4ALL (PC) or Soundflower (MAC).
Sampling Rates for most applications run from 44.1 to 192 kHz. Lower sampling rates are available, however they’re typically used for telephony, toys and other situations that do not require high-fidelity recording. In general, most studios record voice at 48 kHz. This allows some breathing room in the audio, and gives audio editors and designers a little room to stretch and pitch voices without losing audio fidelity.
For most voice and interview podcasts, 44.1 kHz is sufficient to capture excellent vocal audio. For voice actors, I recommend sticking to the 48 kHz studio standards. If you do voice for creature effects, higher sampling rates may be desired.
Bit depth is important in determining your dynamic range for recording. Dynamic range is the difference between the softest signal your device can register above the noise floor and the loudest signal your device can reproduce without distortion. 16-bit is suitable for most podcast voice applications. For voice actors and studio professionals, 24-bit is standard. 32-bit is starting to become more common, but again these are typically used by folks like me who need to record loud engines, shotguns and jets without distorting.
This is for those of us who use our audio devices in the field. Handheld devices, like the Zoom H4, often have a setting to operate as an audio device on most computers, making them more adaptable for recording interviews in that cafe, rather than being tied to the studio.
In A Beginner’s Guide to Microphones for Voice, I covered a number of different technical considerations when comparing microphones. The good news is, most of that information applies here as well. Let’s review!
Frequency response refers to the range of frequencies your microphone can accurately reproduce at an equal level. Audio interfaces should record anything from 20hZ to 20kHz, the range of human hearing and I would be highly dubious of any interface that doesn’t record at a minimum of that range.
Impedance is a measure of your equipment’s resistance. Low-impedance, or low-Z, inputs allow long mic cable runs without introducing noise or reducing frequencies.
Equivalent Noise Level
Also known as self-noise, the equivalent noise level is the electrical noise or hiss a microphone produces. In general, a self-noise specification of 28dB and lower is acceptable for quality recording.
Signal to Noise Ratio (S/N)
This is the difference (in dB) between the audio interfaces sensitivity and the equivalent noise level. 64dB and higher is good.
As with microphones, the choice of an audio interface largely depends on your application. Here’s what to look out for depending on your recording purposes:
Analog I/O: 1 or 2 input, 2 output
Sampling Rate: 44.1 kHz
This is a CD-quality setup which is great for general voice applications where creative effects like pitching and time stretching won’t be used. It converts well to mp3 and keeps storage size down. Battery operated devices will often last longer at lower sampling rates and bit depths
Analog I/O: 1 or 2 input, 2 output
Sampling Rate: 48 kHz
A studio-compliant configuration that’s good for general recording but allows some headroom for pitching and time stretching vocal performances. If you’re recording professionally for any studio, unless directed otherwise, these are the settings to heed.