Catbird Linux: Using Compression and Limiting for Podcast Sound

Written and curated by WebDev Philip C.
HOME Tips and Tricks Downloads Bug Fixes

Managing the amplitude dynamics of your voice is something every podcaster or voiceover actor should do. It is the element of sound processing which adjusts your vocal loudness over syllabic timeframes, making your voice more even, strong, and listenable despite other sounds which may be in your listeners' environments.

Even in a normal speaking voice, the words we say have some syllables which are louder than others, to which we add or subtract more loudness for emphasis.

a normal voice waveform-large
A normal voice waveform.

Early in the histories of radio and audio recording, engineers sought a way to reduce the impact of background noise. Also, the same technical people wanted a way to give the human voice more "punch."

Turning up the microphone gain or asking the talker to speak louder was not a suitable solution. Sure vocal technique matters some, but overdriven audio circuits sound really terrible. The audio waveform, which is an alternating current, becomes flattened and the spectrum becomes trashed with harmonics. Something important does happen in an overdriven amplifier, which engineers did find useful for utility and service communications: a cheap and easy improvement in peak to average ratio. Clipped signals have a much lower ratio of peak versus average power.

Another way to think of the effect is that a wide decibel range of inputs is squashed into a narrow output range, easily 10:1 or 20:1 for audio clippers. They can still be readable in bad conditions, when non-clipped voices are lost in the background noise. Methods have been developed to reduce the strength of distortion products on a radio signal, but these still don't produce a clean enough sound for podcasters or music studios.

a clipped voice waveform-large
A clipped voice waveform.

In addition to clipping a waveform, the ratio between peak and average voice power can be improved by compression: rapidly changing the audio circuit's gain over a period of several to a few hundreds of milliseconds. This is not a hard cropping of each cycle of a signal, but instead a gain change which holds loud intervals at a reasonable level while amplifying the quieter ones, so they are more similar in amplitude. In modern equipment, the amount of compression is expressed in ratios comparing the decibel ranges of input versus output signals (for example, 2:1, 3:1, or 4:1). Also, rapidness of a compressor's attack and slowness of its decay can be adjusted for best effect on the audio.

microphone icon
Consider these very nice microphones for an upgrade of your vocal sound.

Tonor Q9 (USB)

Blue Yeti Pro (USB)

Shure MV7 (XLR / USB)

Limiting is a function in modern audio processors which prevents signals from exceeding a prescribed level. A limiter behaves as a compressor with a very rapid attack, almost infinite ratio, and decay time of a few tens of milliseconds after a strong signal drops below the limiting level.

a compressed voice waveform-large
A compressed voice waveform.

Consider modern FM radio to be an example of how much compression and limiting is probably too much. FM stations tend to be obsessed with loudness, keeping their content over processed to both "beat the noise" and compete with other stations. This, even when strong stations are essentially free of background noise, much unlike AM radio.

In practice, as a podcaster, you should experiment with your audio and find the best settings for a light to moderate amount of compression. After the compressor, a stage of limiting can catch any signals not already reduced enough and prevent them from clipping or simply exceeding the desired amplitude limits. You want your signal to fit within the loudness limits of the platform you upload to, while not losing the softer syllables too far down below the maximum levels. It should be clean, clear, and enjoyable for listeners to hear while wearing earbuds and going about their daily tasks.

An optional, final processing stage after compression and limiting, is use of a gate. Gates remove the unwanted low level sounds heard in the recording environment - whirring fans or air conditioning, humming motors, or perhaps a bit of electronic hiss from a microphone / preamplifier unit.

a normal voice waveform-large
A gated voice waveform.

I have written here of "audio equipment" as if the items above are separate boxes you must find at an electronics shop. You need not visit a shop; only to install a good audio editor and some processing plugins on your computer. These are stages of processing - mathematics and code applied There are several free digital audio editor / workstation packages which can do excellent work on your recordings. There are commercial packages which work nicely and are priced for perfection. Consider features and pick your tools.

Here are some typical settings which work, or at least serve as a start, for setting up compression and limiting in your podcast audio chain. It is assumed that you have made the best recording you can and have already done any necessary noise reduction and equalization. Compression and limiting should be last, except make a gate last if one is applied.

  1. Do other audio processing, then normalize the result.
  2. Apply compression with these settings.
    • Lookahead Time (if available) 20 ms
    • Attack Time 5 ms
    • Decay Time 500 ms
    • Compression Threshold -15 dB
    • Compression Ratio 2:1 (or 3:1 for stronger effect)
    • Knee sharpness 2.5 dB
    • Makeup Gain 10 dB
  3. Apply limiting with these settings:
    • Limiting Level -1 dB
    • Lookahead Time (if available) 20 ms
    • Attack Time (if available) 5 ms
    • Decay Time 100 ms
  4. Normalize again, to your usual reference.
  5. Apply an audio gate, if needed.
    • Gate Threshold -25 dB
    • Attack Time 10 ms
    • Release Time 150 ms
  6. Save or export a copy of this processed audio for your archive.
  7. Normalize to the LUFS level (i.e. -16dB) and format type (i.e. mp3) for your podcast platform.
  8. Save or export a copy of this finished audio for your podcast platform.

Notice how short the attack times are in the above settings? That is for the purpose of quickly reacting to the onset of strong pops or transients in the voice. The compressor begins acting quickly, evens out the amplutude dynamics across the time interval of spoken syllables. The limiter, however, acts rapidly if a strong sound reaches the limiting level, then releases control more quickly as the sound fades.

Those settings are a start; work with the parameters to find what works best for the audio content you create. They work great for speech; longer release times and reduced compression may work for music.

As always, good luck in your content creation and never fear testing and Tweaking to make a good thing even better.

© 2020 - 2024, All Rights Reserved.
Contact, Privacy Policy and Affiliate Disclosure, XML Sitemap.
This website is reader-supported. As an Amazon affiliate, I earn from qualifying purchases.