… for Codecs & Media
Tip #577: VoIP Audio is Not High-Quality
Codecs are also why our phones work.
We’ve all heard of codecs. These convert audio, or video, from analog into digital signals and back.
Just as codecs are the heart of digital visual media, they are also at the heart of VoIP, which stands for Voice over IP. This technology is what allows you to connect a telephone to the Internet and have it actually work.
An audio codec works its magic by sampling the audio signal several thousand times per second. For instance, a WAV codec samples the audio at 64,000 times a second. It converts each tiny sample into digitized data and compresses it for transmission. When the 64,000 samples are reassembled, the pieces of audio missing between each sample are so small that to the human ear, it sounds like one continuous second of audio signal.
What I learned recently is that the codecs used for VoIP don’t sample at 64,000 samples per second. Rather, they sample at 8,000 samples per second. According to the Nyquist theorem, if you divide the sample rate by 2, that yields the maximum frequency response for that sample rate. This means that the maximum high frequency carried by most VoIP systems is 4,000 Hz. This is well below the frequency range of many consonants, such as “S” and “T.”
In case you were wondering, codecs use advanced algorithms to help sample, sort, compress and packetize audio data. The CS-ACELP algorithm (CS-ACELP = conjugate-structure algebraic-code-excited linear prediction) is one of the most prevalent algorithms in VoIP. CS-ACELP organizes and streamlines the available bandwidth. Annex B is an aspect of CS-ACELP that creates the transmission rule, which basically states “if no one is talking, don’t send any data.” The efficiency created by this rule is one of the greatest ways in which packet switching is superior to circuit switching. It’s Annex B in the CS-ACELP algorithm that’s responsible for that aspect of the VoIP call.
And, no, that won’t be on the quiz.