To: S. Andrea Sundaram
I've become interested in sampling rates and whatnot over the past couple of months, and I just read the first and second parts of your “Understanding Digital Music -- What Bit Depth and Sample Rate Really Mean” articles. I was thinking the same thing about the timing of each sample and the resulting digitized wave, but I couldn't find it articulated in any other article or discussion. Seems like common sense.
A lot of people are adamant that 44.1kHz is all you need because the human ear can only hear up to a 20kHz tone. But I don't understand the relevance of the Nyquist theory in determining appropriate sample rates -- because music isn't continuous tones. Surely we can hear the difference between two different instruments each creating a 20kHz tone, or even two copies of the same instrument. Electric guitars can produce notes above 15kHz (as I understand), and one guitar sounds different from another guitar. Wouldn't those differences only be apparent at resolutions well beyond 44.1kHz? Doesn't it stand to reason that we can perceive far more detail than can be recorded and reproduced from 44.1kHz or even 96kHz?
The Nyquist theorem tells us how fast we need to sample a signal in order to capture a given bandwidth; however, determining the appropriate bandwidth requires careful consideration. When creating the CD's Red Book standard, the folks at Sony and Philips decided that a bandwidth of 22.05kHz was adequate, and a 44.1kHz sample rate was compatible with the existing video equipment used to store digital audio at the time.
The high E on the guitar's 24th fret has a fundamental frequency of 1320Hz, but the harmonics do reach beyond 15kHz. Differences in the levels of each harmonic are, in large part, responsible for producing the timbres of different types of instruments, or of different instruments of the same type. For more information on this topic, see the "Understanding Harmonics" article I wrote for our SoundStage! Access publication. The body of each note can be treated as a continuous wave. Sampling at 44.1kHz will capture a large proportion of that harmonic information, although many instruments generate harmonics in excess of 20kHz. The start of each note -- whether it is plucked, hit, bowed, or tongued -- not only provides information about what instrument is playing, it is a very basic component of musical expression. A spectral analysis of these transient events reveals a wealth of high-frequency components, but the ear's recognition of these sounds may be better understood in the time domain. Either way, accurately recording these transients requires a system with a bandwidth well beyond 20kHz. Researchers are still working to create a model of the human hearing system, but it's safe to say that it's capable of more than identifying continuous tones from 20Hz to 20kHz. In that light, insisting that a 44.1kHz sampling frequency is sufficient to capture everything we can hear seems a bit simplistic. . . . S. Andrea Sundaram