Sampling Spec Wars
Technology, Digital Sampling
Published by Keyboard/November 1987, posted june 2002
Peter Gotcher used to teach tennis and play drums, but these days he's paying the
rent by running Digidesign,
a music software company specializing in digital sampling
technology.
- WHEN I WAS 14,1 EXPERIENCED ONE of life's great disappointments. With
my savings from a summer of watering and weeding in hand (about $130),
I convinced Mom to drive me to the local departmentstore to buy my
first stereo system. I selected a 300-Watt Supersystem, confident that I had
choosen the ultimate audio weapon for parental aggravation. Once home, I
frantically assembled my new prize, cranked the volume to 11 (okay, 10),
inserted an 8-track cartridge of Black Sabbath's first album and
. . . not much happened. My Supersystem was only slightly louder than
my sister's flute, and sounded even worse (I didn't think that
was possible).
It seems my stereo was (bold type) 300 watts, (fine print) Peak Music Power. Well,
"peak music power" basically means that the amplifier could produce 300 watts, but only
for one microsecond, at 8921Hz, with 50% distortion, into a speaker of infinitely low
impedance and in a vacuum. In my bedroom, it produced about 15 watts. The lesson
I learned the hard way was that a watt is not a watt unless it is an RMS measurement,
across a broad frequency range and at a low distortion level.
- Spec confusion is commonplace in the hi-fi biz. The bad news is that it's infiltrating
the electronic keyboard biz as well. Potential consumers are bombarded with literally
hundreds of different buzzwords (manufacturers like to coin new phrases to increase the
marketing allure of a new box), and there are often many different methods for measuring
the same spec (signal-to-noise ratio, dynamic range, distortion,etc.). Just when I
(and many of you, I'm sure) had finally figured out the stereo shenanigans, along
comes digital sampling with yet another set of mysterious factors that determine
sound quality. Confusion reigns, it seems, among both musicians and manufacturers
when comparing samplers - particularly when the discussion centers around those
naughty little things called "bits."
Bits are like watts. Some people think the more the better. Not necessarily so, though,
and therein lies the question this column asks: When does an eight-bit sampler sound
better than a 12-bit sampler, which may in turn sound better than a 16-bit sampler? The
answer lies in understanding the signal path of a typical sampler. It goes like this:
The signal to be sampled arrives at the sampler's sample input. The first circuit it
encounters is an input filter, whose responsibility it is to remove all frequencies
that are too high for the sampler to record. If frequencies higher than 1/2 the sample
rate reach the next stage (the analog-to-digital converter), a nasty phenomenon called
aliasing may occur, producing a particularly obnoxious type of digital distortion in the
audible signal. To avoid aliasing, a lowpass filter with a very steep rolloff
(often called a brick wall filter) is needed. If the filter's slope is not steep enough,
aliasing occurs. If its corner frequency (the frequency at which it begins to rolloff)
is too low, high frequencies are lost, and the resulting sample sounds dull.
- A good input filter is difficult to design and expensive to manufacture, but the
penalty for poor or compromised design is noise, distortion, and excessive phase shift.
Analog circuits that inhabit the same box (and often the same circuit board) as digital
circuits are likely to pick up strange digital noises (clock frequencies and radio
frequency emissions). All of a sampler's analog circuitry is susceptible to digital noise,
but any noise caused by the input filter is digitized (sampled) and becomes an
inextricable part of the sound. The input filter is the weakest link in many samplers.
After the input signal has been subjected to the abuse of the input filter, the next
stop is the analog-to-digital (A/D) converter, the device responsible for converting
the analog signal (a varying voltage) into a digital bit stream (O's and 1's).
A/D converters come in many flavors. Some companies, including E-mu and Kurzweil,
do nifty little tricks with the incoming data, essentially squeezing or compressing
more data into fewer bits. Data compression of this sort (also known as nonlinear data
formats) makes the bit war scheme even more difficult to sort out. Nonlinear data
formats can increase the dynamic range of sample playback, but some nonlinear formats
compromise performance in other areas, including transient response, noise modulation,
etc. We'll have more on this in a future column. Are data compression schemes just a
lot of hype? Not necessarily. They're just harder to evaluate from published specs.
Which brings us to the common type of A/D convertion: linear. A linear converter
measures the voltage level of the input signal each time a sample is taken, and outputs
a digital word that represents its best approximation of the level (amplitude) of the
signal.
- The accuracy of this approximation is determined by the resolution of the A/D
converter. A 16-bit digital word can represent over 65,000 different levels, but an eight-
bit digital word can only represent 256 different levels. The ND converter rounds
the sample measurement off to the nearest available value that can be represented by a
single digital word. The fewer different levels you have to describe the waveform, the
greater the possible error for each sample. The digital waveform that results is a jagged,
staircased shape that approximately traces the input analog waveform. But the picture
of the original waveform is distorted, resulting in a type of digital distortion called
quantization noise - a grainy sound that is particularly audible at low signal levels.
- Clearly , a 16-bit linear sample more
accurately describes the audio waveform than an eight bit linear sample, resulting in a
more accurate reproduction of the sound, less quantizing noise, a greater dynamic
range, and so on. Anyone who has read a basic description of sampling (in case you
haven't, see Keyboard, Dec. '85, and Terry Fryer's Jan. '86-June'86 columns) accepts this
immutable fact, and therein lies the source of much spec confusion. Imagine a 16-bit
sampler with a crummy input filter. The 16-bit A/D converter might have a theoretical
dynamic range of 96dB, but if the input filter has a signal-to-noise ratio of only 70dB,
the battle has been lost before the signal is even digitized. The sampler's overall
S/N ratio is no better than its weakest link.
Noise and distortion aside, which factors determine the frequency response of a sampler?
The performance of the digital process is determined by a few key factors, but the analog
circuits involved are much less predictable. The widest possible frequency response of a
sampler is 1/2 of its sample rate (the number of times per second that the A/D converter
measures the analog waveform). For example, if a sampler uses a sampling rate of 30kHz
30,000 samples per second), the theoretical frequency response would be 15kHz. Right? Wrong.
This is a common misconception. I have never seen a sampler with flat frequency response
at its Nyquist frequency (1/2 the sample rate).
- Once again, the guilty party is the analog input filter. As I mentioned earlier, all
frequencies above the Nyquist frequency must be removed from the analog signal
before it reaches the ND converter to avoid the disastrous effects of aliasing. All analog
filters have a slope. To achieve the degree of filtering needed at the Nyquist frequency,
they must start rolling off at a lower frequency. As a result, the real frequency
response of our 30kHz sample rate might be -3dB at 10kHz, -1OdB at 12kHz, and -40dB at
15kHz. Filters with a sharper rolloff require a greater number of poles in the filter design,
increasing both the cost of the filter and (usually) the amount of phase shift it causes.