Currents by John Eargle (Jan. 1992)

Home | Audio mag. | Stereo Review mag. | High Fidelity mag. | AE/AA mag.

BITTER JITTER AND SWEETER DITHER

The words "dither" and "jitter" crop up often in reviews of CD players, and I know that there are many readers of Audio who aren't sure just what these terms mean. Both words have their root in Middle English, and both connote some degree of nervousness or unease. In general, jitter usually means physical shaking, while dither often carries the meaning of indecision or confusion.

In digital technology, jitter refers to uneven timing in the sampling of digital data being converted back into the analog domain. By design, the sampling of data from the CD is under the control of a crystal oscillator, which is a highly accurate operation. Even motional irregularities of the disc transport mechanism itself are carefully corrected by feeding the digital data into a buffer, from which it is carefully "clocked out" under the precise control of the crystal oscillator. Any degree of jitter remaining in a modern CD player is usually quite small and is not likely to be a function of any mechanical parameter of the player.

Which gets us to the subject of CD rings. Many people are still in love with analog, and CD rings provide an element of comfort for those users who feel that anything they can do to their systems under the rubric of tender loving care will make it sound better. In the analog days, TLC could improve sound; anti-static devices, record cleaners, and a host of other accessories could make greater or lesser improvements. CD rings are sold for the purpose of "smoothing out" variations in the angular velocity of the CD itself and are supposed to make it sound better. But as we have seen, the data from the disc goes into a buffer and out again under crystal oscillator control.

Nothing we do to the disc can improve this. In fact, there is good reason for some players to mis-perform if a CD ring is used, due to the added rotational inertia which the CD ring causes.

In any event, jitter is bad, but if it occurs at all, it is usually as a result of electronic--not mechanical-design or function. It can be audible, if not properly addressed, but most CD players and other devices take measures to prevent or cure it.

Dither is something else entirely. It is quite beneficial in digital recording systems and has been a part of signal transmission systems for years. Basically, dithering is a method of adding low-level noise to a system for the purpose of randomizing, or confusing, the small-signal behavior of the system. It has been used in mechanical systems to minimize backlash in gear trains, and it is widely used in TV transmission.

Nakajima et al. mention in The Sony Book of Digital Audio Technology that dither can be generated by amplifying noise from a zener diode.

Recall the early days of digital, when many critics stated that decaying reverberant signals often "disappeared" when those signals fell below the theoretical lower limit of system resolution.

It is not likely that this was truly observed in normal recordings, but in theory-and in the laboratory-it could be demonstrated! In a 16-bit digital system, the maximum signal-to-noise ratio is on the order of 96 dB. Any attempt to encode a signal that is smaller than the least significant bit of the system will result in nothing at all being encoded. In a manner of speaking, the signal is too small and simply falls in the cracks! Another problem widely discussed in the early days of digital recording was that system distortion, as a percentage of the input signal, actually increased as the system input was lowered. This too was true, in a system completely without dither, whereas in any analog system, distortion almost always diminishes as the input signal level is reduced.

These two problems can be eliminated by adding a small amount of dither to the input. This added noise input effectively lessens the maximum S/N ratio of the digital system ever so slightly, but it's a small price to pay for the advantages obtained. I'll briefly try to explain how it works.

First, the human hearing mechanism has a remarkable ability to hear a signal buried in noise. If the noise is fairly broadband (free of any specific, prominent frequency components), then we can easily detect a midrange sine wave signal buried some 10 to 12 dB below the level of the broadband noise. That is, we can detect a signal that has a negative S/N ratio! The ear's ability to do this is a measure of its ability to "lock on" to a signal when there are no nearby frequency components to mask that signal. This means that the ear's dynamic range--its ability to detect signals from the loudest all the way down to the quietest-is sometimes greater than the S/N ratio of the signals being detected.

The second factor operating here is that the presence of dither noise keeps the digital system from ever attempting to encode an input signal below the system's theoretical limit. What the system does instead is to encode a mixture of dither noise and that low-level input signal. Under these conditions, the low-level signal will be modulated by the noise itself and will appear as what is known as duty-cycle modulation or, more formally, pulse-width modulation (PWM) of the noise. When this is done, the ear tends to average out the signal from the noise, just the way it does in an analog tape recording system, where a sine wave may be similarly buried below broadband noise. In fact, the digital system at this point in its low-level operation resembles an analog system in many ways, surprising as that may seem.

Now, let's get back to those critics who said that reverberant signals often disappeared at low levels in digital recording. I earlier implied that this did not actually happen in real-world recordings, but only in the laboratory.

The reason? There has always been a certain amount of unintentional dither noise in most signal paths. It can be due to the self-noise inherent in microphones or to the input noise in all recording consoles. In order to function as dither, these various noise sources must have an amplitude about one-half (or more) of the least significant bit, and this is almost always the case.

This discussion has pointed up the difference between two often confused terms, dither and jitter. It has also pointed up the differences between two other terms, signal-to-noise ratio and dynamic range, which themselves are often confused. Remember, the maximum S/N ratio of a system is simply a measure, in decibels, of the maximum signal level relative to the level of the system's noise floor. The effective dynamic range of the system may be some 10 to 12 dB greater, taking into account the ability of the ear to detect signals below the noise floor.

(adapted from Audio magazine, Jan. 1992)

Also see:

Music of the Bitstream (Jan. 1991)

Philips Oversampling System for Compact Disc Decoding (April 1984)

PWM, PDM, 1-bit converters (Stereophile, May 1989), part 2 (1990), part 3

= = = =

Prev. | Next

Top of Page Home

Updated: Thursday, 2018-07-26 10:05 PST