96/24: Digital Heaven or Digital Hell? (Jul. 1998)

Home | Audio mag. | Stereo Review mag. | High Fidelity mag. | AE/AA mag.

By Bob Katz and Ken Kantor

-----------------

BOB KATZ IS AN AUDIOPHILE RECORDING ENGINEER AND PRESIDENT AND CHIEF MASTERING ENGINEER OF DIGITAL DOMAIN. ORLANDO. FLA. HIS RECORDING OF PAQUITO DRIVER AS BIG BAND WON THE 1996 LATIN-JAZZ GRAMMY AWARD.

If you want to learn about the sound of high-end digital audio technology, ask a professional mastering engineer. Day after day we slave over hot consoles, making careful A/B decisions about dithering (what flavor dither should we use to day?), equalization (should it be IIR, FIR, or analog?), word length (is this new 32-bit processor really better than yesterday's 20-bit?), and sample rate (which sample-rate converter really sounds best?). Every decision is important to us, and our clients (the record companies and the artists) are concerned about the quality of sound we deliver to them.

When I began digital mastering in 1988, the digital audio art was just coming out of the dark ages. I built the first workable model of Bob Adams' (then of dbx) 128-times oversampling analog-to-digital converter and used it to engineer the world's first high-oversampled Compact Discs for Chesky Records. The latest version of this ADC, made by UltraAnalog, has a 5-bit front end that is capable of 20-bit (120-dB) dynamic range when its output is decimated down to the CD sample rate. All high-quality audio A/D converters made today are oversampling designs, and this has made a big contribution to the exponential improvement in CDs since 1988.

The Second Digital Audio Revolution I mark 1993 as the beginning of the second digital audio revolution. Before then, I was recording direct to 16-bit with pink-noise dither added in the analog domain. (Dither is used to linearize quantization and enable signals at levels below -96 dBFS to remain audible.) Around 1993, engineers began recording at 20 bits, which sounds wonderful in the studio, and trying to find ways to get that sound quality to the consumer. Pretty tough job. Engineer Keith Johnson compares the task of fitting 20-bit sound into 16 bits to rolling a bowling ball down a garden hose.

So we began using high-resolution noise-shaped dither, which allows us to get more of that 120-dB dynamic range to you. That doesn't mean we've increased the ratio of forte to piano on typical CDs (though I'd like to). But this small change in noise at the lowest levels has improved the sense of ambience, space, warmth, depth, and separation on our CDs. This is audible even at normal listening levels, with almost any kind of music. In that respect, CDs made today sound much better than those made in 1983, or even 1990.

Some skeptics may find it hard to believe that so little a change in noise makes so mach difference. Listen for yourself: Compare two CDs whose only technical difference is in how they were dithered. Both were made in a transition period before we began to record 20-bit. The original session tapes are 16-bit (DAT), all from the same recording session. The first CD, Clark Terry Live at the Village Gate (Chesky JD49), was produced using simple pink-noise dither, which was applied twice, first during recording and then during post-production. The second CD, Clark Terry, The Second Set (Chesky JD127), was made with pink-noise dither during recording and high-resolution noise-shaped dither in post production. The second CD sounds dramatically warmer, with a wider, deeper soundstage and ambience, all because we used differently shaped noise at a nominal-96 dBFS! If promise that no equalization was used on either recording, though you'd think they were made with different microphones or in a different hall. If only we could completely remove the veil of 16-bit dither and present an original 20- or 24-bit recording to you. That is the promise of DVD.

Every day in my control room, I have the pleasure of auditioning high-resolution, 20- to 24-bit

recordings. True 24-bit dynamic range (144 dB) is not achievable with current A/D or D/A converters; jitter and thermal noise limit even the best to around 20 to 21 bits (on a good day). But there is some resolution below 20 bits. Thus, all else being equal, 20-bit converters sound dramatically better than 16-bit and 24-bit converters slightly better still. I cut about a CD a day, and each day is a clear demonstration: It takes at least 20 bits to capture the full ambient and spatial qualities from our mixes.

But what about 24-bit? Robert Stuart of Meridian Audio reminds us that it is possible to hear a 24-bit truncation in an 18-bit reproduction system. Working with a 16-bit reproduction system, I have clearly heard 20-bit truncations. So every thing has to be done right. Intermediate DSP operations must be performed at 24-bit or better ac curacy to maintain purity of tone when the final product is to be 16 to 20 bits. This illustrates the concept of professional headroom.

However, there is some justification to reducing a 24-bit signal to 20 at the end of the chain, because consumer D/A converters seldom exceed 20-bit precision. You probably will get better sound from such converters if they are fed signals that have been dithered down to 20 bits than if they are presented with raw 24-bit signals that they then truncate during conversion. I don't think I can hear the difference between a 24-bit source and 20-bit, noise-shaped reduction of it, but I can clearly hear a dithered reduction to 18 or 16 bits. So I don't see any reason to get into a war over 24 versus 20 bits. I have no doubt, however, that we need at least 20 bits; 16 is not enough.

The Third Digital Audio Revolution Recently a digital equalizer was introduced that employs double-sampling technology. It accepts up to 24-bit words at 44.1 or 48 kHz, upsamples the signal to 88.2 or 96 kHz, performs 32-bit EQ calculations, and then re-samples the output back to 24 bits at 44.1 or 48 kHz. I was very skeptical, thinking that these heavy calculations would de grade the sound, but the equalizer won me over.

Its low distortion gives the midrange an open sound. The improvement is measurable and quite audible-more, well, analog than I've heard from any other digital equalizer.

Which brings us to the third digital audio revolution: calculation and recording at higher sample rates. No, we haven't magically developed ultra sonic hearing capabilities, but there is good scientific foundation for improvement. In a white pa per on the subject, Dr. James A. (Andy) Moorer, Sonic Solutions' senior vice president for advanced development, explains that, in general, "keeping the sound at a high sampling rate, from recording to the final stage will...produce a better product, since the effect of the quantization will be less at each stage." In other words, because errors are spread over a much wider bandwidth, we notice less distortion in the band from 20 Hz to 20 kHz. Sources of such distortion include cumulative coefficient inaccuracies in filter (EQ) and level calculations.

Moorer also points out that the improvement afforded by high sampling rates "is a binaural (two-ear) phenomenon. If we plug one ear, it is unlikely that anyone would be able to distinguish a 96-kHz recording from a 48-kHz recording; ...some kind of time-domain resolution between the left and right ear signals is more accurately preserved at 96 kHz." And he notes that because of the errors in the decimation stages of typical consumer D/A converters, "on the average, it is likely that a consumer-quality 96-kHz converter will sound better than a consumer-quality 44.1 or 48-kHz converter, simply because it might be built with one less decimation/quantization stage." (For a copy of Moorer's paper, contact Chris Kryzan via e-mail at kryzan@sonic.com.) The sonic improvements from recording at 96 kHz are not as dramatic as from an increase in word length, but they are important enough, in my opinion, to justify using more storage space on the consumer DVD. Mike Story, chief engineer of dCS, has given other reasons why 96-kHz sampling can sound better than the current standard.

In a paper presented at the 96-kHz mastering workshop at the 103rd Convention of the Audio Engineering Society, Story also focused on binaural and localization improvements. He demonstrated that relaxed anti-alias filtering constraints (e.g., Nyquist filtering at 48 kHz instead of 22.05 kHz) result in better spatial resolution. He said that the energy spread of digital filters designed for 48 kHz produces an equivalent distance smear of ±15 centimeters (at the speed of sound), whereas digital filters designed for 96 kHz keep almost all the filter dispersion within a very tight 1.5 centimeters. (Copies of this paper are avail able via e-mail from mstory@dcsltd.co.uk.) Good News on Disc Capacity And there's good news with regard to storage capacity, provided the DVD Forum heeds Bob Stuart's advice. Stuart and the late Michael Gerzon showed that a combination of lossless compression, noise shaping, and pre-emphasis can significantly reduce the storage requirements for 96-kHz audio at no sacrifice to sound quality.

Since this subject was covered extensively by Stu art himself in the April issue of this magazine, I'll summarize simply by noting that appropriate application of these techniques would enable more than 74 minutes of five-channel, 24-bit, 96-kHz audio to be packed onto just one side of a DVD Audio disc.

And the good news doesn't stop there. Remember the lesson of the improved Clark Terry CD? Record companies are sitting on a new gold mine.

Even old, 16-bit/44.1-kHz session tapes can exhibit more life and purity of tone if properly re processed and reissued on a 20-bit, 88.2-kHz DVD-Audio disc.

I will continue to advocate higher-resolution digital recording and processing and to practice what I preach. The benefits are apparent on many currently available CDs, and you may not have to hold your breath much longer to hear even better sound in the home.

------------------------------

KEN KANTOR IS CO-FOUNDER AND CHIEF TECHNOLOGY OFFICER OF VERGENCE TECHNOLOGY, INC.. A MAKER OF HIGH-PERFORMANCE STUDIO RECORDING EQUIPMENT. INCLUDING THE NHT PRO LINE OF MONITOR LOUDSPEAKERS. A FREQUENT CONTRIBUTOR TO AUDIO. KEN HAS AN EXTENSIVE BACKGROUND IN ELECTRONIC AND ACOUSTIC ENGINEERING AND CAN BE READ ONLINE AT WWW.E-TOWN.COM.

Nobody who knows what he is talking about would ever claim that 44.1-kHz, 16-bit audio is a major limiting factor in home playback fidelity. A lot of work went into choosing a standard that is, for all intents and purposes, flawless in this situation. In the home, the playback level is adjusted by the user's volume control, and so 90+ dB of signal to-noise ratio reaches far beyond any musical needs-especially considering the domestic ambient-noise floor. Bandwidth is not a limitation, either. The audio signal will most likely undergo only one D/A conversion, and even three or four would not be an issue. All the junk you hear about quantization and low-level detail is spewed by people who haven't a clue what Nyquist math really says about signal reproduction (or they do but are trying to sell something anyway).

Even so, no one who is well-informed would claim that 44.1-kHz, 16-bit audio is totally trans parent under all conditions. It is easy to come up with artificial signals and conditions that will highlight various limitations and artifacts. This doesn't mean they are any kind of problem in the home, but they sure can be in the studio. When recording, an engineer doesn't have the luxury of adjusting volume on the fly. Headroom is a necessity. And with the mixing and processing of many channels, 16 bits just doesn't cut it. It is usable, but it isn't easy or fun. Going to 18 bits is a big improvement, and 20 bits is an outright luxury, al lowing total freedom in recording levels as well as complete processing and mixing flexibility.

As to 44.1-kHz sampling, that's more controversial. It should be okay, really, and is typically the sample rate of choice over 48 kHz, available in most studios, to make things easier to send to CD.

But it does demand careful filter design, and everyone secretly wishes it were up at least around 60 kHz, just to be sure.

Well, then, it's settled, right? Technology has advanced, memory is cheap, and we can have a wonderful audio medium by moving from the current standard up to a 60-kHz, 20-bit system. Hell, we can even go to 88.2 kHz (double the CD rate) to make the changeover simpler. Recording engineers will be thrilled, audio purists will be placated (at least the sane ones), and we'll need only a little more than twice the amount of storage space we use now. Plenty of room left for multichannel. I'm there, dude! Oh yeah? Well, then, what's up with 96/24? It certainly is unjustified sonically, and it seems a little extreme even for marketing hype. Not only does it pointlessly burn up more storage space, it makes studio processing equipment and computer hard-disk-based recording systems a total night mare. Affordable high-end consumer digital recording or performance-oriented computer sound cards? Guess. "Big Audio" doesn't like just how good inexpensive digital recording and play back equipment is getting. Pushing for 96/24 seems downright greedy to me, if the motive is really planned obsolescence.

Recently I was discussing this issue with a well known recording engineer/producer friend of mine. I was also adding the angle that a big incentive behind 96/24 is the fear that the mechanical/optical media and gear manufacturers have of solid state memory. After all, we are getting ever closer to the point where you can fit a CD's worth of 44/16 on a chip. Easy to manufacture, no transport needed. Where would this leave the billions (trillions?) of dollars the Sonys and Philipses of the world have invested in optical media development? If you were in the disc business, wouldn't you want to up the ante? The more memory you can hog, the safer your technology is. Mere 44/16 or 88/20 would look downright low-fi to the consumer. Optical would stay atop the alleged quality heap.

But my friend offered an interesting view. As a person who makes his living producing audio "content," he was telling me how annoyed producers were at the introduction of the CD.

Used to be, you got paid for a finishing an LP with maybe 20 minutes of music a side. All of a sudden, people expected an hour of sound. But since production budgets didn't go up proportionately, producers got paid the same to do 50% more work. Now there's DVD, and people expect even more playing time. A movie lasts 90 minutes or more--why should I pay for only 60 minutes of music? Producers are dreading this. Bands are dreading it. Solution: Waste bandwidth. Fill up that disc with extra bits, and tell people they are getting better quality.

I say: BS! Whatever the reason, or reasons, behind 96/24, it is a dumb idea, and you might have to pay for it. But you shouldn't want to. Instead of moaning about Divx, which is bound do go down in flames on its own, watch out for the digital excess of 96-kHz/24-bit audio, an equally anti-consumer concept you may well be forced to live with forever.

[Adapted from July 1998 Audio magazine article.]

Also see: 96/24 on DVD: First Impressions of the 96-Khz/24-Bit Stereo Music DVDs From Chesky and Classic Records (July 1998)