CAN YOU REALLY HEAR THOSE HI-FI SPECS? It all boils down to dynamic range and achievable loudness [Jan. 1976]

Home | Audio mag. | Stereo Review mag. | High Fidelity mag. | AE/AA mag.


By Mitchell Cotter [Mitchell Cotter trained as a classical pianist, studied engineering, and headed the Consumer's Union audio division for seven years. ]

"The human ear functions as a refined acoustic-frequency analyzer that has some unusually fast-acting sensing abilities for spectral character."

Component specifications, and the terms in which they are expressed, present a particularly knotty problem for the neophyte audiophile. The stumbling block isn't just the vocabulary, but the unclear relation ship that exists between technical specifications and audible performance. For example, it is evident that 100 watts of power is "better" than 10 watts, but if some specification number is twice (or half) another number, will the performance be twice (or half) as good? And beyond that, just how good is good enough? These questions seem simple, and they are certainly legitimate, but for a variety of reasons they are very difficult to answer. In a nutshell, the problem is this: how to correlate objectively measurable sonic and electrical phenomena with subjectively perceived sound. Psychoacoustics, which is a branch of the science of auditory perception, has discovered many little-appreciated aspects of ear/brain performance that have an enormous influence on what we hear and under what circumstances. Before setting particular design goals for high fidelity components, it therefore makes sense to establish, as best one can, the significance of these psychoacoustic phenomena in respect to the science and art of sound reproduction.

We asked Mitchell Cotter, an engineer with broad concerns in the areas of psychoacoustics, recording techniques, and component design and evaluation, to examine some aspects of these questions for us. Mr. Cotter's emphasis throughout is on the sonic significance of the various specifications rather than the tempting opportunities they provide for one-upmanship ploys among advanced audiophiles.

-Larry Klein, Technical Editor

IT should be understood at the outset that the ultimate function of an audio component is to satisfy the ear of the listener. And it is the ear, with all its limitations and its peculiarities, that should determine how a component is designed. For these reasons it is worth taking a look at what we know about the sense of hearing as it relates specifically to musical perception.

The two major descriptive qualities of hearing sensation are (1) the sense of spectrum and (2) the sense of loudness. And both of these, further, are perceived most meaningfully as they change in time. Without exaggeration, it can be said that all specifications and hearing phenomena are .derived from variants and combinations of these few basic qualities.

1. SPECTRUM

Spectrum is concerned with the concepts both of pitch and of tonal color or sound character. Two instruments may emit sounds of the same pitch, but they will be of a clearly different character.


------------- Fig. 1. A sine wave, which represents a single pure mode of vibration devoid of harmonics, is not a musically interesting sound.

(see fig. 2 inset in top image) ----- Fig. 2. The waveforms of two different note.; played on a clarinet.. Although the frequencies of the notes and the overall shapes of their waveform: are different, the harmonic frequency structures are such as to make them both immediately recognizable to the ear as a clarinet.


----------- Fig. 3 (left): perceived pitch vs. frequency. Most of the span of pitch change occurs below the top note of the piano (4,000 Hz). These data were taken at a low level-40 dB above threshold. Fig. 4 (right): when the level changes, the subjective change in pitch of pure tones shows large shifts for the important melodic frequencies (about 150 to 10,000 Hz).


-------------- Fig. 5. The waveform shown-a group of closely related harmonics of similar strengths-gives rise to the lower-frequency dashed-line "envelope." The ear responds to this frequency that does not actually exist.

Pitch varies with frequency, but not in a direct way. Frequency describes the number of complete vibrations per second of a pure (uncolored) tone. The unit of measure for frequency now in use is the hertz (after Heinrich Hertz, a nineteenth-century German physicist) and is abbreviated Hz or kHz for thou sand (kilo) hertz. A pure tone will appear as a sine wave when graphed as, say, air-pressure variation with time.

Such a waveform, photographed in its audio-signal embodiment on an oscilloscope screen, is shown in Figure 1. Musically, this pure tone is not very interesting. Tonal color results from the complication of the waveform as in Figure 2, which represents the sound of a note on the clarinet. Here the pure sine-wave fundamental tone has been made more interesting musically by the addition of the specific harmonic-frequency structure characteristic of the clarinet. In other words, the individual sound of an instrument is the result of its unique frequency spectrum-the combination of the fundamental tone with an assortment of harmonics.

Pitch is the subjective response to the basic frequency of a wave. The lower the frequency, the lower the pitch; however, keep in mind that this relationship is not direct. The subjective sense of pitch is "calibrated" in mels and has been measured using many test subjects. Figure 3 shows the relationship between subjective mels and objective frequency. The foldout chart accompanying this discussion relates frequency to the musical scale and demonstrates that the higher-frequency end of the scale (5,000 Hz and up) does not provide significant pitch effects. One reason is that this range is too compressed; frequencies from 5,000 to 20,000 Hz represent only the last part of the mel range, while the audible frequencies below 5,000 Hz span most of it. In addition, it appears that the upper frequencies, if assigned too great a role in composition, are much too irritating to the ear to have musical value. The data of Figure 3 were deter mined at only one rather low level of intensity, and it can be seen that in creasing the sound level causes the pitch to shift. Figure 4 shows that as the level is increased-as it would be in the performance of music-substantial pitch changes occur.

The sound-intensity level covered in Figure 4 pretty well spans the range of the softest pianissimo to the loudest fortissimo. This indicates that when music is performed at normal levels, these subjective pitch changes are occurring constantly. What does this mean when it comes to the reproduction of this performance? It is quite clear that if the sound level of the re production is substantially different from that of the original, some alteration must occur, and that alteration will have musical significance.

For complex sounds with a rich structure of harmonics, the ear's sense of pitch does some surprising things. For example, the pitch perceived may arise from the spacing in frequency of the harmonics rather than from the strength of the fundamental tone. As a matter of fact, one may "hear" the fundamental even when it is simply not there: when a set of tones differing by some constant is heard simultaneously--say, 500, 600, 700, 800, 900, and 1,000 Hz--100 Hz, the constant between them all, is heard as the pitch of the combination. If the odd frequencies (500, 700, etc.) are dropped from this series, then the pitch appears to be the constant 200 Hz. Replace them, and the phantom 100 Hz returns. This so called synthetic pitch effect occurs be cause, to an extent, it is the sound "envelope" that the ear responds to (see Figure 5) rather than just discrete frequencies. This has important consequences in the reproduction of music.

Because of it, certain musical sounds that in actuality are relatively deficient in fundamental bass frequencies nevertheless appear to include them.

Illustrations such as Fig. 5 are "static," long-time averages and cannot ex press the changing quality of sound. However, it is just these changes and the way they are arranged that are the essence of music. The human ear functions as a refined acoustic-frequency analyzer that has some unusually fast-acting sensing abilities for spectral character. We recognize musical instruments not only by the characteristic harmonic structure in their sounds, but also by the distribution or variation of that structure in time. To the ear, the long-term average of the sound spectrum is therefore not as significant as the short term-which, oddly enough, is why the smoothness of the frequency response of an audio system is of such importance. The presence of sharp peaks (or valleys) in the frequency response causes resonances at the frequency of the peak; the sharper and higher the peak, the more pronounced the effect. Note that these "steps" in response level, if they cover a small part of an octave, are not heard as frequency-balance aberrations. Rather, if steep enough, they tend to cause "ringing" at the frequency related to the step. A frequency-response curve taken using only sine waves often doesn't reveal a ringing problem, which is why tone-burst tests are helpful in evaluating sound equipment, particularly speakers. The presence of significant ringing or resonance shows itself as an alteration of the tone burst at the point of turn-on or turn-off (see Figure 6, a and b).


--------- RELATIVE LOUDNESS LEVELS OF COMMON SOUNDS


---------- THE FREQUENCIES OF MUSIC (Ranges of fundamental components of tones for the principal instruments and voices).

The presence of such resonances does things to the time relationships among the various components that make up the spectrum of sounds.

These selective and unwanted delays change the character of instrumental sounds, particularly during the attack portions of the notes (see Figure 7).

Loudspeakers--and to a lesser extent microphones--are the principal de vices that cause such effects. One reason the problem occurs in speakers is that the energy pushing at the voice coil that drives the cone takes a finite time to propagate itself over the cone surface that radiates the sound.

Since nothing in nature is infinitely damped, stiff, or weightless, some resonance inevitably occurs in the cone it self. The kind of material and the degree of damping determine the quality of the performance, but insofar as the design can be executed so that the electrical audio signal causes the entire radiating surface to move in step, problems are avoided at the outset. Today there are many electrostatic and recently developed dynamic speaker de signs that distribute the drive force over the entire radiating diaphragm.


------------- Fig. 6. Ringing is relatively absent (top) and present in rather large amounts (bottom) in these two oscilloscope photographs of tone bursts. Such ringing is directly associated with resonances in the device under test-in this case a loudspeaker. Note that, between the bursts, the speaker is unable to completely stop producing an output and, in the serious case, reluctant to start.

The resulting lack of high-stress flexure of the radiating surface in these designs greatly reduces the tendency to resonate, keeps all parts of the "cone" in phase, and usually results in sonic clarity or transparency.

Once the overall frequency balance of the sound is achieved, the transient-response factors assume great significance. This time-domain behavior of loudspeakers is certainly an area that does not receive a great deal of attention as far as specifications go, and yet it would seem to be one of the most important in respect to the natural reproduction of music. Loudspeakers whose frequency responses seem to be generally the same sound different for another reason also, for the sound radiated into the listening room is distributed differently by different designs and the reflections from the wall surfaces of any particular listening room significantly affect the sound that arrives at the listener's ears. In general, the spectral character (frequency balance and range) of the sound heard in the aver age room is always an integration of the sound radiated directly toward the listener and the sound reflections coming to him from the room surfaces.


----------- Fig. 7. The "bite" in the attack of a musical instrument is dependent on the rate at which the fundamental and various harmonics of the note being played grow in strength. In the three examples (trumpet, violin, and clarinet), the fundamentals (labeled I) and the harmonics (2, 3, 4, etc.) can actually be seen to develop at different rates and amplitudes. The sharp onset of the trumpet note is indicated by the sudden, rapid rise of its entire harmonic structure. These patterns change significantly not only from instrument to instrument, but even for the same instrument, depending on the note being played, the loudness, and the way it is played. Musicians achieve expression in playing their instruments by controlling these patterns.

Frequency range (more commonly termed frequency "response") is one of the most commonly referred to hi-fi specifications. Is a response from 10 to 35,000 Hz (35 kHz) better than one of 15 to 25 kHz? In audio amplifiers, where it is relatively easy (and common) to achieve almost any frequency response desired, the quoted range differences are often trivial. In tape recorders, and especially in speakers and phono pickups, the deviations are far greater and have much more sonic significance. If amplifiers differ at all in spectrum it is usually in their magnetic phono equalization section. The equalization applied should conform to the RIAA standard with an accuracy with in 1 dB to be negligible in terms of its effect on sound quality. Unfortunately, there turns out to be a very complex interaction between phono pickups and the magnetic phono-input circuits of some preamps, and this factor has been considered in equipment test reports in STEREO REVIEW for several years.

How serious are such deviations in response (they usually appear graphically as smooth downward slopes at the highest and lowest frequencies rather than sudden dips or peaks)? The answer involves the second of the two salient qualities of hearing mentioned above: the sense of loudness.


--------- Fig. 8. Objective sound intensity is compared with subjective phons in this series of equal-loudness curves. The dashed line at bottom represents the threshold of audibility. Note that at low levels the ear is quite insensitive to bass frequencies.

2. LOUDNESS

Psycho-acousticians discovered many years ago that the subjective experience of loudness is very definitely a different quantity from objective physical measurements of sound intensity.

Furthermore, these differences are affected to a much greater degree by signal frequency and intensity than are the pitch-vs.-frequency effects just described above.

Historically, musical instruments and performances developed out of live listening evaluations. Or, to put it another way, the specific sounds of to day's music were chosen by the ears of our forebears. Modern technical analysis can now explain why these sounds "evolved" the way they did. The data in Figure 8 show the loudness-vs.-frequency properties of hearing. One notices the severe squeezing together of the equal-loudness contours in the bottom three bass octaves. Note particularly that the entire span of loudness becomes a very small range of physical sound pressure, and that the deeper bass requires significantly more energy to achieve a given level of perceived loudness. Music profoundly reflects these attributes of hearing in three important ways. (1) Musical bass, if it is to be heard as equivalently loud, re quires very much more physical energy than the mid-range-which is why one finds much more energy-output capability in the bass instruments. (2) In the deeper bass, the audible sound pres sure covers the small range of about 40 dB (this is about 40 dB less than that of the mid range-see Figure 8), and it is common for the musical dynamics (loudness variations) in the bass end to use something even less than that. In recordings, even of the "purist" kind, this tends to make the bass either inaudible or oppressive unless the original sound level is achieved in the reproduction. The deeper the bass and the wider its dynamic range, the more difficult it is both to record and to properly re produce it. (3) Because audible deep bass embodies so much energy, the listener invariably experiences effects on his body's exterior as well as in the cochlea of the ear.

It should be apparent that low-frequency sounds are heard very differently from middle and high-frequency sounds. We have seen that a particular sound originally heard at a particular level of loudness will change its pitch and relative loudness greatly if reproduced at a different sonic level. One may well ask why this should be the case, and wonder whether nature has not short-changed us in the hearing department. Given all the distortions of frequency and level they introduce, one might be tempted to judge the performance of our ears as below the quality level of a modest hi-fi system. There is, of course, much more to the story than that, and there are a variety of adaptive evolutionary reasons why the various properties of hearing are actually rather more appropriate to the history of the species than they might appear. The important thing to under stand in this regard is that we hear and respond not so much to continuous or steady properties of sound as we do to the changes and contrasts. (In this the sense of hearing is not unlike the senses of taste and smell.) Indoors, where most musical sounds are heard, sound intensity does not vary greatly with distance because of the diffusion of sound by reverberation. The reciprocal nature of the intensity-distance relationship (when you double the distance from the sound source you quarter the intensity) ceases to operate at from 2 to 3 feet in a small room and from 8 to 10 in a very large one. In general, we are aware that particular musical sounds have specific intensity levels and are therefore heard at a particular degree of loudness.

This brings us to one of the most significant factors relating to hearing and audio-equipment specifications. The dynamic range and the ways in which it is employed are of major importance in the development and expression of musical values. It is clear that the range of loudness of a musical performance not the maximum, but the ratio of maximum to minimum-correlates quite strongly with the aesthetic judgment of quality. There is little doubt that the more accomplished performers shade and vary their loudness palette to a greater degree than the less artistic. It is also important to note that performers adapt their efforts to the environment in which they are playing so as to produce the subjective effect desired.

Whatever the location, there is an "artistic" interaction with the room that sets the scale of effort so that fitting levels of sound are created to satisfy the ears of the performer(s)--and the audience as well.

But what happens to these musical judgments when a recording is made or the sound is broadcast? Tape recorders, and in fact all the elements commonly used today in the recording and broadcast chains, have significant limitations in the dynamic range they can handle. Noise enters at the lowest signal levels and distortion occurs at the loudest-unless the range from loudest to softest is compressed. In broadcasting especially, the tendency is to apply compression for the "background mu sic" type of listening; otherwise the softest passages would be masked by the environment's noise level-which, in an automobile at least, can be severe indeed. (It should be noted that homes are often quieter than concert halls.) The best of today's recordings avoid excessive compression, but some amount is present in all but a few exceptional discs. How much compression is "excessive" has to be deter mined by the demands and limitations of our ears, the listening environment, and the equipment capability. In any case, it is obviously necessary to look first at the loudness range of live music before we can judge what is required in our audio systems.

The perceived dynamics of music are a subjective process in the observer and should therefore be reckoned in units that reflect the subjective sense of loudness. This unit is the sone, and it is related to the objective measurement of sound intensity as shown in the graph of Figure 8. The transient loudness peaks in music reach about 200 to 300 sones, which is about the maximum comfortable level. The minimum found in music is about 0.5 sone. The lowest levels in music are usually determined by the ambient noise level during the Performance: obviously there's no point in playing notes that cannot be heard above the program rattling of the audience. The spread in acoustic sound pressure covers a range of 80 to 85 dB, which provides a subjective range of only about 500 to 1. A dynamic range of 80 dB is actually unusual, and though some individual performances do occasionally reach it, 50 to 60 dB is common in a single work. In a recently analyzed performance of the Beethoven Violin Concerto, Isaac Stern covered a 50-dB range in his dynamics. A solo piano recital studied ranged from 108 dB sound pressure down to 43 dB, with some musically important pianissimos fading away below. In general, music ranges from a few to about a hundred sones for the most common span of loudness.

These values define for us the maximum speaker acoustic-power output required for a given room-and there fore the rest of the equipment performance factors relating to dynamic range as well. If a hi-fi system can re produce the highest required sound levels of a live symphonic performance or the amplified output of a "soft" rock group, then the maximum levels of the dynamic range are covered. (There is no way the original sound levels of a "heavy-metal" rock group can be duplicated in the home.) However, we still have to consider the limitations imposed by the noise, hum, and distortion contributed by the sound system itself.

Will these be audible when the amplifier volume control is set for realistic re production levels? If they are, then the terms "noise," "hum," and "distortion" are meaningful; otherwise they are merely numbers, not sounds.

It is important to realize that an amplifier's volume control does not affect the dynamic range of the signal, but only slides the relative signal-level range up and down the scale of loud ness. If the program signal has the full dynamic range, then, when the volume control is turned down, the lowest levels of the noise and perhaps even some distortion may drop below audibility along with some of the music. It can easily be seen that the interrelated factors of dynamic range and achievable loudness to a great degree determine the audible significance of all the other specifications. Given the vagaries of our ears and the vast technical problems inherent in recording and reproduction, it is a wonder that we have come as far as we have in the acoustic simulation of complex musical realities using plastic, metal, glass, and vibrating paper. But, come to think of it, musical instruments use odd materials too-elephant ivory, catgut, cowhide, and horsehair besides.

=============

A Short Glossary of Sound Measurement

Acoustic-power Output: The total amount of energy radiated per unit of time by an acoustic source such as a loudspeaker. It refers to all the power emitted in all directions and thus differs from measurements taken at only one point (or perhaps several) in the sound field produced by such a source.

Compression: Restriction of the variations in loudness of recorded or reproduced sound accomplished by reducing the peaks of loudness to levels nearer the average loudness. The softer passages may be raised in level. Compression may be applied automatically or manually.

Decibel (dB): A unit of measure developed because the greater than million-to-one range of human sensitivity to audible intensities led to calculations too cumbersome for efficient handling. Ma thematically, it is twenty times the logarithm of the ratio between two quantities, 20 dB representing the ratio be tween 10 and 1, 40 dB the ratio between 100 and 1, and 6 dB representing the ratio between 2 and 1. (The reference level is often not explicitly stated-as, for example, in sound-pressure measurements, where the reference is 0.00002 Newton per square meter.)

Loudness: This is the exact term for the subjective impression of sound intensity, and it differs from the purely physical objective measure of sound pressure.

Its unit of measurement is the sone, 1 sone corresponding to the subjective loudness of a 1,000-Hz tone at a sound pressure of 40 dB above the standard reference level (see Decibel above). A perceived loudness of 2 sones is exactly double that of I sone, 40 sones is twice 20 sones, etc.

Mel: The subjective unit of pitch. The perceived pitch of a tone of 1,000 Hz at a sound pressure of 40 dB above the threshold of hearing (0 dB) is 1,000 mels. Like the subjective measure of loudness above, the mel is not linearly related to frequency, but a perceived doubling of pitch corresponds to a doubling of the number of mels.

Pitch: See Mel above.

Resonance: The emphasis of a certain pitch or frequency of sound originating in the physical or electrical tendency of a body or a circuit to store energy at that frequency. Sounds resulting from resonances are usually slow to build up and slow to decay. In musical instruments this is controlled and useful, but in audio it is undesirable because it introduces frequency-response irregularities.

RIAA Standard (Phono): The way in which the low and high frequencies are reduced and increased, respectively, in the making of phonograph records in or der to improve the dynamic range for music and to obtain adequate playing time. Low frequencies take up more groove space, and high frequencies are accompanied by noise. The frequency balance is restored in playback through circuits that have a standard response-equalizing characteristic-the one selected by the RIAA (Recording Industry Association of America) to complement the recording equalization.

Sound Pressure: The average variation in the atmospheric pressure per unit area caused by sound waves passing through the atmosphere, measured in decibels relative to a standard reference level (see Decibel above). About one five-thousandth of 1 atmosphere of pressure (15 lbs./sq. in.) corresponds to the loudest sound in music, and about one thirty-billionth of 1 atmosphere corre sponds to the threshold of hearing (0 dB) at a frequency of 1,000 Hz.

Also see:

EQUIPMENT TEST REPORTS--Hirsch-Houck Laboratory test results on: the Kensonic Accuphase T-100 AM/stereo FM tuner, Technics SA-5550 receiver, Leslie DVX speaker system, and Pioneer PL-15D41 turntable [Jan 1976]

 


Source: Stereo Review (USA magazine)

Prev. | Next

Top of Page   All Related Articles    Home

Updated: Tuesday, 2025-08-26 12:02 PST