Digital Domain (Sept. 1986, By Ken Pohlmann)

Home | Audio Magazine | Stereo Review magazine | Good Sound | Troubleshooting


Departments | Features | ADs | Equipment | Music/Recordings | History




AN INTERIORESTING SUBJECT


What ties together left and right hemispheres, pairs of ears, phantoms, loudspeakers, and your head's shadow? The answer is psychoacoustics. What does psycho acoustics have to do with digital audio? Increasingly a lot, as the mysteries of this largely unexplored area are being revealed and exploited by the signal-processing clout of digital audio techniques. Let's define psychoacoustics and consider specific examples of its principles in action, a side trip in preparation for understanding digital signal processing's eminent impact on psychoacoustics.

Despite all the talk about pressure functions, velocity of propagation, and colliding molecules, the real business of sound takes place inside our heads at the ear/brain interface. Until they are perceived, all sounds are merely academic concepts. Acoustical perception explains our subjective response to anything we hear; it is the ultimate arbitrator in acoustical matters because it is only our response to audio which fundamentally matters. Psychoacoustics seeks to reconcile acoustical stimuli--and all the objective scientific and physical properties which surround them--with the psychological responses which they evoke in individual listeners.

The ear is a very sensitive organ.

The mental judgments which result when it is coupled to the interpretive powers of the brain form the basis for all the enjoyment we experience from sound and music. Compared to the physical properties of sound, psycho acoustics presents formidable opportunity for basic research into such factors as aural associations, the effect of musical training, attentional ability, and organization of memory for musical information. In addition, while many responses are common to all listeners, any single listener's overall response to what he or she hears is a unique reflection of individual experience.

The basis of psychoacoustics and all aural perception is the ear/brain system, composed of two ears and two brain hemispheres. It is a wonderfully complex machine, and while some of the simple mechanisms are fairly well understood, the system itself is still largely a mystery. Normal left and right ears do not, as far as we know, differ physiologically in their capacity for detecting sound, but their respective right and left brain halves certainly do.

Each of us has one brain (more or less), but the two halves loosely divide the brain's functions.

Interestingly enough, and also mysteriously enough, the connections from the ears to the brain halves are crossed-the right ear is wired to the left brain half and the left ear to the right brain half. There is some overlap in the connections, but the primary links are crossed. That leads to an interesting question. It has been found that the left cerebral hemisphere processes most of our speech (verbal) information. Thus, the right ear is perceptually superior for spoken words.

On the other hand, it is mainly the right temporal lobe which processes melodic (nonverbal) information. Therefore, we are better at perceiving melodies heard by the left ear. Should Congress pass a law placing music on the left loudspeaker, and lyrics on the right? Nobody really knows--aspiring Nobel Prize recipients, take note.

All of this raises many questions, including one of basic design. Specifically, why do people have two ears? With one good ear we can fully perceive amplitude, frequency, loudness, and timbre. But primeval man needed two ears for localization--that is, to know what direction the man-eaters were coming from. Today, of course, modern man still desperately needs two ears--otherwise his headphones would fall off.

Localization provides a fine subject for demonstrating psychoacoustics in action, and for showing both the sophistication and the simple-mindedness of the ear/brain system. The ear/ brain uses four main cues to localize sound: Relative intensity, time of incidence, phase, and complexity of waveform. Provided that two ears are available (and two are needed for localization), relative intensity difference is perhaps the most important cue. A sound from one side will have greater intensity at the near ear because of the inverse-square law, which dictates that sound attenuates as it propagates. Intensity is also influenced by the head's acoustic shadow; high frequencies will be blocked by the head, and will thus be further attenuated at the far ear.

This effect is important at frequencies above 1 kHz but is insignificant at lower frequencies because long wavelengths tend to bend around the head.

The second cue, time of incidence, exercises the brain's computational power; the brain rapidly calculates time differences of less than a few ten thousandths of a second between one ear and the other. The ear nearest to the sound receives the sound first, a cue to its direction of origin. This cue would be ineffective for a steady-state continuous tone, but is highly useful for any changing waveform.

The other two cues are near relatives of the first two. With continuous tones, the brain seems to compare the phase between the two ears. The greater the calculated phase difference, the further to one side the sound's origin appears to be. Of course, this cue is frequency-dependent, occurring only where the path length between the two ears is a wavelength or less. In addition, there is some evidence that the ear is not sensitive to phase information outside the midrange band.

Finally, the complexity of the waveform plays a part. The head attenuates high-frequency components and not lower ones, and the brain perceives the resulting timbre differences between the ears--the more distant ear hears less high-frequency information.

Thus, lower frequencies are localized primarily by time of incidence or phase between the two ears, while high frequencies are localized by amplitude difference. The shape of the outer ear helps to determine front/back localization. Slight head movements which shift the ear/brain's placement in the sound field also help in deciphering the cues.

A pair of loudspeakers (or headphones) provides a perfect laboratory to study the psychoacoustics of localization. When sound is produced from the left speaker, our ear/brain uses the four cues to determine the left-hand direction of origin; likewise for the right speaker. But when equal sound is produced from both speakers, a fairly amazing phenomenon takes place: Instead of localizing sound at the left and right speakers, our highly-evolved ear/ brain decides that the sound is coming from the empty space between the speakers, even though other sensory organs such as our eyes clearly show that nothing is there. (However, localization is sharper in a darkened room.) Each ear receives the same information, and that information is stubbornly decoded as coming from straight ahead. Our interpretation of the cues leaves us no choice. We have created a phantom image.

The ear/brain's gullibility in creating phantom images is the keystone of stereo reproduction. When the correct spatial information is recorded along with the music, the ear/brain decodes it to recreate the panorama of a sound stage. As some die-hards enjoy point ing out, stereo is nothing more than two-channel monaural. The rest is purely interpretive.

The principal device used to accomplish stereo encoding is a panning potentiometer, or pan pot. By varying the information of the localization cues, phantom images may be placed anywhere along the line between two speakers. A pan pot functions somewhat like the balance control on a home stereo; when rotated, it varies the relative amplitude of the signal between two channels. The ear/brain subsequently uses those amplitude cues to determine localization, and presto--the image appears to move.

It's only an illusion, but a pretty good one. Of course, it works best with sounds that fit the ear/brain's amplitude-cue criteria. As we've already noted, high frequencies are the best candidates; low frequencies are more difficult to localize. Stability of placement is also dependent on the timbre of the waveform, the interaction with other signals present in the individual loudspeaker, the effects of listening-room acoustics, and, of course, listener placement.

But amplitude is only one of the cues the ear/brain uses to determine localization. What about time of incidence? Indeed, time cues may also be used to create phantom images. If equal-amplitude signals are supplied to our loudspeakers at the same time, the phantom image appears in the middle.

But if a time delay of up to 2 mS is introduced to one speaker, the sound will appear to come from the earlier speaker and the later speaker perceptually disappears. This nifty bit of psychoacoustics is known as the Haas Effect; it states that the ear/brain is drawn to the earlier source, ignoring the later one. The effect is good for delays up to about 40 mS. With longer delays, the ear/brain has time to realize the trick, and perceives the two sounds as discrete impulses. Delays as short as 2 mS shift the sound partway between sources. So, by playing with such short delays, we may move the phantom image along the line between our two sound sources, our ear/brain obediently using time-of-incidence cues to create the phantoms. Of course, localization cues can also be obtained the old-fashioned way.

When a pair of spaced-apart omnidirectional microphones is placed in front of an orchestra, the orchestra's spatial information is encoded free of charge. For example, the sound of the violins will be picked up by both microphones, but the violins are closer to the left microphone than to the right, so their sound output will be louder at the left microphone and will arrive there earlier.

When the two channels are reproduced, the amplitude and time-of-incidence information encoded along with the music will cause our ear/brain to place the violins on the left. Likewise, the rest of the orchestra members will seem to be seated in their respective chairs between our loudspeakers. In addition, combinations of cues such as amplitude, time of incidence, hall reverberation, and high-frequency attenuation create information of depth, placing the woodwinds behind the cellos and the percussion behind the woodwinds.

If you've got two ears, you are a bona fide spatial localization expert. Specifically, what you have underneath that $8 haircut is a marvelously complex decoding machine, obedient to many kinds of aural cues. Of course, when listening to a recording, you are largely constrained by the accuracy of the aural cues that it presents--to the degree that they are imperfect, they limit your psychoacoustic enjoyment of the recording.

Enter DSP--Digital Signal Processing. With high-speed number-crunching techniques performed in the digital domain, the caliber of those cues can be greatly improved, as we'll see next month. Meanwhile, consider this problem: If your stereo system supplies all the acoustical cues of a concert hall, are you listening at home or in the concert hall?

(adapted from Audio magazine, Sept. 1986)

= = = =

Prev. | Next

Top of Page    Home

Updated: Wednesday, 2018-09-26 8:20 PST