Currents by John Eargle (Sept. 1990)

Home | Audio mag. | Stereo Review mag. | High Fidelity mag. | AE/AA mag.

AN 'EARING 'EARING

The "Sound of Audio" was the subject of the eighth International Conference presented by the Audio Engineering Society this past May in Washington, D.C. Subtitled Perception and Measurement, Recording and Reproduction, the conference covered virtually all aspects of audio technology that bear on our understanding of current and future developments in the field.

The chairman of the conference was Skip Pizzi, of National Public Radio, and the papers chairman was Floyd Toole, of the National Research Council of Canada. Special thanks go to these men for organizing the conference and pulling together facilities, demonstrations, and appropriate distinguished chairmen for the various informative sessions.

One of the ground rules was that each speaker was expected to submit a paper suitable for preprinting. These were duplicated, and a bound set was presented to each of the nearly 200 attendees. The complete manuscripts, with any changes or additions, will be published by the AES later this year and will be available for general sale.

Audio readers should keep their eyes open for this valuable collection of convention proceedings.

In addition to the seven major sessions, various audio demonstrations illustrated many of the effects and techniques that were discussed during the actual sessions. Some of these point clearly to possible new developments in consumer electronics, while others represent variations on currently available technology.

The first session, chaired by Louis Fielder of Dolby Laboratories, was titled "Perceiving the Sound of Audio." Neil Viemeister of the University of Minnesota began the conference with an overview of psychoacoustics and auditory perception. Viemeister described the nature of hearing sensitivity, and it may surprise readers of Audio to learn that the ear responds, at its lowest threshold, to eardrum displacement about 1/100 the diameter of a hydrogen molecule! The upper limit is some 120 dB greater, representing an intensity ratio of 10¹² to 1. This is all the more astounding when we remember that the "front-end" of the ear relies on mechanical leverage between the eardrum and the inner ear. Viemeister's discussion continued with the temporal aspects of hearing, pointing out the ears' remarkable ability to sort out timing differences between them on the order of a few microseconds. Viemeister's presentation continued with discussions focusing on loudness and pitch perception.

Frederic Wightman, of the University of Wisconsin, discussed aspects of hearing in three dimensions. The classical aspects of lateral localization with emphasis on arrival time and intensity differences at the ears were reviewed as a prerequisite to a discussion of recent experiments that emphasize the importance of the pinna (the outer structure of the ear) in providing cues for fore/aft and up/down localization of sound sources. Essentially, the convolutions of the pinnae provide significant spectral shaping of sounds as a function of both lateral and vertical angles.

Also, the specific frequency shaping is virtually unique to each person, providing a consistent frame of reference by which each person learns to assign directions.

William Hartmann, of Michigan State University, closed the first session with a paper on localization of sound sources in a room. In real-world situations, there are always reflections that both enhance our appreciation of the environment around us and interfere with our localization efforts. For example, it is difficult to localize the source of a sine wave, since small amounts of reflected sound can profoundly alter the phase relationships of that signal at the two ears. On the other hand, the source of a complex signal such as pink noise can usually be accurately localized, primarily by timing information coming from the many micro-transients that the signal contains. A good bit of the time, we are only marginally sure of our localization judgments, but this suffices in many listening situations where our attention may be drawn to other aspects of audition. In that sense, our localization techniques are adaptive, providing us an important ability to relearn quickly in difficult environments.

Don Keele, Senior Editor of Audio, was the chairman for the second session, "Measuring the Sound of Audio." The first speaker, Richard Cabot, of Audio Precision, discussed audible effects versus objective measurements in the electrical signal path. In a paper noted for its detail and extensive bibliography, Cabot discussed the many forms of both linear and nonlinear distortion that can intrude on the audio signal. The audibility of various types of distortion is dependent on certain thresholds, and of course the annoyance of a given type of distortion depends as well on conditioning and learning. Both audibility and annoyance are subject to a variety of masking effects by the program itself.

David Klepper, of KMK Associates, then discussed the basic relationships between live music and architectural acoustics. Klepper presented a slide tour of modern concert hall design, with emphasis on the many acoustical and commercial trade-offs involved.

The balance between reverberation time, early reflections, and ratios of direct to reflected sound are the objective measurements leading to subjective descriptions such as intimacy, warmth, clarity, etc. As halls become larger, and as they are called on to fulfill other purposes, music requirements per se run the risk of being shortchanged. The skillful acoustical consultant is one who can minimize the maximum risk.

John Bradley, of the National Research Council of Canada, discussed methods of quantifying auditorium acoustics. Such terms as deutlichkeit (clarity), running liveness, and center time are measurements of clarity and definition of music. All are relatively simple measurements and represent single-number descriptors of the effectiveness of a given hall for the performance of music. Such terms as clarity index, articulation index, and speech transmission index are all objective measures of the effectiveness of speech communication in an auditorium. Again, these are all relatively simple measurements whose accuracy can be borne out in actual syllable articulation tests. Bradley stated that there is a relatively small number of measurements necessary to explain the bulk of subjective assessment of auditorium acoustics. These in turn have led to new parameters which can only lead to more predictable halls for music and speech.

As you can see, the first day was a busy one. But it wasn't quite over. That evening, Floyd Toole moderated a panel discussion on the reviewing of audio products. Panelists included Don Keele, Ed Foster, Julian Hirsch, Len Feldman, John Atkinson, Peter Aczel, David Clark, and David Ranada. In lively interplay with the audience, the reviewers provided insight into their methods of and criteria for equipment evaluation.

At the same time, the demonstration rooms were up and running. Some of the interesting exhibits there included Dolby Surround decoding, Ambisonics, various artificial-head recording methods, synthesizing images over headphones with variable height as well as left-right positioning, and a number of loudspeaker crosstalk cancelling schemes that produced very clear out-of-bounds (to the side) localization for listeners seated on the median plane.

David Clark demonstrated an automotive stereo system which had a delayed center loudspeaker for keeping the phantom center image from collapsing to the nearest loudspeaker.

Additional delays from a set of four side and back loudspeakers filled in early reflections one might hear in a typical living room.

The second day got underway with a paper by Floyd Toole on loudspeakers and rooms for stereophonic sound reproduction. His paper, which was actually a continuation of the first after noon's session, dealt with the many effects of room boundary conditions on loudspeaker performance. In addition to affecting the low-frequency loading on the loudspeakers, the boundary characteristics determine reverberation time and may provide significant discrete reflections. The relative positions of the loudspeakers and the listener can also bring into play profound response aberrations due to the normal, or preferred, low-frequency modes of the room. By way of practical advice, Toole outlined methods for analyzing the mode structure of the room and repositioning the loudspeakers to alleviate the modal problems.

Recordings themselves are a major problem in attaining the ultimate listening experience, since there is such variation between them in terms of spatial relationships and integration of hall (studio) sound with direct sound. A recurring theme throughout Toole's presentation was "to close the loop" between the recording (input) and playback (output) processes by involving the recording engineer and producer in analytical evaluations of the playback process.

Daniel Queen, of Daniel Queen Associates, was chairman of the third session, which was titled "Subjective Evaluations of the Sound of Audio." Floyd Toole presented a paper on identifying and controlling the variables in loudspeaker subjective testing. The following physical variables are significant: The listening room itself, loudspeaker position, the listener position, relative loudness, absolute loudness, program material, electronic imperfections, electro-acoustical imperfections, and whether the music is presented in stereo or mono (both are important). He further cited the following psychological and physiological variables: Knowledge of the products, familiarity with the program material, familiarity with the room, familiarity with the task at hand, judgment ability or aptitude, experience, and listener interaction and group pressure. Obviously, the experimental setup must take all these variables into effect and somehow neutralize them so that they do not significantly bias the tests.

Continuing in the same vein, Soren Bech, of the Technical University of Denmark, outlined in great detail the statistical methods used in structuring loudspeaker listening tests so that all undesired variables were equalized out of the tests.

At the conclusion of this session, consultant Tom Nousaine and Stanley Lipshitz, of the University of Waterloo in Canada, gave the audience their reflections on the "Great Debate" of the past decade-the presumed audibility of differences between electronics and the inaudibility of the same differences when subjected to double-blind tests.

Double-blind testing is a procedure in which neither the listener nor the person administering the test knows which of two amplifiers is which. In the normal testing setup, the two amplifiers appear as A and B on a switchbox.

The listener can hear A and B as often as necessary to form a judgment.

Then, at the moment of truth, the listener presses a button marked X. X is either A or B, and the task for the listener is to identify which it is. If there truly is an audible difference between A and B, then the task of identifying X should be quite easy. But when two amplifiers are carefully adjusted to precisely the same gain, and both operated within their power limits, it is amazing how little real difference there is.

Ron Streicher, of Pacific Audio-Visual Enterprises, chaired the next session, titled "Recording and Reproducing the Sound of Audio." Sean Olive, of the National Research Council of Canada, spoke on the preservation of timbre, microphones, loudspeakers, sound sources, and acoustical spaces. Olive described the range of aberrations that are to be found in even the best studio microphones and monitor loudspeakers. Taking into account the characteristics of sound sources and the recording space itself, a not so pretty picture of the total transfer process emerges. The ear/brain combination is mercifully forgiving of many things gone wrong, and we should be thankful for that. When you consider that your grandfather listened to acoustical recordings in severely band-limited and distorted mono, we have made great strides. But there is room for improvement still.

I chaired the next session, titled "Recording and Reproducing the Space of Audio: 'Conventional' Stereophony." The aim of this session was to present descriptions of current two-channel recording practice as applied to the mass media: Compact Disc, the cassette, and FM radio.

The first paper was jointly given by Ron Streicher and me, and it dealt with current practice in commercial classical recording. Essentially, classical recording employs fundamental stereo microphone arrays to preserve essential spatial cues. To this are added various accent microphones to correct imbalances and certain acoustical and musical problems. Contrary to what many people believe, both conductors and artists heartily endorse these hybrid techniques, when used with good taste and judgment.

David Moulton, of the Berklee College of Music, described the many techniques that are used in the pop/ rock studio to produce music intended for presentation over loudspeakers.

Here, there is no acoustical frame of reference, and the studio recording represents the initial creative act.

George Augspurger, of Perception Inc., discussed the many problems of monitoring the recording process in the normal work spaces used by engineers and producers, relating them to typical problems in the consumers' listening environments. He discussed the differences between high-end cone and dome systems and the usual compression driver and horn combinations used in most control rooms. The phantom center image was discussed and compared to the sound that would be produced by a discrete center loudspeaker. Depending on the precise listening angle, a phantom center will exhibit a pronounced null in response at the ears somewhere in the range of 2 kHz! This is inherent in the slightly differing delay paths from each loudspeaker to both ears. A discrete center speaker does not have this problem.

David Griesinger, of Lexicon Inc., chaired the next session, titled "Recording and Reproducing the Space of Audio: 'Surround' Sound." Roger Furness, of Minim Electronics, described the Ambisonic system of recording and playback, in which the four outputs of a Soundfield microphone can be encoded in two, three, or four channels and decoded into a variety of loudspeaker configurations for accurate reproduction of spatial information. He cited many currently available stereo recordings that have been so encoded.

Tomlinson Holman, of Lucasfilm, presented details of the Dolby Stereo and Dolby Surround systems as currently used in motion picture theaters and in home theater systems. Essentially, the technique encodes center channel information in phase between the two transmission channels, while surround channel information is encoded in opposite polarity. The better matrix decoding systems do a remarkably good job of determining dominant signals and sorting them out with a minimum of artifacts.

Griesinger then gave a paper on continuing experiments in reproducing binaural recordings naturally over loudspeakers. While the problem has been made to look simple over the years, it is in fact quite complex. Binaural recordings are normally made with an artificial head, and when played over headphones the effect is pleasant but not always completely natural sounding. There are often ambiguities between front and back, and up/down cues may be missing altogether. Some listeners experience "in the head" localization effects. What is missing in the recording are the pinna cues, which are unique to each of us. An ideal, but impractical, binaural recording setup would be tailored to each person. The artificial head would have exactly the pinnae convolutions of the person being modeled, and the headphones would be carefully equalized via probe microphones at the eardrums. The transformation from binaural to stereo loudspeaker presentation involves crosstalk cancellation, so that a listener on the median plane of the loudspeakers will receive the left and right signals primarily at the left and right ears, respectively. This must be done for a given subtended angle of the loudspeakers as measured from the listening position.

Griesinger further described Lexicon's efforts to solve the basic problems of binaural presentation over loudspeakers so that it will be effective for a significant fraction of listeners.

Gary Kendall and Martin Wilde, of Auris Perceptual Engineering, Inc., then described their work in developing a spatial sound processor that takes monophonic sound sources and processes them for three-dimensional presentation over stereo loudspeakers or headphones. Their system is designed for use in music, video, and film production.

The final session of the conference was titled "Frontiers in Sound Reproduction" and was chaired by Marshall Buck, of Cerwin Vega Corp. The first paper, by Wulf Pompetzki and Jens Blauert of Ruhr University, discussed further the ideas of binaural recording for both headphone and loudspeaker presentation. Details were given for signal processing of multiple microphone inputs for binaural presentation.

Jeffrey Borish, of EuPhonics, discussed methods of enhancing normal stereo recordings through simulation of the reflection patterns naturally occurring in concert halls. The characteristics of a hall can be measured, or they can be modeled via an image modeling program. The advantages of the image modeling approach are that the model can be easily changed, or else the listener can "change seats" in the hall.

David Clark, of DLC Design, then discussed in detail the evolution of the auto-stereo system that had been on demonstration during the conference.

The final paper of the conference was given by consultant Ronald Genereaux on adaptive equalization of loudspeaker systems. The method he described measures the transfer characteristic between a loudspeaker and a given listening position. An inverse filter is then calculated and inserted in the audio path so that many of the adverse effects of the room are cancelled. The technique has wide application in both consumer and professional applications.

The highlight of the conference was the broad subject of binaural presentation over loudspeakers and the impact that digital signal processing (DSP) will have on it. The big problem with that technology is the restriction on listener location. For that reason, the technology will probably find its first broad application in TV stereo, where close stereo loudspeaker placement will work to its advantage. Other applications include the automobile, where speaker and listener are fixed.

(adapted from Audio magazine, Sept. 1990)

= = = =

Prev. | Next

Top of Page Home

Updated: Monday, 2018-07-16 7:57 PST