The Great Phase-Coherency Bandwagon (High Fidelity, Oct. 1977)

Home | Audio Magazine | Stereo Review magazine | Good Sound | Troubleshooting







---or, What's All This about Time Dispersion in Loudspeakers?

by Peter Mitchell

[Sometime astrophysicist Peter W. Mitchell is a well-known audio designer, consultant, and commentator.]

IF YOU ENTER the premises of an audio dealer and admit that you are shopping for loudspeakers, the likely outcome is that you will be shown at least one model with its drivers offset in space to achieve "phase coherency," "linear phase," "time compensation," or some other abstruse -sounding desideratum. Actually, the reality underlying these terms (manufacturers can't agree on one) represents one of the most interesting and controversial topics in loudspeaker design today. What it involves is constructing the speaker so that the original phase relationships of all the frequency components in the input signal are maintained in the acoustic output. To put it another way, the goal is to have the waveform of the input signal (as seen on an oscilloscope) emerge unchanged in the acoustic output.

The worthiness of this endeavor is a matter of some debate. It is represented by its partisans as a real improvement in performance, while its de tractors see it as a costly waste of time at best, and more likely as a ploy designed to provide a marketing advantage. Who is right? The answer to that question requires careful consideration of the way the resulting waveforms are heard by the human ear-which, in turn, demands answers to numerous additional questions, all of which must be formulated with precision.

What Is Phase?

Phase is a physicist's or engineer's way of relating a continuous signal (one that theoretically has al ways existed and will always exist) to an arbitrary point in time. Take our old friend the sine wave. If we examine it at some freely chosen instant of reference and see how far along it is in executing one complete cycle, we can call that fraction of a cycle its phase. Now, if we know the frequency of the wave and its amplitude, the wave is defined for all time. (Times earlier than the reference instant will be considered negative.) Since one complete cycle of a wave can be interpreted as an angle of 360 degrees, we will consider any fraction of the total one-cycle waveform as a phase angle equal to the appropriate fraction of 360 degrees. From a mathematical point of view such a definition of phase leaves much to be desired, but it is good enough for present purposes.

For a speaker to be linear with respect to phase, we require that, if one frequency is reproduced with its original phase angle (this can be ensured by a suitable choice of reference instant), all other frequencies must be as well. Our first key question is: Will a speaker that has this characteristic sound any different from one that does not? In trying to answer it meaningfully, we must use test signals long enough and steady enough to avoid transient effects. (Remember the signal is assumed to have existed infinitely.) The Effect of Phase on Tone Quality In recent years many controlled experiments aimed at defining the phase sensitivity of the human ear have been performed. Typically these tests are conducted with electrostatic headphones, or with loudspeakers in an anechoic chamber, to provide the cleanest possible signals with the least interfering noise and the fewest extraneous factors to confine the interpretation of the results.

One type of experiment involves the use of an "all-pass" filter, a circuit that delays each frequency by a different amount but does not alter the frequency response. Despite amounts of phase shift similar to or far exceeding that found in nor mal audio components, listeners have consistently reported that the all -pass filter produces no perceptible change in the quality of continuous musical sounds or test tones. Mark Davis at MIT has demonstrated an extreme form of this experiment in public: A square wave, with harmonics spanning the audio spectrum, is sent through thirty one -third -octave filters wired in parallel, covering the audible frequency range. Each filter (because of its sharp cutoff) causes phase shifts much greater than any conventional audio component would yield. In this test, the filter set functions as an all -pass filter, preserving the amplitude of the square wave's harmonics while scrambling them in time and thus severely altering the waveform.

Listeners, in A/B comparisons, describe the difference in sound between the undistorted and scrambled versions as barely perceptible or non existent.

Another experiment involves playing a square wave and periodically phase -shifting the third harmonic by 180 degrees. This, too, produces a radical alteration of the waveform, yet listeners find it extremely difficult to identify any change in the sound. Many researchers have arrived independently at the same conclusion: The human ear does not operate as a waveform detector for continuous tones; amplitude and time information are detected separately.

Consider, however, an experiment that has con fused some audiophiles because it seems at first to be similar but yields startlingly contrary results.

Begin with a continuous tone at 1,000 Hz containing a harmonic overtone at 3,000 Hz (and perhaps other harmonics as well). Add a second tone at 1,500 Hz with similar harmonics at 3,000 Hz, 4,500 Hz, etc. Now shift the phase of the second tone.

One can hear a clear difference in the quality of the combined tone. You might be tempted to conclude from this that the ear is sensitive to the phase of harmonically related frequencies. In fact, what the ear is responding to is an amplitude difference-the varying level of the 3,000-Hz component. As the tones are varied in relative phase, their 3,000-Hz overtones mutually reinforce or cancel. But when the composite of the two tones undergoes phase shift (as by means of an all -pass filter), the difference in perceived sound disappears because there no longer is reinforcement or cancellation of component overtones.

Yet there are certain special continuous signals to which the ear genuinely appears to be phase-sensitive. The phenomenon requires complex tones containing closely spaced frequencies. For instance, when presented with pure tones of 950, 1,000, and 1,050 Hz simultaneously, and when the phase of the 1,000-Hz component is changed by as little as 30 degrees, the ear hears a change in the texture of the composite. Even so, these effects are reported to be fairly subtle and can be masked by normal amounts of room reverberation, back ground noise, or the presence of other frequencies in the sound.

As a practical matter, then, we are left with the conclusion that Hermann von Helmholtz announced nearly a century ago: The subjective quality of a continuous tone depends only on its spectral (frequency) content and is independent of phase. And since, as we have noted, "phase" has meaning only for continuous tones, the answer to the old question "Is phase audible?" must be "No."

----------


--------- It's not really a new concern

New discoveries are sometimes just old ideas dusted off. In 1935 motion -picture sound engineer John K. Hilliard found that recorded tap dancing had a falsely resonant quality when reproduced via the finest laboratory reference loudspeaker of the era. This giant two way speaker system had a folded -horn woofer with an air column length of 11 feet, plus a horn tweeter 3 feet in length. MGM recording director Douglas Shearer, brother of actress Norma Shearer and the creator of the "Munchkin sound" for The Wizard of Oz, suggested that the time -delay due to the woofer's extra 8 -foot depth might cause the echo. Realignment of the woofer and tweeter yielded accurate reproduction of the transients in Eleanor Powell's tap dancing.

---------------

Transients and Nonuniform Delay

Helmholtz has ofttimes been blamed for the wide spread neglect of time and phase effects in sound reproduction, but in fact he was careful not to apply his conclusion to transients-the beginnings and ends of sounds. When transients are involved, the notion of phase is no longer applicable or convenient, so we must refer to time directly. The reason is not hard to see.

Consider a loudspeaker linear in phase along with a second similar speaker in which each frequency is delayed by an amount equal to its period-that is, shifted in phase by 360 degrees. In any steady-state test these two speakers are indistinguishable, though in the first there is no relative time delay between frequencies while in the second the delay will be proportional to wavelength that is, higher frequencies will emerge sooner than low ones. We must conclude, therefore, that phase linearity does not uniquely define the time response of a loudspeaker.

But what happens when the frequency components of a transient are subject to a nonuniform delay? Can the effect be heard? Imagine an extreme situation, where the bass frequencies of a Beethoven symphony emerge from the loud speaker now and the higher frequencies emerge several seconds or minutes later. The music will be distorted beyond recognition. So the question now is not "Can time dispersion be heard?" but rather "What is its threshold of audibility?" One way to find out is to create the sharpest, shortest transient sound possible-an instantaneous pulse lasting only a few millionths of a second.

Alternate it with a second signal composed of two simultaneous weaker pulses whose combined strength equals the intensity of the previous pulse.

The single pulse and the equally intense double pulse sound identical-as they are. Now start spreading the two smaller pulses apart in time, and compare their combined sound with that of the single pulse. When this is done it is found that they can be separated by as much as 0.001 second (1 millisecond) and still sound identical to the single pulse. With larger time separations the character of the sound begins to change; the "tick" becomes a "thud." When the delay between the two pulses becomes large (30 to 60 milliseconds, depending on the listener), they can be heard separately.

With the sharpest, shortest transients generated by laboratory equipment, the audibility threshold for time smear is about 1 millisecond. With less demanding test signals (including spoken voice and most music) the threshold increases to several milliseconds. In the early 1930s, telephone engineers found that when voice frequencies were dispersed in time by 10 to 15 milliseconds (due to non uniform delays in long-distance lines), speech began to sound garbled; so compensation to hold time smear to less than 5 milliseconds became standard in telephone systems. In Hollywood, listening tests with a variety of music and special sound effects indicated that a 2 -millisecond delay between woofer and tweeter could not be detected, and that limit was adopted in 1938 as the maximum permissible time spread in theater speakers. In recent years numerous experiments have confirmed that delays of 1 to 2 milliseconds between woofer and tweeter produce no clearly identifiable effect in monophonic sound systems.

In conventional two-way and three-way dynamic loudspeaker systems the maximum woofer/tweeter delay is about 1 millisecond. In a few large horn systems the woofer is 2 to 3 milli seconds behind the midrange and tweeter. Of course in a biamplified system in which the woofer, midrange, and tweeter are mounted in separate cabinets in different locations, larger amounts of time smear can occur.

When a large time dispersion is present, its audible effect on transients is quite dramatic. For ex ample, if Mark Davis' experiment with the one third-octave filter set (which showed that gross waveform distortion of a continuous square wave is inaudible) is repeated with sharp pulses, one actually can hear a rapidly descending frequency sweep due to the time dispersion of the thirty narrow filters. And synthesizer engineer Dennis Colin has shown that an all -pass filter with a steep phase shift at about 600 Hz markedly alters the reproduction of piano sound. The sustained portions of the notes are unaffected, but the attack is softened and muddled.

Since the maximum time dispersion of a conventional loudspeaker is typically about 1 milli second, one would expect the effect on the sound of transients to be at the threshold of audibility. In the "live vs. recorded" demonstrations that Acoustic Research conducts before various audiences, listeners have reported that most of the time it was impossible to tell when percussionist Neil Grover was playing and when he was pantomiming while the speakers played an anechoic recording that he had taped earlier. The striking fact is that, al though the AR -10v speakers used in these demonstrations are not conspicuously of the "linear phase" type, the attack transients of the percussion instruments are accurately reproduced with no apparent softening or veiling, not even in the case of the brilliant bell-like cymbal included in the percussion set.

The conclusion that a time smear of 1 milli-second or less has no audible effect on the reproduction of musical transients is widely accepted by psycho-acousticians and speaker designers. But some designers believe otherwise, and credit for this probably belongs to V. Hansen and E. R. Madsen of Bang & Olufsen in Denmark. About five years ago they conducted a series of experiments suggesting that substantially smaller time shifts can be heard. The key to this result is the discovery of a special test signal. It consists of a sine wave with a DC offset (making it asymmetric) and with every second cycle switched off. As the sine wave is switched off and on, its DC offset is also switched. If the sine-wave frequency is 1,000 Hz, then each second of the test contains 500 1-millisecond segments of offset sine wave alternating with 500 1-millisecond gaps. Mathematical analysis shows that this signal is equivalent to three continuous sine waves at 500, 1,000, and 1,500 Hz, "misaligned" in phase so that their phase angles are proportional to the amount of DC offset. When listeners heard this signal they found that, as the DC offset changed, they could hear a change in timbre. Hansen and Madsen interpreted this to mean that the ear can hear phase shifts of less than 10 degrees at 1,000 Hz, corresponding to time shifts on the order of 30 microseconds (0.03 millisecond).

At higher frequencies the threshold is about the same, while at lower frequencies the permissible time shift increases rapidly.

This experiment has not been repeated or verified by other researchers, but despite criticism for its methodology, interpretation, and use of a test signal unlike anything encountered in music, it appears to have been one of the principal stimuli for the recent surge of "linear phase" loudspeakers.

=======

Time Compensation: Some Pros and Cons

ADVANTAGES

Depth. Stereo, as originally conceived, implied a sound field in which voices or instruments could be localized at different apparent distances from the listener as well as at various lateral positions. Listeners to time-compensated speakers consistently report hearing a stereo image with unusual-sometimes startling-depth.

Resolution. The stereo image is reproduced precisely, each voice or instrument having its proper place and width. In complex sound sources, such as a sym phony orchestra, individual instruments can be re solved with unexpected clarity.

Separation of ambience. With loudspeakers whose stereo image is slightly blended due to time smear, any hall ambience or reverberation in the recording tends to become slightly mixed with the instrumental sounds, causing coloration of those sounds. Consequently, with such speakers, the high definition of closely micro-phoned recordings tends to add a needed clarity to the sound, particularly where the forward image is concerned. With time -corrected loudspeakers, the extra clarity of the front image "dries up" the ambience a bit, allowing greater enjoyment of recordings made in highly reverberant spaces.

DISADVANTAGES

Restriction of listening position. In order to get the maximum benefit of a time -corrected system listeners must be located close to the stereo axis, equidistant from both speakers. With some systems an optimum height is also specified. In the KEF 105, for instance, an alignment light is built in; you and the speaker are optimally aligned only when you can see the light in the speaker.

Exaggerated depth. For a typical listener in a concert hall, the instruments on-stage are at only slightly different distances. But when recording microphones are placed near the front of the orchestra, the relative distances of the instruments at the back of the orchestra are exaggerated, as is the relative prominence of various annoyances such as bow scrape and the clatter of bas soon keys. Loudspeakers that reproduce this perspective do not necessarily provide a realistic sound. Some listeners will prefer speakers whose time smear flattens the exaggerated depth.

Poor recordings ruthlessly exposed. A time-corrected speaker may be too analytical: It may reveal that many recordings do not contain a genuine stereo image.

Many actually sound more pleasant with a less analytical speaker, especially one that adds its own spaciousness to the sound. (Of course the overly dry sound of many recordings as heard through time-corrected speakers can be improved with the aid of a time -delay ambience -synthesis system.)

========

The Effect of Time Smear in Stereo Imaging

In examining the psychoacoustic evidence regarding the effects of time smear on perceived tonal quality and on transient response, we have reached essentially negative conclusions. With continuous tones-even square waves-waveform fidelity appears to be irrelevant. The phase sensitivity of the ear is subtle at best and can be demonstrated only with very specialized test signals.

But these conclusions are essentially applicable to monophonic reproduction. In stereo, time dispersion is no longer trivial. Consider the familiar fact that when the phase of one loudspeaker in a stereo pair is shifted by 180 degrees (by reversing its leads), stereo localization becomes next to impossible. Clearly the relative phase of the speakers of a stereo pair influences stereo localization.

Localization depends essentially on the relative intensity of signals arriving at the ears and the relative timing of those signals. An imbalance of 1/2 dB in level produces a perceptible image shift, and the dependence on relative timing is amazingly critical. According to N. V. Franssen of Philips, trained listeners in laboratory conditions have detected image shifts due to an interaural time difference as small as 1 microsecond. A more typical value for the average listener is 30 microseconds, corresponding to a spatial angular resolution of about 3 degrees. So, although we have established that a frequency-dependent time smear of up to 1 millisecond is probably inaudible in a single speaker, the two speakers of a stereo pair must be identical in time -shift to within tolerances some thirty times smaller. The two speakers must be synchronized at all frequencies if the finest details of the stereo field are to be preserved.

Here, perhaps, is the principal advantage to be gained from "linear -phase" or "time -corrected" speakers. The effort to reduce the time dispersion to zero also makes it likely that there will be no significant differences in timing between the two speakers in a stereo pair. The details of a stereo recording are thus accurately retained and transmitted to the listener unaltered.

If this conclusion is accepted, then a useful corollary may immediately be drawn. Since stereo localization is primarily a mid- and high-frequency phenomenon, it follows that there is little to be gained by eliminating time smear at low frequencies. If the woofer crossover is fairly low in frequency (around 500 Hz or below), it is sufficient to design just the midrange and treble sections for minimum time dispersion and unit-to-unit consistency.

This was confirmed recently by Henning Moller of B&K, a leading proponent of linear -phase de sign. In experimenting with staggered -driver speaker systems, he found that midrange/tweeter alignment dramatically affects the sound but that changes in woofer alignment are very hard to hear.

So, as a practical matter a time -compensated loud speaker need not look like a pregnant kangaroo.

What Causes Time Dispersion?

If we are going to suggest that time-dispersion differences in loudspeakers should be minimized, it may be useful to review some of their causes.

In general, nearly every departure from flat response, in any audio component, has an associated time shift. Examples include the rolloffs at the low -frequency and high-frequency limits of microphones, tape recorders, phono pickups, tuners, amplifiers, and speakers, as well as all frequency response alterations with tone controls, filters, and equalizers. To the extent that these are identical in both stereo channels, they should not degrade the stereo imaging. In general, if a frequency -response error in one component is compensated by equalization, the associated time shift is as well.

Any sharp resonance, too, will yield substantial time shifts within the octave band above and be low the resonance frequency.

Examples include the fundamental driver resonances and resonances associated with cone breakup.

Loudspeaker crossover networks are also filters, and they can cause time smear, sometimes amounting to hundreds of microseconds. Some manufacturers do not control crossover tolerances very precisely, so it is not uncommon to find that two successive speakers off a production line have crossover frequencies that differ by 20% to 30%.

The differences in the resulting time shifts (about 100 microseconds in the crossover region) may de grade stereo imaging if these two speakers are used as a stereo pair. On the other hand, tight tolerances for crossovers and driver resonances may be expected to improve stereo imaging.

Driver placement is the aspect of loudspeaker time compensation that has received the most obvious attention. When all drivers are mounted on one conventional baffle, the woofer's sound emerges after that of the tweeter because of the depth of the woofer cone. Typically the delay is a few hundred microseconds, comparable to that caused by the crossover. It is the desire to compensate for this delay that has led to speakers with tilted front panels and other cabinet shapes in tended to make the effective acoustic positions of all drivers equidistant from the listener. Incidentally, a biamplified system can be time -corrected, using electronic time -delay circuits, without physical realignment of the drivers. If the woofer is flat rather than conical, it will have little geometrical time shift anyway; similarly, flat-panel radiators such as full -range electrostatics and planar magnetics have little woofer offset, though they may still require crossover correction.

If the drivers are side by side in the cabinet, then the images of voices or instruments usually are broadened in proportion to the spacing of the drivers. So the sharpest stereo image usually is obtained with the drivers aligned vertically rather than horizontally. (As noted earlier, it is the alignment of midrange and tweeter that counts most, since low frequencies contribute little to stereo localization.) Likewise, a loudspeaker in which two or more laterally spaced drivers operate in the same frequency range may not produce as sharp and detailed a stereo image as a system employing only one driver for each range. The same problem can occur when a strong reflection is produced by placing a single driver close to a reflecting surface.

These considerations lead to some interesting conclusions. One is that a loudspeaker with staggered drivers in a bulging cabinet may not in fact be accurately time-compensated.

Elimination of time smear requires close control over crossover circuits, driver resonances, and lateral driver geometry. And since these factors yield time differences comparable to those resulting from woofer depth (and since low-frequency time shift is relatively unimportant anyway), it is likely that some conventional rectangular-box loudspeakers are better aligned in time than some staggered-driver systems. Support for this view is found in reports that certain conventional -looking speakers image with unusual depth and resolution.

Is It Worth the Bother?

As with many issues in sound reproduction, it's up to you to decide whether the reduction of time dispersion in loudspeakers is worthwhile. The prosaic fact is that a speaker's sound still depends mainly on its frequency response and its angular dispersion; time-dispersion effects are subtle by comparison. However, speakers having similarly good frequency response and angular dispersion characteristics differ considerably in time smear.

You will have to depend mainly on your ears, at least until standard lab tests for loudspeaker time smear (and for sample-to-sample uniformity in time behavior) are generally adopted.

-------------

(High Fidelity, Oct. 1977)

Also see:

In the Loudspeaker Testing Lab (by Emil Torick)

Computer Technology Transforms Speaker Design (High Fidelity, Oct. 1977)

HIGH FIDELITY's 100 Years of Recording -- Part IV: The Microgroove Era





 

Top of Page   All Related Articles    Home

Updated: Monday, 2021-03-29 7:54 PST