Sounds and Hearing

Home | Audio Magazine | Stereo Review magazine | Good Sound | Troubleshooting


Here’s a beginner’s approach that explains some of the sounds that you hear, how you hear them, and how the ear works.

Considering all the time and effort audiophiles spend producing such a wide variety of amplifiers, speakers, and other devices, the hearing process is just as important as any other part of the music reproduction chain—from the recording to the ear mechanism and nerve impulses sent to the brain to be interpreted. Depending on an individual’s conditioning, a sound can produce an entire spectrum of emotions and physical reactions, such as the quickening of a heartbeat, laughter, and tears. Sound triggers movement and speech, stirs the memory, and brings forth long-forgot ten mental pictures. To accomplish this, many different ways of making sounds and music have evolved over the years, and now, sound recording and reproduction are ways to store and play these valuable sounds.


You can think of a simple stereo recording as being made with two micro phones spaced apart and played back through two speakers that are similarly spaced apart. That is all there is to it! However, because there are no standards for stereo recordings, this is not the case for every recording, particularly popular music. Recordings of individual instruments and vocalists can even be made at different times and different places and combined later. The distinct sound of the instruments can be recorded through the use of a microphone or two within a few feet of each instrument or group of instruments. This might explain why some recordings sound much brighter than what you might hear in the audience, unless that live sound, too, is altered to account for the difference.

Each instrument or group can be panned with the mixing console—as mono or stereo—to appear at different places between the speakers. It can bring the instruments into your room and pro duce the effect of being there. Mixers can add artificial reverberation and alter the response. The only people who truly hear the final production exactly the way it was intended are, perhaps, the recording engineer and producer.

Stereo recordings have several limitations when played through speakers. One occurs when an audience is involved and the microphones are located between the orchestra and the audience. With music played through speakers, a cough in the audience, for example, may sound as though it’s coming from a spot in the orchestra instead of behind you.

Many years ago, I made a stereo recording at the edge of the road in front of Sonotone, where I used to work. When I played the recording back on a pair of speakers, a car that had passed by continued to remain in the right speaker and did not sound like it had gone to the extreme right as it really had done when it went farther down the road.

In addition, there is no vertical component. This became obvious when I made a stereo recording of a Wurlitzer pipe organ in an old movie theater. The pipe lofts were located to the left and right and above the height of the stage. When played back on speakers, the sound appeared to come only from the speakers and the space between them, but not up in the air as the organ lofts actually were.


It was my hope that 4-channel stereo of the 1970s would enable us to be surrounded with stereo, at least in the two dimensions of the horizontal plane. In this configuration, speakers were typically located in four corners of the room and the listener sat in the center. The sound occasionally accomplished some interesting effects. However, technology was limited and fidelity was not always the greatest.

Although mentioned in an article in Hi-Fi News from the UK in the 1960s, tetrahedral stereo never caught on. This consists of two front speakers and one centered at the rear. These form an equilateral triangle. The listener sits in the center. A fourth speaker is located above the center of the triangle forming a tetrahedron. If recordings were made with corresponding microphone positions, a 3-dimensional sound space could be produced. Despite some limitations, very convincing sound could be reproduced. Even more interesting effects could be made with electronic enhancements, particularly for very creative composers.

This process was referred to as Ambisonics. I remember seeing the closely spaced tetrahedral microphone array when visiting with Mark Davis at MIT in Cambridge, Mass., in the 1970s. Cal rec was the first manufacturer to develop this array. The idea never caught on, although today, with high quality multi-channel capability, it could be easily accomplished.

Photo 1 shows a way to produce a 3-dimensional environment. “One extreme was carried out by the Air Force at Wright-Patterson Air Force Base. To create 3-D sound, researchers used a 14’ geodesic sphere of high-grade aluminum alloy, incorporating 272 speakers at regular intervals over the surface. Up to 15 speakers work at a time, emitting the same or different sounds. Sound can move around the sphere at up to two revolutions per second—the same rate that a pilot can roll an F-16 or T-38. In entertainment, 3-D sound achieves greater realism; in the same way that stereo improves on the mono system of the 1950s. Sound is generated in many locations, and when you turn your head, sound changes just as it does in real life.”

PHOTO 1: Sound system for pilots.

Despite the methods for recording stereo, even a simple stereo recording has problems with playback through speakers. First, a straightforward mono phonic drum recording arrangement in Fig. 1 can be used to play back through a single speaker without any difficulty. However, when two microphones are used to record stereo, as in Fig. 2, stereo crosstalk from each speaker to both ears interferes with what would be heard if the drum were heard directly. I will discuss the mechanics of this later. A partial solution is listening with headphones, but this is not entirely satisfactory.

Above: Fig. 1: Monophonic recording.

Above: Fig. 2: Stereo recording.


Binaural recording and playback are very different approaches. After all, you really hear binaurally. It enables an even greater degree of perception and eliminates speakers, room acoustics, and stereo speaker crosstalk. Sounds can be located in all directions similar to how you really hear. On the other hand, it presents other problems such as headphone quality, outer ear compatibility, and skin perception.

In order to present binaural sound correctly to the ears, you need only a two-channel system and headphones. Ideally, a binaural recording should be able to let you perceive sound as being outside and from any direction. To achieve this, several recording methods are used with a microphone spacing of about 7” to simulate the distance between the ears. A dummy head is commonly utilized to provide the necessary conditions for binaural recordings.

The Kunstkopf and Neumann KU 81i artificial heads have a small micro phone located in each ear canal and include artificial pinna [ ear]. The head and ears are made of appropriate acoustic materials. Another method is to process signals electronically to simulate ear-related transfer functions (HTRF).

A further approach is to incorporate a pair of tiny microphones that can be fitted inside the ears of the person making the recording. If recordings are made this way or with an artificial head that also includes the pinna, playback with headphones involves hearing through a double set of pinna. However, eliminating the pinna in the dummy heads or using earbuds to bypass some of the outer ear could help. Of course, there is still a compromise because the pinna on the artificial head are not the same as yours and affect sound differently from what you hear in the real world.

Another problem is that the listener is not free to turn his/her head to locate sounds more accurately. In addition, particularly for loud sounds, body perception comes into play. Skin perception provides additional information and is useful to detect the impact of a shock wave, such as a drum strike or other percussive sounds. Even loud infrasonic sounds can be felt by the whole body.

Despite the drawbacks of the various recording methods, stereo and binaural sounds both produce some very pleasing effects. In addition, the recording medium has improved tremendously from what it was 50 or 60 years ago, but there is still a lot of room for improvement to reproduce a truly realistic experience in which you can move around and hear sound like you do in real life.

How do I know what good sound is? I played the violin for many years in grade school and then in the high school orchestra. I attended many live concerts—the kind without speakers to enhance the sound. These examples could be convincing but memory fades over time and playing in an orchestra with instruments surrounding you i not the same as sitting in the audience.

Better yet are the sounds heard around us every day. You normally take hearing for granted and awareness of sound location and frequencies are only in the background of our attention. The worst instance for me was when earwax had built up in one of my ears and I could hear only with the other ear (monaural sound—bad). It’s at times like this when you can appreciate all that is missing from normal binaural hearing. It was unbearable. I went to the emergency room and had the wax removed. It was like being reborn!


The Haas Effect also known as the Precedence Effect plays an important role in how you hear. When two identical sounds come from two different sources, but one is delayed to simulate an echo, the interpretation of what you hear is a unique characteristic of the ear/brain combination. After hearing the initial sound, the brain will suppress any later sound, such as an echo, for up to 30 milliseconds (some sources report up to 40 milliseconds). In effect, this causes you to literally not hear the delayed sound for that short time.

This inhibition is called time or temporal masking. The first sound to arrive defines the direction. If the arrival time of the delayed sound is longer, then two distinct sounds are heard, even if the second arrival is as much as 10dB higher. In addition, if the initial burst is from one speaker but continues as a steady tone from another speaker in a different location, the source will still be identified as coming from the speaker with the initial burst.

Even in an enclosed environment, such as a living room that has several reflective surfaces, you can reliably localize the initial source from the signals that reach your ears first. However, localizing a steady tone in a room is nearly hopeless.

Reflected sounds from the wall, floor, and ceiling form many different reinforcements and cancellations throughout the room. You can hear them as peaks and nulls as you move around. There are no clues for direction. When viewed on a survival basis, this is very important to locate any danger, and not be confused with immediate echoes that can come from other directions.

One of the classic demonstrations of temporal masking is to stand in front of a hard surface such as a brick wall, away from other reflecting surfaces that might interfere with what you hear. By clapping your hands and gradually moving back, you will reach a point where you can begin to discern two separate sounds—your initial handclap and the echo.

When I tried this, I measured 17’ to the wall. The total distance, representing the time it takes for the sound to travel to the wall and back again, was 34’ or 30.6 milliseconds. The temperature was 53° F. You can even tell whether the distance is great or small depending on how long the reflected sound takes to return. The longer the delay is, the greater the distance you are from the wall.

Accurate onset of transients is essential for initial perception and is of utmost importance for locating sounds. The leading edge or wavefront of a transient provides higher-frequency content that permits localization. In this way, the ear effectively acts as a waveform analyzer. One author suggests that this method of detection may be the way human hearing has evolved, what our continued existence is all about, and sine waves may not I what you respond to at all.


Sound originating from one point in space, such as a single speaker that handles the entire frequency range, can be called monophonic sound. It has no dimensions at all and requires only one channel. There is a difference between monophonic and monaural sound. Monophonic means only one channel as in a recording, and monaural means hearing with only one ear. Binaural is hearing with both ears. With both ears, you can easily locate the source of a monophonic sound ( Fig. 3).

Above: Fig. 3: Monophonic source diagram.

Above: Fig. 4: Stereo crosstalk diagram.

When two stereo speakers ( Fig. 4) are separated by several feet and fed with the same monophonic signal, a phantom image appears to come from straight ahead. However, the phantom image is not the same as the separate monophonic center channel in Fig. 3 because the sound from each speaker travels to both ears resulting in a crosstalk of sounds from the left speaker into the right ear and vice versa. Crosstalk alters the amplitude and phase response of the phantom center image and is not the way you would hear a sound from a single-point source.

The resulting response for a mono phonic phantom image is referred to as a comb filter because the nulls resemble a comb. The nulls shift to lower frequencies as you move further to one side. The curve in Fig. 5 was made with two identical speakers placed about 2’ apart and fed the same signal. The microphone was located 300 off to the left side of the left speaker. Although each speaker by itself had a very flat response, comb filtering can drastically alter the combined response.

Comb filtering can be easily heard by playing FM inter-station hiss or pink noise in the mono mode. As you move your head from side to side from the center position, you can hear the hiss in crease to a maximum and then decrease again. This effect may be more audible with some speakers than others depending on their directional characteristics and surrounding room reflections. How ever, when using random phased pink noise, which means a separate generator for each channel, this effect is eliminated. The speakers then essentially don’t radiate the same frequency at the same time.

With home theater material, voice is almost always in mono and is directed to the center channel speaker. However, with only two speakers playing home theater and the listener sitting at the center position, the mono voice essentially appears as a center phantom in spite of the missing frequency effects of comb filtering. Nevertheless, the image will exhibit the same problems with waveform (amplitudes and phase) and follow you as you move all the way to one side. On the other hand, a separate mono center channel does not suffer from comb filtering or waveform distortion, and the sound continues to appear to be coming from the center speaker even when you move to the side.

Some recordings may have a mix of stereo and mono, such as a vocalist, piano, or other instruments that appear in the exact center. Listening to a piano that appears to have no width or several instruments at the exact same location is not becoming for good stereo sound, but if a single channel and speaker are used for each instrument, comb filtering and the apparent narrow-width effect are not present. Some stereo recordings with much random spatial content and no mono added won’t exhibit this characteristic nearly as much and might be more like the behavior of random phased noise. Monophonic recordings would seem to be more accurately played using only one speaker.


Human survival still depends on the ability to locate sounds accurately, and two ears do hear more effectively than one. Most investigations for sound perception have been limited to the horizontal dimension. This includes not only the sound source but also assumes that the listener’s ears are in the same horizontal plane. Tests to locate direct sounds are normally conducted in a reflection-free environment.

The ability to locate a sound in real life is determined by the path length difference between the ears. The path length is the actual distance the sound must travel around the head, including obstructions such as the contours of the pinna. The time difference to reach each eardrum is referred to as the inter-aural time difference, which is significant be cause the path to the farther ear is not only longer but also curved. Higher frequencies tend to propagate in straight lines and don’t follow curvatures as, well. Sound reaching the farther ear is reduced in amplitude producing an inter-aural amplitude difference. The results are shown in Fig. 6 for several different angles

The ear closest to the source determines the side the sound comes from, and the amount of inter-aural time difference defines the direction. Sounds that arrive at both ears at the same time are interpreted as coming from straight ahead, but you can easily tell whether the sound comes from a different direction. Sounds near the front can be located with an accuracy of 2 or 30, but accuracy decreases as the angle to the source in creases. This invites turning the head to more accurately locate the source.

The head is a significant influence for frequencies above 600Hz but with a discontinuity in the area of 2kHz. The same area of the brain also processes sounds that come from the rear or are inclined above or below but are modified differently by different portions of the head and pinna. The shape of the pinna is different for different arrival directions. The brain can interpret all these clues to locate the origin of the sound.

Above: Fig. 5: Comb filtering response.

Above: Fig. 6: Inter-aural hearing curves.


Referring again to Fig. 4 and the arrangement of two speakers, the right ear hears some of the sound coming from the left speaker, and vice versa. To compensate for stereo crosstalk and to locate an image at a particular place, you can add electronic compensation to the left and right channels so the speakers radiate the sound with the inter-aural time delay that the brain would expect if the sound were actually coming from that place directly to your ears. Normally, if the spot is to the listener’s right, the sound from the right speaker should reach the right ear before the left speaker sound reaches the left ear. But the sound from the right speaker typically travels around the head and enters the left ear before the left speaker sound can arrive.

The problem is further complicated by the Haas effect. After hearing a signal, the brain suppresses any similar signals that occur within about 30ms. In this example, the delayed sound from the left speaker is blocked if it arrives too soon after the same sound from the right speaker.

The crosstalk cancellation concept was patented by Atal and Schroeder at the Bell Telephone Laboratories in 1966. It was based on reproducing binaural recordings through speakers that are located relatively close together. Several different commercial versions were developed over the years. One sophisticated device (Photo 2) was called the Sound Retrieval System (SRS), which was refined by Arnold Klayman at Hughes in the 1980s.

The SRS system does not have any time delay processing, artificial reverberation, harmonic generation, artificial phase correction or alteration, and does not require any prior encoding of program material. It was designed to simulate inter-aural differences between the ears and to be played through closely spaced stereo speakers. (Inter—aural effects and hearing are discussed further on.)

Although Klayman intended it to be for use in 747 airplanes and was awarded four patents, it never went into production. Later, the design and patents were purchased from Hughes by SRS Labs headed by Klayman and Jim Lucas. It was produced and sold as the model AK- 100 Sound Retrieval System.

A demonstration at McIntosh Lab by the Hughes representatives utilized two small speakers placed about 12” apart on the floor. A stereo recording was played through the AK-100 and then through a power amplifier and speakers. It sounded surprisingly spacious and pleasing. The center control can be adjusted to fill in the hole-in-the-middle as desired and the spaciousness can also be adjusted.

Although the effects are sometimes interesting, they vary from recording to recording requiring readjustment of the controls and there is no accurate setting other than personal preference. The arrangement assumes you will be at the center between the speakers and facing toward the front. Sounds can be heard well beyond the space between the speakers. However, when the AK-100 is used with binaural recordings, a better sense of space and direction can be achieved, again depending on what is in the recording. Sometimes the spacing between the speakers needs to be adjusted as well.

For speakers with wide spacing, the effect is obscured and response appears to be altered when seated back in the listening area. In addition, it’s claimed that the AK-100 can add interest to monophonic recordings. When you are close to the wall where the speakers are, you can hear sounds coming from the sides as well as the center, but as soon as you turn your head to locate the sounds at the side, the illusion disappears.

The concept still exists today. It was refined by the Cooper-Bauck technology to make the best listening area about the same as you would expect with a normal stereo system with a relatively narrow sweet spot. But the claims are that this sound envelops the listener in acoustic space. It’s used by several different companies and modified for use with surround sound, car audio, computers, games, and so on. One such company is Harman International.

Ambiophonics (not to be confused with Ambiphonics) claims a different approach. An out-of-phase component is injected into each channel to cancel out the inter-aural time delay. However, the cancellation component from each speaker is heard by both ears.

Ambiophonics claims that further computer processing produces very realistic effects even with two widely’ spaced front speakers. In surround recordings, side and rear channels can exhibit similar crosstalk, and it was found that simulated algorithms for concert halls were a satisfactory substitute. These have no time relationship to the front channels, and many surround speaker locations around the sides and back are preferred.

PHOTO 2: Hughes AK-100 Sound Retrieval System.

Although these concepts produce an exciting and engaging sound field, they still don’t reproduce a real life experience, only different or alternate realities. In addition, a real concert hall provides other sensory input such as viewing the musicians on the stage that can help locate the sources of sound and at the same time offer a visual distraction as well.


If the ear is exposed to two different sounds at the same time and one of the sounds is very loud, the second sound is “drowned” and cannot be heard. This is termed masking and the very loud sound masks the other sound. Figure 7 shows the threshold for various sound levels at a frequency of 1200Hz The thresh old tends to follow a class III one-third octave shape but is skewed toward the upper frequencies.

This effect was particularly notice able for me when listening for distortion. When doing a sine-wave sweep for distortion in a tweeter, I was able to hear a third harmonic distortion product just above the masking threshold very clearly, but if the distortion was only a dB or so less, it was clearly not audible.


Unlike many test signals and test equipment used today, the ear responds very differently. For instance, Fletcher-Munson curves of perceived equal loudness in Fig. 8 show that the ear is much less sensitive to lower frequencies at lower levels. One good reason is that if you have ever tried to make a recording out doors on a windy day, the wind noise will easily overdrive the recorder because the microphone responds to low frequencies at low levels—something you barely hear and which, again, benefits your survival.

Above: Fig. 7: Masking effect curves.

Above: Fig. 8: Loudness contours.

The inverse of the loudness curves indicates that the most sensitive area of hearing at lower levels is in the range of about 400 to 8000Hz and is greatest in the 3000 to 4500Hz range.

Part 2 takes a look at how our hearing mechanism works.

+ + + +

Part 2

We often take our hearing for granted. Find out how the ear works and what we can do to keep it functioning properly.

The idea of a mechanism that took hundreds of millions of years to develop is staggering. It was accomplished so slowly that you cannot observe any changes in your lifetime. In its present-day form, the ear, with its resulting construction, is so complex as to be overwhelming. To make matters more complicated, medical terminology can make a somewhat simple understanding even more intimidating.


The hearing mechanism is very elaborate, and the brain’s processing of in formation is more like a science-fiction story with an unfinished ending. Al though volumes have been written about the hearing process, perhaps a few simpler explanations may help to under stand some of the intricacies of the ear and how you hear.

The brain has a sound memory center that begins accumulating sounds almost at birth, perhaps even before. In normal adults, the brain has been able to establish a library of about 400,000 sound patterns to relate to the outside world. You can play tunes any time from your mental library, even complete symphonies with all the instruments playing together. In a similar way, nerve impulses from other sense organs, such as visual images or impulses from other parts of the body, can also be stored for comparison with new sensory input.

The ear is an incredibly sensitive de vice. The threshold of audibility corresponds to a pressure variation of less than one billionth of one atmosphere. This is commonly referenced as 0dB sound pressure level at 1000Hz, but varies with frequency.

At the other extreme, +130dB is considered the threshold of pain, corresponding to a voltage ratio of more than 3,000,000 to 1 (some sources quote +120dB as the pain threshold). Even a loud noise only causes microscopic movements of the eardrum. For a high- frequency sound, the motion may be only one-tenth of the diameter of a hydrogen molecule Hydrogen is in a diatomic form.

Although the ear magnifies a wide range of sound intensities, the transmit ting equipment is too stiff to respond to the very weakest tones, and thus they are not heard. If the range were not limited, you would be assailed by such sounds as your own body’s muscle contractions and bone movements.

Above: Fig. 9: Cross-section of the ear. Pinna, Auditory canal, Eustachian tube, Outer ear.

A microphone can be called a transducer because it transforms energy from an acoustical system through a diaphragm to a mechanical system that is then converted to an electrical system. The ear functions as a biological transducer.

Figure 9 shows a cross-section of the ear There are three main sections: the outer, middle, and inner ear. The outer and middle ear function as an impedance transformer, converting energy from the low impedance of the air to the high impedance of the fluid in the cochlea.

Acoustic resonances in the auditory canal of the outer ear can double the sound vibration force. The mechanical advantage of the bone-lever system of the middle ear can triple it. Pressure then transmitted to the cochlea at the inner ear can increase it 30 times. The total result can be an amplification, of up to 180 times before a sound wave sets the fluid of the inner ear in motion.

The principal parts of the ear can be seen in the three-dimensional model of Photo 3. The sensitive parts of the inner ear are well protected inside the skull bone, shown as the porous structure. Sound travels from the auditory canal at the bottom left and ends at the eardrum near the bottom center. This converts the acoustic energy to mechanical energy that is transferred from the eardrum to the mechanical system of the middle ear consisting of three bones— the hammer, anvil, and stirrup (known collectively as the ossicles). These are shown just above the eardrum and are the smallest bones in the human body.

Notice the stirrup as a U-shaped bone at the center of the picture. The three loops at the top center of the picture are the semicircular canals that pro vide our sense of balance. They are part of the same bone structure as the cochlea, which resembles a snail shell and can be seen at the right side.

PHOTO 3: Inner ear model.

Above: Fig. 10: Cross section, cochlea.

The stirrup transmits vibrations to the fluid-filled cochlea, which converts the energy into electrical pulses that are sent to the brain. The auditory nerves are shown just above the cochlea. The Eustachian tube is at the bottom right and is used to equalize air pressure on the eardrum. To put size in perspective, the cochlea is about the size of a pea.

The inside of the cochlea is hollow and is the most complex part of the hearing mechanism. A cross section is shown in Fig. 10. It’s divided into three fluid-filled parts—the tympanic (lower) canal, the vestibular (upper) canal, and the cochlear duct (middle canal). They are separated by thin membranes. The upper and lower canals are filled with a fluid called perilymph, and the middle canal is filled with endolymph. These fluids have different chemical compositions and electrical charge.

The stirrup transmits vibrations to the oval window of the upper canal. Vibrations travel through this canal to the tip of the cochlea at its center, where the cross section is the smallest. The waves then enter the lower canal until they reach the membrane-covered round window at the large end of the lower canal. This window dampens the vibration.

The small rectangle enclosing the organ of Corti in Fig. 10 is shown magnified in Fig. 11. The organ of Corti is a gelatinous mass located in the middle canal. It performs the most complex and interesting of the transformations and contains about 7,500 interrelated parts.

There are four longitudinal rows of auditory sensor hair cells. Three of the rows are outer hair cells and one row contains inner hair cells. They are embedded in supporting cells of the basilar membrane that forms the floor of the middle canal. Bundles of stiff stereocilia (much smaller hairs) project from the hair cells. Incidentally, stereocilia and stereo sound are not directly related.

The tectorial membrane shown in Fig. 11 is suspended from the bone surrounding the cochlea and projects over the hair cells and stereocilia like a shelf.

It’s coupled to the organ of Corti by the stereocilia bundles attached to the hair cells. There are over a million stereocilia in each ear. The fluid pressure waves produce motion of the basilar membrane that bends, twists, pulls, and pushes the hairs. This motion causes the stereocilia to move against the tectorial membrane.

The basilar membrane, also shown in Fig. 11, is light and taut near the wider stirrup end and thick and loose at the smaller end. Waves induce a ripple in the membrane. High tones produce their greatest crests where the membrane is fight; lower tones crest where the membrane is slack. The membrane can also pick up vibrations from the skull. When hearing is impaired, a bone conduction transmitter can sometimes be used to couple audio vibrations to the skull and restore partial hearing.

The chain of events for hearing then becomes even more involved with the conversion to electrical signals. The most intricate process occurs when motion of the stereocilia causes a chemical process to take place that alters the permeability of the membranes of the hair cells and allows positive ions to enter the corresponding cells. As a result, the respective cells develop a receptor potential and release more neurotransmitter molecules at its synapse with a sensory neuron. The resulting increase causes electrical discharges that are generated by the nerve cells. The cochlea-to-brain transmission system contains 30,000 nerve fibers issuing from the organ of Corti. The sound frequency interpretation depends on which fibers are activated.

Although there has been some correlation with nerve-firing rate and audio frequency, there is a recovery time for each nerve that limits the maxi mum firing rate to 1000 firings per second. This is a much lower rate than our upper limit of hearing at 20,000Hz. Nerves can take turns and increase the effective rate to 3000 firings per second, but information about most sound frequencies depends on which fibers are activated and not on the rate of firing.

Auditory nerve axons convey signals to the thalamus and cerebral cortex of the brain, where the auditory signals are interpreted as sound. Some signals are even fed back. The enormous amount of information transmitted to the brain can be selected and interpreted in an unbelievable number of ways. One that is commonly known is our ability to filter out a single conversation in a room full of people who are all talking.


The middle ear has inherent safety de vices that help to protect the inner ear from loud noises and large changes in ear pressure. Loud noise in excess of 80 to 85dB triggers two sets of muscles. One tightens the eardrum and restricts its ability to vibrate, while the other pulls the stirrup away from its link to the inner ear. Sounds that are most damaging are those above 500Hz and protection mainly affects the range below 2kHz. Of course, there is a release time for the muscles as well. How ever, the protective system is relatively slow to act, being from 10 to 30 ms for very loud sounds and up to 150 ms for levels near the trigger threshold.

Typically, protection response is more effective for younger people. Nevertheless, hearing damage can still occur be cause there is a tendency to play music even louder, overriding the efforts of our ears to protect themselves. Hearing damage can occur at any age for very short duration high-level sounds such as a rifle shot or other acoustic spikes. The muscles don’t respond fast enough.

You may have noticed that as you play music louder that the sound changes. Not only is there an apparent frequency shift, but the quality of the sound also seems to change. This is very apparent for some people, but others may not have noticed the effect at all. I attribute it to the action of the muscles of the inner ear and a listening level above 80 to 85dB.

The second safety device is the Eustachian tube, which connects the air- filled middle ear with the mouth cavity and serves as a pressure equalizer. This is why I was advised, when firing an anti-tank weapon, to open my mouth so that pressure would then be equalized at the eardrum and possible hearing damage could be avoided. This is similar to the operation of a noise- canceling microphone in which equal pressure at both sides of the diaphragm reduces diaphragm displacement. You may more commonly experience this when ascending or descending in an airplane.


A normal hearing loss due to aging consists of a gradual decrease in high-frequency sensitivity over time. This means that for a man at age 35, hearing sensitivity could be reduced as much as 11dB at 8kHz. In comparison, for a woman at that age the reduction might be only as much as 5dB. This varies, of course, from person to person, but you can infer that sensitivity would be down even further at 20kHz.

On the other hand, hearing damage is an acquired symptom. Despite safety devices built into the human hearing mechanism, survival did not depend on tolerance to loud sounds. On the contrary, it depended on man’s ability to hear and locate faint sounds. Humans are not equipped to tolerate exposure to so many loud sounds encountered in our present environment. Whether it’s machinery, engines, or places of entertainment, the danger is there.

When it comes to car audio, home audio, and even head phones, you can be inadvertently attracted to the sensation of hearing and even feeling loud sounds and the sensation that the louder it is, the better. Unfortunately, prolonged exposure to excessively loud sounds or even short-term peaks can damage or destroy part or all of your delicate hair cells.

Above: Fig. 11: Cross-section of the organ of Corti. Tectorial membrane.

They can become twisted, bent, and/or fused and are no longer able to respond properly to the incoming sonic vibrations. The ear’s performance is degraded even when only a relatively small number of hair cells is damaged. The dam age can be permanent in part or all of the frequency range for the rest of your life.

Hearing aids, which simply amplify frequency areas, might help to restore hearing to some degree depending on the severity of hearing loss. In cases of total hearing loss, cochlear implant surgery can be performed. Often, it’s the stereocilia that are damaged and not the nerves. An implant can be effective in some cases and is becoming more common.

A typical operation involves inserting a tiny 22-channel electrode under the basilar membrane so that it curls around inside the cochlea. The electrodes stimulate the auditory nerves, bypassing the damaged hairs. An internal receiver is added near the cochlea with an antenna loop. An external microphone and amplifier are used out side the body.

The amplifier converts the audio signals to pulses, which are inductively coupled by placing the transmitting coil at the side of the head next to the receiver antenna implant. This operation can restore varying degrees of hearing, but not total hearing ability Even if it restores only partial hearing, it’s a tremendous help in reducing or lessening the feeling of isolation that total or near total deafness can cause.

A symptom of exposure to extra loud sound is ringing in the ears. This may go away after a period of rest, but repeated occurrences may someday cause permanent ringing.

For example, at one place of entertainment that used a loud public address system, I came prepared with a sound level meter and earplugs. It didn’t take long for the sound to become very unpleasant, so I put my earplugs in and measured 85-90dB on the A scale and slow response. On the C scale with fast response, I measured 90-100dB. The PA system was used much of the time for about an hour.

I noticed later that one of the foam earplugs didn’t fit tightly in my ear. Although I thought it cut down the intensity when I removed both of them at the end of the performance, I could only hear well out of the ear with the completely sealed plug. Fortunately, this was only a temporary loss and my hearing recovered after 15 minutes or so. No warning was given to us that we would be exposed to uncomfortable or damaging sound levels. To quote from ASHA, “Sounds louder than 80 decibels are considered, potentially dangerous. Both the amount of noise and the length of time of exposure determine the amount of damage.”

Hearing damage is not just confined to adults. This can be even more prevalent with younger people, particularly those who use iPods with earbuds that are inserted directly into the ear canal. In a paper presented before the Audio Engineering Society one researcher cited several audiological studies of students. In one case, it was found that 3.8% of the sixth graders failed a high- frequency hearing test, while 11% of 9th graders and 10.6% of high school seniors failed. A survey of incoming college freshmen yielded a 33% failure rate. The next year, 60.7% of the new in coming class failed. These figures clearly imply an accumulation of damage.


Several difficult protective devices could be used but they can take away from the physical and emotional sensation of loud sound. To quote one disco enthusiast, “I like to feel the sound.” The reasons people prefer louder sound have received little attention Loudness commands our attention. It’s a survival trait. Louder sounds are usually associated with danger or excitement. They make the pulse go faster. They take your attention away from other lesser stimuli. In a sense, it’s like a drug, providing an escape to a different space controlled by sound like the rhythm or the inventiveness of the music. The louder the sound, the more immersion there is in this other space. This immediate gratification would seem to be without penalty because the gradual increase in damage is not at first apparent. Warnings are then easy to ignore, and, like drugs, loud sound can even be habit forming.


In my opinion, sound in several dimensions has a long way to go to capture real acoustic events. However, so many improvements have been made in the last 50 years that music reproduction has become thoroughly enjoyable. For me, I only wish it had occurred at an earlier time.

There are two sides to pleasing sound. There is the re-production of reality and then there is the production of many alternate realities, all of which offer a different path to sonic ecstasies and all of which can provide many varied experiences. After all, the pleasure of music in various forms is what mankind has sought and expressed for many thou sands of years. However, combined visual effects distract from a sonic experience and limit your imagination.

In addition, after this brief introduction to some of the delicate and very complicated functions of the ear, I hope that you will take care of your hearing and your children’s hearing so that you may all continue to enjoy music and sound throughout your lifetimes.


1. “0, 0, 0, That Concert Hall Realism Rag,” Craig Stark, Stereo Review, Nov. 1971,p. 75.

2. “Sound System for Pilots Turns Heads,” Design News, Dec. 17, 1990.

3. “The Influence of a Single Echo on the Audibility of Speech,” Helmut Hass, Journal of the Audio Engineering Society, Vol. 20 (Mar. 1972), pp. 145-159.

4. “Sineward Distortion,” John W. Campbell, Jr., Audio, Nov. 1961, p. 52.

5. “Stereo Hearing,” Floyd Toole, Stereo Review, Jan. 1973, p. 75.

6. “Two Speakers are Better than 5.1,” Alan Kramer, IEEE Spectrum, May 2001, pp. 71-74.

7. “Acoustic Noise Measurements,” Bruel & Kjaer, Copenhagen, Denmark, Jan. 1971

8. “Sound and Hearing,” Stevens and Warshofsky, Time, Inc., 1967.

9. “The Human Ear: In Anatomical Transparencies,” Polyak, McHugh & Judd, Sonotone Corporation, Elmsford, NY, 1946.

10. Ear Model, 3B Scientific products, Rudorffweg 8, 21031 Hamburg, Germany.

11. “Hearing and Listening,” Floyd Toole, Audio Scene, D 1978, pp. 34-43.

12. Olson, Harry E, Modern Sound reproduction, 1972 (Van Nostrand Rein hold Co.), p. 325.

13. “Recreational Deafness—How Can Audio Engineers Stem It?”, Daniel R. Raichel, Audio Engineering Society, 1979 (preprint 1535-1-6).

14. “Early detection of hearing damage in young listeners resulting from expo sure to amplified music,” RD. West and E.E Evans, British Journal of Audiology, 24:2: 89-103, April 1990.

15. “The seductive (yet destructive) Appeal of Loud Music” by Barry Blesser, Ph.D..

Prev. | Next

Top of Page    Home

Updated: Wednesday, 2014-10-01 0:00 PST