ENDING THE REIGN OF ERROR
Error protection is one of the great opportunities of digital audio storage
media such as the Compact Disc. It offers something that was never possible
with analog media-the chance to correct errors and repair damage.
When you scratch an LP record, the grooves are irrevocably damaged, along
with the information contained in them. Forever after, there will be a click
or pop as the damaged part of the groove passes underneath the phono graph
stylus. But when you scratch a CD, the nature of the data on the disc and the
player's design offer you a second chance. The data on the disc includes a
special error-correction code, and your player uses this to correct for damaged
data. Thus, it strives to deliver the original information in tact, and it
performs this error correction each time the disc is played.
Let's examine some fundamentals of error protection. To start off, consider
these two messages:
1. Turn to page 345 in your hymnals.
2. Let's party! Let's party!
The first message is like the information in an LP groove, and the second
is like the data in a CD pit track. Now, lay a finger vertically across the
two messages, to represent a scratch. You'll observe that part of the first
message is irrevocably gone, whereas the second message can be reconstructed
because we have used redundant data to repeat it. In fact, redundancy is the
essence of error protection. This is extra information, derived from the original
data, which is not essential (in the absence of errors) to the message itself.
In general, the greater the redundancy, the greater the error protection.
By comparing redundant messages, you can overcome the effect of the error.
Note that we haven't prevented the error; we have simply ensured against its
effect. Clever, eh? Of course, even the most clever of ideas has its limitations.
As you lay more and more fingers over the message, reconstructing it becomes
more and more difficult. Thus, the severity of the error is an important factor
in our ability to make corrections. Also, if you lay your finger horizontally
across the page, you might completely obscure the entire line, destroying both
the message and its redundancy. There fore, the nature of the error plays a
role too. In addition, we should note that error correction demands a price;
in this case, including an error-correction code doubles the amount of storage
space required.
On the negative side, error protection is not only an opportunity but an obligation.
Digital audio on a Compact Disc might require storage of 15 billion bits, which
amounts to about 100 mil lion bits per square centimeter. With such great data
density, even the smallest speck of dust would wipe out a considerable number
of bits. This points out the fact that, in reality, the type of error correction
utilized has to be a great deal more sophisticated than our simple redundancy
scheme.
We should also consider the vulnerability of digital data. Even one bad bit
could wreak havoc. For example, if the digital word 0000000000000000 (representing
silence) was misread as 1000000000000000 (representing a pretty loud level),
a loud click would result. Highly unacceptable! Thus we have no choice; for
satisfactory digital audio storage, error protection is essential. Data on
a CD must be error-protected before the disc is made, and a CD player must
be able to correct errors during playback.
But before a CD player can correct errors, it must detect them. While this
might sound obvious, the problem can be considerable. If presented with a data
word, could you tell whether or not it contained an error? For example, does
1100101000011110 contain an error? Unless you are psychic, there's no way to
tell. I could transmit the message twice:
1100101000011110 1100101100011110
and close examination would reveal a difference between them. Obviously, both
cannot be correct, but which message is right? I could transmit the message
three times, with two identical data words and one slightly different, and
you might have a good suspicion of which was correct. But would you be sure?
What if I repeated the message three times and all three versions were different?
Obviously, simple repetition is an in efficient way of going about error detection
and correction; we need to de vise a better method. Redundancy is the best
way, but we must optimize it to limit its inherent penalty of overhead.
This leads to the development of elaborate error-correction codes which make
very efficient use of redundancy
by coding the information in certain ways. Error-correction codes use special
algorithms to create redundant in formation, balancing the information between
the twin tasks of detection and correction.
What is a code? A code is simply a way of representing information. Common
examples include Morse code, ASCII (American Standard Code for In formation
Interchange), and the ISBN (International Standard Book Number) code. An ISBN
number is more than just a series of digits. For example, consider the ISBN
number 0-672 22388-0. The first digit designates the country of origin; for
example, 0 is for the U.S. and some other English speaking countries. The next
three dig its comprise a publisher code, and the next six make up the book
number code. The last digit is particularly interesting; it is a check digit
with a value derived from the values of the previous digits. This allows us
to verify the correctness of any ISBN by adding together the first nine digits,
in modulo 11 arithmetic, and comparing them to the last digit.
In practice, the redundancy con tamed in digital audio protection codes acts
very much like this last digit of the ISBN. Specifically, the redundancy of
ten takes the form of a parity bit, an extra bit chosen according to a predefined
rule, using the message as the basis. For example, in an odd-parity scheme,
the parity bit is chosen so that the number of ones in the message, including
the parity bit, is odd. A word such as 11010100 would have a parity bit of
1 appended. When the word is read from a storage medium, we can use the parity
bit to perform a simple check on the word's validity. In this case, if the
number of ones in the received message were even, we would know an error had
occurred, though not which bit was wrong.
To increase their detection reliability, parity codes can be made more complex.
In some codes, data is divided into blocks and parity bits are added to each
block. Consider the example in Fig. 1. In addition to the 16 data values, nine
extra parity values are created and appended. They are placed at the end of
each row and column, and each represents the sum of that row or column. Note
that a parity value is also given for the parity row and the parity column.
If an error occurs in any data (or parity) value, the error can be easily
located and the correct value can be easily calculated, using the other data
present. For example, suppose that the data block shown in Fig. 2 is received.
As we calculate each parity value at the receiving end, and check it against
each transmitted parity value, we observe a disagreement. In fact, there is
a disagreement in both a row and a column parity value. The intersection of
the row and column points to the error. Furthermore, we can now substitute
the correct value, thanks to the transmitted parity values.
The data in the block is thus made substantially more reliable than it had
been. Of course, instead of 16 values, 25 are now required in order to accommodate
the error code.
Using parity, the system can- detect and correct errors, and supply good as-new
information. However, as I've mentioned, the performance of the error-protection
system depends on the nature of the error. What if a large error obliterated
both the data and its parity? There would be nothing left from which to reconstruct
the message. It is thus important to understand the nature of the errors and
devise protection accordingly.
Fig. 1-Original data block (va and block same data with parity values added (8).
Fig.2-Error detection and correction, using parity values.
Fig. 3-Without interleaving (A), large dropouts (burst errors) cause irretrievable
data losses.
With interleaving (8). the effects of dropouts are distributed, allowing errors
to be corrected or concealed.
Fig. 4-A more detailed look at how interleaving works.
Errors can occur in large groups, called burst errors, or in isolated instances,
called random errors. For ex ample, a badly formed pit would cause a random
error, whereas a greasy fingerprint would cause burst errors. The CD must guard
against both kinds.
Interleaving is employed to protect against the all-too-likely occurrence
of burst errors. Interleaving might be thought of as shuffling a deck of cards;
data words are redistributed in the bit stream during recording so that consecutive
words are never adjacent in the medium. An error occurring in the medium (such
as a dust particle on the disc) might prevent the successful reading of a number
of adjacent values. ues. However, upon de interleaving, the shuffled words
are placed back in their original and rightful positions in the stream, and
the errors are scattered in time. Thus isolated, they are much easier to correct.
Figure 3 shows what happens to a data dropout without and with interleaving.
In the latter case, a large defect in the medium is distributed following de-interleaving.
Because consecutive data bits are stored far apart, any errors will be scattered,
becoming more like random errors-which are more easily corrected.
Figure 4 shows how interleaving and de-interleaving works. It appears complicated,
but simply delaying the data with differing delay times during re cording,
and delaying it again (in a complementary manner) upon play back, accomplishes
the job. Cross-interleaving carries the idea one step farther; data is interleaved,
then inter leaved again. This adds extra robustness, providing correctability
for larger errors. The nice part about cross-interleaving is that while error
protection is enhanced, the amount of redundancy is not increased.
While total error correction is theoretically possible (short of catastrophic
failure of the system), it would be impractical to implement. With real-life
digital audio systems, some errors are too massive for the error-correction
scheme, and we must fall back on error concealment. Using error-detection results
to pinpoint massive errors, interpolation methods utilize valid data surrounding
the error to calculate a new value to substitute for the error.
Unlike correction, in which the new value is identical to the original, interpolation
is only a best guess. But with a good interpolation scheme, an uncorrected
error can be made virtually in audible.
In worst-case scenarios, where the error is so massive that even error concealment
would fail, we choose to mute the audio signal. (After all, brief silence is
preferable to a burst of distortion.) By swiftly attenuating the signal before
and after the mute, even these catastrophic errors are often made inaudible
to most listeners.
As we have seen, error protection consists of several kinds of processing
tasks. Errors must be detected and then corrected (if possible), concealed,
or muted. To accomplish those chores, the signal must be carefully encoded
with detection and correction codes, and interleaved. It's a lot of work. Next
time you're eating some thing greasy and listening to CDs, I hope you'll appreciate
it.
(adapted from Audio magazine, May 1987)
= = = =
|