Digital recording and transmission principles (part 2)

Home | Audio mag. | Stereo Review mag. | High Fidelity mag. | AE/AA mag.

13. Channel coding

In summary, it’s not practicable simply to serialize raw data in a shift register for the purpose of recording or for transmission except over relatively short distances. Practical systems require the use of a modulation scheme, known as a channel code, which expresses the data as waveforms which are self-clocking in order to reject jitter, separate the received bits and to avoid skew on separate clock lines. The coded waveforms should further be DC-free or nearly so to enable slicing in the presence of losses and have a narrower spectrum than the raw data to make equalization possible.

Jitter causes uncertainty about the time at which a particular event occurred. The frequency response of the channel then places an overall limit on the spacing of events in the channel. Particular emphasis must be placed on the interplay of bandwidth, jitter and noise, which will be shown here to be the key to the design of a successful channel code.

Fig. 29 shows that a channel coder is necessary prior to the record stage, and that a decoder, known as a data separator, is necessary after the replay stage. The output of the channel coder is generally a logic level signal which contains a 'high' state when a transition is to be generated.

The waveform generator produces the transitions in a signal whose level and impedance is suitable for driving the medium or channel. The signal may be bipolar or unipolar as appropriate.

Some codes eliminate DC entirely, which is advantageous for optical media and for rotary head recording. Some codes can reduce the channel bandwidth needed by lowering the upper spectral limit. This permits higher linear density, usually at the expense of jitter rejection. Other codes narrow the spectrum by raising the lower limit. A code with a narrow spectrum has a number of advantages. The reduction in asymmetry will reduce peak shift and data separators can lock more readily because the range of frequencies in the code is smaller. In theory the narrower the spectrum, the less noise will be suffered, but this is only achieved if filtering is employed. Filters can easily cause phase errors which will nullify any gain.

A convenient definition of a channel code (for there are certainly others) is: 'A method of modulating real data such that they can be reliably received despite the shortcomings of a real channel, while making maximum economic use of the channel capacity.' The basic time periods of a channel-coded waveform are called positions or detents, in which the transmitted voltage will be reversed or stay the same. The symbol used for the units of channel time is Td.

There are many ways of creating such a waveform, but the most convenient is to convert the raw data bits to a larger number of channel bits which are output from a shift register to the waveform generator at the detent rate. The coded waveform will then be high or low according to the state of a channel bit which describes the detent.

Channel coding is the art of converting real data into channel bits. It’s important to appreciate that the convention most commonly used in coding is one in which a channel-bit one represents a voltage change, whereas a zero represents no change. This convention is used because it’s possible to assemble sequential groups of channel bits together without worrying about whether the polarity of the end of the last group matches the beginning of the next. The polarity is unimportant in most codes and all that matters is the length of time between transitions. It should be stressed that channel bits are not recorded. They exist only in a circuit technique used to control the waveform generator. In many media, for example CD, the channel bit rate is beyond the frequency response of the channel and so it cannot be recorded.

One of the fundamental parameters of a channel code is the density ratio (DR). One definition of density ratio is that it’s the worst-case ratio of the number of data bits recorded to the number of transitions in the channel. It can also be thought of as the ratio between the Nyquist rate of the data (one-half the bit rate) and the frequency response required in the channel. The storage density of data recorders has steadily increased due to improvements in medium and transducer technology, but modern storage densities are also a function of improvements in channel coding.

Fig. 30(a) shows how density ratio has improved as more sophisticated codes have been developed.

Fig. 30 (a) Comparison of codes by density ratio; (b) comparison of codes by figure of merit. Note how 4/5, 2/3, 8/10 and RNRZ move up because of good jitter performance; HDM-3 moves down because of jitter sensitivity.

As jitter is such an important issue in digital recording and transmission, a parameter has been introduced to quantify the ability of a channel code to reject time instability. This parameter, the jitter margin, also known as the window margin or phase margin (Tw), is defined as the permitted range of time over which a transition can still be received correctly, divided by the data bit-cell period (T).

Since equalization is often difficult in practice, a code which has a large jitter margin will sometimes be used because it resists the effects of inter symbol interference well. Such a code may achieve a better performance in practice than a code with a higher density ratio but poor jitter performance.

A more realistic comparison of code performance will be obtained by taking into account both density ratio and jitter margin. This is the purpose of the figure of merit (FoM), which is defined as DR _Tw. Fig. 30(b) shows a number of codes compared by FoM.

Fig. 31 Early magnetic recording codes. RZ shown at (a) had poor signal-to-noise ratio and poor overwrite capability. NRZ at (b) overcame these problems but suffered error propagation. NRZI at (c) was the final result where a transition represented a one.

NRZI is not self-clocking.

14. Recording-oriented codes

Many channel codes are sufficiently versatile that they have been used in recording, electrical or optical cable transmission and radio transmission.

Others are more specialized and are intended for only one of these categories. Channel coding has roots in computers, in telemetry and in Telex services, but has for some time been considered a single subject.

These starting points will be considered here.

In magnetic recording, the first digital recordings were developed for early computers and used very simple techniques. Fig. 31(a) shows that in Return to Zero (RZ) recording, the record current has a zero state between bits and flows in one direction to record a one and in the opposite direction to record a zero. Thus every bit contains two flux changes which replay as a pair of pulses, one positive and one negative.

The signal is self-clocking because pulses always occur. The order in which they occur determines the state of the bit. RZ recording cannot erase by overwrite because there are times when no record current flows.

Additionally the signal amplitude is only one half of what is possible.

These problems were overcome in the Non-Return to Zero code shown in Fig. 31(b). As the name suggests, the record current does not cease between bits, but flows at all times in one direction or the other dependent on the state of the bit to be recorded. This results in a replay pulse only when the data bits change from state to another. As a result if one pulse was missed, the subsequent bits would be inverted. This was avoided by adapting the coding such that the record current would change state or invert whenever a data one occurred, leading to the term Non-Return to Zero Invert or NRZI shown in Fig. 31(c). In NRZI a replay pulse occurs whenever there is a data one. Clearly neither NRZ or NRZI are self-clocking, but require a separate clock track. Skew between tracks can only be avoided by working at low density and so the system cannot be used for digital audio. However, virtually all the codes used for magnetic recording are based on the principle of reversing the record current to produce a transition.

Fig. 32 Various communications oriented codes are shown here: at (a) frequency shift keying (FSK), at (b) phase encoding and at (c) differential quadrature phase shift keying (DQPSK).

15. Transmission-oriented codes

In cable transmission, also known as line signaling, and in telemetry, the starting point was often the speech bandwidth available in existing telephone lines and radio links. There was no DC response, just a range of frequencies available. Fig. 32(a) shows that a pair of frequencies can be used, one for each state of a data bit. The result is frequency shift keying (FSK) which is the same as would be obtained from an analog frequency modulator fed with a two-level signal. This is exactly what happens when two-level pseudo-video from a PCM adaptor is fed to a VCR and is the technique used in units such as the PCM F-1 and the PCM-1630. PCM adaptors have also been used to carry digital audio over a video landline or microwave link. Clearly FSK is DC-free and self clocking.

Instead of modulating the frequency of the signal, the phase can be modulated or shifted instead, leading to the generic term of phase shift keying or PSK. This method is highly suited to broadcast as it’s easily applied to a radio frequency carrier. The simplest technique is selectively to invert the carrier phase according to the data bit as in Fig. 32(b).

There can be many cycles of carrier in each bit period. This technique is known as phase encoding (PE) and is used in GPS (Global Positioning System) broadcasts. The receiver in a PE system is a well-damped phase locked loop which runs at the average phase of the transmission. Phase changes will then result in phase errors in the loop and so the phase error is the demodulated signal.

16. General-purpose codes

Despite the different origins of codes, there are many similarities between them. If the two frequencies in an FSK system are one octave apart, the limiting case in which the highest data rate is obtained is when there is one half-cycle of the lower frequency or a whole cycle of the high frequency in one bit period. This gives rise to the frequency modulation (FM). In the same way, the limiting case of phase encoding is where there is only one cycle of carrier per bit. In recording, this what is meant by phase encoding. These approaches can be contrasted in Fig. 33.

Fig. 33 FM and PE contrasted. In (a) are the FM waveform and the channel bits which may be used to describe transitions in it. The FM coder is shown in (b). The PE waveform is shown in (c). As PE is polarity conscious, the channel bits must describe the signal level rather than the transitions. The coder is shown in (d).

The FM code, also known as Manchester code or bi-phase mark code, shown in Fig. 33(a) was the first practical self-clocking binary code and it’s suitable for both transmission and recording. It’s DC-free and very easy to encode and decode. It’s the code specified for the AES/EBU digital audio interconnect standard which will be described in Section 8.

In the field of recording it remains in use today only where density is not of prime importance, For example in SMPTE/EBU timecode for professional audio and video recorders and in floppy disks.

In FM there is always a transition at the bit-cell boundary which acts as a clock. For a data one, there is an additional transition at the bit-cell centre. Fig. 33(a) shows that each data bit can be represented by two channel bits. For a data zero, they will be 10, and for a data one they will be 11. Since the first bit is always one, it conveys no information, and is responsible for the density ratio of only one-half. Since there can be two transitions for each data bit, the jitter margin can only be half a bit, and the resulting FoM is only 0.25. The high clock content of FM does, however, mean that data recovery is possible over a wide range of speeds; hence the use for timecode. The lowest frequency in FM is due to a stream of zeros and is equal to half the bit rate. The highest frequency is due to a stream of ones, and is equal to the bit rate. Thus the fundamentals of FM are within a band of one octave. Effective equalization is generally possible over such a band. FM is not polarity conscious and can be inverted without changing the data.

Fig. 33(b) shows how an FM coder works. Data words are loaded into the input shift register which is clocked at the data bit rate. Each data bit is converted to two channel bits in the code guide or look-up table. These channel bits are loaded into the output register. The output register is clocked twice as fast as the input register because there are twice as many channel bits as data bits. The ratio of the two clocks is called the code rate, in this case it’s a rate one-half code. Ones in the serial channel bit output represent transitions whereas zeros represent no change. The channel bits are fed to the waveform generator which is a one-bit delay, clocked at the channel bit rate, and an exclusive-OR gate. This changes state when a channel bit one is input. The result is a coded FM waveform where there is always a transition at the beginning of the data bit period, and a second optional transition whose presence indicates a one.

Fig. 34 MFM or Miller code is generated as shown here. The minimum transition spacing is twice that of FM or PE. MFM is not always DC free as shown at (b). This can be overcome by the modification of (c) which results in the Miller2 code.

In PE there is always a transition in the centre of the bit but Fig. 33(c) shows that the transition between bits is dependent on the data values. Although its origins were in line coding, phase encoding can be used for optical and magnetic recording as it’s DC-free and self-clocking.

It has the same DR and Tw as FM, and the waveform can also be described using channel bits, but with a different notation. As PE is polarity sensitive, the channel bits determine the level of the encoded signal rather than causing a transition. Fig. 33(d) shows that the allowable channel bit patterns are now 10 and 01.

In modified frequency modulation (MFM) also known as Miller code, 5 the highly redundant clock content of FM was reduced by the use of a phase-locked loop in the receiver which could flywheel over missing clock transitions. This technique is implicit in all the more advanced codes. Fig. 34(a) shows that the bit-cell centre transition on a data one was retained, but the bit-cell boundary transition is now only required between successive zeros. There are still two channel bits for every data bit, but adjacent channel bits will never be one, doubling the minimum time between transitions, and giving a DR of 1. Clearly the coding of the current bit is now influenced by the preceding bit. The maximum number of prior bits which affect the current bit is known as the constraint length Lc, measured in data-bit periods. For MFM Lc = T. Another way of considering the constraint length is that it assesses the number of data bits which may be corrupted if the receiver misplaces one transition. If Lc is long, all errors will be burst errors.

MFM doubled the density ratio compared to FM and PE without changing the jitter performance; thus the FoM also doubles, becoming 0.5.

It was adopted for many rigid disks at the time of its development, and remains in use on double-density floppy disks. It’s not, however, DC free. Fig. 34(b) shows how MFM can have DC content under certain conditions.

17. Miller code

The Miller code is derived from MFM, and Fig. 34(c) shows that the DC content is eliminated by a slight increase in complexity.

Wherever an even number of ones occurs between zeros, the transition at the last one is omitted. This creates two additional, longer run lengths and increases the Tmax of the code. The decoder can detect these longer run lengths in order to re-insert the suppressed ones. The FoM of Miller2 is 0.5 as for MFM. Miller was used in early 3M stationary head digital audio recorders, in high rate instrumentation recorders and in the D-2 DVTR format.

18. Group codes

Further improvements in coding rely on converting patterns of real data to patterns of channel bits with more desirable characteristics using a conversion table known as a codebook. If a data symbol of m bits is considered, it can have 2m different combinations. As it’s intended to discard undesirable patterns to improve the code, it follows that the number of channel bits n must be greater than m. The number of patterns which can be discarded is:

2n - 2m

One name for the principle is group code recording (GCR), and an important parameter is the code rate, defined as:

R = m n

It will be evident that the jitter margin Tw is numerically equal to the code rate, and so a code rate near to unity is desirable. The choice of patterns which are used in the code guide will be those which give the desired balance between clock content, bandwidth and DC content.

Fig. 35 shows that the upper spectral limit can be made to be some fraction of the channel bit rate according to the minimum distance between ones in the channel bits. This is known as T_min, also referred to as the minimum transition parameter M and in both cases is measured in data bits T. It can be obtained by multiplying the number of channel detent periods between transitions by the code rate.

Unfortunately, codes are measured by the number of consecutive zeros in the channel bits, given the symbol d, which is always one less than the number of detent periods. In fact T_min is numerically equal to the density ratio.

[...]

Fig. 35 also shows that the lower spectral limit is influenced by the maximum distance between transitions Tmax. This is also obtained by multiplying the maximum number of detent periods between transitions by the code rate. Again, codes are measured by the maximum number of zeros between channel ones, k, and so:

The length of time between channel transitions is known as the run length. Another name for this class is the run-length-limited (RLL) codes.

Since m data bits are considered as one symbol, the constraint length Lc will be increased in RLL codes to at least m. It is, however, possible for a code to have run-length limits without it being a group code.

Fig. 35 A channel code can control its spectrum by placing limits on Tmin (M) and Tmax which define upper and lower frequencies. The ratio of Tmax/Tmin determines the asymmetry of waveform and predicts DC content and peak shift. Example shown is EFM.

In practice, the junction of two adjacent channel symbols may violate run-length limits, and it may be necessary to create a further code guide of symbol size 2n which converts violating code pairs to acceptable patterns.

This is known as merging and follows the golden rule that the substitute 2n symbol must finish with a pattern which eliminates the possibility of a subsequent violation. These patterns must also differ from all other symbols.

Substitution may also be used to different degrees in the same nominal code in order to allow a choice of maximum run length, e.g. 3PM. The maximum number of symbols involved in a substitution is denoted by r.

There are many RLL codes and the parameters d,k,m,n, and r are a way of comparing them.

Sometimes the code rate forms the name of the code, as in 2/3, 8/10 and EFM; at other times the code may be named after the d,k parameters, as in 2,7 code. Various examples of group codes will be given to illustrate the principles involved.

Fig. 36 The code guide of the 4/5 code of MADI. Note that a one represents a transition in the channel.

19. 4/5 code of MADI

In the MADI (multi-channel audio interface) standard, a four-fifths rate code is used where groups of four data bits are represented by groups of five channel bits.

Four bits have 16 combinations whereas five bits have 32 combinations.

Clearly only 16 out of these 32 are necessary to convey all the possible data. Fig. 36 shows that the 16 channel bit patterns chosen are those which have the least DC component combined with a high clock content.

Adjacent ones are permitted in the channel bits, so there can be no violation of Tmin at the boundary of two symbols. Tmax is determined by the worst case run of zeros at a symbol boundary and as k = 3, Tmax is 16/5 = 3.2T. The code is thus described as 0,3,4,5,1 and Lc = 4T.

The jitter resistance of a group code is equal to the code rate. For example, in 4/5 transitions cannot be closer than 0.8 of a data bit apart and so this represents the peak to peak jitter which can be rejected. The density ratio is also 0.8 so the FoM is 0.64; an improvement over FM.

A further advantage of group coding is that it’s possible to have codes which have no data meaning. In MADI further channel bit patterns are used for packing and synchronizing. Packing is where dummy data are sent when the real data rate is low in order to keep the channel frequencies constant. This is necessary so that fixed equalization can be used. The packing pattern does not decode to data and so it can be easily discarded at the receiver. Further details of MADI can be found in Section 8.

Fig. 37 2/3 code. In (a) two data bits (m) are expressed as three channel bits (n) without adjacent transitions (d = 1). Violations are dealt with by substitution. Adjacent data pairs can break the encoding rule; in these cases substitutions are made, as shown in (b).

20. 2/3 code

Fig. 37(a) shows the code guide of an optimized code which illustrates one merging technique. This is a 1,7,2,3,2 code known as 2/3. It’s designed to have a good jitter window in order to resist peak shift distortion in disk drives, but it also has a good density ratio.

In 2/3 code, pairs of data bits create symbols of three channel bits. For bandwidth reduction, codes having adjacent ones are eliminated so that d = 1. This halves the upper spectral limit and the DR is improved accordingly:

In Fig. 37(b) it will be seen that some group combinations cause violations. To avoid this, pairs of three-channel bit symbols are replaced with a new six-channel bit symbol. Lc is thus 4T, the same as for the 4/5 code. The jitter window is given by:

This is an extremely good figure for an RLL code , and is some 10 percent better than the FoM of 3PM and 2,7 and as a result 2/3 has been highly successful in Winchester disk drives.

Fig. 38 EFM code: d = 2, k = 10. Eight data bits produce 14 channel bits plus three packing bits. Code rate is 8/17. DR = (3 _ 8)/17 = 1.41.

Fig. 39 (a) Digital sum value example calculated from EFM waveform. (b) Two successive 14T symbols without DC control (upper) give DSV of -16. Additional transition (*) results in DSV of +2, anticipating negative content of next symbol.

21. EFM code in CD

This section is concerned solely with the channel coding of CD. A more comprehensive discussion of how the coding is designed to suit the specific characteristics of an optical disk is given in Section 12. Fig. 38 shows the 8,14 code (EFM) used in the Compact Disc. Here eight bit symbols are represented by 14-bit channel symbols.

There are 256 combinations of eight data bits, whereas 14 bits have 16K combinations. Of these only 267 satisfy the criteria that the maximum run length shall not exceed 11 channel bits (k = 10) nor be less than three channel bits (d = 2). A section of the code guide is shown in the figure.

In fact 258 of the 267 possible codes are used because two unique patterns are used to synchronize the subcode blocks (see Section 12). It’s not possible to prevent violations between adjacent symbols by substitution, and extra merging bits having no data meaning are placed between the symbols. Two merging bits would be adequate to prevent violations, but in practice three are used because a further task of the merging bits is to control the DC content of the waveform. The merging bits are selected by computing the digital sum value (DSV) of the waveform. The DSV is computed as shown in Fig. 39(a). One is added to a count for every channel bit period where the waveform is in a high state, and one is subtracted for every channel bit period spent in a low state. Fig. 39(b) shows that if two successive channel symbols have the same sense of DC offset, these can be made to cancel one another by placing an extra transition in the merging period. This has the effect of inverting the second pattern and reversing its DC content. The DC-free code can be high-pass filtered on replay and the lower-frequency signals are then used by the focus and tracking servos without noise due to the DC content of the audio data. Encoding EFM is complex, but was acceptable when CD was launched because only a few encoders are necessary in comparison with the number of players.

Decoding is simpler as no DC content decisions are needed and a look up table can be used. The code guide was computer optimized to permit the implementation of a programmable logic array (PLA) decoder with the minimum complexity.

Owing to the inclusion of merging bits, the code rate becomes 8/17, and the density ratio becomes:

The code is thus a 2,10,8,17, r system where r has meaning only in the context of DC control.

The constraints d and k can still be met with r = 1 because of the merging bits. The figure of merit is less useful for optical media because the straight-line frequency response does not produce peak shift and the rigid, non-contact medium has good speed stability.

The density ratio and the freedom from DC are the most important factors.

Fig. 40 Some of the 8/10 code guide for non-zero DSV symbols (two entries) and zero DSV symbols (one entry).

22. The 8/10 group code of DAT

The essential feature of the channel code of DAT is that it must be able to work well in an azimuth recording system. There are many channel codes available, but few of them are suitable for azimuth recording because of the large amount of crosstalk. The crosstalk cancellation of azimuth recording fails at low frequencies, so a suitable channel code must not only be free of DC, but it must suppress low frequencies as well. A further issue is that erasure is by overwriting, and as the heads are optimized for short-wavelength working, best erasure will be when the ratio between the longest and shortest wavelengths in the recording is small.

In Fig. 40, some examples from the 8/10 group code of DAT are shown.

Clearly a channel waveform which spends as much time high as low has no net DC content, and so all ten-bit patterns which meet this criterion of zero disparity can be found. As was seen in section 21, the term used to measure DC content is called the digital sum value (DSV).

For every bit the channel spends high, the DSV will increase by one; for every bit the channel spends low, the DSV will decrease by one. As adjacent channel ones are permitted, the window margin and DR will be 0.8, comparing favorably with the figure of 0.5 for MFM, giving an FoM of 0.64. Unfortunately there are not enough DC-free combinations in ten channel bits to provide the 256 patterns necessary to record eight data bits. A further constraint is that it’s desirable to restrict the maximum run length to improve overwrite capability and reduce peak shift. In the 8/10 code of DAT, no more than three channel zeros are permitted between channel ones, which makes the longest wavelength only four times the shortest. There are only 153 ten-bit patterns which are within this maximum run length and which have a DSV of zero.

The remaining 103 data combinations are recorded using channel patterns that have non-zero DSV. Two channel patterns are allocated to each of the 103 data patterns. One of these has a DSV of +2, the other has a DSV of -2. For simplicity, the only difference between them is that the first channel bit is inverted. The choice of which channel-bit pattern to use is based on the DSV due to the previous code.

For example, if several bytes have been recorded with some of the 153 DC-free patterns, the DSV of the code will be zero. The first data byte is then found which has no zero disparity pattern. If the +2 DSV pattern is used, the code at the end of the pattern will also become +2 DSV. When the next pattern of this kind is found, the code having the DSV of -2 will automatically be selected to return the channel DSV to zero. In this way the code is kept DC-free, but the maximum distance between transitions can be shortened. A code of this kind is known as a low-disparity code.

Fig. 41 In (a) the truth table of the symbol encoding prior to DSV control. In (b) this circuit controls code disparity by remembering non-zero DSV in the latch and selecting a subsequent symbol with opposite DSV.

In order to reduce the complexity of encoding logic, it’s usual in group codes to computer-optimize the relationship between data patterns and code patterns. This has been done for 8/10 so that the conversion can be performed in a programmed logic array. The Boolean expressions for calculating the channel bits from data can be seen in Fig. 41(a). Only DC-free or DSV = +2 patterns are produced by the logic, since the DSV = - 2 pattern can be obtained by reversing the first bit. The assessment of DSV is performed in an interesting manner. If in a pair of channel bits the second bit is one, the pair must be DC-free because each detent has a different value. If the five even channel bits in a ten-bit pattern are checked for parity and the result is one, the pattern could have a DSV of 0, ± 4 or ± If the result is zero, the DSV could be ± 2, ±6 or ±10. However, the codes used are known to be either zero or +2 DSV, so the state of the parity bit discriminates between them. Fig. 41(b) shows the encoding circuit. The lower set of XOR gates calculate parity on the latest pattern to be recorded, and store the DSV bit in the latch. The next data byte to be recorded is fed to the PLA, which outputs a ten-bit pattern. If this is a zero disparity code, it passes to the output unchanged. If it’s a DSV = +2 code, this will be detected by the upper XOR gates. If the latch is set, This means that a previous pattern had been +2 DSV, and so the first bit of the channel pattern is inverted by the XOR gate in that line, and the latch will be cleared because the DSV of the code has been returned to zero.

Decoding is simpler, because there is a direct relationship between ten bit codes and eight-bit data.

23. Tracking signals

Many recorders use track following systems to help keep the head(s) aligned with the narrow tracks used in digital media. These can operate by sensing low-frequency tones which are recorded along with the data.

Whilst this can be done by linearly adding the tones to the coder output, this requires a linear record amplifier. An alternative is to use the DC content group codes. A code is devised where for each data pattern, several code patterns exist having a range of DC components. By choosing groups with a suitable sequence of DC offsets, a low frequency can be added to the coded signal. This can be filtered from the data waveform on replay.

24. Convolutional RLL codes

It has been mentioned that a code can be run-length limited without being a group code. An example of this is the HDM-1 code used in DASH format (digital audio stationary head - see Section 9) recorders. The coding is best described as convolutional, and is rather complex, as Fig. 42 shows.

The DR of 1.5 is achieved by treating the input sequence of 0,1 as a single symbol which has a transition recorded at the centre of the one. The code then depends upon whether the data continue with ones or revert to zeros. The shorter run lengths are used to describe sequential ones; the longer run lengths describe sequential zeros, up to a maximum run length of 4.5 T, with a constraining length of 5.5 T. In HDM-2, a derivative, the maximum run length is reduced to 4 T with the penalty that Lc becomes 7.5 T.

The 2/4M code used by the Mitsubishi ProDigi quarter-inch format recorders is also convolutional, and has an identical density ratio and window margin to HDM-1. Tmax is eight bits. Neither HDM-1 nor 2/4M are DC-free, but this is less important in stationary head recorders and an adaptive slicer as shown in section 11 can be used. The encoding of 2/4M is just as complex as that of HDM-1 and is shown in Fig. 43.

Fig. 42 HDM-1 code of the DASH format is encoded according to the above rules.

Transitions will never be closer than 1.5 bits, nor further apart than 4.5 bits.

Two data bits form a group, and result in four channel bits where there are always two channel zeros between ones, to obtain a DR of 1.5. There are numerous exceptions required to the coding to prevent violation of the run-length limits and this requires a running sample of ten data bits to be examined. Thus the code is convolutional although it has many of the features of a substituting group code.

Fig. 43 Coding rules for 2/4M code. In (a) a running sample is made of two data bits DD and earlier and later bits. In (b) the two data bits become the four channel bits shown except when the substitutions specified are made.

25. Graceful degradation

In all the channel codes described here all data bits are assumed to be equally important and if the characteristics of the channel degrade, there is an equal probability of corruption of any bit. In digital audio samples the bits are not equally important. Loss of a high-order bit causes greater degradation than loss of a low-order bit. For applications where the bandwidth of the channel is unpredictable, or where it may improve as technology matures, a different form of channel coding has been proposed20 where the probability of corruption of bits is not equal. The channel spectrum is divided in such a way that the least significant bits occupy the highest frequencies and the most significant bits occupy the lower frequencies. When the bandwidth of the channel is reduced, the eye pattern is degraded such that certain eyes are indeterminate, but others remain open, guaranteeing reception and clocking of high-order bits. In PCM audio the result would be sensibly the same waveform but an increased noise level. Any error correction techniques would need to consider the unequal probability of error possibly by assembling code words from bits of the same significance.

Fig. 44 When randomizing is used, the same pseudo-random sequence must be provided at both ends of the channel with bit synchronism.

26. Randomizing

NRZ has a DR of 1 and a jitter window of 1 and so has a FoM of 1 which is better than the group codes. It does, however, suffer from an unconstrained spectrum and poor clock content. This can be overcome using randomizing. At the encoder, a pseudo-random sequence (see Section 3) is added modulo 2 to the serial data and the resulting ones generate transitions in the channel. This process drastically reduces Tmax and reduces DC content. Fig. 44 shows that at the receiver the transitions are converted back to a serial bitstream to which the same pseudo-random sequence is again added modulo 2. As a result the random signal cancels itself out to leave only the serial data, provided that the two pseudo-random sequences are synchronized to bit accuracy.

Randomizing with NRZI (RNRZI) is used in the D-1 DVTR. Randomizing can also be used in addition to any other channel coding or modulation scheme. It’s employed in NICAM 728 and in DAB as will be seen in the next section.

Fig. 45 A DQPSK coder conveys two bits for each modulation period. See text for details.

27. Communications codes

Since the original FSK and PSK codes were developed, advances in circuit techniques have allowed more complex signaling techniques to be used.

The common goal of all of these is to minimize the channel bandwidth needed for a given bit rate whilst offering immunity from multipath reception and interference. This is the equivalent of the DR in recording, but is measured in bits/s/Hz.

In PSK it’s possible to use more than two distinct phases. When four phases in quadrature are used, the result is quadrature phase shift keying or QPSK. Each period of the transmitted waveform can have one of four phases and therefore conveys the value of two data bits. In order to resist reflections in broadcasting, QPSK can be modified so that a knowledge of absolute phase is not needed at the receiver. Instead of encoding the signal phase, the data determine the magnitude of a phase shift. This is known as differential quadrature phase shift keying or DQPSK and is the modulation scheme used for NICAM 728 digital TV sound. A DQPSK coder is shown in Fig. 45 and as before two bits are conveyed for each transmitted period. It will be seen that one bit pattern results in no phase change. If this pattern is sustained the entire transmitter power will be concentrated in the carrier. This can cause patterning on the associated TV pictures. The randomizing technique of section 26 is used to overcome the problem. The effect is to spread the signal energy uniformly throughout the allowable channel bandwidth so that it has less energy at a given frequency. This reduces patterning on the analog video signal in addition to making the signal more resistant to multipath reception which tends to remove notches from the spectrum.

A pseudo-random sequence generator as described in Section 3 is used to generate the randomizing sequence used in NICAM. A nine-bit device has a sequence length of 511, and is preset to a standard value of all ones at the beginning of each frame. The serialized data are XORed with the LSB of the Galois field, which randomizes the output which then goes to the modulator. The spectrum of the transmission is now determined by the spectrum of the psendo-random sequence. This was shown in Section 3 to have a spikey sinx/x envelope. The frequencies beyond the first nulls are filtered out at the transmitter, leaving the characteristic 'dead hedgehog' shape seen on a spectrum analyzer.

On reception, the de-randomizer must contain the identical ring counter which must also be set to the starting condition to bit accuracy. Its output is then added to the data stream from the demodulator. The randomizing will effectively then have been added twice to the data in modulo 2, and as a result is cancelled out leaving the original serial data.

Fig. 46 DSIS information within the TV line sync pulse.

Fig. 47 In 64-QUAM, two carriers are generated with a quadrature relationship.

These are independently amplitude-modulated to eight discrete levels in four quadrant multipliers. Adding the signals produces a QUAM signal having 64 unique combinations of amplitude and phase. Decoding requires the waveform to be sampled in quadrature like a colour TV subcarrier.

Where an existing wide-band channel having a DC response and a good SNR is being used for digital signaling, an increase in data rate can be had using multi-level signaling or m-ary coding instead of binary. This is the basis of the sound-in-syncs technique used by broadcasters to convey PCM audio along baseband video routes by inserting data bursts in the analog video sync pulses. Fig. 46 shows the four-level waveform of the UK DSIS (Dual Channel Sound in Syncs) system which is used to carry stereo audio to NICAM-equipped transmitters. Clearly the data separator must have a two-bit ADC which can resolve the four signal levels. The gain and offset of the signal must be precisely set so that the quantizing levels register precisely with the centers of the eyes.

Where the maximum data rate is needed for economic reasons as in Digital Audio Broadcasting (DAB) or digital television broadcasts, multi level signaling can be combined with PSK to obtain multi-level Quadrature Amplitude Modulation (QUAM). Fig. 47 shows the example of 64-QUAM. Incoming six-bit data words are split into two three-bit words and each is used to amplitude modulate a pair of sinusoidal carriers which are generated in quadrature. The modulators are four-quadrant devices such that 23 amplitudes are available, four which are in phase with the carrier and four which are antiphase. The two AM carriers are linearly added and the result is a signal which has 26 or 64 combinations of amplitude and phase. There is a great deal of similarity between QUAM and the colour subcarrier used in analog television in which the two colour difference signals are encoded into one amplitude and phase modulated waveform. On reception, the waveform is sampled twice per cycle in phase with the two original carriers and the result is a pair of eight-level signals.

The data are randomized by addition to a pseudo-random sequence before being fed to the modulator. The resultant spectrum has once again the sinx/x shape with nulls at multiples of the randomizer clock rate. As a result, a large number of carriers can be spaced at multiples of the randomizer clock frequency such that each carrier centre frequency coincides with the nulls of all the adjacent carriers. The result is referred to as COFDM or coded orthogonal frequency division multiplexing.

Fig. 48 (a) Modulo-2 addition with a pseudo-random code removes unconstrained runs in real data.

Identical process must be provided on replay. (b) Convolutional randomizing encoder, at top, transmits exclusive OR of three bits at a fixed spacing in the data. One-bit delay, far right, produces channel transitions from data ones. Decoder, below, has opposing one-bit delay to return from transitions to data levels, followed by an opposing shift register which exactly reverses the coding process.

28. Convolutional randomizing

The randomizing in NICAM is block based, since this matches the one millisecond block structure of the transmission. Where there is no obvious block structure, convolutional, or endless randomizing can be used. This is the approach used in the Scrambled Serial digital video interconnect which allows composite or component video of up to ten-bit wordlength to be sent serially along with digital audio channels.

In convolutional randomizing, the signal sent down the channel is the serial data waveform which has been convolved with the impulse response of a digital filter. On reception the signal is deconvolved to restore the original data. Fig. 48(a) shows that the filter is an infinite impulse response (IIR) filter which has recursive paths from the output back to the input. As it’s a one-bit filter its output cannot decay, and once excited, it runs indefinitely. The filter is followed by a transition generator which consists of a one-bit delay and an exclusive-OR gate. An input 1 results in an output transition on the next clock edge. An input 0 results in no transition.

A result of the infinite impulse response of the filter is that frequent transitions are generated in the channel which result in sufficient clock content for the phase-locked loop in the receiver.

Transitions are converted back to 1s by a differentiator in the receiver.

This consists of a one-bit delay with an exclusive-OR gate comparing the input and the output. When a transition passes through the delay, the input and the output will be different and the gate outputs a 1 which enters the deconvolution circuit.

Fig. 48(b) shows that in the deconvolution circuit a data bit is simply the exclusive-OR of a number of channel bits at a fixed spacing.

The deconvolution is implemented with a shift register having the exclusive-OR gates connected in a reverse pattern to that in the encoder.

The same effect as block randomizing is obtained, in that long runs are broken up and the DC content is reduced, but it has the advantage over block randomizing that no synchronizing is required to remove the randomizing, although it will still be necessary for deserialization.

Clearly the system will take a few clock periods to produce valid data after commencement of transmission, but this is no problem on a permanent wired connection where the transmission is continuous.

29. Synchronizing

Once the PLL in the data separator has locked to the clock content of the transmission, a serial channel bitstream and a channel bit clock will emerge from the sampler. In a group code, it’s essential to know where a group of channel bits begins in order to assemble groups for decoding to data bit groups. In a randomizing system it’s equally vital to know at what point in the serial data stream the words or samples commence. In serial transmission and in recording, channel bit groups or randomized data words are sent one after the other, one bit at a time, with no spaces in between, so that although the designer knows that a data block contains, say, 128 bytes, the receiver simply finds 1024 bits in a row. If the exact position of the first bit is not known, then it’s not possible to put all the bits in the right places in the right bytes; a process known as deserializing. The effect of sync slippage is devastating, because a one-bit disparity between the bit count and the bitstream will corrupt every symbol in the block.

Fig. 49 Concatenation of two words can result in the accidental generation of a word which is reserved for synchronizing.

The synchronization of the data separator and the synchronization to the block format are two distinct problems, which are often solved by the same sync pattern. Deserializing requires a shift register which is fed with serial data and read out once per word. The sync detector is simply a set of logic gates which are arranged to recognize a specific pattern in the register. The sync pattern is either identical for every block or has a restricted number of versions and it will be recognized by the replay circuitry and used to reset the bit count through the block. Then by counting channel bits and dividing by the group size, groups can be deserialized and decoded to data groups. In a randomized system, the pseudo-random sequence generator is also reset. Then counting derandomized bits from the sync pattern and dividing by the wordlength enables the replay circuitry to deserialize the data words.

In digital audio the two's complement coding scheme is universal and traditionally no codes have been reserved for synchronizing; they are all available for sample values. It would in any case be impossible to reserve all ones or all zeros as these are in the centre of the range in two's complement. Even if a specific code were excluded from the recorded data so it could be used for synchronizing, this cannot ensure that the same pattern cannot be falsely created at the junction between two allowable data words. Fig. 49 shows how false synchronizing can occur due to concatenation. It’s thus not practical to use a bit pattern which is a data code value in a simple synchronizing recognizer. The problem is overcome in NICAM 728 by using the fact that sync patterns occur exactly once per millisecond or 728 bits. The sync pattern of NICAM 728 is just a bit pattern and no steps are taken to prevent it from appearing in the randomized data. If the pattern is seen by the recognizer, the recognizer is disabled for the rest of the frame and only enabled when the next sync pattern is expected. If the same pattern recurs every millisecond, a genuine sync condition exists. If it does not, there was a false sync and the recognizer will be enabled again. As a result it will take a few milliseconds before sync is achieved, but once achieved it should not be lost unless the transmission is interrupted. This is fine for the application and no-one objects to the short mute of the NICAM sound during a channel switch. The principle cannot, however, be used for recording because channel interruptions are more frequent due to head switches and dropouts and loss of several blocks of data due to a single dropout is unacceptable.

In run-length-limited codes this is not a problem. The sync pattern is no longer a data bit pattern but is a specific waveform. If the sync waveform contains run lengths which violate the normal coding limits, there is no way that these run lengths can occur in encoded data, nor any possibility that they will be interpreted as data. They can, however, be readily detected by the replay circuitry. The sync patterns of the AES/EBU interface are shown in Fig. 50. It will be seen from Fig. 33 that the maximum run length in FM coded data is one bit. The sync pattern begins with a run length of one and a half bits which is unique. There are three types of sync pattern in the AES/EBU interface, as will be seen in Section 8. These are distinguished by the position of a second pulse after the run length violation. Note that the sync patterns are also DC-free like the FM code.

Fig. 50 Sync patterns in various applications. In (a) the sync pattern of CD violates EFM coding rules, and is uniquely identifiable. In (b) the sync pattern of DASH stays within the run length of HDM-1. (c) The sync patterns of AES/EBU interconnect.

In a group code there are many more combinations of channel bits than there are combinations of data bits. Thus after all data bit patterns have been allocated group patterns, there are still many unused group patterns which cannot occur in the data. With care, group patterns can be found which cannot occur due to the concatenation of any pair of groups representing data. These are then unique and can be used for synchronizing.

In MADI, this approach is used as will be seen in Section 8. A similar approach is used in CD. Here the sync pattern does not violate a run length limit, but consists of two sequential maximum run lengths of 11 channel bit periods each as in Fig. 50(a). This pattern cannot occur in the data because the data symbols are only 14 channel bits long and the packing bit generator can be programmed to exclude accidental sync pattern generation due to concatenation.

===

Prev. | Next