Oversampling
Oversampling means using a sampling rate which is greater (generally substantially
greater) than the Nyquist rate. Neither sampling theory nor quantizing theory
require oversampling to be used to obtain a given signal quality, but Nyquist
rate conversion places extremely high demands on component accuracy when a
convertor is implemented. Oversampling allows a given signal quality to be
reached without requiring very close tolerance, and therefore expensive, components.
Although it can be used alone, the advantages of oversampling are better realized
when it’s used in conjunction with noise shaping. Thus in practice the two
processes are generally used together and the terms are often seen used in
the loose sense as if they were synonymous. For a detailed and quantitative
analysis of oversampling having exhaustive references the serious reader is
referred to Hauser.
In section 4, where dynamic element matching was described, it was seen that
component accuracy was traded for accuracy in the time domain. Oversampling
is another example of the same principle.
FGR. 44 shows the main advantages of oversampling. At (a) it will be seen
that the use of a sampling rate considerably above the Nyquist rate allows
the anti-aliasing and reconstruction filters to be realized with a much more
gentle cut-off slope. There is then less likelihood of phase linearity and
ripple problems in the audio passband.
FGR. 44(b) shows that information in an analog signal is two dimensional and
can be depicted as an area which is the product of bandwidth and the linearly
expressed signal-to-noise ratio. The figure also shows that the same amount
of information can be conveyed down a channel with a SNR of half as much (6
dB less) if the bandwidth used is doubled, with 12 dB less SNR if bandwidth
is quadrupled, and so on, provided that the modulation scheme used is perfect.
==
FGR. 45 Information rate can be held constant when frequency doubles by removing
one-bit from each word. In all cases here it’s 16F. Note bit rate of (c) is
double that of (a). Data storage in oversampled form is inefficient.
FGR. 46 The amount of information per bit increases disproportionately as
wordlength increases. It’s always more efficient to use the longest words possible
at the lowest word rate. It will be evident that sixteen-bit PCM is 2048 times
as efficient as delta modulation. Oversampled data are also inefficient for
storage.
==
The information in an analog signal can be conveyed using some analog modulation
scheme in any combination of bandwidth and SNR which yields the appropriate
channel capacity. If bandwidth is replaced by sampling rate and SNR is replaced
by a function of wordlength, the same must be true for a digital signal as
it’s no more than a numerical analog. Thus raising the sampling rate potentially
allows the wordlength of each sample to be reduced without information loss.
Oversampling permits the use of a convertor element of shorter wordlength,
making it possible to use a flash convertor. The flash convertor is capable
of working at very high frequency and so large oversampling factors are easily
realized. The flash convertor needs no track-hold system as it works instantaneously.
The drawbacks of track hold set out in section 6 are thus eliminated. If the
sigma-DPCM convertor structure of FGR. 43 is realized with a flash convertor
element, it can be used with a high oversampling factor. FGR. 44(c) shows that
this class of convertor has a rising noise floor. If the highly oversampled
output is fed to a digital low-pass filter which has the same frequency response
as an analog anti-aliasing filter used for Nyquist rate sampling, the result
is a disproportionate reduction in noise because the majority of the noise
was outside the audio band. A high-resolution convertor can be obtained using
this technology without requiring unattainable component tolerances.
Information theory predicts that if an audio signal is spread over a much
wider bandwidth by, for example, the use of an FM broadcast transmitter, the
SNR of the demodulated signal can be higher than that of the channel it passes
through, and this is also the case in digital systems.
The concept is illustrated in FGR. 45. At (a) four-bit samples are delivered
at sampling rate F. As four bits have sixteen combinations, the information
rate is 16 F. At (b) the same information rate is obtained with three-bit samples
by raising the sampling rate to 2 F and at (c) two-bit samples having four
combinations require to be delivered at a rate of 4 F.
Whilst the information rate has been maintained, it will be noticed that the
bit-rate of (c) is twice that of (a). The reason for this is shown in FGR.
46. A single binary digit can only have two states; thus it can only convey
two pieces of information, perhaps 'yes' or 'no'. Two binary digits together
can have four states, and can thus convey four pieces of information, perhaps
'spring summer autumn or winter', which is two pieces of information per bit.
Three binary digits grouped together can have eight combinations, and convey
eight pieces of information, perhaps 'doh remi fah so lah te or doh', which
is nearly three pieces of information per digit. Clearly the further this principle
is taken, the greater the benefit. In a sixteen-bit system, each bit is worth
4K pieces of information. It’s always more efficient, in information-capacity
terms, to use the combinations of long binary words than to send single bits
for every piece of information. The greatest efficiency is reached when the
longest words are sent at the slowest rate which must be the Nyquist rate.
This is one reason why PCM recording is more common than delta modulation,
despite the simplicity of implementation of the latter type of convertor. PCM
simply makes more efficient use of the capacity of the binary channel.
==
FGR. 47 A recorder using oversampling in the convertors overcomes the shortcomings
of analog anti-aliasing and reconstruction filters and the convertor elements
are easier to construct; the recording is made with Nyquist rate PCM which
minimizes tape consumption.
===
FGR. 48 A conventional ADC performs each step in an identifiable location
as in (a).
With oversampling, many of the steps are distributed as shown in (b).
==
As a result, oversampling is confined to convertor technology where it gives
specific advantages in implementation. The storage or transmission system will
usually employ PCM, where the sampling rate is a little more than twice the
audio bandwidth. FGR. 47 shows a digital audio tape recorder such as DAT using
oversampling convertors. The ADC runs at n times the Nyquist rate, but once
in the digital domain the rate needs to be reduced in a type of digital filter
called a decimator. The output of this is conventional Nyquist rate PCM, according
to the tape format, which is then recorded. On replay the sampling rate is
raised once more in a further type of digital filter called an interpolator.
The system now has the best of both worlds: using oversampling in the convertors
overcomes the shortcomings of analog anti-aliasing and reconstruction filters
and the wordlength of the convertor elements is reduced making them easier
to construct; the recording is made with Nyquist rate PCM which minimizes tape
consumption. Digital filters have the characteristic that their frequency response
is proportional to the sampling rate. If a digital recorder is played at a
reduced speed, the response of the digital filter will reduce automatically
and prevent images passing the reconstruction process.
Oversampling is a method of overcoming practical implementation problems by
replacing a single critical element or bottleneck by a number of elements whose
overall performance is what counts. As Hauser28 properly observed, oversampling
tends to overlap the operations which are quite distinct in a conventional
convertor. In earlier sections of this section, the vital subjects of filtering,
sampling, quantizing and dither have been treated almost independently. FGR.
48(a) shows that it’s possible to construct an ADC of predictable performance
by taking a suitable anti-aliasing filter, a sampler, a dither source and a
quantizer and assembling them like building bricks. The bricks are effectively
in series and so the performance of each stage can only limit the overall performance.
In contrast, FGR. 48(b) shows that with oversampling the overlap of operations
allows different processes to augment one another allowing a synergy which
is absent in the conventional approach.
If the oversampling factor is n, the analog input must be bandwidth limited
to n.Fs/2 by the analog anti-aliasing filter. This unit need only have flat
frequency response and phase linearity within the audio band.
Analog dither of an amplitude compatible with the quantizing interval size
is added prior to sampling at n.Fs/2 and quantizing.
Next, the anti-aliasing function is completed in the digital domain by a low-pass
filter which cuts off at Fs/2. Using an appropriate architecture this filter
can be absolutely phase linear and implemented to arbitrary accuracy. Such
filters are discussed in Section 3. The filter can be considered to be the
demodulator of FGR. 44 where the SNR improves as the bandwidth is reduced.
The wordlength can be expected to increase.
As Section 3 illustrated, the multiplications taking place within the filter
extend the wordlength considerably more than the bandwidth reduction alone
would indicate. The analog filter serves only to prevent aliasing into the
audio band at the oversampling rate; the audio spectrum is determined with
greater precision by the digital filter.
With the audio information spectrum now Nyquist limited, the sampling process
is completed when the rate is reduced in the decimator.
One sample in n is retained.
==
FGR. 49 A conventional DAC in (a) is compared with the oversampling implementation
in (b).
==
The excess wordlength extension due to the anti-aliasing filter arithmetic
must then be removed. Digital dither is added, completing the dither process,
and the quantizing process is completed by requantizing the dithered samples
to the appropriate wordlength which will be greater than the wordlength of
the first quantizer. Alternatively noise shaping may be employed.
FGR. 49(a) shows the building-brick approach of a conventional DAC. The Nyquist
rate samples are converted to analog voltages and then a steep-cut analog low-pass
filter is needed to reject the sidebands of the sampled spectrum.
FGR. 49(b) shows the oversampling approach. The sampling rate is raised in
an interpolator which contains a low-pass filter which restricts the baseband
spectrum to the audio bandwidth shown. A large frequency gap now exists between
the baseband and the lower sideband. The multiplications in the interpolator
extend the wordlength considerably and this must be reduced within the capacity
of the DAC element by the addition of digital dither prior to requantizing.
Again noise shaping may be used as an alternative.
Oversampling without noise shaping
If an oversampling convertor is considered which makes no attempt to shape
the noise spectrum, it will be clear that if it contains a perfect quantizer,
no amount of oversampling will increase the resolution of the system, since
a perfect quantizer is blind to all changes of input within one quantizing
interval, and looking more often is of no help. It was shown earlier that the
use of dither would linearize a quantizer, so that input changes much smaller
than the quantizing interval would be reflected in the output and this remains
true for this class of convertor.
FGR. 50 shows the example of a white-noise-dithered quantizer, oversampled
by a factor of four. Since dither is correctly employed, it’s valid to speak
of the unwanted signal as noise. The noise power extends over the whole baseband
up to the Nyquist limit. If the base bandwidth is reduced by the oversampling
factor of four back to the bandwidth of the original analog input, the noise
bandwidth will also be reduced by a factor of four, and the noise power will
be one-quarter of that produced at the quantizer. One-quarter noise power implies
one-half the noise voltage, so the SNR of this example has been increased by
6 dB, the equivalent of one extra bit in the quantizer. Information theory
predicts that an oversampling factor of four would allow an extension by two
bits.
This method is suboptimal in that very large oversampling factors would be
needed to obtain useful resolution extension, but it would still realize some
advantages, particularly the elimination of the steep-cut analog filter.
==
FGR. 50 In this simple oversampled convertor, 4_ oversampling is used. When
the convertor output is low-pass filtered, the noise power is reduced to one-quarter,
which in voltage terms is 6 dB. This is a suboptimal method and is not used.
==
The division of the noise by a larger factor is the only route left open,
since all the other parameters are fixed by the signal bandwidth required.
The reduction of noise power resulting from a reduction in bandwidth is only
proportional if the noise is white, i.e. it has uniform power spectral density
(PSD). If the noise from the quantizer is made spectrally non uniform, the
oversampling factor will no longer be the factor by which the noise power is
reduced. The goal is to concentrate noise power at high frequencies, so that
after low-pass filtering in the digital domain down to the audio input bandwidth,
the noise power will be reduced by more than the oversampling factor.
Noise shaping
Noise shaping dates from the work of Cutler in the 1950s. It’s a feedback
technique applicable to quantizers and requantizers in which the quantizing
process of the current sample is modified in some way by the quantizing error
of the previous sample.
When used with requantizing, noise shaping is an entirely digital process
which is used, For example, following word extension due to the arithmetic
in digital mixers or filters in order to return to the required wordlength.
It will be found in this form in oversampling DACs. When used with quantizing,
part of the noise-shaping circuitry will be analog.
As the feedback loop is placed around an ADC it must contain a DAC.
When used in convertors, noise shaping is primarily an implementation technology.
It allows processes which are conveniently available in integrated circuits
to be put to use in audio conversion. Once integrated circuits can be employed,
complexity ceases to be a drawback and low cost mass production is possible.
It has been stressed throughout this section that a series of numerical values
or samples is just another analog of an audio waveform. Section 3 showed that
all analog processes such as mixing, attenuation or integration all have exact
numerical parallels. It has been demonstrated that digitally dithered requantizing
is no more than a digital simulation of analog quantizing. It should be no
surprise that in this section noise shaping will be treated in the same way.
Noise shaping can be performed by manipulating analog voltages or numbers representing
them or both.
If the reader is content to make a conceptual switch between the two, many
obstacles to understanding fall, not just in this topic, but in digital audio
in general.
The term noise shaping is idiomatic and in some respects unsatisfactory because
not all devices which are called noise shapers produce true noise. The caution
which was given when treating quantizing error as noise is also relevant in
this context. Whilst 'quantizing-error-spectrum shaping' is a bit of a mouthful,
it’s useful to keep in mind that noise shaping means just that in order to
avoid some pitfalls. Some noise shaper architectures don’t produce a signal
decorrelated quantizing error and need to be dithered.
FGR. 51(a) shows a requantizer using a simple form of noise shaping. The low-order
bits which are lost in requantizing are the quantizing error. If the value
of these bits is added to the next sample before it’s requantized, the quantizing
error will be reduced. The process is somewhat like the use of negative feedback
in an operational amplifier except that it’s not instantaneous, but encounters
a one sample delay.
With a constant input, the mean or average quantizing error will be brought
to zero over a number of samples, achieving one of the goals of additive dither.
The more rapidly the input changes, the greater the effect of the delay and
the less effective the error feedback will be. FGR. 51(b) shows the equivalent
circuit seen by the quantizing error, which is created at the requantizer and
subtracted from itself one sample period later. As a result the quantizing
error spectrum is not uniform, but has the shape of a raised sinewave shown
at (c), hence the term noise shaping. The noise is very small at DC and rises
with frequency, peaking at the Nyquist frequency at a level determined by the
size of the quantizing step. If used with oversampling, the noise peak can
be moved outside the audio band.
==
FGR. 51 (a) A simple requantizer which feeds back the quantizing error to
reduce the error of subsequent samples. The one-sample delay causes the quantizing
error to see the equivalent circuit shown in (b) which results in a sinusoidal
quantizing error spectrum shown in (c).
FGR. 52 By adding the error caused by truncation to the next value, the resolution
of the lost bits is maintained in the duty cycle of the output. Here, truncation
of 011 by 2 bits would give continuous zeros, but the system repeats 0111,
0111, which, after filtering, will produce a level of three-quarters of a bit.
FGR. 53 The noise-shaping system of the first generation of Philips CD players.
==
FGR. 52 shows a simple example in which two low-order bits need to be removed
from each sample. The accumulated error is controlled by using the bits which
were neglected in the truncation, and adding them to the next sample. In this
example, with a steady input, the roundoff mechanism will produce an output
of 01110111 . . . If this is low-pass filtered, the three ones and one zero
result in a level of three-quarters of a quantizing interval, which is precisely
the level which would have been obtained by direct conversion of the full digital
input. Thus the resolution is maintained even though two bits have been removed.
The noise-shaping technique was used in the first-generation Philips CD players
which oversampled by a factor of four. Starting with sixteen bit PCM from the
disc, the 4x oversampling will in theory permit the use of an ideal fourteen-bit
convertor, but only if the wordlength is reduced optimally. The oversampling
DAC system used is shown in FGR. 53.
The interpolator arithmetic extends the wordlength to 28 bits, and this is
reduced to 14 bits using the error feedback loop of FGR. 51. The noise floor
rises slightly towards the edge of the audio band, but remains below the noise
level of a conventional sixteen-bit DAC which is shown for comparison.
The fourteen-bit samples then drive a DAC using dynamic element matching.
The aperture effect in the DAC is used as part of the reconstruction filter
response, in conjunction with a third-order Bessel filter which has a response
3 dB down at 30 kHz. Equalization of the aperture effect within the audio passband
is achieved by giving the digital filter which produces the oversampled data
a rising response. The use of a digital interpolator as part of the reconstruction
filter results in extremely good phase linearity.
Noise shaping can also be used without oversampling. In this case the noise
cannot be pushed outside the audio band. Instead the noise floor is shaped
or weighted to complement the unequal spectral sensitivity of the ear to noise.
Unless we wish to violate Shannon's theory, this psychoacoustically optimal
noise shaping can only reduce the noise power at certain frequencies by increasing
it at others. Thus the average log PSD over the audio band remains the same,
although it may be raised slightly by noise induced by imperfect processing.
==
FGR. 54 Perceptual filtering in a requantizer gives a subjectively improved
SNR.
===
FGR. 54 shows noise shaping applied to a digitally dithered requantizer.
Such a device might be used when, for example, making a CD master from a twenty-bit
recording format. The input to the dithered requantizer is subtracted from
the output to give the error due to requantizing. This error is filtered (and
inevitably delayed) before being subtracted from the system input. The filter
is not designed to be the exact inverse of the perceptual weighting curve because
this would cause extreme noise levels at the ends of the band. Instead the
perceptual curve is leveled off such that it cannot fall more than e.g. 40
dB below the peak.
Psychoacoustically optimal noise shaping can offer nearly three bits of increased
dynamic range when compared with optimal spectrally flat dither. Enhanced Compact
Discs recorded using these techniques are now available.
Noise-shaping ADCs
==
FGR. 55 The sigma DPCM convertor of FGR. 43 is shown here in more detail.
FGR. 56 In a sigma-DPCM or convertor, noise amplitude increases by 6 dB/octave,
noise power by 12dB/octave. In this 4x oversampling convertor, the digital
filter reduces bandwidth by four, but noise power is reduced by a factor of
16. Noise voltage falls by a factor of four or 12 dB.
==
FGR. 57 The enhancement of SNR possible with various filter orders and oversampling
factors in noise-shaping convertors.
==
FGR. 58 Stabilizing the loop filter in a noise-shaping convertor can be assisted
by the incorporation of feedforward paths as shown here.
==
The sigma DPCM convertor introduced in FGR. 43 has a natural application here
and is shown in more detail in FGR. 55. The current digital sample from the
quantizer is converted back to analog in the embedded DAC. The DAC output differs
from the ADC input by the quantizing error. The DAC output is subtracted from
the analog input to produce an error which is integrated to drive the quantizer
in such a way that the error is reduced. With a constant input voltage the
average error will be zero because the loop gain is infinite at DC. If the
average error is zero, the mean or average of the DAC outputs must be equal
to the analog input. The instantaneous output will deviate from the average
in what is called an idling pattern. The presence of the integrator in the
error feedback loop makes the loop gain fall with rising frequency. With the
feedback falling at 6 dB per octave, the noise floor will rise at the same
rate.
FGR. 56 shows a simple oversampling system using a sigma-DPCM convertor and
an oversampling factor of only four. The sampling spectrum shows that the noise
is concentrated at frequencies outside the audio part of the oversampling baseband.
Since the scale used here means that noise power is represented by the area
under the graph, the area left under the graph after the filter shows the noise-power
reduction. Using the relative areas of similar triangles shows that the reduction
has been by a factor of sixteen. The corresponding noise-voltage reduction
would be a factor of four, or 12 dB, which corresponds to an additional two
bits in wordlength. These bits will be available in the wordlength extension
which takes place in the decimating filter. Owing to the rise of 6 dB per octave
in the PSD of the noise, the SNR will be 3 dB worse at the edge of the audio
band.
One way in which the operation of the system can be understood is to consider
that the coarse DAC in the loop defines fixed points in the audio transfer
function. The time averaging which takes place in the decimator then allows
the transfer function to be interpolated between the fixed points. True signal-independent
noise of sufficient amplitude will allow this to be done to infinite resolution,
but by making the noise primarily outside the audio band the resolution is
maintained but the audio band signal-to-noise ratio can be extended. A first-order
noise shaping ADC of the kind shown can produce signal-dependent quantizing
error and requires analog dither. However, this can be outside the audio band
and so need not reduce the SNR achieved.
A greater improvement in dynamic range can be obtained if the integrator is
supplanted to realize a higher-order filter.
==
FGR. 59 An example of a high-order noise-shaping ADC. See text for details.
==
The filter is in the feedback loop and so the noise will have the opposite
response to the filter and will therefore rise more steeply to allow a greater
SNR enhancement after decimation. FGR. 57 shows the theoretical SNR enhancement
possible for various loop filter orders and oversampling factors. A further
advantage of high-order loop filters is that the quantizing noise can be decorrelated
from the signal, making dither unnecessary. High-order loop filters were at
one time thought to be impossible to stabilize, but this is no longer the case,
although care is necessary. One technique which may be used is to include some
feedforward paths as shown in FGR. 58.
An ADC with high-order noise shaping was disclosed by Adams and a simplified
diagram is shown in FGR. 59. The comparator outputs of the 128 times oversampled
four-bit flash ADC are directly fed to the DAC which consists of fifteen equal
resistors fed by CMOS switches. As with all feedback loops, the transfer characteristic
cannot be more accurate than the feedback, and in this case the feedback accuracy
is determined by the precision of the DAC.
Driving the DAC directly from the ADC comparators is more accurate because
each input has equal weighting.
The stringent MSB tolerance of the conventional binary weighted DAC is then
avoided. The comparators also drive a 16 to 4 priority encoder to provide the
four-bit PCM output to the decimator. The DAC output is subtracted from the
analog input at the integrator. The integrator is followed by a pair of conventional
analog operational amplifiers having frequency-dependent feedback and a passive
network which gives the loop a fourth-order response overall. The noise floor
is thus shaped to rise at 24 dB per octave beyond the audio band. The time
constants of the loop filter are optimized to minimize the amplitude of the
idling pattern as this is an indicator of the loop stability. The four-bit
PCM output is low-pass filtered and decimated to the Nyquist frequency. The
high oversampling factor and high-order noise shaping extend the dynamic range
of the four-bit flash ADC to 108 dB at the output.
FGR. 60 In (a) the operation of a one-bit DAC relies on switched capacitors.
The switching waveforms are shown in (b).
=== |