<<prev.   
CD-i  
The Compact Disc Interactive (CD-i) standard was devised as a product-specific
  application of the CD-ROM format.  
CD-i permits storage of a simultaneous combination of audio, video, graphics,
  and text, and defines specific data formats for these. In addition, titles
  can function with real time interactivity. For example, a CD-i dictionary might
  contain a word and its definitions, as well as spoken pronunciation, pictures,
  and translations into foreign languages. The CD-i standard, codified in the
  Green Book (issued in 1986), defines how each type of information is encoded
  as well as logical layout of files on the disc. It also specifies how hardware
  reads discs and decodes information.  
The CD-i data format is derived from the CD-ROM Mode 2 format. CD-i data is
  arranged in 2352-byte blocks, as in the CD-ROM/XA format. The CD-i format accepts
  either PCM or ADPCM (adaptive differential pulse-code modulation) data. The
  full-motion video (FMV) extension allows storage of 74 minutes of full-motion
  digital video and stereo audio. The MPEG-1 coding standard is used to reduce
  the video bit rate to 1.15 Mbps and the audio rate to 0.22 Mbps; lower rates
  can also be used. CD-i players can also play Video CDs coded with MPEG-1. MPEG-1
  audio is described in Section 11 and MPEG-1 video in Section 16.  
To ensure universal compatibility, dedicated hardware and interfaces are defined.
  The CD-Bridge format adds information to a CD-ROM/XA disc so it can be played
  on a CD-i player. Bridge tracks use Mode 2 data, tracks are listed in the TOC
  as a CD-ROM/XA track, and block layout is identical to CD-i and CD-ROM/XA.
  The Photo CD is an example of a Bridge disc. The CD-i format did not enjoy
  success among its targeted consumers.  
Photo CD 
The Photo CD is used to professionally store, manipulate, and display photographic
  images. Photographs can be viewed or reproduced as high-quality prints of images
  using a color printer. The 35-mm version of the Photo CD provides three to
  four times the resolution required in any high-definition television (HDTV)
  standard. Conventional photographic images can be scanned to the Photo CD,
  with 2048 scan lines across the short dimension of a 35 mm frame, with 3072
  pixels on each line to yield a 3:2 aspect ratio. Data compression and decomposition
  are used to increase storage efficiency. During authoring, high resolution
  image files are subjected to a 4:1 data reduction.  
In addition, file sizes can be reduced without significant visual loss by
  using chroma sub-sampling to take advantage of limitations in human visual
  perception. The Photo CD was developed by Kodak and is defined in the Beige
  Book.  
Photo CD discs conform to the Orange Book Part II standard and are physically
  identical to CD-R audio discs; however, different data headers make them incompatible.  
Data blocks are written according to the CD-ROM/XA, Mode 2, Form 1 standard.
  Because discs use the CD Bridge format, they are playable on CD-ROM/XA players.  
Because the Orange Book Part II permits additional multisession recording
  to a disc, images can be added over time. Pacs initially recorded on a disc
  are structured as a file using the ISO 9660 structure. Subsequently recorded
  Pacs use a CD-R Volume and File Structures format, using the multisession method.
  All Pacs are addressed through the block-addressing method used by CD-ROM discs
  and defined by the ISO/IEC 10149 standard. Because the Photo CD adheres to
  the CD ROM/XA format, audio and video data can be interleaved; in this way,
  a soundtrack can accompany visuals. The Picture CD consumer format similarly
  stores photographic files on a CD-R disc; it provides 1024 × 1536 resolution
  using JPEG compression. The disc also contains software used to view and edit
  the photographs.  
CD + G and CD + MIDI 
The CD + G and CD + MIDI formats were devised to encode graphics or MIDI software
  on CDs, in addition to regular audio data. Special hardware or software is
  required to access this data. Eight subcode channels are accumulated over 98
  frames; thus, each 98-bit subcode word is output at a 75-Hz rate. Subcode synchronization
  occupies the first two frames, thus a subcode block contains eight channels
  with 96 data bits. This data block is called a packet, and each quarter of
  a packet is called a pack. A pack is generated every 3.3 ms. Only P and Q are
  reserved for audio control information. Over the length of a CD, the remaining
  channels, R to W, provide about 25 Mbytes of 8-bit data. Utilization of that
  capacity has been promoted as CD + G or CD + Graphics, and CD + MIDI , sometimes
  known as CD + G/M. The player decodes the graphics or MIDI data separately
  from the audio data. In CD + G discs, data is collected over thousands of CD
  frames to form video images or other data fields. For example, a CD + G audio
  disc can contain video images, liner notes, librettos, or other information.
  Because video images require a large amount of data for storage, CD + G images
  provide limited resolution. 
In the CD + MIDI application, MIDI (Musical Instrument Digital Interface)
  information is stored in the subcode field, and output synchronously with the
  audio playback. External MIDI instruments can synchronize to the melody or
  other musical parameters of an encoded disc. The subcode capacity is sufficient
  to store up to 16 channels of MIDI information. MIDI information can be supplemented
  with graphics information; for example, music notation could be supplied. Another
  variation can encode music notation in the subcode area to allow print out
  of sheet music. CD + G/M discs are compatible with any CD player, but only
  players equipped with CD + G/M output ports can retrieve the information from
  the disc. Alternatively, an external decoder can be connected to any CD player
  with a digital output port, provided that the full subcode data is available
  from the port. CD + G is sometimes used for karaoke applications.  
CD-3 
In addition to regular 120-mm-diameter CD discs, the CD 3 format describes
  80-mm-diameter discs. The name derives from the approximately 3-in diameter.
  This small size promotes greater portability and the format is useful for short
  audio programs. A CD-3 disc holds a maximum of 20 minutes of music. Because
  a CD data track begins at the innermost radius, CD-3 discs are compatible with
  regular discs and players. Some players have concentric rings in their disc
  drawers to center both diameter discs over the spindle. The CD-3 format is
  also used to hold over 200 Mbytes of CD-ROM data. The CD-3 format is also used
  for CD-R and CD-RW discs.  
Video CD 
The Video CD format is an outgrowth of the CD-i standard; full-motion video
  was added to the original CD-i standard and that feature was subsequently revised
  in 1992 to form the Video CD standard. The Video CD uses the MPEG-1 coding
  standard for audio and video. The audio signal is coded with the Layer II standard
  at 44.1 kHz. A disc stores about 74 minutes of full-motion digital video and
  audio; a feature film is placed on two discs. The video decoder chip permits
  full-motion video (FMV) to be shown at either 29.97 (NTSC) or 25 (PAL/SECAM)
  frames per second at 352 pixels by 240 lines and 352 pixels by 288 lines, respectively,
  one-fourth the resolution of DVD's normal mode. The Video CD may be shown as
  a quarter-screen image. The video bit rate is 1.15 Mbps and the audio bit rate
  is 0.22 Mbps. The Video CD format is a CD-ROM/XA Bridge disc, Mode 2, Form
  2; this allows a Video CD to play on a CD-ROM drive. A Video CD disc will not
  play on a CD-Audio player, but will play in many DVD players. 
Video CD is described in the White Book; version 1.0 of this specification
  was originally developed in 1992 for karaoke discs and in 1995 it was extended
  to version 2.0, which supported interactive video. The Video CD is different
  from the CD-Video format, now abandoned. The MPEG-1 video algorithm is discussed
  in Section 16. MPEG 1 audio is discussed in Section 11.  
The Super Video CD (SVCD) is an enhanced version of the Video CD designed
  primarily for higher-quality movie playback. SVCD uses MPEG-2 coding for video
  compression to store about 70 minutes on a disc. The NTSC resolution is 480
  × 480, and PAL resolution is 470 × 576-about three-fourths that of DVD's normal
  mode. Dual mono, stereo or 5.1-channel soundtracks can be used at bit rates
  ranging from 32 kbps to 384 kbps using MPEG-1 Layer II or MPEG-2 multichannel
  codecs. Uncompressed audio cannot be stored. The maximum data rate is 2.2 Mbps
  by virtue of a 2 × drive. However, at the higher data rate, playback time is
  halved to about 35 minutes; a movie might occupy three discs. Copy Generation
  Management System (CGMS) copy protection can be enabled. SVCD's development
  was sponsored by the Chinese government as a low-cost alternative to DVD. Other
  technical aspects were derived from the Video CD format and the China Video
  CD (CVD). The SVCD specification was ratified by the China National Committee
  of Recording Standards in September 1998. SVCD is also standardized in the
  IEC 62107 document. A similar specification, the Chao-Ji ("Super")
  VCD standard, was developed to support both China Video CD and SVCD; many SVCD
  players and changers support the Chao-Ji standard and most discs use the SVCD
  format. The DSVCD (Double SVCD) format uses a smaller track pitch to permit
  longer high-quality playing times of about 60 minutes.  
Super Audio CD 
When the Compact Disc was launched in 1982, it was rightly heralded as a data
  carrier of immense storage capacity. However, over time the CD seemed increasingly
  small. Moreover, some audiophiles argued that its specifications constrained
  audio fidelity. In particular, the CD was insufficient for the large file sizes
  and high bit rates required by surround sound and high sampling frequency audio.
  In 1999, Philips and Sony introduced the high density Super Audio CD standard,
  known as SACD. The SACD format supports discrete-channel (two-channel and multichannel)
  audio recordings, using the proprietary one bit Direct Stream Digital (DSD)
  coding method. DSD uses a high sampling frequency and achieves a flat frequency
  response to 100 kHz and a dynamic range of 120 dB in the 0- to 20-kHz band.  
SACD players can play both SACD and CD discs. SACD is not compatible with
  the DVD or Blu-ray formats.  
The mechanical and optical properties of an SACD disc are similar to those
  of a DVD-5 disc; however, the logical layout of content, the data format, and
  the copy protection measures are different. DSD data is not playable in standard
  DVD or Blu-ray drives, but a CD layer, if present on an SACD disc, is playable.
  Some players may include decoders to accommodate multiple disc formats. Other
  data such as text and graphics (but not video) can be included on an SACD disc;
  this content follows the Blue Book "Enhanced CD" standard. The SACD
  standard is sometimes known as the Scarlet Book, published in March 1999.  
  
FIG. 31 A hybrid SACD disc contains two data layers (high-density and
  CD). The two layers are bonded together to form a disc with a thickness of
  1.2 mm. (Verbakel et al., 1998)  
  
FIG. 32 Both the high-density layer and CD layer in a hybrid SACD are
  read from one side by a laser. The high density layer is semi-reflective, while
  the CD layer is fully reflective.  
Disc Design  
SACD discs use the same dimensions as a CD: 12-cm diameter and 1.2-mm thickness.
  The laser wavelength is 650 nm, the lens NA is 0.60, the minimum pit/land length
  is 0.40 µm, and the track pitch is 0.74 µm. (The pertinent CD figures are 780
  nm, 0.45, 0.83 µm, and 1.6 µm.) Software providers may choose from three disc
  types specified in the SACD format: single-layer, dual-layer, and hybrid disc
  construction. The single-layer disc contains one layer of high-density DSD
  content (4.7 Gbytes); for two-channel stereo, this provides about 110 minutes
  of playing time.  
The dual-layer disc contains two layers of high-density content (8.5 Gbytes
  total). The hybrid disc is a dual-layer disc that contains one layer of high-density
  DSD content (4.7 Gbytes) and one layer of Red Book compatible stereo content
  (680 Mbytes), as shown in FIG. 31. The semi reflective high-density layer
  must be reflective (readable) at the 650-nm wavelength of SACD, and transparent
  at the 780-nm wavelength used by conventional CD players; in other words, it
  acts as a color filter. The high-density layer is 0.6 mm from the readout surface
  and the CD layer is 1.2 mm from the surface. An SACD player can read both layers,
  and a CD player can read the CD layer.  
In dual-layer discs, two 0.6-mm substrates are bonded together. In all implementations,
  there is only one data side.  
A semi-reflective layer (20 to 40% reflective and approximately 0.05 µm in
  thickness) is used on the embedded inner data layer; in some cases, a silicon
  based dielectric film is used. A fully reflective top metal layer (at least
  70% reflective and approximately 0.05 µm in thickness) is used on the outer
  data surface. This surface is protected by an acrylic layer (approximately
  10 µm in thickness) and a printed label. Care must be taken to seal a hybrid
  disc to limit water absorption and evaporation from the substrate; unequal
  absorption between the two disc sides could cause disc warpage. The back side
  is inherently protected by a metal layer and a lacquer layer while the front
  side is nominally unprotected, thus a front side transparent silicon-based
  coating (10 nm to 15 nm) is needed. A hybrid disc in which a dual pickup (650
  nm and 780 nm) is used to read both SACD and CD data is shown in FIG. 32.  
  
FIG. 33 The SACD high-density data layer is designed to carry both two-channel
  and multichannel audio data. 
The data on an SACD disc is grouped into sectors of 2064 bytes. This comprises:
  Identification Data (ID) of 4 bytes, ID Error Detection (IED) of 2 bytes, Reserved
  of 6 bytes, Main Data of 2048 bytes, and Error Detection Code (EDC) of 4 bytes.
  During encoding, following scrambling, 16 sectors form an error-correction
  code block, which is processed with a scheme using a Reed-Solomon Product Code.
  Rows of ECC blocks are interleaved and grouped into recording frames. Frames
  undergo EFMPlus modulation. Data is then placed in Physical Sectors and recorded
  to disc.  
The radius of the high-density layer is segmented for different kinds of data,
  as shown in FIG. 33. The innermost radius contains the disc lead-in area,
  followed by the data area. It is divided into several areas including a Master
  Table of Contents (Master TOC) containing information on tracks and timing,
  as well as text data on the title and artist. The Master TOC is stored in three
  places (sectors 510, 520, and 530) to ensure readability. The next two radial
  areas are given to two-channel and multichannel recordings (up to six channels).
  The two-channel and multichannel areas use the same basic structure. The Area
  TOC for each audio area is placed at the beginning and end of each area. They
  contain track, sampling frequency, timing, and text information about the tracks
  included in that section. The SACD standard permits up to 255 tracks. 
Audio tracks contain two types of streams: audio elementary stream and supplementary
  data elementary stream; they are multiplexed. In addition, there are sequences
  of audio frames each with a timecode, and supplementary data frames for pictures,
  text, and graphics; each frame represents 1/75 second. Following the audio
  tracks, there is an area for optional data such as text, graphics, and video.
  This data can only be accessed by a file system; its format is not specified
  in the SACD specification. The outermost radius holds the disc lead-out. 
SACD discs can be read using a hierarchical TOC, or by optionally using a
  UDF or ISO 9660 file system.  
FIG. 34 In principle, DSD coding is based on a one-bit quantization method.
  A. A one-bit quantizer produces a square wave output. B. The output square
  wave from a one bit quantizer yields a large difference signal.  
All SACD discs incorporate an invisible watermark that is physically embedded
  in the substrate of the disc. Virtually impossible to copy, the watermark is
  used to conduct mutual authentication of the player and the disc. SACD players
  read the watermark and will reject any discs that do not bear an authentic
  watermark. Visible watermarks on the signal side of the disc in the form of
  faint images or letters may also be employed. A process called Pit Signal Processing
  (PSP) uses a controlled array of pit widths to create both invisible and visible
  watermarks; user data stored as pit/land lengths is unaffected by this watermarking.  
DSD Modulation 
Whereas all CD discs carry PCM data, all SACD discs carry Direct Stream Digital
  (DSD) data, in which audio signals are coded in one-bit pulse density form
  using sigma-delta modulation. Most conventional analog-to digital (A/D) converters
  use sigma-delta techniques in which the input signal is upsampled to a high
  sampling frequency. The signal is passed through a decimation filter and also
  quantized for output as a PCM signal at a nominal sampling frequency of 44.1
  kHz (for CD) and up to 192 kHz (for DVD-Audio or Blu-ray). Likewise, many D/A
  converters use oversampling to increase the sampling frequency of the output
  signal, to move the image spectra from the audio band. As in PCM systems, DSD
  begins with a high sampling frequency, but unlike PCM systems, DSD does not
  require decimation filtering and PCM quantization in the recording process;
  instead, the original sampling frequency of 2.8224 MHz is retained. One-bit
  data is recorded directly on the disc. Unlike PCM, DSD does not employ interpolation
  (oversampling) filtering in the playback process. In other words, the basic
  DSD specification is based on the direct output of a typical sigma-delta A/D
  converter at 64 × 44.1 kHz.  
  
FIG. 35 DSD coding uses a sigma-delta coding technique. A. A sigma-delta
  modulator uses negative feedback to subtract a compensation signal from the
  input. B. The output signal from a sigma-delta modulator is a pulse-density
  waveform. 
DSD uses sigma-delta modulation and noise shaping. A simple one-bit quantizer
  is shown in FIG. 34A, and the output waveform resulting from a sine-wave
  input is shown in FIG. 34B. The shaded portion shows the difference error
  between the input waveform and the quantized output waveform. An example of
  a simple sigma-delta encoder is shown in FIG. 35A. The one-bit output signal
  is also used as an error signal and delayed by one sample and subtracted from
  the input analog signal. If the input waveform, accumulated over one sampling
  period, rises above the value accumulated in the negative feedback loop during
  previous samples, the converter outputs a 1 value.  
Similarly, if the waveform falls relative to the accumulated value, a 0 value
  is output. Fully positive waveforms will generate all 1 values and fully negative
  waveforms will generate all 0 values. This method of returning output error
  data to the input signal to be subtracted as compensation data is called negative
  feedback.  
  
FIG. 36 Noise-shaping algorithms are designed to reduce the low-frequency
  (in-band) quantization error, but also increase high-frequency (out-of-band)
  content.  
FIG. 35B shows an input sine wave applied to a sigma-delta encoder and
  the resulting output signal. The pulses of the output signal reflect the magnitude
  of the input signal; this is a pulse density modulation representation in which
  a 0 value has no pulse output while a 1 value does.  
The shaded portion shows the difference error; analysis shows that the volume
  of error is the same as in a simple quantizer; however, because the integrator
  (sigma) in the sigma-delta encoder acts as a lowpass filter, the amount of
  low-frequency error is reduced while the amount of high frequency error is
  increased, as shown in FIG. 36. The system's designers note that the ear
  is sensitive to very high-frequency signals only if they are correlated to
  lower in-band signals. At frequencies higher than 20 kHz, they state that signal-to-noise
  ratios become less important.  
Thus, they argue that the uncorrelated high-frequency shaped noise is perceptually
  unimportant. This noise shaping property can be developed with higher-order
  (perhaps 5th order) noise shaping feedback filters to further decrease error
  in the audible range of frequencies. In principle, a lowpass filter can decode
  sigma-delta signals.  
Such a low-pass filter would also remove high-frequency noise resulting from
  noise shaping. The principles of sigma-delta modulation and noise shaping are
  discussed more fully in Section 18.  
  
FIG. 37 DSD coding used in the SACD format requires significant noise
  shaping to reduce low-frequency noise. 
However, this significantly increases high-frequency noise above 20 kHz.  
The DSD modulation used in the SACD format uses a sampling frequency that
  is 2.8224 MHz. In other words, the analog signal is sampled at a 2.8224 MHz
  rate and each sample is quantized as a one-bit word. Overall, the bit rate
  is thus four times higher than on a CD. In principle, the Nyquist frequency
  is thus 1.4112 MHz. However, in practice, to remove high-frequency noise introduced
  by high-order noise shaping, the high frequency response is limited to 100
  kHz or less by analog filters. As shown in Fig. 37, a significant noise-shaping
  component is present in the 100-kHz band, as anticipated by the SACD standard. 
The SACD standard specifies that noise power in the 100-kHz band should be
  20 dB below the standard reference level. When a 100-kHz lowpass filter is
  used, at a volume level that achieves a 100-watt output, this noise component
  is thus 1 watt or less. However, at higher volume levels, the SACD standard
  recommends that SACD players incorporate a lowpass filter with a corner frequency
  of 50 kHz and a minimum 30-dB/octave slope for use with most conventional power
  amplifiers and speakers. 
When making audio measurements of the SACD, a 20-kHz lowpass filter (such
  as the 3344A filter by NF Electronic Instruments with 60 dB of attenuation
  above 24.1 kHz) is recommended to avoid the effects of the shaped components
  in the higher frequency range.  
The 2.8224 MHz (64 × 44.1 kHz) sampling frequency of the one-bit DSD signal
  can be converted to a variety of standard PCM sampling frequencies with integer
  computation. Division by 64 and 32 yields 44.1 and 88.2 kHz. Following multiplication
  by 5, division by 441, 294, and 147 yields 32, 48, and 96 kHz, respectively.
  Also, an extended sampling frequency of 128 × 44.1 kHz is possible.  
DST Lossless Coding 
A lossless coding algorithm known as Direct Stream Transfer (DST) is employed
  in the SACD format to more than double effective disc capacity. Eight DSD channels
  (six multichannel plus a stereo mix) on a 4.7-Gbyte data layer are allowed
  a playing time of 27 minutes, 45 seconds. 
With DST, a 74-minute playing time is accommodated, effectively increasing
  storage capacity to about 12 Gbytes.  
As with other lossless compression methods, the compression achieved by DCT
  depends on the audio signal itself. In one survey, DCT yielded a coding gain
  of 2.4 to 2.5 for pop music, and 2.6 to 2.7 for classical music.  
  
FIG. 38 Direct Stream Transfer (DST) can be used for lossless coding of
  DSD data using an adaptive prediction filter and entropy (arithmetic) coding.
  A. DST encoder. B. DST decoder.  
The DST encoder and decoder are shown in FIG. 38.  
DST uses data framing, an adaptive prediction filter and entropy coding. The
  use of lossless coding can be decided on a frame-by-frame basis; the flag information
  for the decoder is contained in each frame header. An area without any DST
  frames can be marked accordingly in the area TOC. DST coding yields variably
  sized frames; a buffer model is used to output a fixed bit rate. The theory
  of lossless coding is discussed in Section 10.  
Player Design 
SACD players play back both SACD and CD discs. Their design is similar to
  that of CD players. Dual laser pickups are required to operate at both the
  SACD 650-nm wavelength and the CD 780-nm wavelength. In some player designs,
  a single processor accepts the amplified RF signal from the dual pickup and
  performs clock signal extraction and synchronization, as well as demodulation
  and error correction for both CD and SACD signals. A servo chip controls the
  pickup and motor systems. CD data is passed along to the digital filter. SACD
  data is applied to the DSD decoder; this circuit first reads the invisible
  watermark, then intermittent data is rearranged and ordered in a buffer memory
  according to a master clock.  
This chip also reads subcode data, including TOC information such as track
  number, time, and text data.  
DSD data is output as a one-bit signal at a frequency of 2.8224 MHz and applied
  to a pulse-density modulation processor in which the data signal is converted
  to a complementary signal in which each 1 value creates a wide pulse and each
  0 value creates a narrow pulse. A current pulse D/A converter converts the
  voltage pulse output into a current pulse. This current pulse signal is passed
  through an analog lowpass filter to create the analog audio waveform. In some
  designs, this filter's response measures -3 dB at 50 kHz. 
 |