IBM's PC filing system (Oct. 1986)

Home







This description of PC DOS-a version of MS DOS-complements last years series on floppy disc filing systems for microcomputers.

by Frances Stubbs, Ph.D.

Disc operating systems insulate the user from that way the data is stored on the disc, and allow the manipulation of sets of data (files) simply by referring to their names and common English words such as LIST, COPY and ERASE. Floppy discs are divided into concentric tracks, each divided into radial slices (sectors). A sector is of a fixed size for a particular system, for example, a PC Dos sector contains 512 bytes of useful data. The sector also contains the track number and sector number for identification, the sector size, and cyclic redundancy checking information, used to check whether data has been read or written correctly. A file may occupy one or a number of sectors, which may be contiguous or scattered over the disc. The user does not have to worry about this, as the DOS keeps a directory of all the files on the disc, and which sectors each occupies.

IBM's PC DOS associates corresponding tracks on opposite sides of the disc to form a cylinder, which allows two tracks to be read from the disc without any movement of the read/ write head, with a consequent increase in speed over systems which use the two sides of the disc separately, for example cP/MI. A PC Dos disc contains 40 cylinders, consisting of one track on each side of the disc, or one track only if the disc is a single-sided one. Each track is divided into 9, 512 byte sectors. (Earlier versions had only eight sectors per track).

The directory

A disc operating system keeps a directory on the disc of all the files together with information about which sectors each occupies. This is analogous to the index of a book which contains the chapter titles and the page number where the start of each chapter is to be found. Usually a chapter in a book occupies a number of consecutive pages, but this is not always the case on a computer disc; sectors occupied by a file may be scattered randomly over the disc.

How PC DOS uses the file allocation table (FAT) to overcome this difficulty is shown later.

In PC Dos the directory information (and some other system information) is always found in a fixed place on the first cylinder, and the rest of the disc is available for file data. This file space is divided into clusters, analogous to pages in the book. A cluster consists of two consecutive sectors. (On a single-sided disc a cluster contains only one sector.) The clusters are numbered, starting at cluster number 2, which starts immediately after the directory.

Directory format

The directory occupies seven sectors, starting with sector 6 of the first cylinder (sector 4 on eight sector/track discs). Each entry occupies 32 bytes as follows: Each of the seven directory sectors can contain up to 16 file names, thus a disc may contain up to 112 files.

File name and extension The file name is supplied by the user when the file is written on the disc, for example by using the SAVE command, and consists of up to eight characters padded by spaces, followed optionally by an extension of up to three characters, also padded by spaces. The extension is sometimes used to denote special types of files such as BAS for Basic files, ASM for assembly language source files, and EXE for executable machine code files. A character is represented by a byte of data, according to ascii.

Attribute byte: Enables the DOS to identify files which must be protected in some way from user interference. Not all of the attribute byte is used. If the first (least significant) bit is set (=1) the file is a hidden file and will not appear if DOS is asked to list the directory. Bit 2 set denotes a system file as distinct from a user file. A system file may also be hidden.

Time and date: These entries contain the date and time at which the file was created in coded form, for reference.


-----File allocation table

File size: Contains the size of the file in bytes and used to find the actual size of the file if the last cluster is not full.

First cluster number: Number of the first cluster occupied by the file. Then DOS goes to file allocation table (FAT) to find where on the disc the rest of the file is.

File allocation table: The disc contains two copies of the FAT, starting at sector two of the first cylinder, each copy occupying two sectors (only 1 sector is needed for eight sector/track discs). The file allocation table is a map of the total disc space available for files, and contains an entry of three hexadecimal digits for each cluster on the disc. The directory contains the number of the first cluster occupied by a file; the FAT entry for that cluster contains a pointer to the next cluster in the file, and so on.

If the cluster is the last one in the file, the entry is FFF. In this way the DOS can trace the sequence of clusters which make up a file, no matter where they are on the disc. The entry for unoccupied clusters is 000. If a cluster has been damaged in some way so that the DOS cannot successfully read or write to it, the entry is FF7, and Pc DOS does not use clusters marked in this way.

If the FAT is displayed on a v.d.u. it is not immediately obvious what it means, because a byte is displayed with the most significant digit in the left-hand position. For example, 12 means 2 in the 'ones' column and 1 in the '16's' column. This is the ‘obvious' way for human beings, as it is the way we write decimal numbers. But on a floppy disc or in a computer memory the least significant digit is written first. Taking the example of file allocation table entries shown, the entries for the first file would be written as 300 400 500 FFF or rather 300400500FFF as the spaces are not there.

Now as data is usually written to a v.d.u. as bytes (two hex digits), with the most significant digit appearing first for convenience, this would then appear as 03 40 00 05 FO FE.

Erasing files

When a file is erased from the disc the first letter of the filename in the directory is changed to E5, and the FAT entries for all the clusters which it occupied are changed to zero. The directory space and the file space which the file occupied are now available for further use. (Because the FAT entries are changed to zero the information about where the file was is lost, thus unlike cP/m where this information is retained when a file is erased, PC Dos files cannot easily be ‘un-erased'.)

Tree directory

More recent versions of PC DOS also support a tree directory structure. In this system the directory is called the root directory, and may contain files which are themselves directories (subdirectories). An entry in the directory is flagged as a subdirectory name by having bit 4 of the attribute byte set. The subdirectory is now a file in normal file space.

It has the same structure as the root directory, but can be any length. And it may itself contain further subdirectory names.

Each subdirectory contains two entries created by the DOS to allow it to determine the position of the subdirectory in the `tree'. The first of these is an entry whose name is ".", and which points to the first cluster of the subdirectory itself, and the second is an entry whose name is "..", and which points to the first cluster of the parent directory. These two entries also have bit 4 of the attribute byte set.

The tree directory structure allows any number of files to be stored on a disc and is particularly useful where hard discs with their much larger storage capacity are used.

[Frances Stubbs is a freelance programmer/analyst and microcomputer enthusiast, having previously worked in Geneva at CERN. His degrees re in physics from Durham University. ]

----------------------

Also see:

Oscilloscope update

Mains communication without tears

 

==========

(adapted from: Wireless World , Dec. 1986)

Top of Page

PREV. |   | NEXT |  Guide Index | HOME