previous
section   upper level

Section 14: The Unix File System

Most Unix machines store their files on magnetic disk drives. A disk drive is a device that can store information by making electrical imprints on a magnetic surface. One or more heads skim close to the spinning magnetic plate, and can detect, or change, the magnetic state of a given spot on the disk. The drives use disk controllers to position the head at the correct place at the correct time to read from, or write to, the magnetic surface of the plate. It is often possible to partition a single disk drive into more than one logical storage area. This section describes how the Unix operating system deals with a raw storage device like a disk drive, and how it manages to make organized use of the space.

How the Unix file system works

Every item in a Unix file system can be defined as belonging to one of four possible types:
Ordinary files
Ordinary files can contain text, data, or program information. An ordinary file cannot contain another file, or directory. An ordinary file can be thought of as a one-dimensional array of bytes.
Directories
In a previous section, we described directories as containers that can hold files, and other directories. A directory is actually implemented as a file that has one line for each item contained within the directory. Each line in a directory file contains only the name of the item, and a numerical reference to the location of the item. The reference is called an i-number, and is an index to a table known as the i-list. The i-list is a complete list of all the storage space available to the file system.
Special files
Special files represent input/output (i/o) devices, like a tty (terminal), a disk drive, or a printer. Because Unix treats such devices as files, a degree of compatibility can be achieved between device i/o, and ordinary file i/o, allowing for the more efficient use of software. Special files can be either character special files, that deal with streams of characters, or block special files, that operate on larger blocks of data. Typical block sizes are 512 bytes, 1024 bytes, and 2048 bytes.
Links
A link is a pointer to another file. Remember that a directory is nothing more than a list of the names and i-numbers of files. A directory entry can be a hard link, in which the i-number points directly to another file. A hard link to a file is indistinguishable from the file itself. When a hard link is made, then the i-numbers of two different directory file entries point to the same inode. For that reason, hard links cannot span across file systems. A soft link (or symbolic link) provides an indirect pointer to a file. A soft link is implemented as a directory file entry containing a pathname. Soft links are distinguishable from files, and can span across file systems. Not all versions of Unix support soft links.

The I-List

When we speak of a Unix file system, we are actually referring to an area of physical memory represented by a single i-list. A Unix machine may be connected to several file systems, each with its own i-list. One of those i-lists points to a special storage area, known as the root file system. The root file system contains the files for the operating system itself, and must be available at all times. Other file systems are removable. Removable file systems can be attached, or mounted, to the root file system. Typically, an empty directory is created on the root file system as a mount point, and a removable file system is attached there. When you issue a cd command to access the files and directories of a mounted removable file system, your file operations will be controlled through the i-list of the removable file system.

The purpose of the i-list is to provide the operating system with a map into the memory of some physical storage device. The map is continually being revised, as the files are created and removed, and as they shrink and grow in size. Thus, the mechanism of mapping must be very flexible to accommodate drastic changes in the number and size of files. The i-list is stored in a known location, on the same memory storage device that it maps.

Each entry in an i-list is called an i-node. An i-node is a complex structure that provides the necessary flexibility to track the changing file system. The i-nodes contain the information necessary to get information from the storage device, which typically communicates in fixed-size disk blocks. An i-node contains 10 direct pointers, which point to disk blocks on the storage device. In addition, each i-node also contains one indirect pointer, one double indirect pointer, and one triple indirect pointer. The indirect pointer points to a block of direct pointers. The double indirect pointer points to a block of indirect pointers, and the triple indirect pointer points to a block of double indirect pointers. By structuring the pointers in a geometric fashion, a single i-node can represent a very large file.

It now makes a little more sense to view a Unix directory as a list of i-numbers, each i-number referencing a specific i-node on a specific i-list. The operating system traces its way through a file path by following the i-nodes until it reaches the direct pointers that contain the actual location of the file on the storage device.

The file system table

Each file system that is mounted on a Unix machine is accessed through its own block special file. The information on each of the block special files is kept in a system database called the file system table, and is usually located in /etc/fstab. It includes information about the name of the device, the directory name under which it will be mounted, and the read and write privileges for the device. It is possible to mount a file system as "read-only," to prevent users from changing anything.

File system quotas

Although not originally part of the Unix filesystem, quotas quickly became a widely-used tool. Quotas allow the system administrator to place limits on the amount of space the users can allocate. Quotas usually place restrictions on the amount of space, and the number of files, that a user can take. The limit can be a soft limit, where only a warning is generated, or a hard limit, where no further operations that create files will be allowed.

The command

quota
will let you know if you're over your soft limit. Adding the -v option will provide statistics about your disk usage.

File system related commands

Here are some commands related to file system usage, and other topics discussed in this section:
bdf
On HP-UX systems, reports file system usage statistics

df
On HP-UX systems, reports on free disk blocks, and i-nodes

du
Summarizes disk usage in a specified directory hierarchy

ln
Creates a hard link (default), or a soft link (with -s option)

mount, umount
Attaches, or detaches, a file system (super user only)

mkfs
Constructs a new file system (super user only)

fsck
Evaluates the integrity of a file system (super user only)

A brief tour of the Unix filesystem

The actual locations and names of certain system configuration files will differ under different implementations of Unix. Here are some examples of important files and directories under version 9 of the HP-UX operating system:
/hp-ux
The kernel program

/dev/
Where special files are kept

/bin/
Executable system utilities, like sh, cp, rm

/etc/
System configuration files and databases

/lib/
Operating system and programming libraries

/tmp/
System scratch files (all users can write here)

/lost+found/
Where the file system checker puts detached files

/usr/bin/
Additional user commands

/usr/include/
Standard system header files

/usr/lib/
More programming and system call libraries

/usr/local/
Typically a place where local utilities go

/usr/man
The manual pages are kept here

Other places to look for useful stuff

If you get an account on an unfamiliar Unix system, take a tour of the directories listed above, and familiarize yourself with their contents. Another way to find out what is available is to look at the contents of your PATH environment variable:
echo $PATH
You can use the ls command to list the contents of each directory in your path, and the man command to get help on unfamiliar utilities. A good systems administrator will ensure that manual pages are provided for the utilities installed on the system.


previous
section   upper level