Библиотека сайта rus-linux.net
Linux System Administrator's Survival Guide lsg18.htm
Chapter 18
Filesystems and Disks
One of a system administrator's most important tasks is managing the Linux system's hard disks and filesystems. Keeping both in proper order helps the Linux operating system perform at its best. This task involves doing a set of actions regularly. This chapter describes the actions involved in keeping the Linux filesystems and the hard disks they reside on in peak condition. (This chapter does not look at the steps involved in adding new hard disks to your Linux system, as this was covered in Chapter 8, "Hard Disks.")
The general actions a system administrator must perform to keep filesystems performing smoothly are the following:
- Check filesystems for corrupt sectors
- Check filesystems for integrity and correct i-node tables
- Check file permissions and ownerships to ensure proper access
- Make filesystems (local and remote) available to users as necessary
- Manage the Linux system's disk space
- Perform regular backups for data security
Although some of these actions are performed automatically every time Linux boots (such as checking the filesystem for corruption), you should know how to force these processes manually, as well as know what they do and how to correct problems that may arise. With the exception of performing backups for data security (covered in more detail in Chapter 22, "Backup, Backup, Backup!") and checking file permissions (covered in Chapter 17), this chapter looks at all these actions.
Mounting and Unmounting Filesystems
To understand why filesystems must be mounted, you have to know how Linux organizes the disks and filesystems that make up the entire directory structure. Linux uses a single directory structure, regardless of how many disks and disk partitions are involved. Each partition's filesystem must be part of the larger directory structure. The entire directory tree has only one root directory, and other filesystems are attached at lower levels.
To visualize this concept, imagine a standard Linux filesystem with the root partition (/) at the top; all the other partitions branch off from the root partition. The root partition is on a partition of the first hard disk. Usually, that disk also has other directories on it, such as /dev, /lib, /etc, and so on. Essentially, all the directories needed to start a minimal Linux operating system have to be on the primary partition.
However, suppose you want to have a very large /usr filesystem because you intend to support a lot of users with very large database files. Your primary disk partition may not be able to contain all the files you want to save, so you can use another partition (on the same or a different hard disk) and format it as a Linux filesystem, and then attach it to the root filesystem at the /usr directory point. Whenever a user moves from the /bin directory (on the first partition) to /usr/tparker, for example, the user moves to another partition or disk. The move across partitions is completely unnoticeable to the user because the two partitions look like a single directory tree. The /usr directory is said to be mounted on the root directory.
More accurately, the partition that holds the /usr filesystem is mounted on the root filesystem in the /usr location. It could just as easily have been mounted at the /home location. Linux doesn't care where you mount a filesystem as long as you mount it as a directory that exists in the root filesystem (so /usr or /home, depending on where you mount the filesystem, would have to be an empty directory on the root fileystem) and no conflict exists between directory names. If the partition were mounted at /home, the user would access /home/tparker instead of /usr/tparker.
You can stretch this concept even further. Suppose one user, such as /usr/tparker, has to access a very large library of pictures stored on a CD-ROM drive. You can attach the filesystem on the CD-ROM to the existing filesystem as /usr/tparker/cd_rom, for example, with the operating system knowing to move to the CD-ROM whenever the user accesses that directory. Again, this transition is unnoticeable to the user. This example shows that you can mount a filesystem onto another mounted filesystem.
Linux lets you mount partitions anywhere from any source, as long as they fit into the overall filesystem structure. The only place you cannot mount a filesystem is at the root directory location, which must exist on the root filesystem.Linux also allows you to mount some other operating system filesystems, such as a DOS or OS/2 filesystem, onto your Linux filesystem. Essentially, you let Linux know where to access the filesystem (/dos, /usr/dos, or some other directory name) and tell Linux the type of filesystem, and it lets you move through that filesystem's directories and files as you would any Linux directory. You can mount a filesystem in only one location at a time; you cannot mount one filesystem (of any kind) as both /usr and /home, for example.
All these filesystem mounting options make Linux very versatile. If a friend has a hard drive full of data you want to access and the data is a filesystem Linux can understand, your friend can bring the hard drive to your machine and attach it to your controller, and then you can mount your friend's filesystem anywhere that is available on your existing filesystem. You can mount any device that can hold a filesystem, including CD-ROMs, floppy disks, magneto-optical drives, removable cartridges, and so on.
To mount a filesystem, you use the mount command. The general syntax of the mount command is
mount device_name mount_point
where device_name is the name of the device (partition, hard disk, CD-ROM, and so on) and mount_point is the name of the directory to which you want to mount the device. For example, to mount the partition /dev/sda4 (fourth partition on the first SCSI hard disk) to the /usr directory, issue the following command:
mount /dev/sda4 /usr
To mount a CD-ROM filesystem (such as /dev/cdrom) on the directory /cdrom (assuming the directory exists), use the following command:
mount /dev/cdrom /cdrom
Alternatively, you can use the following command to mount a CD-ROM filesystem, because you can mount a filesystem anywhere:
mount /dev/cdrom /usr/tparker/data/pictures/cd-rom
<NOTE>Only the root mounts and unmounts filesystems. Although it's possible to enable users to mount filesystems, this practice can lead to security problems and is therefore generally discouraged. Log in as the superuser to mount or unmount filesystems.<NOTE>
You can mount a filesystem as read-only so that any attempt to write to the filesystem generates an error message. This feature is useful to prevent frustrated users of a mounted CD-ROM filesystem, for example, or if you want to make sure nobody writes to a mounted filesystem on another partition (which may contain data you don't want to be corrupted). To mount a filesystem as read-only, use the -r option:
mount -r /dev/cdrom /cdrom
Some older versions of UNIX and Linux allow the -r option to be at the end of the command line:
mount /dev/cdrom /cdrom -r
When one of the mounted filesystems is disconnected (so users cannot access the directories), the filesystem has been unmounted. Any mounted filesystem can be unmounted except for the root filesystem, which is always active. To unmount a filesystem, use the umount command. (One of the most common errors for system administrators is typing this command as unmount instead of umount). The umount command takes the name of either the device or the mount point. To unmount the CD-ROM mounted in the last example, you can use either of the following two commands:
umount /dev/cdrom
umount /cdrom
You don't have to unmount all filesystems before you shut down the system, as Linux can handle the unmounting as part of the shutdown process.
Mounting Filesystems Automatically with the /etc/fstab File
Any previously mounted filesystems are not necessarily mounted automatically when the system restarts(other than root, which is always mounted automatically when the system starts). When Linux boots, it must know where to find the filesystems to be mounted. Linux uses the /etc/rc initialization file (run when Linux boots) to execute the command:
mount -av
When Linux executes this command, it knows to read the file /etc/fstab to find out which filesystems have to be mounted and where they should be mounted.
<NOTE>You also can use the following command to mount all the filesystems in the /etc/fstab file:
mountall
Not all versions of Linux support the mountall command, but all should support the mount command line.<NOTE>
Each line in the /etc/fstab file follows this format:
device mount_location filesystem_type options dump_frequency pass_number
This section looks at a few of these parameters in more detail, as well as provide valid values. In practice, the /etc/fstab file is an ASCII file composed of several columns. The following is a sample /etc/fstab file:
/dev/sda1 / ext2 defaults 1 1
/dev/sda2 /usr ext2 defaults 1 1
/dev/sda3 /usr/data ext2 defaults 1 1
/dev/cdrom /cdrom iso9660 ro 1 1
/dev/sda4 /dos msdos defaults 1 1
/dev/sdb1 /data ext2 defaults 1 1
This rather complex-looking table is quite easy to understand. The first column gives the device name, followed by the mount point, the type of filesystem, and instructions about how to treat the filesystem. For example, the root filesystem in the above table is /dev/sda1 and is a typical ext2 Linux filesystem. The CD-ROM device is mounted as /cdrom; it is an ISO 9660 (CD-ROM) filesystem and is mounted as read-only. The DOS filesystem is mounted as /dos.
Linux mounts the filesystems in the order they are given in /etc/fstab. Note in the preceding sample file that the entry that mounts /usr/data follows the entry that mounts /usr. If the /usr/data entry came before the /usr entry, the mount wouldn't work because the /usr directory wouldn't yet exist. If one mount fails, Linux ignores it and executes the rest of the entries. If a mount of a directory that is used further down the file fails, the dependent mounts fail too. For example, if the mount of /usr fails for some reason, the mount of /usr/data fails too, as /usr doesn't exist.
The last two numbers on each line in /etc/fstab show the dump frequency and the pass number. These two numbers do not mean anything with some versions of Linux, so check the fstab man page for more information. The dump frequency tells Linux how often the filesystem should be backed up. One means the backup should occur daily, two means the backup should be every other day, and so on. This number is used for automated backup routines that can parse the /etc/fstab file for this information.
The pass number indicates the order in which the fsck utility should check the filesystem. One means the filesystem should be checked first, two means it should be checked second, and so on. If more than one filesystem has a pass number of one, the filesystems are checked in the order they occur in the /etc/fstab file. The root filesystem must have a value of one, and the convention is to set other partitions higher. However, because most Linux versions don't use the pass number, all filesystems usually have this number set to one. If your version of Linux does use this number and you have more than one disk drive on your Linux filesystem, set the numbers on each disk in order (1, 2, 3 and so on to match the mount order), and then use a parallel scheme for each additional disk drive. This way, fsck checks filesystems on each disk in parallel.
You can include swap partitions can be included in the /etc/fstab file as well. List these partitions as type swap, with the mount directory set to none and the dump frequency and pass number set to zero, as shown in the following example:
/dev/sda2 none swap sw 0 0
When you include a swap partition in /etc/fstab, you can activate it using the swapon command. When you execute the command
swapon -a
Linux reads the /etc/fstab file and activates all swap partitions. This command is usually embedded in the /etc/rc file (so it is executed automatically when Linux boots), although it can be run from the command line just as easily.
Filesystem Types
The types of filesystems that Linux supports vary depending on the version of Linux you are using. Most versions support the following filesystem types, though. You can use them in the /etc/fstab file:
ext2 | This type is the second extended filesystem, which is the most common type of Linux partition. |
ext | This type is the original Linux extended filesystem, which has been replaced by ext2. |
minix | This type is the original Minix filesystem, which is rarely used but is still supported because it was the first Linux filesystem format. |
xia | This type is the Xia filesystem, which is rarely used because it has been superseded by ext2. |
umsdos | This type is the UMS-DOS filesystem, which is used to install Linux on a DOS partition (with no dedicated Linux partition). |
proc | This type is the filesystem based on /proc, which is used for some processes that use system information. |
iso9660 | This type is the ISO 9660 filesystem, which is used on most CD-ROM disks. |
xenix | This type is the SCO Xenix filesystem, which provides support for Xenix under Linux. |
sysv | This type is the UNIX System V filesystem, which provides support for System V drives under Linux. |
coherent | This type is Mark Williams' Coherent UNIX version, which provides support for Coherent filesystems under Linux. |
msdos | This type is a DOS partition that Linux can access. |
hpfs | This type is the High Performance filesystem, which provides support for HPFS under Linux. |
Some versions of Linux do not include support for all filesystems included above, especially the lesser-used filesystems like coherent and minix. A filesystem called nfs, which supports the Network filesystem, is supported on recent Linux versions. The nfs filesystem refers to Network Filesystem, which is examined later in this book.
Options Values
The options field in the /etc/fstab file can have several different values, depending on the version of Linux. For most versions of Linux (which are based on BSD UNIX), you can use the following options to describe the filesystem characteristics:
default | Varies depending on version, but normally read-write, suid, and quota |
rw | Read-write |
ro | Read-only |
suid | Access in SUID mode allowed |
nosuid | Access in SUID mode not allowed |
quota | Quotas may be in effect |
noquota | Quotas may not be in effect |
If the filesystem type is nfs, many more options are supported. The default option tends to be the best choice for typical filesystems mounted on a local hard drive.
<NOTE>You may see the term SUID used often when dealing with system administration. SUID stands for Set User ID and is a permission bit associated with all files and directories. There is also a bit called SGID, for Set Group ID. Any file or directory with these bits set act as though they were owned by another user. For example, you could be logged in as a normal user and execute a binary file that has SUID set. The binary will execute as though it was run by root. Both SUID and SGID are dangerous bits to work with as they can provide security problems.<NOTE>
Managing Disk Space
UNIX system administrators have a saying: no matter how much disk space you have, it's not enough. This maxim is as true for Linux as it is for UNIX. Disk space has a way of being gobbled up, especially when several users are sharing a system. By the time you have loaded your operating system, favorite applications, compilers, and user files, your disk space is probably close to full. If it isn't, wait six months and it will be.
Disk drives are very inexpensive now, so many system administrators prefer to battle the disk space problem by adding larger hard disks or extra disk drives. This option is certainly valid and prevents a lot of hassle cleaning up files, but you still should force some kind of disk space usage policy on yourself and other users to make sure disk space is not wasted. To create such a policy, you have to know how to determine disk space usage, manage disk space effectively, and clean up disks.
Checking Filesystems
Part of Linux's startup routine (driven by an entry in /etc/rc) is to check all mounted filesystems to make sure that they are not damaged or corrupted. This check is performed with every reboot. However, if your machine is not rebooted often or you are experiencing disk errors, start a filesystem check manually.
In general, you use the utility fsck (filesystem check) to check filesystems. Linux uses some special versions of fsck to check Linux-dependent filesystems, though, so you may not have direct access to fsck. For example, many Linux versions have a dedicated fsck version called e2fsck to check the ext2 filesystem.
When fsck does exist on a Linux version, it is often just a front-end search engine that looks in the /bin, /sbin, /etc/fs, and /etc directories for one of the proper filesystem fsck versions, and then executes that version. The search and execution processes are transparent to you in most cases.
The fsck utility does several tasks. As part of its operation, it scans the entire filesystem for any of the following problems:
- A block shared by many files (cross-linked)
- Blocks in use but marked as free
- Inconsistent entries among files and i-node tables
- Incorrect link counts
- Illegal entries in the i-node tables
- Inconsistencies between i-node table size values and the disk space used by a file
- Illegal values in files
- Lost files that don't appear in the i-node table
The entire process occurs quickly, so there is no reason not to run fsck regularly. If fsck does report errors, shut down the system to superuser mode only and rerun fsck. The problem may have occurred because of a user application; this step identifies that type of problem. If the disk still has problems, you can correct them in superuser mode.
<NOTE>In most cases, fsck runs only on unmounted filesystems (except root). If you want to check a filesystem, unmount it, and then run fsck. To check root, switch the system to single-user mode, and then run fsck. Although some versions of Linux don't require these steps, they are good safety precautions to prevent accidental changes to the disk or i-node tables.<NOTE>
The fsck command takes the name of either the device or the mount point of the filesystem you want to check. For example, both of these command lines invoke fsck properly:
fsck /dev/sda1
fsck /usr
If fsck is working on several disk drives (because of mounting), it tries to work in parallel whenever possible to reduce the amount of time required for the disk checking.
A number of options are useful with fsck or its filesystem-specific versions. The options supported by most Linux systems that are commonly used by system administrators are as follows:
-a | This option automatically repairs the filesystem without prompting you(use this option with care). |
-r | This option interactively repairs the filesystem (the system asks for instructions). Use it only when checking a single filesystem. |
-t <type> | This option specifies the type of filesystems to check. If type is preceded by no, only the other types of filesystems are checked. This option uses the filesystem types from the /etc/fstab file. |
-v | The option provides verbose output. |
Many other options are supported by fsck and its versions (like e2fsck), but a system administrator seldom (if ever) needs these other options. The fsck man page summarizes all the available options for you.
Get in the habit of running fsck occasionally, just to check the filesystem integrity. If you reboot often, the automated fsck checks the filesystem for you. But if you ever get disk error messages, fsck is the first place to turn.
Displaying Filesystem Statistics
Two commands are frequently used to check filesystem statistics (such as space used, space available, and so on). They are df (disk filesystem) and du (disk usage). Both commands are included with practically all versions of Linux.
The df command is the most widely used statistics generator for filesystems. It displays information about all the filesystems on the system, their total capacities, the amount of free space available on each, and the current mount locations. The following is an example of output from a df command:
merlin$ df
Filesystem 1024-blocks Used Available Capacity Mounted on
/dev/sda3 478792 94598 359465 21% /
/dev/sda1 511712 44288 467424 9% /dos
/dev/scd0 663516 663516 0 100% /cdrom
This system has a single SCSI hard disk with two partitions, one Linux and one DOS. The Linux partition /dev/sda3 has 478,792K total on the disk, of which 94,590K are used. The amount of disk space available is 359,465K. The Linux partition is 21 percent used. (Remember that a kilobyte is 1,024 bytes, so the numbers shown in the output are kilobytes.) Similarly, the DOS partition /dev/sda1 has only 9 percent of its 511,712K capacity used. The CD-ROM has 100 percent of its 663,516K used. It's mounted as /cdrom.
This command shows you a handy display of the capacity of all the Linux partitions on your system and their usage. If you are handy with utilities like awk, you can total the capacities and used space by adding the columns, which makes a handy single shell language utility. Some system administrators like to run this type of summary command in background every day and post the information to themselves through mail or a broadcast when they log in.
<NOTE>You may occasionally see disk capacities in excess of 100 percent. This is caused by Linux holding back about 10 percent of the disk for the superuser's use exclusively, which means about 110 percent of the displayed capacity is available to root. Whenever the capacity approaches 100 percent, though, it's time to clear off the disk!<NOTE>
A handy option of the df command shows similar information about the i-node tables:
merlin$ df -i
Filesystem Inodes IUsed IFree %IUsed Mounted on
/dev/sda3 123952 8224 115728 7% /
/dev/sda1 0 0 0 0% /dos
/dev/scd0 0 0 0 0% /cdrom
This display, from the same system as the df output above, shows the number of i-nodes available, how many are used, the number that remain free, and the percentage used. No correlation exists between disk space usage and i-node table usage, so you should display both sets of information. An i-node is used every time a file is used. If many small files are saved, the i-node table can fill up, but you may still have plenty of disk space for new files. Check both disk space usage and i-node table usage for maximum information.
The df command ignores any filesystems that have zero blocks in them unless you specify the -a or -all option. Filesystems with zero blocks are used occasionally for special purposes such as automounting particular devices. The df command also ignores any filesystems that have the filesystem options set to ignore in the /dev/fstab file (usually only swap files have this setting). By default, the df command displays all filesystems mounted on the system, unless you specify one particular filesystem, as in the following example:
merlin$ df
Filesystem 1024-blocks Used Available Capacity Mounted on
/dev/sda3 478792 94598 359465 21% /
<NOTE>The df utility displays disk space in 1K blocks unless you set the environment variable POSIXLY_CORRECT in the system startup files. If this variable is set, 512-byte blocks are used to report information. This setting is helpful if you use an older filesystem type that uses disk sectors of 512 bytes.<NOTE>
The df command provides a number of command-line options, most of which are supported in all Linux versions. The available options for df are in the following list:
-a, -all | This option includes all filesystems with zero blocks (usually special filesystems). |
-help | This option displays help information. |
-i, -inode | This option displays i-node information. |
-k, -kilobyte | This option displays disk space in 1K increments. (This option is used to override the environment variable set to 512 bytes blocks; see preceding note). |
-p | This option uses POSIX format to display all information of a filesystem on one line with no wrapping. If a filesystem name is longer than 20 characters, this option forces the columns to be misaligned. |
-T | This option displays the type of filesystem in addition to disk usage information. |
-t<type> | This option displays only filesystems whose type matches the one you specify. |
-v | This option displays the version number. |
-x<type> | This option displays all filesystems not of the type you list. |
You can use most of these options in combination as you need them. You can embed the most frequently run commands in a shell script to be run whenever you want.
The du command also displays useful disk usage statistics. When run by itself, the du command displays the amount of disk space used by all files and subdirectories under any specified directory or the current directory if none other is listed (these excerpts have been edited to reduce space):
merlin$ du
125 /info/a_temp
4 /info/data
265 /info/data/book
726 /info/data/book/chap_1
2 /info/zookeeper
...
273263 /info
merlin$ du /usr/tparker
35 /usr/tparker/bin
2736 /usr/tparker/book
3 /usr/tparker/source
...
7326 /usr/tparker
The output from du shows each directory's disk usage in blocks in the first column and name of the directory in the second. You can usually convert the blocks in the first column directly to kilobytes used because most Linux filesystems use 1K blocks. (As with df, the du utility displays disk space in 1K blocks unless you set the environment variable POSIXLY_CORRECT in the system startup files.)
If you run du on a large directory tree, the output can be very long (and boring to read). You can summarize the information using the -s (summarize) option:
merlin$ du -s /usr/ychow
3315 /usr/ychow
The output with this option includes all subdirectories and the directory being reported. This output is useful for determining the amount disk space each user on the system uses.
<NOTE>You can easily combine the du command with other commands to generate lists of disk usage by directory. For example, to show a complete list of all directories in order of size, issue the command<NOTE>
du / | sort -rn
The du command has several useful options. Most Linux versions support the following options:
-a, -all | This option displays a total count for all files and directories. |
-b, -bytes | This option displays size in bytes. |
-c, -total | This option displays a grand total. |
-k | This option displays the sizes in kilobytes, overriding any environment variable set to 512 bytes. |
-l | This option displays the size of all files (including links). |
-s | This option displays only totals. |
-v | This option displays version information. |
-x | This option ignores directories on another filesystem (mounted into the current filesystem, of course). |
The du command may take a while to generate output if there are a lot of entries to process, especially when run, for example, from the root filesystem on a heavily loaded system. The best use for the du command is in scripts or cronjobs that are run when the system is not heavily loaded.
Making the Most of Your Disk Space
When you're running out of disk space, the easiest solutions are to buy another disk, create another Linux partition, or add a remote disk to your system. Presumably, if you can do any of these solutions you will, but sometimes expanding the total amount of disk space is not practical or desirable. Instead, the solution is to manage what you have.
As a general rule, disk performance starts to degrade when the system hits 90 percent capacity or more. This system degradation is primarily due to fragmentation of the disk and the heads having move further to access and save files. Many system administrators use about 75 percent capacity as the first warning sign to do something about disk space. You'll develop your own guidelines, but try to avoid running out of disk space; you can find yourself in very awkward circumstances if you do.
A good first step to reducing disk space usage is to examine all the applications and software sets loaded on your system and remove the ones you don't use. For example, if you are not using the C compilers you loaded when you installed Linux, you can remove them and free up over 50M of disk space.
Another good practice is to scan user areas for users with large disk usage. Tell those users to clean up their areas by deleting or archiving material they don't want or need. In many cases, users keep multiple copies of files around, just in case. Remove the old ones! Get rid of automatic backup files, and clean out large log files. Just cleaning out the system logs can free up 30M on some systems. The primary log files you should look at are the following:
/usr/spool/lp/log | printing log |
/usr/lib/cron.log | cron log file |
/usr/spool/uucp/LOGFILE | UUCP log file |
These three files can grow to amazing sizes. There are also log files for all the printers, many communications packages, the system, compilers, and other utilities. Check your filesystem for files that grow unreasonably large. Also check mailboxes, which can collect error messages (such as from a bad cron job) and grow to many megabytes in size.
If you want to keep some of the lines in the log files instead of just deleting them all, use the tail command with the number of lines you want to keep. For example, the following series of commands keeps the first 100 lines of the log file, but deletes the rest:
cd /usr/spool/lp
tail -100 log > tmp
mv tmp log
Next, get in the habit of routinely backing up material you don't need except as archive material. Use floppy disks, a tape drive, or other archive material and stick the data on the shelf instead of on your hard drive. You'll save lots of room by regularly going through your system and cleaning up files. If you really need to keep them on your hard disk, use compress or gzip to shrink the file size noticeably. To find all files that haven't been accessed (read or write) in a certain number of days, use the find command. This command searches for all files older than 120 days and displays them on-screen:
find / -atime +120 -print
When you have the list of old files, you can consider archiving them.
You can write a shell script that searches for and deletes unwanted files, such as core files, .bak (and similar backups for editors and word processors)files, .log files, .error files, and so on. You can create a list of the files you want to regularly remove from your system, embed them in a find command such as the following one, and execute the command to clean out disk space. The following command looks for all files called core and deletes them:
find / -name core -exec rm {} \;
The find command locates all core files and passes the path to the rm command. The trailing backslash and semicolon are necessary to execute the command properly. There are more elegant (and less CPU-intensive) methods of doing the same task, but this command is a solid, reliable method.
Understanding Links
File links are an oft-misunderstood aspect of filesystems, despite their simplicity. A link, in its simplest form, creates a second filename for a file. For example, if you have the file /usr/bill/testfile and want to have the same file in the /usr/tim directory, you don't have to copy it. Just create a link with the following command:
ln /usr/bill/testfile /usr/tim/testfile
The format of the command is always the current filename followed by an additional filename, just as with the cp or mv commands.
The reason for links is basically twofold, in this example. First, both the file /usr/bill/testfile and /usr/tim/testfile refer to the exact same file, so any changes made by bill or tim are reflected immediately in the other directory (removing the need to copy files every time). Both bill and tim can modify the file, as long as they don't make changes to the file at the same time.
The link also gets by file permission and ownership problems. If bill owns the file /usr/bill/testfile and is the only one who can write to it, he can create a link to /usr/tim/testfile and set the ownership of the new link to tim. In this way, both bill and tim can work on the same file despite ownerships and permissions, as each copy has its own ownerships. If set correctly, the ownerships and permissions can prevent anyone other than bill and tim from reading or writing to the file.
The ln command, such as in the preceding example, is creating hard links. A hard link is a link in the same filesystem with two i-node table entries pointing to the same physical contents (with the same i-node number because they point to the same data). If you want to see the effect of a link on the i-node table, display the i-node entry for a file in a directory, for example:
$ ls -i testfile
14253 testfile
Then, create a link to another filename and display the i-node entries again:
$ ln testfile test2
$ ls -i testfile test2
14253 testfile 14253 test2
As you can see, both file i-node numbers are the same. A directory listing of the two files shows that they have their own permissions and ownerships. The only thing indicating a link is the second column of the ls output, which shows a two for the number of links. Deleting a linked filename doesn't delete the file until there are no more links to it.
A symbolic link is another type of link that doesn't use the i-node entry for the link. You used these links when you were creating device drivers, such as /dev/modem instead of /dev/cua1. The -s option to the ln command creates a symbolic link. For example, you can recreate the preceding example with a symbolic link:
$ ls -i bigfile
6253 bigfile
$ ln -s bigfile anotherfile
$ ls -i bigfile anotherfile
6253 bigfile 8358 anotherfile
As you can see, the i-node table entries are different. A directory listing shows the symbolic link as an arrow:
lrwxrwxrwx 1 root root 6 Sep 16:35 anotherfile -> bigfile
-rw-rw-r-- 1 root root 2 Sep 17:23 bigfile
The file permissions for a symbolic link are always set to lrwxrwxrwx. Permissions for access to the symbolic link name are determined by the permissions and ownership of the file it is symbolically linked to (bigfile in this case).
The difference between hard links and symbolic links is more than just i-node table entries. You can create symbolic links to files that don't exist yet, which you can't do with hard links. You can also follow symbolic links to find out what they point to, which is an almost impossible task with hard links. The kernel processes the two types of links differently, too.
Summary
This chapter examined the common disk and filesystem utilities you have available for checking the integrity of the filesystem. It also looked at the basic methods you should use to keep down disk space usage. This chapter also briefly examined links and how you can use both symbolic and hard links to help provide access to some files. Following these simple steps can make your life a lot easier.