Disk fragmentation (and how to get rid of it). ============================================== Unix disk partitions are organized around "cylinders", which are the combined set of disk tracks read by all heads on all platters of a disk drive. This makes disk read operations more efficient because larger amounts of data can be read without having to reposition, or "seek", the heads. Unix file systems (and there are many kinds of file systems for Unix) reside on all the cylinders within a given partition. They allocate files in the blocks that make up these cylinders, trying to make all the blocks contiguous and within the same cylinder. This means file I/O is made more efficient because the heads do not need to seek to different locations on the disc to read or write all the blocks within a given cylinder. Many files can then all reside in the same cylinder and the heads will not need to seek to read/write any of them. (It should be noted that disc I/O is also speeded by file buffer caching in RAM so that write operations can be optimized in relation to other file I/O requests and CPU utilization, although file reads/writes that are not in, or are too large for, the cache end up being limited by the raw filesystem speed.) Primitive operating systems like DOS, on the other hand, allocate files on a sector by sector basis and tend towards a lot of head seeking as files extend to other tracks on the disk. They often do not do any buffering or caching (althoug newer versions of DOS, add-on programs, and some disc drive controllers, do add some buffering.) As file systems fill, fewer free blocks are available within cylinders. When a new file is then created, it cannot fit within the remaining free blocks within a given cylinder and must be broken up with file extents that will fall within different cylinders. When file extents are split up across cylinders, Unix file systems start to behave like DOS and will begin to force head seeking from cylinder to cylinder to read or write the blocks of the file. This situation is referred to as "disk fragmentation". Microsoft DOS, which is a single-user, single-tasking operating system, allows a single program to take full control of the system and thus ensure that there are no buffers of un-written file data. They can thus go through the disk drive, sector by sector, and recreate files such that they are organized contiguously on disk again. This is called "defragmentation" and it results in greatly sped up disk access. The newer versions of Microsoft Windows (other than Windows NT, which is a true multi-tasking operating system), allow a minimal amount of multi-tasking and file system buffering, requires that you disable buffering and allow the defragmentation utility to take full control of the disk to ensure that no corruption of the file system occurs during defragmentation. This is similar to the "single user" mode of Unix systems (where nobody is allowed to use the system, other than the system administrator, and no services are running that might try to create files). There are many DOS and Windows defragmentation utilities to handle this common and frequent problem of disk fragmentation with DOS file systems. Unix, on the other hand, has much less likelihood of having situations of high disk fragmentation that will impact I/O performance. ("Better" is not the same as "perfect". Some file systems, like the Berkeley Fast File System, try to allocate block in at least a rotationally efficient, if not perfectly contiguous, manner. You can also get some improvements in fragment allocation via "tunefs". See "man tunefs" and _The Design an Implementation of the 4.3 BSD Unix Operating System_, listed in the Library section.) The most likely time for fragmentation problems to crop up is when the disk partitions near 100% capacity. This is a good reason to monitor disk usage and ensure plenty of free space is generally available. A common way to defragment Unix file systems is to do a backup, remake the file system, then restore the files. Note that you must do the backup using a program that operates at the level of directories and files (e.g., "cpio", "tar", "dump"), rather than dealing with raw partitions themselves (e.g., "dd"). Those backup utilities that operate on raw partitions will maintain the existing blocking, and thus the fragmentation, when used to restore partitions. Utilities that back up files one at a time, effectively re-consolidating the fragmented blocks while writing them to tape, and then lay them back down in contiguous blocks in the fresh file system. "dump" (and its inverse, "restore") are the more popular utilities for backs for several reasons For more information, see: ftp://sunsite.unc.edu/pub/Linux/docs/HOWTO/mini/Partition Besides using this backup/new file system/restore method, there are options for using defragmentation utilities in Unix. Some of them are listed here. Digital Unix ============ Polycenter Advanced File System (AdvFS) has its own defragmenter named, appropriately, "defragment". From the "advfs" man page: The POLYCENTER Advanced File System (AdvFS) is a file system option on the DEC OSF/1 operating system. The Advanced File System provides rapid crash recovery, high performance, and a flexible structure that enables you to manage your file system while it is on line. An optional set of utilities is available for AdvFS that expands the capabilities of the file system. The POLYCENTER Advanced File System Utilities provide functions such as adding volumes without reconfiguring the directory hierarchy of the file system, cloning filesets, and improving system performance with file defragmentation, domain balancing, and file striping. A graphical user interface (GUI) that simplifies file system management is available With the AdvFS Utilities. The Advanced File System component is licensed with the DEC OSF/1 operating system and is available as an optional subset during installation. The POLYCENTER Advanced File System Utilities is available as a separately licensed layered product. Linux ===== From: Stephen Tweedie Subject: defrag-0.70 - Linux defragmenter Date: Thu, 21 Aug 1997 14:10:51 GMT =====BEGIN PGP SIGNED MESSAGE===== Announcing defrag-0.70 The latest release of the Linux filesystem defragmenter, defrag-0.70, is now available at linux.dcs.ed.ac.uk:/pub/linux/defrag/defrag-0.70.tar.gz (to appear at) sunsite.unc.edu:/pub/Linux/system/filesystems/defrag-0.70.tar.gz This release includes all of the patches I have been sent against the 0.6x defrags. Please let me know if there are any other changes outstanding. New in 0.70: ************ Tidied up colour support and attributes for fragmented blocks. Added 64 bit device access support to allow use on filesystems >2GB. Added a new e2defrag.static target with no graphic display support for use on root floppies (for those who want to defragment their root filesystem). New in 0.62: ************ Thanks to Ulrich E. Habel (espero@b31.hadiko.de) for this update: Picture mode is now colorized. Now ext2 V2 Inode-informations are read from the Linux-Includes. A bug fixed in valid-check of filesystem. Stephen Tweedie HP-UX ===== Commercial products exist to handle defragmentation. One such product is described here: EAGLE Software, Inc. has announced Version 3.00 of DISK_PAK for UNIX. DISK_PAK can safely eliminate file system fragmentation as well as cluster frequently accessed files for peak file system responsiveness. IRIX ==== "fsr" (File System Reorganizer) Run nightly via "cron" From an SGI administrators list email message: http://www.sgi.com/Archive/comp.sys.sgi/admin/1993/Apr/0131.html -------------------------- In article <1qcchqINN6bt@srvr1.engin.umich.edu> hillig@U.Chem.LSA.UMich.EDU (Kurt Hillig) writes: >Can anyone translate the output of the "-v" option of fsr? The man >page says: > > -v Verbose. Print cryptic information about each file being > reorganized. The following line: movex() i1152 0+3676 -> 23768 means: for the file whose inode number (ls -i) is 1152 move 3676 blocks starting at logical block 0 to file system block 23768. This line: slidex() i65 0+160 -> 7834 (1) means: "slide" the 160 block extent starting at logical block 0 to file system block 7834, one block at a time "(1)". (Try fsr -vv to see even more output :-) >The reason I'm asking is that I run fsr weekly (Saturday nights) but >this afternoon my /usr/users filesystem was giving the error: > >Apr 12 12:01:41 Uranium unix: lv1 (/usr/users): Out of contiguous space Allocations for indirect extents require contiguous space, but only up to 32 blocks. Sounds like a large file is trying to grow in a very fragmented file system. Sure would be nice if the above error identified the file trying to grow, huh? :-) Try doing ``fsr -s /dev/rwhatever'' (just print frag statistics) then run ``fsr /dev/rwhatever'' (drop the -v unless you really want to see absolutely every block as its moved) and afterwards run fsr -s again. You might need to run ``fsr /dev/rwhatever'' a couple of times. Since fsr works extra hard to blow out the page cache and does all I/O raw and synchronously you'll see a reasonable perf hit on your system so you might want to run this off peak. However if clearing up the fragmentation is more important it's perfectly safe to run fsr no matter how busy the file system or system. The only issue is that fsr can't reorganize files which are currently open (fsr -v will say ``ino=XXXX in use...'') so things like /usr/adm/SYSLOG will probably never be touched, but the idea is that enough other files are not being used so that they can be moved to defragment free space. Bent --------------------------