Newsgroups: comp.unix.sco.misc From: bill@wjv.com.REMOVEME (Bill Vermillion) Subject: Re: Defrag freeware ? Date: Tue, 16 Nov 1999 02:03:46 GMT In article <80ovid$3n4$1@soap.pipex.net>, Marc Redmile-Gordon <marc@carsplus.co.uk> wrote: >Tom has informed me of my erroneous thinking, but, the way I "see" >disk writes & deletions ( just as displayed in a "FAT" windows >defrag tool ) - it seems inevitable that after a period of time the >disk will become fragmented. >What makes Unix's file system so different from the FAT system ? [This started off to be a short reply - but got a bit long winded, so if you aren't interested in filesystems you may hit 'n' now. It's long, but it isn't HTML - wjv] One way I would describe this is that Unix has a file 'system', and DOS has a file handler. There is a world of difference between the strategies behind the two.
The FAT file-system is really nothing more than the original file system that came out with CPM - in the mid 1970s -with changes along the way to handle newer/larger devices, but still hewing to most of the original design. On a FAT disk there is an area of the disk reservered for file names, and a block of bytes associated with this which will then point to blocks on the disk which are allocated for . The only way it knows how to add more data is find the first free block on the disk and start filling it up from there. Let's say file "A" looks like this in the disk AAAAAAAAAA then we create file B so the disk looks like AAAAAAAABBBB, and more to 'a' and it's now AAAAAAAAAABBBBAAAA. This continue along if you delete any files and addfile until it looks like alphabet soup. (of course if you never delete any file you won't have fragementation) The advent of DOS 2.0 brought forth the hierarchical file system, as up to that point you could not store more than 512 files on a disk. The hierarchy made it possible increase the number of files available for storage which was needed to handle the new 'hot' items the $2000 5MB hard disks that were just coming out. However if you look at the IBM floppies, or the hard disk, with a binary editor you can see a pattern of e5e5e5e5, and if you delete a file the first letter of the file name is replaced with e5. Where did this come from. That is the worst case pattern for single density disks. By 'worst case pattern' that means that if you write this pattern and there is any problem with disk, then this is the pattern that is most likely to fail first. Double density uses a different pattern but the MSDOS world still kept the old convention. That meant that testing with less than the worst-case problem could pass disks that will later fail. (Now you know why you had problems with floppies in that OS!). It has such other 'fun' things (until the NT file system or the 32 bit system) of allocating up to 16Kbytes for a 1 byte file. It has been hamstrung by the old design. And this is called a 'system'. Hah. The System V file system, also had problems earlier but advances over the past few years have elminated most of them. Originally Sys V had a free list which contained a list of free blocks. This was not a list of ALL the blocks but the first group of free blocks (ISTR it was 100 but may have been more). When the last block on this list was used, the system would gather more free blocks to add to the list. But then if you deleted files you added the just deleted blocks to those blocks to the free list. That meant that if the free list had blocks 100 to 200 on it, assuming that you had allocated 0 thru 99, and then deleted it, you would be seeking to a lower number and then a higher number when a new file was created - and you have just started fragmenting the drive. Those of us using these in the old days used to have cron run fsck -S on the filesystem overnight. The capital S option says rebuild the freelist IF and ONLY IF the rest of the file system is OK. That kept the fragmentation at bay awhile longer, as the data was put on the lower numbered blocks first. In the meantime the Berkeley Software Distribtuion (BSD) tried to overcome this with their Fast File System. This system organized the hard drive into zones each of which had several cylinders in it. (A cylinder is a track, and all tracks underneath that track physically on the bottom of the top disk, and on both sides of all lower platters). This meant that the only delay when moving from one sector to another in a single cylinder would essentially be the time it took to switch the data from one head to another. This also took into effect disk rotation so that when the last sector on a given head was written the time was computer the data circuit would switch to the next head. There was a certain amount of time that this required, so there was a delay added so that when the head was ready to write again, it would start with the 'first' sector on that track, which was rotationally further along than the 'first' sector on the previous track. But it essentially elminated all mechanical delay and with bigger blocks and only headswitching delay, improved things drastically. The system also allocated files across the disk into each cylinder group. That means that data would be scattered across the disk, but kept together from the very first useage. It was attempting to keep all data for a given file contigous, but leave enough space for all new files to be contiguous if at all possible. This goes counter to the DOS implementation where all file are kept close together. This is because DOS based systems are 'synchronous' operating system - which means it can only handle one task at a time. The computer can do nothing else while it is completing each task. Writing a large file to disk slows things down. Having to seek across a disk to find the pieces also slows things down.. That makes defragmentation and drives with fast seek times almost mandatory. That's where a local company made good by designing a disk controller with enough memory so that it looked like a hard-drive to the OS and the OS could go back to work. That company is DPT (since acquired by Adaptec - it's in the process now) In the modern Unix system - which vary among the installations - and vendors - the files are written in an entirely different manner. The Unix systems are asynchronous file systems (for the most part). That means when Unix tells you the file is written - by given you your prompt back - or the program control back to you - it probably still is sitting in cache in the OS. Every little bit the OS gathers up the data in cache and writes it to disk. Different implementations - depending a lot on the controllers - will start at the first of a disk and just go to cylinders in numerical order instead of seeking all over. The fast file systems (a name started by BSD and now covers many) allocates larger blocks - typically 8096 bytes at a time, and tried to keep those block in a file contiguous. Any data that doesn't fill a full block is written to a 'fragment'. When more data is added to a file that has data in a fragment, when it reaches 8096 bytes (or more) a new 8K segement is added. This means that all reads are in 8K segments. (Some Unix variants targeted to multi-media and broadcast streaming video, allocate over 1MB at a time). The last white paper I read on SCO's fast file implementation differs mainly in the fact that BSD used 8K blocks, while the EAFS system kept the block size but allocates up to 32 contiguous blocks at once. This overcomes the BSD problem of reallocating out of the fragments. This gets away from the original block at a time method. The current Sys5 file systems also gained speed by using a bit-map instead of the free list. Other thing such a sorting synchronous file requests before asynchrnous requests also permitted return of control to the user faster than in other systems. To sum it up, while the Unix file systems have progressed in design over the past 20 or so years with new methods of implementing the system to speed performance, organize storage, cache items, the standard FAT system on DOS has really only undergone changes to be able to handle increasingly larger file sizes, by increasing the minimum number of bytes that are allocated to a file. You did ask: >What makes Unix's file system so different from the FAT system ? That's just a brief overview. Since my home machine has been *ix based since 1983 - after spending 9 slow months with DOS 2 on a jenyouine IBM peecee - I'm slightly prejudiced. Bill -- Bill Vermillion bv @ wjv.com --------------050B5803F7AF064498B56BFD--
Have you tried Searching this site?
Unix/Linux/Mac OS X support by phone, email or on-site: Support Rates
This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more. We appreciate comments and article submissions.
Many of the products and books I review are things I purchased for my own use. Some were given to me specifically for the purpose of reviewing them. I resell or can earn commissions from the sale of some of these items. Links within these pages may be affiliate links that pay me for referring you to them. That's mostly insignificant amounts of money; whenever it is not I have made my relationship plain. I also may own stock in companies mentioned here. If you have any question, please do feel free to contact me.
Specific links that take you to pages that allow you to purchase the item I reviewed are very likely to pay me a commission. Many of the books I review were given to me by the publishers specifically for the purpose of writing a review. These gifts and referral fees do not affect my opinions; I often give bad reviews anyway.
We use Google third-party advertising companies to serve ads when you visit our website. These companies may use information (not including your name, address, email address, or telephone number) about your visits to this and other websites in order to provide advertisements about goods and services of interest to you. If you would like more information about this practice and to know your choices about not having this information used by these companies, click here.
Click here to add your comments
Don't miss responses! Subscribe to Comments by RSS or by Email
Click here to add your comments
If you want a picture to show with your comment, go get a Gravatar