HOWTO: Multi Disk System Tuning Stein Gjoen, v0

Network Cabling Blog Form


disk!boot disk!1023 disk!nuni Root

disk!root /boot. DOS etc.

disk!DOS-related issues At the danger of sounding heretical I have included this little section about something many reading this document have strong feelings about. Unfortunately many hardware items come with setup and maintenance tools based around those systems, so here goes. Explanation of Terms

disk!terms explained Naturally the faster the better but often the happy installer of Linux has several disks of varying speed and reliability so even though this document describes performance as ‘fast’ and ‘slow’ it is just a rough guide since no finer granularity is feasible. Even so there are a few details that should be kept in mind: Speed

disk!terms explained!speed This is really a rather woolly mix of several terms: CPU load, transfer setup overhead, disk seek time and transfer rate. It is in the very nature of tuning that there is no fixed optimum, and in most cases price is the dictating factor. CPU load is only significant for IDE systems where the CPU does the transfer itself but is generally low for SCSI, see SCSI documentation for actual numbers. Disk seek time is also small, usually in the millisecond range. This however is not a problem if you use command queueing on SCSI where you then overlap commands keeping the bus busy all the time. News spools are a special case consisting of a huge number of normally small files so in this case seek time can become more significant. There are two main parameters that are of interest here: 7200 and 10000 RPM (rotations per minute). Higher RPM reduces the seek time but at a substantial cost. Also drives working at high RPM have been known to be noisy and to generate a lot of heat, a factor that should be kept in mind if you are building a large array or “disk farm”. Very recently drives working at 15000 RPM has entered the market and here the cooling requirements are even stricter and minimum figures for air flow are given. It is therefore important to read the specifications for the drives very carefully, and note that the maximum transfer speed quite often is quoted for transfers out of the on board cache (burst speed) and not directly from the platter (sustained speed). See also section on . Reliability

disk!terms explained!reliability Naturally no-one would want low reliability disks but one might be better off regarding old disks as unreliable. Also for RAID purposes (See the relevant information) it is suggested to use a mixed set of disks so that simultaneous disk crashes become less likely. So far I have had only one report of total file system failure but here unstable hardware seemed to be the cause of the problems. Disks are cheap these days yet people still underestimate the value of the contents of the drives. If you need higher reliability make sure you replace old drives and keep spares. It is not unusual that drives can work more or less continuous for years and years but what often kills a drive in the end is power cycling. Files

disk!terms explained!files The average file size is important in order to decide the most suitable drive parameters. A large number of small files makes the average seek time important whereas for big files the transfer speed is more important. The command queueing in SCSI devices is very handy for handling large numbers of small files, but for transfer EIDE is not too far behind SCSI and normally much cheaper than SCSI. File Systems

disk!file systems Over time the requirements for file systems have increased and the demands for large structures, large files, long file names and more has prompted ever more advanced file systems, the system that accesses and organises the data on mass storage. Today there is a large number of file systems to choose from and this section will describe these in detail. The emphasis is on Linux but with more input I will be happy to add information for a wider audience. General Purpose File Systems

Most operating systems usually have a general purpose file system for every day use for most kinds of files, reflecting available features in the OS such as permission flags, protection and recovery. disk!file system!minix This was the original fs for Linux, back in the days Linux was hosted on minix machines. It is simple but limited in features and hardly ever used these days other than in some rescue disks as it is rather compact. disk!file system!xiafs disk!file system!extfs These are also old and have fallen in disuse and are no longer recommended. disk!file system!ext2fs This is the established standard for general purpose in the Linux world. It is fast, efficient and mature and is under continuous development and features such as ACL and transparent compression are on the horizon. For more information check the home page. disk!file system!ext3fs This is the name for the successor to Metadata, data that describes the structure of the files, is written in a joural to the disk. Note that there are 3 modes of operations for disk!file system!ufs This is the fs used by BSD and variants thereof. It is mature but also developed for older types of disk drives where geometries were known. The fs uses a number of tricks to optimise performance but as disk geometries are translated in a number of ways the net effect is no longer so optimal. disk!file system!efs The Extent File System (efs) is Silicon Graphics’ early file system widely used on IRIX before version 6.0 after which xfs has taken over. While migration to xfs is encouraged efs is still supported and much used on CDs. There is a Linux driver available in early beta stage, available at home page. disk!file system!XFS has started porting its mainframe grade file system to Linux. Source is not yet available as they are busily cleaning out legal encumbrance but once that is done they will provide the source code under GPL. More information is already available on the The Enhanced File System project is now dead. disk!file system!Tux2 fs The Tux2 File System project is now dead. Microsoft File Systems

disk!file system!Microsoft disk!file system!confusion This company is responsible for a lot, including a number of filesystems that has at the very least caused confusions. disk!file system!fat Actually there are 2 disk!file system!fat32 After about 10 years Microsoft realised disk!file system!vfat At the same time as Microsoft launched disk!file system!ntfs This is the native fs of Win-NT but as complete information is not available there is limited support for other OSes. Logging and Journaling File Systems

disk!file system!logging file systems disk!file system!journaling file systems These take a radically different approach to file updates by logging modifications for files in a log and later at some time checkpointing the logs. Reading is roughly as fast as traditional file systems that always update the files directly. Writing is much faster as only updates are appended to a log. All this is transparent to the user. It is in reliability and particularly in checking file system integrity that these file systems really shine. Since the data before last checkpointing is known to be good only the log has to be checked, and this is much faster than for traditional file systems. Note that while Adam Richter from Yggdrasil posted some time ago that they have been working on a compressed log file based system but that this project is currently on hold. Nevertheless a non-working version is available on their FTP server. Check out where special patched versions of the kernel can be found. Another project is the (GFS

disk!file system!GFS disk!device!Global File System The is a file system designed for storage across a wide area network. Special File Systems

In addition to the general file systems there is also a number of more specific ones, usually to provide higher performance or other features, usually with a tradeoff in other respects.

disk!file system!tmpfs disk!file system!swapfs For short term fast file storage SunOS offers /tmp area but not /var/tmp which is where temporary data that must survive a reboot, is placed. SunOS offers very limited tuning for Linux now features disk!file system!userfs disk!file system!arcfs disk!file system!docfs The user file system ( for more information. disk!file system!devfs When disks are added, removed or just fail it is likely that disk device names of the remaining disks will change. For instance if directory, info on the numbering and allocation can be found in for more information directory a kernel file system in the same way as is. More information will appear as it becomes available.

disk!file system!smugfs disk!file system!huge files For a number of reasons it is currently difficult to have files bigger than 2 GB. One file system that tries to overcome this limit is and while it worked with kernel version 2.1.85 it is quite possible some work is required to make it fit into newer kernels. Also the low version number (0.0) suggests extra care is required. File System Recommendations

There is a jungle of choices but generally it is recommended to use the general file system that comes with your distribution. If you use page which has been superseded by and the article . That guide is being superseded by a HOWTO which is underway and a link will be added when it is ready. To avoid total havoc with device renaming if a drive fails check out the scanning order of your system and try to keep your root system on Technologies

disk!technologies In order to decide how to get the most of your devices you need to know what technologies are available and their implications. As always there can be some tradeoffs with respect to speed, reliability, power, flexibility, ease of use and complexity. Many of the techniques described below can be stacked in a number of ways to maximise performance and reliability, though at the cost of added complexity. RAID

disk!technologies!RAID This is a method of increasing reliability, speed or both by using multiple disks in parallel thereby decreasing access time and increasing transfer speed. A checksum or mirroring system can be used to increase reliability. Large servers can take advantage of such a setup but it might be overkill for a single user system unless you already have a large number of disks available. See other documents and FAQs for more information. For Linux one can set up a RAID system using either software (the md module in the kernel), a Linux compatible controller card (PCI-to-SCSI) or a SCSI-to-SCSI controller. Check the documentation for what controllers can be used. A hardware solution is usually faster, and perhaps also safer, but comes at a significant cost. A summary of available hardware RAID solutions for Linux is available at . SCSI-to-SCSI

disk!technologies!RAID!SCSI-to-SCSI SCSI-to-SCSI controllers are usually implemented as complete cabinets with drives and a controller that connects to the computer with a second SCSI bus. This makes the entire cabinet of drives look like a single large, fast SCSI drive and requires no special RAID driver. The disadvantage is that the SCSI bus connecting the cabinet to the computer becomes a bottleneck. A significant disadvantage for people with large disk farms is that there is a limit to how many SCSI entries there can be in the directory. In these cases using SCSI-to-SCSI will conserve entries. Usually they are configured via the front panel or with a terminal connected to their on-board serial interface. One manufacturer of such systems is . PCI-to-SCSI

disk!technologies!RAID!PCI-to-SCSI PCI-to-SCSI controllers are, as the name suggests, connected to the high speed PCI bus and is therefore not suffering from the same bottleneck as the SCSI-to-SCSI controllers. These controllers require special drivers but you also get the means of controlling the RAID configuration over the network which simplifies management. Currently only a few families of PCI-to-SCSI host adapters are supported under Linux. including SmartCache I/III/IV and SmartRAID I/III/IV controller families. These controllers are supported by the EATA-DMA driver in the standard kernel. More information from the author of the DPT controller drivers (EATA* drivers) can be found at his pages on and . These are not the fastest but have a good track record of proven reliability. Note that the maintenance tools for DPT controllers currently run under DOS/Win only so you will need a small DOS/Win partition for some of the software. This also means you have to boot the system into Windows in order to maintain your RAID system. featuring up to 5 independent channels and very fast hardware based on intelligent controllers. The Linux driver was written by the company itself which shows they support Linux. As ICP-Vortex supplies the maintenance software for Linux it is not necessary with a reboot to other operating systems for the setup and maintenance of your RAID system. This saves you also extra downtime. . drivers are available from for Linux. ATARAID

disk!technologies!RAID!ATARAID Hardware RAID support for ATA drives are now available from a number of companies, notably Promise and Highpoint. These are supported in kernel 2.4 and later. For more information check out the for more information. Software RAID

disk!technologies!RAID!Software RAID A number of operating systems offer software RAID using ordinary disks and controllers. Cost is low and performance for raw disk IO can be very high. As this can be very CPU intensive it increases the load noticeably so if the machine is CPU bound in performance rather then IO bound you might be better off with a hardware PCI-to-RAID controller. Real cost, performance and especially reliability of software vs. hardware RAID is a very controversial topic. Reliability on Linux systems have been very good so far. The current software RAID project on Linux is the RAID Levels

disk!technologies!RAID!RAID levels RAID comes in many levels and flavours which I will give a brief overview of this here. Much has been written about it and the interested reader is recommended to read more about this in the . RAID RAID RAID RAID RAID RAID There are also hybrids available based on RAID 0 or 1 and one other level. Many combinations are possible but I have only seen a few referred to. These are more complex than the above mentioned RAID levels. RAID 01 combines striping with duplication as mirrored arrays of striped arrays which gives very high transfers combined with fast seeks as well as redundancy. The disadvantage is high disk consumption as well as the above mentioned complexity. Also a single disk failure turns the array into RAID 0. RAID 1+0 combines striping with duplication as striped arrays of mirrored arrays which gives very high transfers combined with fast seeks as well as redundancy. The disadvantage is high disk consumption as well as the above mentioned complexity. RAID 1/5 combines the speed and redundancy benefits of RAID5 with the fast seek of RAID1. Redundancy is improved compared to RAID 0/1 but disk consumption is still substantial. Implementing such a system would involve typically more than 6 drives, perhaps even several controllers or SCSI channels. Volume Management

disk!technologies!volume management Volume management is a way of overcoming the constraints of fixed sized partitions and disks while still having a control of where various parts of file space resides. With such a system you can add new disks to your system and add space from this drive to parts of the file space where needed, as well as migrating data out from a disk developing faults to other drives before catastrophic failure occurs. The system developed by has become the defacto standard for logical volume management. Volume management is for the time being an area where Linux is lacking. One is the virtual partition system project Hint: if you cannot get it to work properly you have forgotten to set the Compression

disk!technologies!compression disk!compression!DouBle disk!compression!Zlibc disk!compression!dmsdos disk!compression!e2compr Disk compression versus file compression is a hotly debated topic especially regarding the added danger of file corruption. Nevertheless there are several options available for the adventurous administrators. These take on many forms, from kernel modules and patches to extra libraries but note that most suffer various forms of limitations such as being read-only. As development takes place at neck breaking speed the specs have undoubtedly changed by the time you read this. As always: check the latest updates yourself. Here only a few references are given. DouBle features file compression with some limitations. Zlibc adds transparent on-the-fly decompression of files as they load. there are many modules available for reading compressed files or partitions that are native to various other operating systems though currently most of these are read-only. Physical Track Positioning

disk!technologies!physical track positioning disk!technologies!track positioning This trick used to be very important when drives were slow and small, and some file systems used to take the varying characteristics into account when placing files. Although higher overall speed, on board drive and controller caches and intelligence has reduced the effect of this. Nevertheless there is still a little to be gained even today. As we know, ” . To understand the strategy we need to recall this near ancient piece of knowledge and the properties of the various track locations. This is based on the fact that transfer speeds generally increase for tracks further away from the spindle, as well as the fact that it is faster to seek to or from the central tracks than to or from the inner or outer tracks. Most drives use disks running at constant angular velocity but use (fairly) constant data density across all tracks. This means that you will get much higher transfer rates on the outer tracks than on the inner tracks; a characteristics which fits the requirements for large libraries well. Newer disks use a logical geometry mapping which differs from the actual physical mapping which is transparently mapped by the drive itself. This makes the estimation of the “middle” tracks a little harder. In most cases track 0 is at the outermost track and this is the general assumption most people use. Still, it should be kept in mind that there are no guarantees this is so.


, /tmp and /var/tmp. Hence seek time reduction can be achieved by positioning frequently accessed tracks in the middle so that the average seek distance and therefore the seek time is short. This can be done either by using fdisk or cfdisk to make a partition on the middle tracks or by first making a file (using Disk Speed Values

disk!technologies!disk speed values The same mechanical head disk assembly (HDA) is often available with a number of interfaces (IDE, SCSI etc) and the mechanical parameters are therefore often comparable. The mechanics is today often the limiting factor but development is improving things steadily. There are two main parameters, usually quoted in milliseconds (ms): Head movement – the speed at which the read-write head is able to move from one track to the next, called access time. If you do the mathematics and doubly integrate the seek first across all possible starting tracks and then across all possible target tracks you will find that this is equivalent of a stroke across a third of all tracks. Rotational speed – which determines the time taken to get to the right sector, called latency. After voice coils replaced stepper motors for the head movement the improvements seem to have levelled off and more energy is now spent (literally) at improving rotational speed. This has the secondary benefit of also improving transfer rates. Some typical values: Drive type Access time (ms) | Fast Typical Old ——————————————— Track-to-track This shows that the very high end drives offer only marginally better access times then the average drives but that the old drives based on stepper motors are significantly worse. Rotational speed (RPM) | 3600 | 4500 | 4800 | 5400 | 7200 | 10000 ——————————————————————- Latency (ms) | 17 | 13 | 12.5 | 11.1 | 8.3 | 6.0 As latency is the average time taken to reach a given sector, the formula is quite simply latency (ms) = 60000 / speed (RPM) Clearly this too is an example of diminishing returns for the efforts put into development. However, what really takes off here is the power consumption, heat and noise. Yoke

disk!technologies!yoke There is also a Trivia: There is a movie also called Solaris, a science fiction movie that is very, very long, slow and incomprehensible. This was often pointed out at the time Solaris (the OS) appeared… BeOS

disk!operating systems, other!BeOS This operating system is one of the more recent one to arrive and it features a file system that has some database like features. There is a BFS file system driver being developed for Linux and is available in alpha stage. For more information check the where patches also are available. Clusters

disk!technologies!clusters In this section I will briefly touch on the ways machines can be connected together but this is so big a topic it could be a separate HOWTO in its own right, hint, hint. Also, strictly speaking, this section lies outside the scope of this HOWTO, so if you feel like getting fame etc. news mail web proxy printer server modem server (PPP, SLIP, FAX, Voice mail)

You can also /usr and /var/spool and possibly /usr/local but probably not /var/spool/lpd. Most of the time even slow disks will deliver sufficient performance. On the other hand, if you do processing directly on the disks on the server or have very fast networking, you might want to rethink your strategy and use faster drives. Searching features on a web server or news database searches are two examples of this. Such a network can be an excellent way of learning system administration and building up your own toaster network, as it often is called. You can get more information on this in other HOWTOs but there are two important things you should keep in mind: Do not pull IP numbers out of thin air. Configure your inside net using IP numbers reserved for private use, and use your network server as a router that handles this IP masquerading. Remember that if you additionally configure the router as a firewall you might not be able to get to your own data from the outside, depending on the firewall configuration. The There are also some more advanced clustering projects going, notably There is also an article in Byte called Disk Layout

disk!disk layout disk!layout, disk With all this in mind we are now ready to embark on the layout. I have based this on my own method developed when I got hold of 3 old SCSI disks and boggled over the possibilities. The tables in the appendices are designed to simplify the mapping process. They have been designed to help you go through the process of optimizations as well as making an useful log in case of system repair. A few examples are also given. Selection for Partitioning

disk!layout, disk!partitioning Determine your needs and set up a list of all the parts of the file system you want to be on separate partitions and sort them in descending order of speed requirement and how much space you want to give each partition. The table in section is a useful tool to select what directories you should put on different partitions. It is sorted in a logical order with space for your own additions and notes about mounting points and additional systems. It is therefore NOT sorted in order of speed, instead the speed requirements are indicated by bullets (‘o’). If you plan to RAID make a note of the disks you want to use and what partitions you want to RAID. Remember various RAID solutions offers different speeds and degrees of reliability. (Just to make it simple I’ll assume we have a set of identical SCSI disks and no RAID) Mapping Partitions to Drives

disk!layout, disk!mapping partitions disk!layout, disk!partitions, mapping Then we want to place the partitions onto physical disks. The point of the following algorithm is to maximise parallelizing and bus capacity. In this example the drives are A, B and C and the partitions are 987654321 where 9 is the partition with the highest speed requirement. Starting at one drive we ‘meander’ the partition line over and over the drives in this way: A : 9 4 3 B : 8 5 2 C : 7 6 1 This makes the ‘sum of speed requirements’ the most equal across each drive. Use the table in section to select what drives to use for each partition in order to optimize for paralellicity. Note the speed characteristics of your drives and note each directory under the appropriate column. Be prepared to shuffle directories, partitions and drives around a few times before you are satisfied. Sorting Partitions on Drives

disk!layout, disk!sorting partitions disk!layout, disk!partitions, sorting After that it is recommended to select partition numbering for each drive. Use the table in section to select partition numbers in order to optimize for track characteristics. At the end of this you should have a table sorted in ascending partition number. Fill these numbers back into the tables in appendix A and B. You will find these tables useful when running the partitioning program (fdisk or cfdisk) and when doing the installation. Optimizing

disk!layout, disk!optimizing partitions disk!layout, disk!partitions, optimizing After this there are usually a few partitions that have to be ‘shuffled’ over the drives either to make them fit or if there are special considerations regarding speed, reliability, special file systems etc. Nevertheless this gives what this author believes is a good starting point for the complete setup of the drives and the partitions. In the end it is actual use that will determine the real needs after we have made so many assumptions. After commencing operations one should assume a time comes when a repartitioning will be beneficial. For instance if one of the 3 drives in the above mentioned example is very slow compared to the two others a better plan would be as follows: A : 9 6 5 B : 8 7 4 C : 3 2 1 Optimizing by Characteristics

disk!layout, disk!optimizing by characteristics disk!layout, disk!characteristics, optimizing by Often drives can be similar in apparent overall speed but some advantage can be gained by matching drives to the file size distribution and frequency of access. Thus binaries are suited to drives with fast access that offer command queueing, and libraries are better suited to drives with larger transfer speeds where IDE offers good performance for the money. Optimizing by Drive Parallelising

disk!layout, disk!optimizing by parallelising disk!layout, disk!parallelising, optimizing by Avoid drive contention by looking at tasks: for instance if you are accessing /usr/local/bin chances are you will soon also need files from /usr/local/lib so placing these at separate drives allows less seeking and possible parallel operation and drive caching. It is quite possible that choosing what may appear less than ideal drive characteristics will still be advantageous if you can gain parallel operations. Identify common tasks, what partitions they use and try to keep these on separate physical drives. Just to illustrate my point I will give a few examples of task analysis here. .overview

files on separate drives for larger installations. /usr/src and project files on the same drive as the home directories. and . Compromises

disk!compromises One way to avoid the aforementioned is to only set off fixed partitions to directories with a fairly well known size such as swap, /tmp and /var/tmp and group together the remainders into the remaining partitions using symbolic links. Example: a slow disk (slowdisk), a fast disk (fastdisk) and an assortment of files. Having set up /home and root on slowdisk we have (the fictitious) directories /a/slow, /a/fast, /b/slow and /b/fast left to allocate on the partitions /mnt.slowdisk and /mnt.fastdisk which represents the remaining partitions of the two drives. Putting /a or /b directly on either drive gives the same properties to the subdirectories. We could make all 4 directories separate partitions but would lose some flexibility in managing the size of each directory. A better solution is to make these 4 directories symbolic links to appropriate directories on the respective drives. Thus we make /a/fast point to /mnt.fastdisk/a/fast or /mnt.fastdisk/a.fast /a/slow point to /mnt.slowdisk/a/slow or /mnt.slowdisk/a.slow /b/fast point to /mnt.fastdisk/b/fast or /mnt.fastdisk/b.fast /b/slow point to /mnt.slowdisk/b/slow or /mnt.slowdisk/b.slow and we get all fast directories on the fast drive without having to set up a partition for all 4 directories. The second (right hand) alternative gives us a flatter files system which in this case can make it simpler to keep an overview of the structure. The disadvantage is that it is a complicated scheme to set up and plan in the first place and that all mount points and partitions have to be defined before the system installation. /usr

partition must be mounted directly onto root and not via an indirect link as described above. The reason for this are the long backward links used extensively in X11 that go from deep within /usr all the way to root and then down into /etc directories. Implementation

disk!implementation Having done the layout you should now have a detailed description on what goes where. Most likely this will be on paper but hopefully someone will make a more automated system that can deal with everything from the design, through partitioning to formatting and installation. This is the route one will have to take to realise the design. Modern distributions come with installation tools that will guide you through partitioning and formatting and also set up /etc/fstab for you automatically. For later modifications, however, you will need to understand the underlying mechanisms. Checklist

disk!implementation!checklist Before starting make sure you have the following: Written notes of what goes where, your design A functioning, tested rescue disk A fresh backup of your precious data At least two formatted, tested and empty floppies Read and understood the man page for fdisk or equivalent Patience, concentration and elbow grease Drives and Partitions

disk!implementation!drives disk!implementation!partitions When you start DOS or the like you will find all partitions labeled Dec 6 23:45:18 demos kernel: Partition check: Dec 6 23:45:18 demos kernel: sda: sda1 Dec 6 23:45:18 demos kernel: hda: hda1 hda2 SCSI drives are labelled /dev/MAKEDEV and /usr/src/linux/Documentation/devices.txt. Partitions are labelled numerically for each drive /etc/fstab before they appear as a part of the file system. Partitioning

disk!implementation!partitioning disk!fdisk disk!cfdisk disk!sfdisk disk!Disk Druid It feels so good / It’s a marginal risk / when I clear off / windows with fdisk! (the Dustbunny in an of in the song “Refund this”) First you have to partition each drive into a number of separate partitions. Under Linux there are two main methods, that also offers numerous features. Also the GNU project offers a partitioning tool called The , a Tcl/Tk-based file system mounter, and , an editing tool for KDE, as well as , another editing tool which is part of KDE. Briefly, the fields are partition name, where to mount the partition, type of file system, mount options, when to dump for backup and when to do Mount options

disk!mount Mounting, either by hand or using the fstab, allows for a number of options that offers extra protection. Below are some of the more useful options. For more information and cautions refer to the man page for Recommendations

Having constructed and implemented your clever scheme you are well advised to make a complete record of it all, on paper. After all having all the necessary information on disk is no use if the machine is down. Partition tables can be damaged or lost, in which case it is excruciatingly important that you enter the exact same numbers into which will generate a summary of your disk configurations. For checking your hard disks you can use the Disk Advisor boot disk available . The disk builder required Windows to run. This system is useful to diagnose failed disks. You are strongly recommended to make a rescue disk and test it. Most distributions make on available and is often part of the installation disks. For some, such as the one for Redhat 6.1 the way to invoke the disk as a rescue disk is to type linux rescue at the boot prompt. There are also specialised rescue disk distributions available on the net. When need for it comes you will need to know where your root and boot partitions reside which you need to write down and keep safe. Note: the difference between a boot disk and a rescue disk is that a boot disk will fail if it cannot mount the file system, typically on your hard disk. A rescue disk is self contained and will work even if there are no hard disks. Maintenance

disk!maintenance It is the duty of the system manager to keep an eye on the drives and partitions. Should any of the partitions overflow, the system is likely to stop working properly, no matter how much space is available on other partitions, until space is reclaimed. Partitions and disks are easily monitored using df and should be done frequently, perhaps using a cron job or some other general system management tool. Do not forget the swap partitions, these are best monitored using one of the memory statistics programs such as free, procinfo or top. Drive usage monitoring is more difficult but it is important for the sake of performance to avoid contention – placing too much demand on a single drive if others are available and idle. It is important when installing software packages to have a clear idea where the various files go. As previously mentioned GCC keeps binaries in a library directory and there are also other programs that for historical reasons are hard to figure out, X11 for instance has an unusually complex structure. When your system is about to fill up it is about time to check and prune old logging messages as well as hunt down core files. Proper use of Backup

disk!maintenance!backup The observant reader might have noticed a few hints about the usefulness of making backups. Horror stories are legio about accidents and what happened to the person responsible when the backup turned out to be non-functional or even non existent. You might find it simpler to invest in proper backups than a second, secret identity. There are many options and also a mini-HOWTO ( There are both free and commercial backup systems available for Linux. One commercial example is the disk image level backup system from offering a full function 30 day Linux demo available online. Defragmentation

disk!maintenance!defragmentation This is very dependent on the file system design, some suffer fast and nearly debilitating fragmentation. Fortunately for us, Deletions

disk!maintenance!deletions Quite often disk space shortages can be remedied simply by deleting unnecessary files that accumulate around the system. Quite often programs that terminate abnormally cause all kinds of mess lying around the oddest places. Normally a core dump results after such an incident and unless you are going to debug it you can simply delete it. These can be found everywhere so you are advised to do a global search for them now and then. The locate command is useful for this. Unexpected termination can also cause all sorts of temporary files remaining in places like /tmp or /var/tmp, files that are automatically removed when the program ends normally. Rebooting cleans up some of these areas but not necessary all and if you have a long uptime you could end up with a lot of old junk. If space is short you have to delete with care, make sure the file is not in active use first. Utilities like /var/log area. In particular the file /var/log/messages tends to grow until deleted. It is a good idea to keep a small archive of old log files around for comparison should the system start to behave oddly. If the mail or news system is not working properly you could have excessive growth in their spool areas, /var/spool/mail and /var/spool/news respectively. Beware of the overview files as these have a leading dot which makes them invisible to /etc/motd file to tell users when space is short. Setting the default shell settings to prevent core files being dumped can save you a lot of work too. Certain kinds of people try to hide files around the system, usually trying to take advantage of the fact that files with a leading dot in the name are invisible to the Upgrades

disk!maintenance!upgrades No matter how large your drives, time will come when you will find you need more. As technology progresses you can get ever more for your money. At the time of writing this, it appears that 6.4 GB drives gives you the most bang for your bucks. Note that with IDE drives you might have to remove an old drive, as the maximum number supported on your mother board is normally only 2 or some times 4. With SCSI you can have up to 7 for narrow (8-bit) SCSI or up to 15 for wide (15 bit) SCSI, per channel. Some host adapters can support more than a single channel and in any case you can have more than one host adapter per system. My personal recommendation is that you will most likely be better off with SCSI in the long run. The question comes, where should you put this new drive? In many cases the reason for expansion is that you want a larger spool area, and in that case the fast, simple solution is to mount the drive somewhere under /var/spool. On the other hand newer drives are likely to be faster than older ones so in the long run you might find it worth your time to do a full reorganizing, possibly using your old design sheets. If the upgrade is forced by running out of space in partitions used for things like /usr or /var the upgrade is a little more involved. You might consider the option of a full re-installation from your favourite (and hopefully upgraded) distribution. In this case you will have to be careful not to overwrite your essential setups. Usually these things are in the /etc directory. Proceed with care, fresh backups and working rescue disks. The other possibility is to simply copy the old directory over to the new directory which is mounted on a temporary mount point, edit your /etc/fstab file, reboot with your new partition in place and check that it works. Should it fail you can reboot with your rescue disk, re-edit /etc/fstab and try again. Until volume management becomes available to Linux this is both complicated and dangerous. Do not get too surprised if you discover you need to restore your system from a backup. The Tips-HOWTO gives the following example on how to move an entire directory structure across: (cd /source/directory; tar cf – . ) | (cd /dest/directory; tar xvfp -) While this approach to moving directory trees is portable among many Unix systems, it is inconvenient to remember. Also, it fails for deeply nested directory trees when pathnames become to long to handle for tar (GNU tar has special provisions to deal with long pathnames). If you have access to GNU cp (which is always the case on Linux systems), you could as well use cp -av /source/directory /dest/directory GNU cp knows specifically about symbolic links, hard links, FIFOs and device files and will copy them correctly. Remember that it might not be a good idea to try to transfer /dev or /proc. There is also a that gives you a step by step guide on migrating an entire Linux system, including LILO, form one hard disk to another. Recovery

disk!maintenance!recovery disk!gpart disk!dos tool!findpart disk!dos tool!editpart disk!dos tool!findfat disk!dos tool!getsect disk!dos tool!putsect disk!dos tool!cyldir disk!dos tool!cdir System crashes come in many and entertaining flavours, and partition table corruption always guarantees plenty of excitement. A recent and undoubtedly useful tool for those of us who are happy with the normal level of excitement, is which means “Guess PC-Type hard disk partitions”. Useful. In addition there are some Further Information

disk!information resources There is wealth of information one should go through when setting up a major system, for instance for a news or general Internet service provider. The FAQs in the following groups are useful: News groups

disk!information resources!news groups Some of the most interesting news groups are: . . . . . Most newsgroups have their own FAQ that are designed to answer most of your questions, as the name Frequently Asked Questions indicate. Fresh versions should be posted regularly to the relevant newsgroups. If you cannot find it in your news spool you could go directly to the . The WWW versions can be browsed at Mailing Lists

disk!information resources!mailing lists These are low noise channels mainly for developers. Think twice before asking questions there as noise delays the development. Some relevant lists are vger.rutgers.edu server but this is notoriously overloaded, so try to find a mirror. There are some lists mirrored at . Many lists are also accessible at , and the rest of the web site is a gold mine of useful information. If you want to find out more about the lists available you can send a message with the line ). If you need help on how to use the mail server just send the line ) and the Intelligent IO list . Mailing lists are in a state of flux but you can find links to a number of interesting lists from the . HOWTO

disk!information resources!HOWTOs These are intended as the primary starting points to get the background information as well as show you how to solve a specific problem. Some relevant HOWTOs are . There is a a new HOWTO out that deals with setting up a DPT RAID system, check out the . Mini-HOWTO

disk!information resources!mini-HOWTOs These are the smaller free text relatives to the HOWTOs. Some relevant mini-HOWTOs are /usr/src/linux/drivers/block/README.ide or /usr/src/linux/Documentation/ide.txt. Local Resources

disk!information resources!local In most distributions of Linux there is a document directory installed, have a look in the directory. where most packages store their main documentation and README files etc. Also you will here find the HOWTO archive ( ) of ready formatted HOWTOs and also the mini-HOWTO archive ( ) of plain text documents. Many of the configuration files mentioned earlier can be found in the directory. In particular you will want to work with the file that sets up the mounting of partitions and possibly also file that is used for the is, of course, the ultimate documentation. In other words, use the source, Luke. It should also be pointed out that the kernel comes not only with source code which is even commented (well, partially at least) but also an informative . If you are about to ask any questions about the kernel you should read this first, it will save you and many others a lot of time and possibly embarrassment. Also have a look in your system log file ( ) to see what is going on and in particular how the booting went if too much scrolled off your screen. Using tail -f /var/log/messages in a separate window or screen will give you a continuous update of what is going on in your system. You can also take advantage of the file system that is a window into the inner workings of your system. Use Web Pages

disk!information resources!WWW disk!information resources!web pages There is a huge number of informative web pages out there and by their very nature they change quickly so don’t be too surprised if these links become quickly outdated. A good starting point is of course . that is a information central for documentation, project pages and much, much more. Mike Neuffer, the author of the DPT caching RAID controller drivers, has some interesting pages on and . Software RAID development information can be found at along with patches and utilities. Disk related information on benchmarking, RAID, reliability and much, much more can be found at project page. There is also information available on how to and what software packages are needed to achieve this. In depth documentation on is also available. People who looking for information on VFAT, FAT32 and Joliet could have a look at the . These drivers are in the 2.1.x kernel development series as well as in 2.0.34 and later. For diagrams and information on all sorts of disk drives, controllers etc. both for current and discontinued lines is the site you need. There is a lot of useful information here, a real treasure trove. Please let me know if you have any other leads that can be of interest. Search Engines

disk!information resources!search engines disk!information resources!Troubleshooting mini-HOWTO disk!information resources!Updated mini-HOWTO When all fails try the internet search engines. There is a huge number of them, all a little different from each other. It falls outside the scope of this HOWTO to describe how best to use them. Instead you could turn to the Troubleshooting on the Internet mini-HOWTO, and the Updated mini-HOWTO. If you have to ask for help you are most likely to get help in the news group. Due to large workload and a slow network connection I am not able to follow that newsgroup so if you want to contact me you have to do so by e-mail. Getting Help

disk!assistance, obtaining In the end you might find yourself unable to solve your problems and need help from someone else. The most efficient way is either to ask someone local or in your nearest Linux user group, search the web for the nearest one. Another possibility is to ask on Usenet News in one of the many, many newsgroups available. The problem is that these have such a high volume and noise (called low signal-to-noise ratio) that your question can easily fall through unanswered. No matter where you ask it is important to ask well or you will not be taken seriously. Saying just Processor DMA IRQ Chip set (LX, BX etc) Bus (ISA, VESA, PCI etc) Expansion cards used (Disk controllers, video, IO etc) BIOS (On motherboard and possibly SCSI host adapters) LILO, if used Linux kernel version as well as possible modifications and patches Kernel parameters, if any Software that shows the error (with version number or date) Type of disk drives with manufacturer name, version and type Other relevant peripherals connected to the same busses

As an example of how interrelated these problems are: an old chip set caused problems with a certain combination of video controller and SCSI host adapter. Remember that booting text is logged to /var/log/messages which can answer most of the questions above. Obviously if the drives fail you might not be able to get the log saved to disk but you can at least scroll back up the screen using the Concluding Remarks

disk!conclusion Disk tuning and partition decisions are difficult to make, and there are no hard rules here. Nevertheless it is a good idea to work more on this as the payoffs can be considerable. Maximizing usage on one drive only while the others are idle is unlikely to be optimal, watch the drive light, they are not there just for decoration. For a properly set up system the lights should look like Christmas in a disco. Linux offers software RAID but also support for some hardware base SCSI RAID controllers. Check what is available. As your system and experiences evolve you are likely to repartition and you might look on this document again. Additions are always welcome. Finally I’d like to sum up my recommendations: Disks are cheap but the data they contain could be much more valuable, use and test your backup system. Work is also expensive, make sure you get large enough disks as refitting new or repartitioning old disks takes time. Think reliability, replace old disks before they fail. Keep a paper copy of your setup, having it all on disk when the machine is down will not help you much. Start out with a simple design with a minimum of fancy technology and rather fit it in later. In general adding is easier than replacing, be it disks, technology or other features. Coming Soon

disk!coming soon There are a few more important things that are about to appear here. In particular I will add more example tables as I am about to set up two fairly large and general systems, one at work and one at home. These should give some general feeling on how a system can be set up for either of these two purposes. Examples of smooth running existing systems are also welcome. There is also a fair bit of work left to do on the various kinds of file systems and utilities. There will be a big addition on drive technologies coming soon as well as a more in depth description on using fdisk, cfdisk and sfdisk. The file systems will be beefed up as more features become available as well as more on RAID and what directories can benefit from what RAID level. There is some minor overlapping with the Linux Filesystem Structure Standard and FHS that I hope to integrate better soon, which will probably mean a big reworking of all the tables at the end of this document. As more people start reading this I should get some more comments and feedback. I am also thinking of making a program that can automate a fair bit of this decision making process and although it is unlikely to be optimum it should provide a simpler, more complete starting point. Request for Information

disk!request for information It has taken a fair bit of time to write this document and although most pieces are beginning to come together there are still some information needed before we are out of the beta stage. More information on swap sizing policies is needed as well as information on the largest swap size possible under the various kernel versions. How common is drive or file system corruption? So far I have only heard of problems caused by flaky hardware. References to speed and drives is needed. Are any other Linux compatible RAID controllers available? What relevant monitoring, management and maintenance tools are available? General references to information sources are needed, perhaps this should be a separate document? Usage of /tmp and /var/tmp has been hard to determine, in fact what programs use which directory is not well defined and more information here is required. Still, it seems at least clear that these should reside on different physical drives in order to increase paralellicity. Suggested Project Work

disk!projects, suggested Now and then people post on comp.os.linux.*, looking for good project ideas. Here I will list a few that comes to mind that are relevant to this document. Plans about big projects such as new file systems should still be posted in order to either find co-workers or see if someone is already working on it. Questions and Answers

disk!FAQ disk!frequently asked questions This is just a collection of what I believe are the most common questions people might have. Give me more feedback and I will turn this section into a proper FAQ. Q:How many physical disk drives (spindles) does a Linux system need?

A: Linux can run just fine on one drive (spindle). Having enough RAM (around 32 MB, and up to 64 MB) to support swapping is a better price/performance choice than getting a second disk. (E)IDE disk is usually cheaper (but a little slower) than SCSI. Q: I have a single drive, will this HOWTO help me?

A: Yes, although only to a minor degree. Still, section will offer you some gains. Q: Are there any disadvantages in this scheme?

A: There is only a minor snag: if even a single partition overflows the system might stop working properly. The severity depends of course on what partition is affected. Still this is not hard to monitor, the command Q: OK, so should I split the system into as many partitions as possible for a single drive?

A: No, there are several disadvantages to that. First of all maintenance becomes needlessly complex and you gain very little in this. In fact if your partitions are too big you will seek across larger areas than needed. This is a balance and dependent on the number of physical drives you have. Q: Does that mean more drives allows more partitions?

A: To some degree, yes. Still, some directories should not be split off from root, check out the file system standards for more details. Q: What if I have many drives I want to use?

A: If you have more than 3-4 drives you should consider using RAID of some form. Still, it is a good idea to keep your root partition on a simple partition without RAID, see section for more details. Q: I have installed the latest Windows95 but cannot access this partition from within the Linux system, what is wrong?

A: Most likely you are using Q: I cannot get the disk size and partition sizes to match, something is missing. What has happened?

A:It is possible you have mounted a partition onto a mount point that was not an empty directory. Mount points are directories and if it is not empty the mounting will mask the contents. If you do the sums you will see the amount of disk space used in this directory is missing from the observed total. To solve this you can boot from a rescue disk and see what is hiding behind your mount points and remove or transfer the contents by mounting the offending partition on a temporary mounting point. You might find it useful to have “spare” emergency mounting points ready made. Q: It doesn’t look like my swap partition is in use, how come?

A: It is possible that it has not been necessary to swap out, especially if you have plenty of RAM. Check your log files to see if you ran out of memory at one point or another, in that case your swap space should have been put to use. If not it is possible that either the swap partition was not assigned the right number, that you did not prepare it with file. Q: What is this Nyx that is mentioned several times here?

A: It is a large free Unix system with currently about 10000 users. I use it for my web pages for this HOWTO as well as a source of ideas for a setup of large Unix systems. It has been running for many years and has a quite stable setup. For more information you can view the which also gives you information on how to get your own free account.

Bits and Pieces

disk!miscellaneous This is basically a section where I stuff all the bits I have not yet decided where should go, yet that I feel is worth knowing about. It is a kind of transient area. Power and Heating

disk!miscellaneous!power-related issues disk!miscellaneous!heat-related issues Not many years ago a machine with the equivalent power of a modern PC required 3-phase power and cooling, usually by air conditioning the machine room, some times also by water cooling. Technology has progressed very quickly giving not only high speed but also low power components. Still, there is a definite limit to the technology, something one should keep in mind as the system is expanded with yet another disk drive or PCI card. When the power supply is running at full rated power, keep in mind that all this energy is going somewhere, mostly into heat. Unless this is dissipated using fans you will get a serious heating inside the cabinet followed by a reduced reliability and also life time of the electronics. Manufacturers state minimum cooling requirements for their drives, usually in terms of cubic feet per minute (CFM). You are well advised to take this serious. Keep air flow passages open, clean out dust and check the temperature of your system running. If it is too hot to touch it is probably running too hot. If possible use sequential spin up for the drives. It is during spin up, when the drive platters accelerate up to normal speed, that a drive consumes maximum power and if all drives start up simultaneously you could go beyond the rated power maximum of your power supply. Deja

disk!miscellaneous!Dejanews disk!miscellaneous!Deja disk!reliability This was an Internet system that was aquired by Google and is now available as Google Groups. It searches and serves for more information. It changed name from Dejanews. What perhaps is less known, is that they used about 120 Linux SMP computers many of which use the For the production systems (which are up 365 days a year) the downtime attributable to disk errors is less than 0.25 % (that is a quarter of 1%, not 25%). Just in case: this is not an advertisement, it is stated as an example of how much is required for what is a major Internet service. Crash Recovery

disk!miscellaneous!recovery disk!miscellaneous!crash recovery Occasionally hard disks crash. A crash causing data scrambling can often be at least partially recovered from and there are already HOWTOs describing this. In case of hardware failure things are far more serious, and you have two options: either send the drive to a professional data recovery company, or try recovering yourself. The latter is of course high risk and can cause more damage. If a disk stops rotating or fails to spin up, the number one advice is first to turn off the system as fast as safely possible. Next you could try disconnecting the drives and power up the machine, just to check power with a multimeter that power is present. Quite often connectors can get unseated and cause all sorts of problems. If you decide to risk trying it yourself you could check all connectors and then reapply power and see if the drive spins up and responds. If it still is dead turn off power quickly, preferrably before the operating system boots. Make sure that delayed spinup is not deceiving you here. If you decide to progress even further (and take higher risks) you could remove the drive, give it a firm tap on the side so that the disk moves a little with respect to the casing. This can help in unsticking the head from the surface, allowing the platter to move freely as the motor power is not sufficient to unstick a stuck head on its own. Also if a drive has been turned off for a while after running for long periods of time, or if it has overheated, the lubricant can harden of drain out of the bearings. In this case warming the drive slowly and gently up to normal operating temperature will possibly recover the lubrication problems. If after this the drive still does not respond the last possible and the highest risk suggestion is to replace the circuit board of the drive with a board from am identical model drive. Often the contents of a drive is worth far more than the media itself, so do consider professional help. These companies have advanced equipment and know-how obtained from the manufacturers on how to recover a damaged drive, far beyond that of a hobbyist. Appendix A: Partitioning Layout Table: Mounting and Linking

disk!partitioning layout table!mounting and linking The following table is designed to make layout a simpler paper and pencil exercise. It is probably best to print it out (using NON PROPORTIONAL fonts) and adjust the numbers until you are happy with them. Mount point is what directory you wish to mount a partition on or the actual device. This is also a good place to note how you plan to use symbolic links. The size given corresponds to a fairly big Debian 1.2.6 installation. Other examples are coming later. Mainly you use this table to select what structure and drives you will use, the partition numbers and letters will come from the next two tables. Directory Mount point speed seek transfer size SIZE swap __________ ooooo ooooo ooooo 32 ____ / __________ o o o 20 ____ /tmp __________ oooo oooo oooo ____ /var __________ oo oo oo 25 ____ /var/tmp __________ oooo oooo oooo ____ /var/spool __________ ____ /var/spool/mail __________ o o o ____ /var/spool/news __________ ooo ooo oo ____ /var/spool/____ __________ ____ ____ ____ ____ /home __________ oo oo oo ____ /usr __________ 500 ____ /usr/bin __________ o oo o 250 ____ /usr/lib __________ oo oo ooo 200 ____ /usr/local __________ ____ /usr/local/bin __________ o oo o ____ /usr/local/lib __________ oo oo ooo ____ /usr/local/____ __________ ____ /usr/src __________ o oo o 50 ____ DOS __________ o o o ____ Win __________ oo oo oo ____ NT __________ ooo ooo ooo ____ /mnt._________ __________ ____ ____ ____ ____ /mnt._________ __________ ____ ____ ____ ____ /mnt._________ __________ ____ ____ ____ ____ /_____________ __________ ____ ____ ____ ____ /_____________ __________ ____ ____ ____ ____ /_____________ __________ ____ ____ ____ ____ Total capacity: Appendix B: Partitioning Layout Table: Numbering and Sizing

disk!partitioning layout table!numbering and sizing This table follows the same logical structure as the table above where you decided what disk to use. Here you select the physical tracking, keeping in mind the effect of track positioning mentioned earlier in . The final partition number will come out of the table after this. Drive sda sdb sdc hda hdb hdc ___ SCSI ID | __ | __ | __ | Directory swap | | | | | | | / | | | | | | | /tmp | | | | | | | /var : : : : : : : /var/tmp | | | | | | | /var/spool : : : : : : : /var/spool/mail | | | | | | | /var/spool/news : : : : : : : /var/spool/____ | | | | | | | /home | | | | | | | /usr | | | | | | | /usr/bin : : : : : : : /usr/lib | | | | | | | /usr/local : : : : : : : /usr/local/bin | | | | | | | /usr/local/lib : : : : : : : /usr/local/____ | | | | | | | /usr/src : : : : DOS | | | | | | | Win : : : : : : : NT | | | | | | | /mnt.___/_____ | | | | | | | /mnt.___/_____ : : : : : : : /mnt.___/_____ | | | | | | | /_____________ : : : : : : : /_____________ | | | | | | | /_____________ : : : : : : : Total capacity: Appendix C: Partitioning Layout Table: Partition Placement

disk!partitioning layout table!partition placement This is just to sort the partition numbers in ascending order ready to input to fdisk or cfdisk. Here you take physical track positioning into account when finalizing your design. Unless you get specific information otherwise, you can assume track 0 is the outermost track. These numbers and letters are then used to update the previous tables, all of which you will find very useful in later maintenance. In case of disk crash you might find it handy to know what SCSI id belongs to which drive, consider keeping a paper copy of this. Drive : sda sdb sdc hda hdb hdc ___ Total capacity: | ___ | ___ | ___ | ___ | ___ | ___ | ___ SCSI ID | __ | __ | __ | Partition 1 | | | | | | | 2 : : : : : : : 3 | | | | | | | 4 : : : : : : : 5 | | | | | | | 6 : : : : : : : 7 | | | | | | | 8 : : : : : : : 9 | | | | | | | 10 : : : : : : : 11 | | | | | | | 12 : : : : : : : 13 | | | | | | | 14 : : : : : : : 15 | | | | | | | 16 : : : : : : : Appendix D: Example: Multipurpose Server

disk!example!server, multi-purpose The following table is from the setup of a medium sized multipurpose server where I once worked. Aside from being a general Linux machine it will also be a network related server (DNS, mail, FTP, news, printers etc.) X server for various CAD programs, CD ROM burner and many other things. The files reside on 3 SCSI drives with a capacity of 600, 1000 and 1300 MB. Some further speed could possibly be gained by splitting /usr/local from the rest of the /usr system but we deemed the further added complexity would not be worth it. With another couple of drives this could be more worthwhile. In this setup drive sda is old and slow and could just a well be replaced by an IDE drive. The other two drives are both rather fast. Basically we split most of the load between these two. To reduce dangers of imbalance in partition sizing we have decided to keep /usr/bin and /usr/local/bin in one drive and /usr/lib and /usr/local/lib on another separate drive which also affords us some drive parallelizing. Even more could be gained by using RAID but we felt that as a server we needed more reliability than was then afforded by the Appendix E: Example: Mounting and Linking

disk!example!mounting and linking Directory Mount point speed seek transfer size SIZE swap sdb2, sdc2 ooooo ooooo ooooo 32 2×64 / sda2 o o o 20 100 /tmp sdb3 oooo oooo oooo 300 /var __________ oo oo oo ____ /var/tmp sdc3 oooo oooo oooo 300 /var/spool sdb1 436 /var/spool/mail __________ o o o ____ /var/spool/news __________ ooo ooo oo ____ /var/spool/____ __________ ____ ____ ____ ____ /home sda3 oo oo oo 400 /usr sdb4 230 200 /usr/bin __________ o oo o 30 ____ /usr/lib -> libdisk oo oo ooo 70 ____ /usr/local __________ ____ /usr/local/bin __________ o oo o ____ /usr/local/lib -> libdisk oo oo ooo ____ /usr/local/____ __________ ____ /usr/src ->/home/usr.src o oo o 10 ____ DOS sda1 o o o 100 Win __________ oo oo oo ____ NT __________ ooo ooo ooo ____ /mnt.libdisk sdc4 oo oo ooo 226 /mnt.cd sdc1 o o oo 710 Total capacity: 2900 MB Appendix F: Example: Numbering and Sizing

disk!example!numbering and sizing Here we do the adjustment of sizes and positioning. Directory sda sdb sdc swap | | 64 | 64 | / | 100 | | | /tmp | | 300 | | /var : : : : /var/tmp | | | 300 | /var/spool : : 436 : : /var/spool/mail | | | | /var/spool/news : : : : /var/spool/____ | | | | /home | 400 | | | /usr | | 200 | | /usr/bin : : : : /usr/lib | | | | /usr/local : : : : /usr/local/bin | | | | /usr/local/lib : : : : /usr/local/____ | | | | /usr/src : : : : DOS | 100 | | | Win : : : : NT | | | | /mnt.libdisk | | | 226 | /mnt.cd : : : 710 : /mnt.___/_____ | | | | Total capacity: | 600 | 1000 | 1300 | Appendix G: Example: Partition Placement

disk!example!partition placement This is just to sort the partition numbers in ascending order ready to input to fdisk or cfdisk. Remember to optimize for physical track positioning (not done here). Drive : sda sdb sdc Total capacity: | 600 | 1000 | 1300 | Partition 1 | 100 | 436 | 710 | 2 : 100 : 64 : 64 : 3 | 400 | 300 | 300 | 4 : : 200 : 226 : Appendix H: Example II

disk!example!server, academic The following is an example of a server setup in an academic setting, and is contributed by /var/spool/delegate is a directory for storing logs and cache files of an WWW proxy server program, “delegated”. Since I don’t notice it widely, there are 1000–1500 requests/day currently, and average disk usage is 15–30% with expiration of caches each day. /mnt.archive is used for data files which are big and not frequently referenced such a s experimental data (especially graphic ones), various source archives, and Win95 backups (growing very fast…). /mnt.root is backup root file system containing rescue utilities. A boot floppy is also prepared to boot with this partition. ================================================= Directory sda sdb hda swap | 64 | 64 | | / | | | 20 | /tmp | | | 180 | /var : 300 : : : /var/tmp | | 300 | | /var/spool/delegate | 300 | | | /home | | | 850 | /usr | 360 | | | /usr/lib -> /mnt.lib/usr.lib /usr/local/lib -> /mnt.lib/usr.local.lib /mnt.lib | | 350 | | /mnt.archive : : 1300 : : /mnt.root | | 20 | | Total capacity: 1024 2034 1050 ================================================= Drive : sda sdb hda Total capacity: | 1024 | 2034 | 1050 | Partition 1 | 300 | 20 | 20 | 2 : 64 : 1300 : 180 : 3 | 300 | 64 | 850 | 4 : 360 : ext : : 5 | | 300 | | 6 : : 350 : : Filesystem 1024-blocks Used Available Capacity Mounted on /dev/hda1 19485 10534 7945 57% / /dev/hda2 178598 13 169362 0% /tmp /dev/hda3 826640 440814 343138 56% /home /dev/sda1 306088 33580 256700 12% /var /dev/sda3 297925 47730 234807 17% /var/spool/delegate /dev/sda4 363272 170872 173640 50% /usr /dev/sdb5 297598 2 282228 0% /var/tmp /dev/sdb2 1339248 302564 967520 24% /mnt.archive /dev/sdb6 323716 78792 228208 26% /mnt.lib Apparently /tmp and /var/tmp is too big. These directories shall be packed together into one partition when disk space shortage comes. /mnt.lib is also seemed to be, but I plan to install newer TeX and ghostscript archives, so /usr/local/lib may grow about 100 MB or so (since we must use Japanese fonts!). Whole system is backed up by Seagate Tapestore 8000 (Travan TR-4, 4G/8G). Appendix I: Example III: SPARC Solaris

disk!example!server, industrial The following section is the basic design used at work for a number of Sun SPARC servers running Solaris 2.5.1 in an industrial development environment. It serves a number of database and cad applications in addition to the normal services such as mail. Simplicity is emphasized here so /usr/lib has not been split off from /usr. This is the basic layout, planned for about 100 users. Drive: SCSI 0 SCSI 1 Partition Size (MB) Mount point Size (MB) Mount point 0 160 swap 160 swap 1 100 /tmp 100 /var/tmp 2 400 /usr 3 100 / 4 50 /var 5 6 remainder /local0 remainder /local1 Due to specific requirements at this place it is at times necessary to have large partitions available on a short notice. Therefore drive 0 is given as many tasks as feasible, leaving a large /local1 partition. This setup has been in use for some time now and found satisfactorily. For a more general and balanced system it would be better to swap /tmp and /var/tmp and then move /var to drive 1. Appendix J: Example IV: Server with 4 Drives

disk!example!server, 4 drives This gives an example of using all techniques described earlier, short of RAID. It is admittedly rather complicated but offers in return high performance from modest hardware. Dimensioning are skipped but reasonable figures can be found in previous examples. Partition sda sdb sdc sdd —- —- —- —- 1 root overview lib news 2 swap swap swap swap 3 home /usr /var/tmp /tmp 4 spare root mail /var Setup is optimised with respect to track positioning but also for minimising drive seeks. If you want DOS or Windows too you will have to use Partition sda sdb sdc sdd —- —- —- —- 1 boot overview news news 2 overview swap swap swap 3 swap lib lib lib 4 lib overview /tmp /tmp 5 /var/tmp /var/tmp mail /usr 6 /home /usr /usr mail 7 /usr /home /var 8 / (root) spare root Here all duplicates are parts of a RAID 0 set with two exceptions, swap which is interleaved and home and mail which are implemented as RAID 1 for safety. Note that boot and root are separated: only the boot file with the kernel has to reside within the 1023 cylinder limit. The rest of the root files can be anywhere and here they are placed on the slowest outermost partition. For simplicity and safety the root partition is not on a RAID system. With such a complicated comes an equally complicated /dev/sda8 / ? ? 1 1 (a) /dev/sdb8 / ? noauto 1 2 (b) /dev/sda1 boot ? ? 1 2 (a) /dev/sdc7 /var ? ? 1 2 (c) /dev/md1 news ? ? 1 3 (c+d) /dev/md2 /var/tmp ? ? 1 3 (a+b) /dev/md3 mail ? ? 1 4 (c+d) /dev/md4 /home ? ? 1 4 (a+b) /dev/md5 /tmp ? ? 1 5 (c+d) /dev/md6 /usr ? ? 1 6 (a+b+c+d) /dev/md7 /lib ? ? 1 7 (a+b+c+d) The letters in the brackets indicate what drives will be active for each Appendix K: Example V: Dual Drive System

disk!example!system, 2 drives A dual drive system offers less opportunity for clever schemes but the following should provide a simple starting point. Partition sda sdb —- —- 1 boot lib 2 swap news 3 /tmp swap 4 /usr /var/tmp 5 /var /home 6 / (root) If you use a dual OS system you have to keep in mind that many other systems must boot from the first partition on the first drive. A simple DOS / Linux system could look like this: Partition sda sdb —- —- 1 DOS lib 2 boot news 3 swap swap 4 /tmp /var/tmp 5 /usr /home 6 /var DOSTEMP 7 / (root) Also remember that DOS and Windows prefer there to be just a single primary partition which has to be the first one where it boots from. As Linux can happily exist in logical partitions this is not a big problem. Appendix L: Example VI: Single Drive System

disk!example!system, 1 drive Although this falls somewhat outside the scope of this HOWTO it cannot be denied that recently some rather large drives have become very affordable. Drives with 100 – 250 GB are becoming common and the question often is how best to partition such monsters. Interestingly enough very few seem to have any problems in filling up such drives and the future looks generally quite rosy for manufacturers planning on even bigger drives. Opportunities for optimisations are of course even smaller than for 2 drive systems but some tricks can still be used to optimise track positions while minimising head movements. Partition hda Size estimate (MB) —- —————— 1 DOS 500 2 boot 20 3 Winswap 200 4 data The bulk of the drive 5 lib 50 – 500 6 news 300+ 7 swap 128 (Maximum size for 32-bit CPU) 8 tmp 300+ (/tmp and /var/tmp) 9 /usr 50 – 500 10 /home 300+ 11 /var 50 – 300 12 mail 300+ 13 / (root) 30 14 dosdata 10 ( Windows bug workaround!) Remember that the Appendix M: Disk System Documenter

disk!disk documenter This shell script was very kindly provided by Steffen Hulegaard. Run it as root (superuser) and it will generate a summary of your disk setup. Run it after you have implemented your design and compare it with what you designed to check for mistakes. Should your system develop defects this document will also be a useful starting point for recovery. #!/bin/bash #$Header$ # # makediskdoc Collects storage/disk info via df, mount, # /etc/fstab and fdisk. Creates a single # reference file — /root/sysop/doc/README.diskdoc # Especially good for documenting storage # config/partioning # # 11/11/1999 SC Hulegaard Created just before RedHat 5.2 to # RedHat 6.1 upgrade # 12/31/1999 SC Hulegaard Added sfdisk -glx usage just prior to # collapse of my Quantum Grand Prix (4.3 Gb) # # SEE ALSO Other /root/bin/make*doc commands to produce other /root/sysop/doc/README.* # files. For example, /root/bin/makenetdoc. # FILE=/root/sysop/doc/README.diskdoc echo Creating $FILE … echo ‘ ‘ > $FILE echo $FILE >> $FILE echo Produced By $0 >> $FILE echo `date` >> $FILE echo ‘ ‘ >> $FILE echo $Header$ >> $FILE echo ‘ ‘ >> $FILE echo DESCRIPTION: df -a >> $FILE df -a >> $FILE 2>&1 echo ‘ ‘ >> $FILE echo DESCRIPTION: df -ia >> $FILE df -ia >> $FILE 2>&1 echo ‘ ‘ >> $FILE echo DESCRIPTION: mount >> $FILE mount >> $FILE 2>&1 echo ‘ ‘ >> $FILE echo DESCRIPTION: /etc/fstab >> $FILE cat /etc/fstab >> $FILE echo ‘ ‘ >> $FILE echo DESCRIPTION: sfdisk -s disk device size summary >> $FILE sfdisk -s >> $FILE echo ‘ ‘ >> $FILE echo DESCRIPTION: sfdisk -glx info for all disks listed in /etc/fstab >> $FILE for x in `cat /etc/fstab | egrep /dev/[sh] | cut -c 0-8 | uniq`; do echo ‘ ‘ >> $FILE echo $x ============================= >> $FILE sfdisk -glx $x >> $FILE done echo ‘ ‘ >> $FILE echo DESCRIPTION: fdisk -l info for all disks listed in /etc/fstab >> $FILE for x in `cat /etc/fstab | egrep /dev/[sh] | cut -c 0-8 | uniq`; do echo ‘ ‘ >> $FILE echo $x ============================= >> $FILE fdisk -l $x >> $FILE done echo ‘ ‘ >> $FILE echo DESCRIPTION: dmesg info on both sd and hd drives >> $FILE dmesg | egrep [hs]d[a-z] >> $FILE echo ” >> $FILE echo Done >> $FILE echo Done exit


For reliable and quality Managed IT ServicesIT Support and VoIP, Contact Precise Business Solutions 

Network Cabling Blog Form