A three-pronged attack on performance

401

By Federico Kereki

As with all optimizations, you won’t be able to tell whether you are really getting better results without doing some simple benchmarking. Many processes run on a normal Linux PC, and they can affect performance measurements. To minimize their impact, we will work at init 1 level — single-user mode, in which only minimal processes run. Start a console (ALT-F1 will get you there), log in as root, and execute the init 1 command. This will shut down most services and applications, and let you get consistent results.

Even being in runlevel 1, you should then use the ps xaf command to check whether there’s something running that shouldn’t be; in my case, I discovered that the ddclient program was running (actually sleeping) and might have changed my results, so I ran kill ddclient to get rid of it.

Optimizing hard drive speed

Our first optimization targets the hard drive. In order to learn what hard drives you have, you can use cat /etc/fstab and mount commands. In my case, the first command produces:

/dev/hda3     /                    reiserfs   acl,user_xattr,noatime   1 1
/dev/hda1     /boot                ext2       acl,user_xattr             1 2
/dev/hda2     swap                 swap       defaults                    0 0
proc             /proc                proc       defaults                      0 0
sysfs           /sys                  sysfs      noauto                       0 0
debugfs       /sys/kernel/debug    debugfs    noauto              0 0
usbfs          /proc/bus/usb        usbfs      noauto                   0 0
devpts        /dev/pts             devpts     mode=0620,gid=5     0 0
/dev/fd0     /media/floppy        auto       noauto,user,sync     0 0
/dev/hdd1  /media/disk2         reiserfs   defaults,noatime      1 2

while the second one says:

/dev/hda3 on / type reiserfs (rw,noatime,acl,user_xattr)
proc on /proc type proc (rw)sysfs on
 /sys type sysfs (rw)debugfs on 
/sys/kernel/debug type debugfs (rw)udev on 
/dev type tmpfs (rw)devpts on 
/dev/pts type devpts (rw,mode=0620,gid=5)
/dev/hda1 on /boot type ext2 (rw,acl,user_xattr)/dev/hdd1 on 
/media/disk2 type reiserfs (rw,noatime)securityfs on 
/sys/kernel/security type securityfs (rw)

This shows my main disk is /dev/hda, with three partitions — /dev/hda1, /dev/hda2, and /dev/hda3 — and I also have a secondary disk /dev/hdd with a single /dev/hdd1 partition. Let’s optimize the first drive.

The hdparm command (“hdparm” stands for “hard disk parameters”) allows you to examine and modify drive configuration. Not all modifications are good: some will lower the performance, and some can even be wildly destructive, leading to data loss. The man hdparm command shows you all the options, and highlights the dangerous ones.

Let’s start by viewing the current performance. The command hdparm -t /dev/hda does a test of the transfer speed, and produces a result like:

/dev/hda: Timing buffered disk reads:   10 MB in  3.14 seconds =   3.18 MB/sec

That indicates a slow disk. I usually run this command a dozen times, discard the lowest and highest values, and average the rest. To do this, you can use a shell loop:

for ((i=0;iwill repeat the test 12 times. You could also use script commands to do the discarding and averaging, but a simple calculator is enough.

Now, let’s see the current disk parameters by using hdparm -v /dev/hda:

/dev/hda: multcount    =  0 (off) 
IO_support   =  0 (default 16-bit) 
unmaskirq    =  0 (off) 
using_dma    =  0 (off) 
keepsettings =  0 (off) 
readonly     =  0 (off) 
readahead    =  0 (off) 
geometry     = 16383/255/63, sectors = 156301488, start = 0

Normally, the first optimization to try is using DMA (Direct Memory Access, which means that the drive can directly store data in memory, for a speedier transfer), which can produce impressive results by itself. In my case, after setting the drive to use DMA by executing hdparm -d1 /dev/hda (the -d0 option would have turned DMA off; bad for performance!) I measured the speed again and got an increase to 16.25 MB/sec: more than five times the original speed!

We can try more options. We can change the IO_support value with the -c3 option, as in hdparm -c3 /dev/hda. On my system this produced just a tiny enhancement, reaching a speed of about 16.4 MB/sec, but it’s worth keeping.

The multcount parameter shows how many sectors can be read in a single operation. The command hdparm -i /dev/hda produces somewhat confusing output which includes maxMultSect=16, which indicates we should run hdparm -m16 /dev/hda to allow the drive to read at its maximum rate.

Another parameter that has to do with reading more sectors is readahead. To get the best results, you have to experiment with different values; in my case, using hdparm -a1024 /dev/hda worked best. The combination of these two enhancements led to a speed around 33 MB/sec. To get there, I tried different combinations, starting with -a128 and going up through -a256, -a512, -a1024, and -a2048, but the speed peaked at -a1024; your results may vary. Of course, I ran my dozen tests after each parameter change.

I also tried different multcount values, from -m1 to -m16, and opted for the latter; trying -m32 gave me an error, warning me that the drive couldn’t handle that value.

With all those changes, I managed to speed up the disk almost 11 times — not too shabby! There are a couple more options you can try, but they could be risky. For instance, you could meddle with interrupts with hdparm -u1 /dev/hda or change the transfer options with the hdparm -X command. After testing them out, I did not get any further speed-ups, so I opted for leaving things as they were.

You can keep your values by running hdparm -k1 /dev/hda, but do not do this until you are really sure that they are correct and optimal. As an alternative, you can include your hdparm commands in /etc/init.d/boot.local, a file that includes commands that are run at startup time, at least on my openSUSE system; the startup command file may vary on other distributions.

When you reach this point, your drive is working at its best speed. Now let’s work at a somewhat higher level, and optimize file access.

Optimizing filesystem access

Linux records the times when files were created, last modified, and last accessed. The latter usually implies a penalty on file access, since even if you only read a file, the system will update the directory entry for the file to record the latest timestamp. Since writes can be somewhat slow, doing away with this update should result in performance gains.

To achieve this speedup, you must change the way the filesystem is mounted. Still as root, you can cat /etc/fstab to get:

/dev/hda3   /                    reiserfs   acl,user_xattr             1 1
/dev/hda1  /boot                ext2       acl,user_xattr             1 2
/dev/hda2  swap                 swap       defaults                   0 0
proc          /proc                proc       defaults                   0 0
sysfs        /sys                 sysfs      noauto                     0 0
debugfs    /sys/kernel/debug    debugfs    noauto                     0 0
usbfs       /proc/bus/usb        usbfs      noauto                     0 0
devpts     /dev/pts             devpts     mode=0620,gid=5            0 0
/dev/fd0  /media/floppy        auto       noauto,user,sync           0 0
/dev/hdd1  /media/disk2         reiserfs   defaults                   1 2

Disk drive partitions / and /dev/hdd1 are the best candidates for the optimization, since /boot is used only at boot time, /swap is out of bounds (Linux uses it for its own needs), and the others are not hard disks.

The change is easy: using any text editor add “,noatime” to the options in the fourth column, and remount everything with the mount -a command.

How to test the performance gain? I first tried using the bonnie++ package, but the results weren’t conclusive, since its tests are not specifically oriented to file access.

Instead, I opted for a more “do-it-yourself” test. I created a thousand files and copied their contents to /dev/null, timing the copy. I did the former by

for ((i=0;i$i ; done

and the timing by

time cp * >/dev/null

both with and without the noatime option. The results showed a small performance enhancement, which is logical, because now the file access time isn’t updated after every access.

Now that the drive is working as fast as possible, and that we optimized access to files, it’s time for the last optimization: getting commands to load faster.

Optimizing application loading time

Most Linux programs are ELF (Executable and Linkable Format) and usually smaller than what would seem required, because they do not include all needed libraries, but rather references to them, which are resolved (linked) when loading the code for execution.

This is a classic time vs. space compromise; the program file is smaller, but the loading time is higher. Since many programs use the same libraries, having only one copy of the library saves space. Also, most small programs require few libraries, so this linking is usually quick. However, for larger programs that use many libraries, the linking can take some time.

If you are willing to use a bit more disk space, you can run the prelink command to do the linking phase in advance, and store the libraries within the same program file, so it will be ready to execute as soon as it is loaded. (This is technically not true: actually, when a program is loaded the libraries are checked to verify they haven’t changed since the program was prelinked. However, that check is speedy.)

However, you cannot just jump in and start prelinking programs; you must set up a configuration file named /etc/prelink.conf with some important specifications. This file tells prelink where to search for shared libraries and, eventually, for programs to prelink should you decide to use the -a option to prelink everything possible.

The format of this file is simple: aside from comments (lines starting with a # sign) and blank lines, you can have several lines like:

-l someDirectory-h someOtherDirectory-b someFilePatternToAvoid

The -l lines specify directories to be processed. The -h lines do the same, but allow for symlinks, which will be followed. The -b lines show “blacklisted” programs (patterns can also be used) which shouldn’t be processed. If you happen to know that certain programs cannot be prelinked, you can avoid a runtime error message by including an appropriate -b line.

To prelink a single program you can just prelink someprogram and, if the program can be relinked (there are cases when this isn’t possible) the original binary file will be overwritten with the new, larger, version. You can opt to prelink everything than can be prelinked by executing prelink -a which will go through every -l and -h directory in the configuration file, prelinking all programs it finds. You can add an additional option -n in order to do a “dry run”: all changes will be reported but not done, so you can see what would have happened in an actual run.

A warning: if you have many libraries in your system, and not very much memory, you will need to add the -m option in order to make prelink conserve memory.

If you dislike the prelinked files (or get tired of having to update them every time you get newer libraries) you can use the -u parameter to undo the prelinking: prelink -u someprelinkedprogram will revert someprelinkedprogram to its previous format with no problems.

You can execute the prelinked version just like the normal one, but it will usually load faster, giving a snappier feel to your desktop. As to results, there are some conflicting opinions, but most references I’ve found show a definite speedup. I tried timing some simple commands like:

prelink difftime diffprelink -u difftime diff

and the results were encouraging. (I picked a program that would exit as soon as it loaded, to avoid any “human” times.) Of course, this is just a very small sample, but general results should be similar.

Summary

Getting better performance is always a good practical goal. The three suggestions in this article provide separate but related enhancements that can provide better overall speed for your Linux PC. It is always nice getting a bit more “oomph” out of your hardware!