view · edit · print · history

Problem with very slow Bootup

It might happen that booting up the slug takes very long and you think your slug will never come up again anymore. Then, after 4 to 7 Minutes or so, suddenly HD-activity comes back and finally, the system is up again.

Apart from an ext3- filesystem-check, this might be also caused by calculating quota during startup.

1. Remove all quota in the web-interface.

2. Put a diversion-script called rc.quota in /unslung containing the following:

exit 0

3. chmod the file to 755

 chmod 755 /unslung/rc.quota

4. try a reboot and hopefully enjoy!

Great, these changes were enough to fix my slow bootup! :) - Lurch

Aparently, the solution above didn't solve my problem completely. There was a quick boot and then a couple of slow boots. Could not determine any logic behind it!

Today, I experimented a little bit on HDD spindown and added the following script ( originally found in SetSpinDownTimeOnMaxtorOneTouch, slightly changed) to /unslung/rc.local:

# /unslung/rc.local
# A diversion script to remount the drive(s) without access
# times being recorded (the update of access times can
# prevent drives sleeping)

# Usually it is enough just to do /sda1 as this is the usually
# the one that holds the system.
/bin/echo "Remounting /dev/sda1 with noatime"
/bin/mount -o remount,rw,noatime /dev/sda1
/bin/echo  "Remounting /dev/sda2 with noatime"
/bin/mount -o remount,rw,noatime /dev/sda2

return 1
# EOF - include this line

This remounts all partitions on /dev/sda with the noatime-option.

Since I have this script running, there are no slow-boots anymore! Don't know why, but anyway - who cares, when it works!

[LJR 20080729]

About eight times out of ten my SLUG was taking over five minutes to boot. I attached a monitor to the serial port and discovered that the delay occurs when /sbin/rc.bootbin hangs for four minutes and 27 seconds. You can see this by waiting for the 'Starting CGI_ds.conf' message and running ps while the boot process hangs there.

A utility called strace is available from the optware library. It can attach to a running process and provide a view of any system calls that the attached process is making.

I installed and attached strace to rc.bootbin via its process id and discovered that it is spinning in a loop waiting for a file named /tmp/log.lck to disappear. Apparently some prior part of the boot creates this file and doesn't always delete it when finished. Occasionally it does delete the file and that explains why the SLUG doesn't always hang up.

strace -p <pid> where <pid> is found from running ps while hung in the loop.

By creating /unslung/rc.bootbin with a line to rm /tmp/log.lck, the file is deleted before rc.bootbin is invoked and the delay is reliably gone.


# delete lock file that hangs rc.bootbin
rm /tmp/log.lck
return 1

I haven't had to do anything like what the contributor above describes and after twenty boots I'm confident this file removal technique fixes my problem. Removing a log lock file from a tmp directory is safe because no one else should be writing to the log file at this point in the boot process. If anything bad comes of this fix I'll post additional information.

It's very easy to see if /sbin/rc.bootbin is causing the problem if you have access to the serial line and can monitor the boot.

If I were more curious I would pepper the boot process with lines checking for the existence of /tmp/log.lck to find out who's the bad boy. But it seems most likely that whoever set the lock is long gone by the time rc.bootbin runs.

view · edit · print · history · Last edited by Lurch.
Based on work by arthur92710, LJR, and Armin.
Originally by Armin.
Page last modified on January 30, 2009, at 08:44 AM