How to check your disk for errors (and repair them)
The scandisk function of the Linksys web interface will not work when the disk is unslung as it is then in use and cannot be unmounted. The following steps can be used instead.
(if you get an error that no device is mounted use "mount" to find out the name of the device; look for devices that are mounted as /share/hdd/data and /share/hdd/conf).
The web interface may still not allow you to scan. The log may say:
"Warning: Out of disk space, scandisk cannot proceed."
/sbin/fsck.ext3 -f /dev/sda1
/sbin/fsck.ext3 -f /dev/sda2
This will check the disk for you, the -f flag forces a check even if the file system appears sane at first glance.
If you want to you can add the -y flag to the command. This will answer all questions with yes automatically (which is generally the best choice anyway).
WARNING: be sure to reboot your slug after doing this (to avoid that you accidentally fill up the flash filesystem)
Alternative: you could also remove the USB hard drive / flash drive from the slug and connect it to a Linux box and run fsck from there as well. fsck and e2fsck is the same thing, so use either (one command points to the other). This is probably more convenient for users that run the stock Linksys firmware rather than Unslung or other telnet enabled firmwares. Your PC should have USB2 - otherwise you would have to remove the drive from the enclosure. If you don't have a Linux box somewhere, you can boot your normal PC or Laptop with one of the several Live CD distros, for instance:
CD to create a live USB drive, if your BIOS supports USB boot devices.
WARNING (infoball): Being sort of adventurous I run a HD case (Argosy HD360U) and disk size (Seagate 400GB) that does not seem to be completely compatible with the slug. Linksys firmware 2.3R29 refuses to finish formatting, 2.3R63 I don't know if it's any better in the formatting area but it won't finish scandisk. 5.5-beta works. Sort of. At least it finishes formatting the disk. The above procedure results in "Bus error" (basically when it's done it seems), which then is the immediate response to all subsequent commands and the slug has to be reset by yanking the power cord. I do not yet know if there are any odd side effects as it seems to work well afterwards, but be warned. I have also experienced other strange occurrences with this (brand new) disk when used by the slug so the warning about making sure your hardware works with the stock firmware is very valid. I made a conscious decision to not care in this case. We'll see if it's fixable ;-)
Update: Not having enough time to investigate I got a smaller disk (Hitachi 250GB) and an identical HD case. I also flashed the slug with the 2.3R63 firmware and lo: it worked! Formatting, scandisk and all. I have recently flashed with Unslung 5.5-beta again and after a while I tried the fsck again, which worked, no more "Bus error". Several errors were discovered, and I let them be corrected. I ran fsck several times and seemingly the same, or at least as many, errors were discovered and "fixed". Finally I just connected the disk to a Linux box and did the same thing and it worked right away, no errors left to fix on the second fsck run. As a first impression things also seem more stable (just two "network name not available" messages while transferring several large files to and from the slug in the hours since I connected it again). Therefore I'd like to suggest that the fsck primarily be done on a PC and not from the slug as the fsck.ext3 binary seems unreliable, and also that anyone experiencing samba trouble try this as well!
I yesterday had a hd crash (I suspect the cause was a mobile phone that was called just while a write operation was carried out). I run the above procedure and lots of errors run through the screen... I had to run it several fsck.ext3 several times until no errors were found. But even after that the NSLU2 refused to mount the drive. From the syslogs I see that at some point fsck decided to erase the ext3 journal, so now my fs is just ext2, and the slug does not like. I will try to modify the fstab file and mount it manually. Anyway, my info is gone anyway, I attached the drive to a PC and it's empty (lucky it was in test and I only had some programs but no data there)... but beware of mobile phones near hard drives!
-- i had a problem with a hd crash too (in fact 3 times, I'm desperate) 1. I tried both Debian debootstrapped (slugos/le) and twice debian installer rc1. 2. only things running: proftpd (no traffic aside from me uploading some stuff once), sshd, screen, rtorrent (the first 2 times), transmission (a lightweight bittorrent client - the third time). 3. It all goes smooth (very smooth) for somewhere between 12-16-24 or maybe a little more hours, then it crashes (no ssh or ftp access, ping responding). 4. during this time, RAM has about 4 MB free and swap (128 MB swap partition) is used between 6-16MB (downloading only a 7 GB torrent). 5. NO errors in the dmesg, messages log, any other log that i know of 6. no program crash, everything is fine until it crashes
after crash, I take my harddrive and stick it into a Ubuntu machine. It has lots of errors on fsck. First time, after fsck, i ended up with all my data in the lost+found folder. The second time after some fixes, it worked. This morning, I fsck-ed it again after the crash and again, lots of error (bad allocation i think, and "file has filetype set", and after running with -y for 10 minutes, (and listing almost every single file on that harddrive) I still got some system data, but the directory where the torrents were is now a file, not a directory, /var/log/messages is missing, and who knows what others strange things happened.
I found the cause, which is the @#%!@## USB-> ATA Adapter (some El Cheapo from china). Here it is:
So it seems my usb -> ide adapter works just fine under Windows XP, but under Linux 2.6, after a variable number of hours, it trashes the harddrive, file system, anything.
I am now testing it with unslung (2.4 kernel - works all right so far - 20 hours) and then I will test another adapter (another vendor) with 2.6 and I will try to post here the results.
UPDATE 2: kernel 2.4 seems to be working ok with this adapter, so I will stick with it:
17:21:24 up 3 days, 18:20, 2 users, load average: 1.48, 1.73, 1.67
3-5 torrents all this time (transmissioncli), amuled, many ftp transfers and local copy from dir to dir (big files for testing purposes)
UPDATE 3: it crashed after more than 7 days, but it crashed nevertheless. So don't use this controller. --