I had another 1 terabyte SATA hard drive die. :( This one was only about one and a half years old. Thankfully pretty-much all the drive was backed up before it bit the dust. I don't know exactly how much data I lost, but I don't think it was much.
So that means both my new 1TB SATA drives have died and all my much older IDE (PATA) drives are carrying on. That's a bit of a concern.
I had another drive do wacky things last night too -- a 300GB IDE drive. I switched everything off (including the power source because ATX machines don't really switch off when they say they're off), then I painstakingly went through all the power and data connections, unplugging and replugging them. Also I disconnected some of the extra fans, which were of doubtful help because I leave my machine open all the time anyway -- they just seem to make a lot of noise and move the dust around. When I turned the computer back on I did smartdrive checks, found the faulty block, which luckily was in the filesystem journal, fixed it by getting rid of the journal, zeroed the block, then started up the journal again. No more errors detected by smartctl. Whew! I'll do a badblocks test later too, but I think I'm okay for now.
This brings me to a possible solution to a nagging problem. I've long worried that in the frantic backup when problems show that I may be copying faulty files over. The drive that gave me a scare last night is my peer-to-peer drive. After I had done most of my checks on the drive I started up my p2p program again and got it to verify all the data. This is really neat. The p2p program checks through, breaking each file into pieces and verifies it against a kind of checksum. This is much better than the standard checksums used to verify files, because an md5sum, for instance will make a checksum for the entire file, but because of the piece-by-piece way p2p downloads work it has multiple checksums so that damaged parts of files can be downloaded again. Very nice self-correcting technique. I wish I had something like this for the rest of my filesystems.
Meanwhile I'll think on how I could maintain a database of checksums (md5sum, or perhaps sha256sum, which is more reliable). At least then I would have an easy way to tell if a file was damaged. Maybe Linux's ext3 filesystem already has something like this. Surely I'm not the only person to have thought of it.
So that means both my new 1TB SATA drives have died and all my much older IDE (PATA) drives are carrying on. That's a bit of a concern.
I had another drive do wacky things last night too -- a 300GB IDE drive. I switched everything off (including the power source because ATX machines don't really switch off when they say they're off), then I painstakingly went through all the power and data connections, unplugging and replugging them. Also I disconnected some of the extra fans, which were of doubtful help because I leave my machine open all the time anyway -- they just seem to make a lot of noise and move the dust around. When I turned the computer back on I did smartdrive checks, found the faulty block, which luckily was in the filesystem journal, fixed it by getting rid of the journal, zeroed the block, then started up the journal again. No more errors detected by smartctl. Whew! I'll do a badblocks test later too, but I think I'm okay for now.
This brings me to a possible solution to a nagging problem. I've long worried that in the frantic backup when problems show that I may be copying faulty files over. The drive that gave me a scare last night is my peer-to-peer drive. After I had done most of my checks on the drive I started up my p2p program again and got it to verify all the data. This is really neat. The p2p program checks through, breaking each file into pieces and verifies it against a kind of checksum. This is much better than the standard checksums used to verify files, because an md5sum, for instance will make a checksum for the entire file, but because of the piece-by-piece way p2p downloads work it has multiple checksums so that damaged parts of files can be downloaded again. Very nice self-correcting technique. I wish I had something like this for the rest of my filesystems.
Meanwhile I'll think on how I could maintain a database of checksums (md5sum, or perhaps sha256sum, which is more reliable). At least then I would have an easy way to tell if a file was damaged. Maybe Linux's ext3 filesystem already has something like this. Surely I'm not the only person to have thought of it.