Disk Crash, Recovering Files and Doing Backups

About one and a half weeks ago I had a disk crash. I didn’t lose anything, but was pretty close, mainly because I deleted by hand an important file :-(.

It is interesting to talk about my disk crash because I faced many problems to bring my computer back. Luckily my notebook has two hard drives, so I’m up and running with the secondary one now, but to do so I had to clone the recovery partition. Then I copied my backups to my home directory, and in the process I deleted an important file, which took me days to recover. And finally, I wrote a small script to back up things. This script is not a program (there is no error checking or anything like that), but it shows how to keep an encrypted backup.

Cloning the restore partition

When I switched the two disks, I couldn’t install Windows 7 because the recovery partition didn’t exist anymore. Sadly, the recovery media generated using Dell’s software didn’t work without the partition either. The good thing is that I could read the other disk, painfully slow, but it was doable.

My disk had 4 partitions: a small diagnostics partition (fat32, 39.19MB), the recovery partition (ntfs, 14.65GB), and then my two operating systems, Windows and GNU/Linux.

What I did was pretty simple, I just created two partitions in the beginning of my new disk with the corresponding sizes (39.19MB and 14.65GB) plus a little delta just in case. Then using dd I copied the content.

dd if=/dev/sdb1 of=/dev/sda1
dd if=/dev/sdb2 of=/dev/sda2

This worked okay and then I could install Windows, and later GNU/Linux (Ubuntu 10.10). You can play with the parameters of dd, in fact, setting the block size to something around 16MB worked quite well for me. In my case the disk was not failing when reading the first two partitions, so it was okay to use dd, otherwise, ddrescue might be a better idea.

The remove-file nightmare

I had all my files backed up, but I managed to erase an important one. The worst part is that I did it by hand! I was cleaning a folder using find:

find . -name *.bz2 -exec rm {} ;

What I didn’t notice is that I had my file mails_20090814.bz2 in there. That file is a copy of all mails from 2004 to the date in the file name (in mbox format), and the worst part, that was my only copy! I started a quest to recover that precious file.

The file was in an ntfs partition and ntfsundelete didn’t find anything, maybe because I used rm instead of deleting it in Windows. I tried many different open-source options, but none of them found anything. So I moved to Windows and started trying out all recovery tools I could find (even if one of them found the file and I had to pay to recover, I would do it).

I finally got a winner! The software recuva was able to find my file. Well, not exactly, but enough information so I could recover it. It found a bunch of files without a name (things I had erased) and in list mode I could sort them by size. The software shows you the header of the file, so I had to find something starting with 0x42 0x5a 0x68 (the header of a bz2 file) that was close to 2GB. Lucky me, there weren’t that many files bigger than 2GB, so I got to it quickly :-). The bad news, the program could not recover the file (some random error about permissions), but it also provided the following:

Filename:
Path: G:?
Size: 2.09 GB (2,245,093,270)
State: Excellent
Creation time: Unknown
Last modification time: Unknown
Last access time: Unknown
Comment: No overwritten clusters detected.
548119 cluster(s) allocated at offset 220851280

To recover it, I just went back to my Ubuntu and did:

dd if=/dev/sdc1 bs=4096 skip=220851280 count=548119 > mails.bz2.tmp
head -c 2245093270 > mails.bz2
rm mails.bz2.tmp

Let me quickly explain the numbers there. The block size (bs) is 4096, standard size of a block in ntfs. The skip is how many block we want to skip (not read) starting from the beginning of /dev/sdc1. Finally, we know we have to read 548119 blocks, and we tell that to dd with the count parameter. Since our file has a size that is not a multiple of 4096, we only take the right prefix using head. After that, we are done.

I was really lucky that the file was contiguous, but if it wasn’t you can do the same thing, read them blocks in order and use >> instead of > when executing dd each time.

Backing up

I have two external hard drives, a portable one and another on my desk (that I just bought after this awful experience). Both have an ntfs partition, since I want to be able to read them in Windows too. One thing I don’t like is that losing my external hard drive would give mean that someone else has all my information. Most of the things in the hard drive are not important, but my home directory is, it has personal information I don’t want anybody to have. So, this time, since I was spending so much time sorting my files, I decided to write a short script that would keep my files encrypted.

I created an encrypted copy of my files in the ntfs partition using encfs. The use is quite simple, just create two folders, in my case enc_home and plain_home. The first one will contain the encrypted data, and the second one, when active, will allow me to access the plain data. To initialize it I just did:

encfs /path/to/enc_home /path/to/plain_home

In my case I used the option ‘p’ (for paranoid) and then was asked for a password. From now on, when I mount the partition, using the same command, I have to enter the password. The files copied into plain_home are encrypted and stored in enc_home. To stop the plain view you have to umount the folder:

fusermount -u /path/to/plain_home

There was still one thing I didn’t like about this way of doing backups. In GNU/Linux you have permissions, and when moving to ntfs, those are lost. This is especially annoying when you have source code with executable files. If you restore from there, you have to set those permissions by hand. To fix that I created an image file containing an ext4 partition (I know I could partition the external hard drive, but then the sizes are kind of fixed, with the image I can make it bigger later on). To do so I did:

dd if=/dev/zero of=/path/to/img bs=1024M count=100
mkfs.ext4 /path/to/img

Then I was ready to mount the partition (100GB as you could deduce from the dd command) and copy files into it. So without further explanation, I’ll just leave you with the script that does the backup (both to ntfs and the image). This script can me improved, there are many things that could fail, but right now I don’t have the time to make it better (maybe for a later post, where I could improve this and also explain how I synchronize my fails over the network). If you want to use the script you have to update the paths, that are all across the script and not defined as variables (shame on me :p).

#!/bin/bash

echo "Backing up"
sleep 5

### Backup to an encrypted image file
# mount image
mkdir virtual_bkp
sudo mount -o loop /media/recoded-backup/bkp_home/backup.ext4 $HOME/virtual_bkp
# link Backup in plain form to Documents
sleep 5
encfs $HOME/virtual_bkp/encrypted $HOME/virtual_bkp/Documents
# Copy files
sleep 5
rsync -av $HOME/Documents/ $HOME/virtual_bkp/Documents
# unlink Documents
sudo sync
sleep 5
fusermount -u $HOME/virtual_bkp/Documents/
# umount image
sudo umount $HOME/virtual_bkp
rmdir virtual_bkp

### Paranoid, we keep another copy, this one in ntfs
###

# link encrypted folder to /media/recoded-backup/bkp_home/plain_home
encfs /media/recoded-backup/bkp_home/enc_home/ 
    /media/recoded-backup/bkp_home/plain_home/
# copy files
sleep 5
rsync -av $HOME/ /media/recoded-backup/bkp_home/plain_home
# unlink encrypted folder
sudo sync
sleep 5
fusermount -u /media/recoded-backup/bkp_home/plain_home/

 

2 Comments

  1. Robert Karen says:

    Very nice and informative tutorial and I really liked it but why don’t u use a Disk Doctors Data Recovery and Undelete softwar.Man! they are awesome.They can recover data even from formatted or corrupted hard drives and also so easy that even a person with no technical knowledge can use it.

    • I didn’t try it. I stopped at the point where I got enough information to recover the file in Linux. Since it was only one file this was good enough, but if I ever get into the problem of having to recover multiple files (I hope it doesn’t happen), I’ll certainly consider it. Thanks for the recommendation :-).

Leave a Reply