No More Twist

Raising the Dead

without comments

Sometimes you can ‘undelete’ files, usually if they weren’t deleted properly in the first place. You can drag things out of the wastebasket. On Linux there is the concept of a deleted file that is still usable – in other words one that a program holds open while a user deletes it. This file is supposed to work like a normal file from the program’s point of view, while remaining inaccessible to the rest of the system.

It’s not always like that though. You can use tools like ‘lsof’ to list files that a running program is using. If one of those is deleted and you have access, you can copy the file back from the /proc filesystem (/proc/PID/fd/FD) to somewhere. You can get back a snapshot of the deleted file.

That isn’t the same are restoring the file though. The snapshot is a separate new file. The program goes on using the old file. What if the file is being changed all the time? You would have an outdated copy. If it is a database file it might not even be in a consistent state when you copy it. That is the challenge I was facing.

I have to digress and talk about links for a while. There are two main types of links or shortcuts to files. The soft link is a pointer that tells you an alternative filename for something. The hard link is like another filename for the same file. Each file can have a number of names, most usually one. But when a hard link is created the number of names for that file goes up. When someone deletes the file, the name goes but the file can remain in existence if there are other names for it, or if a program holds the file open.

So, a proper undelete would find this file-with-no-name, known only by an inode number (that’s the index used to organise files). It would create a new ‘hard link’ for that file which would make the same open file visible again under the previous name. Then the program with the open file could continue normally…

Having a look at the kernel sources for making a hard link (sys_link), it seems to look up a file, check the new name is available, and calls vfs_link with a directory entry (dentry) for the existing file, inode and dentry for the new name. The only thing that is used from the existing file dentry is the inode number. So by rewriting sys_link to take an inode number and make a fake dentry with it, we should be able to undelete the file properly.

The only problems are how to do this in a running system – it’s possible using a loadable kernel module. The functions we need to use in the kernel are exported, or small enough to rewrite. Finally, if there’s journalling going on (ext3 filesystem or similar) we need to call something like the ext3_orphan_del function to clean up the orphan record. Unfortunately this isn’t exported from the ext3 code, so it has to be rewritten. Without doing this the filesystem will not be able to unmount cleanly.

This was one of the more unusual things I worked on. I think it shows a strength of open source systems – that by having all the sources to hand you can code your way out of a difficult situation, going beyond what the system was originally designed to do.

Written by Import Robot

May 19th, 2007 at 10:12 am

Posted in UP

Leave a Reply