Ext4magic


Ext4magic: Inode - Directory - Journal - Install - Time_Options - Histogram - Scenarios - Tips&Tricks - Manpage - Expert-Mode



ext4magic


Basic Data

Developer:

robi

Actual Release:

0.3.2

Release Date:

2014-09-12

Operating System:

Linux

Dependencies:

Category:

License:

GNU General Public License

Documentation:

English German

Project Site:

Ext4magic

Summary

ext4magic is a disk utility to recover files from ext3 or ext4 partitions

It is based on ext3grep and extundelete, but was rewritten from scratch. In addition to the tools just mentioned ext4magic



Contents



Introduction

Sooner or later it happens to everybody using a Linux system: You hit Enter and immediately realize you just started to delete important data on your system. Unfortunately you don't have any backup or you have a backup, but it's a very old and outdated. If you use ext2 as your filesystem you are able to recover the deleted data but ext3 and ext4 reinitializes block pointers to NULL (See Linux ext3 FAQ and ext4) and therfore a data recovery is not possible any more on ext3 and ext4. There exist scan tools (See HOWTO recover deleted files on an ext3 file system) which may be able to recover the data. Unfortunately nowadays most Linux distributions use ext3 or ext4.


ext3 and ext4 use a journaling file system to be able to recover disk failures. This journal keeps copies of internal file system information and file data and can also be used to recover data deleted by accident. ext4magic is able to use the information in the journal to recover most or even all of the lost data, if such information are still available in the journal data. There is no guarantee the data can be recovered because there are a lot of factors which influence the recovery capabilities of ext4magic and the tool may even fail to recover the important data. But there is a given probability ext4magic is able to recover the data.

The amount of changes done on the filesystem after some data was deleted by error impacts the recovery capabilities of ext4magic. Therefore it's important to stop working on the filesystem immediately or as soon as possible.





Installation

Some distribution provided community packets

The development of ext4magic create also some full feature packages by openSUSE Build Service

If you can not find a package for your distribution, ext4magic can be compiled from source.
Follow the instructions on the install page.

Release Note: currently available:

ext4magic-0.2.4 ( Beta ) includes tho old Magic-function only for ext3
ext4magic-0.3.2 ( Beta ) includes the new Magic-function currently only for ext4





Invocation

Attention:

It's important to work on a unmounted or read only mounted partition or even better to create a copy of the partition in read only mode and use ext4magic on the partition copy. Otherwise important information in the journal can be overwritten and reduces the probability to successfully recover data.





Create a device copy

The most important invocation parameter to ext4magic is the target which should be recovered. It's either a block device (e.g. /dev/loop0) or a partition (e.g. /dev/sda1) or a file system image. There are no modifications done on the target but it's strongly recommended to create a copy of the target and run ext4magic on this copy. That way in case of any failure during the recovery it's possible to start over and create a new copy and start again the recovery procedure.

The copy should be created on any other partition on the same disk or another disk, or can also be a file on a other filesystem.

# dd if=/dev/DEVICE of=/BACKUPPATH/DEVICE.img

The saved image of a partition can be used directly by ext4magic. If you created an image of a whole disk you have to use the loop device to get access to the partition (Details see below)

Be aware this will take some time depending on the disk size you copy. But double check the filesystem is mounted read only just before you start to create the copy!



Save journal only

You may also save the journal of the target only instead of the whole target. The journal is the repository of the needed information for the recover. A simple new mount, the next write, and also read or find commands on the read-write mounted file system will destroy some journal data. With a copy of the Journal immediately following the accident, later you have a good chance to recover files, if you can not use ext4magic immediately.

It's important to create this journal copy immediately before a new mount of the file system. Otherwise some journal data will be destroyed and lost. Create it also before the first using of ext4magic.

# debugfs -R "dump <8> /PATH/journal.copy" /dev/DEVICE 

"/PATH/journal.copy" is the name of the file which will get the journal and must be on a different filesystem. "/dev/DEVICE" is the block device or partition which should be recovered.
This snapshot of the current journal now can be used by ext4magic for recovery analysis instead of the existing journal on the target device.

# ext4magic /dev/DEVICE -j /PATH/journal.copy .......... 

Because the journal was now saved you can continue to work (should it really be necessary) on the filesystem but keep in mind filesystem blocks of deleted files may be overwritten now. Therefore it's strongly recommended to stop working on the filesystem immediately.





Mount saved image of a whole disk with loop device

Images of partitions can be used directly by extmagic. Images created from a whole disk need a loop device in order to access a partition. In most cases the partition to recover is not the first one on the disk. An offset is required which points to the start of the partition in the image.
Here you can find a script that can this calculate, and the instructions.





Parameters

Summary

ext4magic {-M|-m} [-j <journal_file>] [-d <target_dir>] <filesystem>
ext4magic [-S|-J|-H|-V|-T] [-x] [-j <journal_file>] 
          [-B n|-I n|-f <file_name>|-i <input_list>] 
          [-t n|[[-a n][-b n]]] 
          [-d <target_dir>][-R|-r|-L|-l] [-Q] <filesystem>

Options

ext4magic has a huge number of options to control it's processing. There exist four different modes:

Filter options allow a fine grain control of the time range, directory- and inodes and transactions used for recovery. Input/output options are available to define the input sources used for recovery processing and where the recovered files should be written.
One option must always be specified, the file system. This can be specified as a Partition or a virtual block device or as a filesystem image.



Magic options

These options allow to recover files in particular if files were deleted recursively or the whole file system was deleted at once. This starts a powerful multi-stage recover process and using different methods to each other.

Attention:

Note: At the moment the full support for this function is for ext3 only in version 0.2.x
and the full support for ext4 only in version 0.3.x , later versions will support both in the future.



-M
Recovery of all files on the filesystem. This option is useful if the whole filesystem was deleted.
-m
Recovery of deleted files. This option is useful if a limited number of files was deleted.


This function assumes:


Under these conditions, ext4magic finds itself the optimal time parameters and the command line needs only the option "-M" or "-m"
The function we will work well even the delete process is running before many days.

ext4magic /dev/sda3 -M -d /home/recoverdir


But, was deleting not the last action in the file system, or the deletion running very slow and has required a long time (> 5 minutes), or a long time there were many individual file deletions, then additional is necessarily required the option -a with a time stamp immediately before the beginning of the deletion. In other cases, it will not work or only a few files will recover.

ext4magic /dev/sda3 -M -a 1330042429 -d /home/recoverdir

You can determine this timestamp with the histogram function ((see also, still needs to be translated))



Recovery options

This option controls the recovery algorithm used and the way recoverable files are displayed. The processing works recursive on directories and are influenced by time options. The start directory for the search is either a directory name or an inode number of a directory. Default is the root filesystem of the filesystem to recover. Only files and directories can restored if a undeleted inode copy is found in the file system journal. The journal is designed for a different task, the fast restoring of the consistency of the file system after a crash or similar problems, and not for a recover of deleted files. So it's not ensured, that always exists such a copy of each file is in the journal. Many factors play a role which are of adequate data for a recover there. (see journal, not yet been translated)
That ext4magic can found these copies, it needs a time window which determines the time period of interest. ext4magic has a default time window, the last 24hours. This means, without specification of any other time window only recovers files that have been deleted in the past 24 hours. And that means: if files were deleted long time ago, it must be set a different time window, otherwise these files can not found, even if suitable inode copies available in the journal. Background information and examples here ((
currently only partially translate))



-r
All files which are below the start directory and without any conflicts are recovered with there existing data blocks. The list of files will be identical to the files which are displayed with option -l . Symlinks and empty files have no blocks, also no conflicts with used blocks. Therefore, they are all here recovers, although necessary they are not deleted. Its not a bug its a feature ;-)

All recovered files will get their old filenames and if possible the old properties. -f defines the start directory to search or -I defines the start inode of the directory. If an inode number is defined the recovered directory will have the inode number as directory name. If a file will be recovered multiple times # chars are appended at the filename. A filename will have at most five #. Individual files can also recovered with time options and transaction numbers. If this function is used from the root directory the first stage of the multi-stage magic-function runs additional automatically. This will search for not found directories and files, because for that some required directory information missing. These files will be stored in the MAGIC-1 and MAGIC-2 directories.



-R
Mostly same functionality as -r with two major differences: This option recovers all inodes even if the data blocks are used, and user and access rights an time stamps of the directories are recovered. Empty directories are also recovered. If a directory was deleted as the last action completely the behavior of -R and -r is identical but the directory attributes are also recovered.
Has already been written to the file system after erase of files, it is possible that some data blocks has been re-used for new files. -r recover only files were no data blocks are re-used again. This are the undamaged files.
-R recover the files with their original data blocks, whether there some data blocks already used by new files, or not. It's possible this generate some corrupted files.



Every recovery output starts with ;

--------

in front of each filename which marks the successful recovery. If access rights are missing to write the file there will show up some "x".

At the end of the recovery results from the hardlink database may be displayed. There are missing hardlinks detecetd if a positive number is written in front of the filename whereas too much hard links were found. if a negative number shows up, more filenames found for this recovered inode, as contained in the link counter of this inode. Cause may be varied. If the consistent state of the hard links are important for your files, you should evaluate the output of the hard link database.

It's impossible to detect whether datablocks were reused after deletion of a file and later on the reusing file deleted also. This can lead to files with defective data and every recovered file should be checked before it's used again.





Display options

These options generate status information of the filesystem and the journal.

-S
Display the filesystem superblock. Option -x allows in addition to display the group descriptor table



-J
Display the content of the journal superblock.



-H
Display a histogram of the inode timestamps. This allows to locate the exact time when mass changes happened on the filesystem. A directory or inode number can be passed to restrict the display to this directory or inode. For every inode the last change or deletion time is used. For ext4 another histogram displays the create time of the inodes. -x allows a better resolution of the time interval.



-V
Displays only the ext4magic and libext2fs version used, and indicates whether the expert mode is active.



-T
Display of the whole transaction list of all data block copies in the journal. -B, -I and -f " can be used to restrict the display to the data blocks. Additional -x displays also for inode blocks the transaction time. The result is displayed in the sequence they were written in the journal. <--!Backups or other activities which change the timestamp of inodes may cause no display of timestamps.-->



-x
Changes the output format and output contants of most of List options. Lookup these options for more details about format and contents.



-L
Output of all filenames with their inode numbers recursiv of the selected directory which contains all deleted and not deleted files and directories. But remember, it looks in the journal data and is controlled of the time window of the Time options. -L -x will output the filenames with double quotes so they can be used as input_list for option -i.



-l
Output of file names and directory names which refer to unused data blocks. Every line includes the percentage of unused data blocks first. This output will list all files which can be recovered. If an earlier interval time limit is used with -l files may be listed where the data blocks will already be overwritten by other files which were deleted in the meantime also. This list will also contain files without any data blocks, i.e. symbolic links, empty files and special files. At the beginning of the output of each file is a percentage found.

100% means, no data blocks of the deleted file is currently in use. The same command line but with tho Option -r will recover all these files. This option can therefore easily used for check whether the time options are chosen correctly.





Filter options

These options can be used to select the file, directory or the data block to process. These options cannot be combined in one command.

-B n
n is the data block number of the filesystem block. Default is to create a "One-Byte Hex+ASCII" dump of the data block like the result of hexdump -C. -x creates a "4 Byte Hex+ASCII" dump. -t n will display the copy of this block in the journal with the transaction number specified with option t. and T will search for all copies of this block from the journal.
Example:
# ext4magic /dir/filesytem.iso -B 97 -t 22
Create a hexdump of the copy of file system block No 97 which was written in the journal in transaction No 22.
To display all transactions for block 97 with their transaction number use option -T.
Example:
# ext4magic /dir/filesystem.iso -B 97 -T



-I n
n is the inode number. Default is to display the contents of the actual inodes of the filesystem.-x will display the list of all the referred data blocks from this inode. If the inode is a directory the contents of the directory entries are displayed als. If -T or -J is used the contents is not the real inode of the filesystem then all inode copies of the journal are displayed. Option -t n displays the contents of the inode of transaction n only.
The search for specific missing files or file versions is often easier if not used the file name of the file. You can find out the inode number and then you can list out all inode copies of this inode. If found the correct inode copy for the file or directory version then can recover this highly focused with details: inode number and transaction nummer or with second accurate time options for this inode copy.



-f filename
This option has the same selection function like -I n. ext4magic tries to locate the corresponding inode number in the filesystem. Filename can be a filename or a directory name. filename is relative to the root of the filesystem, not to the Linux root directory.
Example:
Mountpoint for the file system is /home and the full Linux path name is /home/usr1/Document". filename for ext4magic now has to be
# extmagic /dev/sda3 -f usr1/Document
The root filesystem can be specified as -f / or -f "" All other filenames shouldn't have a leading and trailing /.
To find out the inode number for a file name ext4magic will looking within the time window. Files and directories can use different inode numbers at different times. Also, all directories from file root up to the desired file must be found with the same time options. That will not always succeed. If -f filename does not work, you can try to find out the inode yourself, and then recover inode number.



Time options

The following options define the time interval which is scanned in the journal and used to list or recover data. The interval start is defined by -a (AFTER) and the interval end by -b (BEFORE). In order to list or recover files, the files must have existed in a undeleted state between AFTER and BEFORE. AFTER prevents restore of very old deleted files, and all inode changes after BEFORE ignored. The Time should be in the form of the number of seconds since the UNIX epoch. All inode copies which are deleted before AFTER or change after BEFORE are ignored during processing.

If these options are not used the last 24 hours are recovered. If the data loss is detected more than one day later these options have to be used because otherwise nothing will be recovered. (Exception: Magic options -M and -m)

-a n
Defines the start time limit (AFTER). Default is the current time one day before
-b n
Defines the end time limit (BEFORE). Default is the current time.
n is the number of seconds since 01.01.1970 00:00:00 (UTC). This timeformat is used by ext4magic in a lot of reports and can be extracted from there. The date command can be used to convert standard date into this format.

Example:

Set start interval to now - 36 hours and end interval to now - 24 hours
-a $(date -d "-3day" +%s) -b $(date -d "-2day" +%s)
-t n
This option can be used together with -B, -I and -f. n is the transaction number and allows to use the inode with this transaction number to recover data. Transactionsnumbers are listed with option -T or the report of inode contents.

see the Time Options Site for more details and examples.





Input-/Output options

The following options define the input and output.

filesystem
Defines the input filesystem and is a mandatory parameter. It either can be a block device or a ext3 or ext4 filesystem (partition) or a uncompressed image of a partition.



-j journal_file
This option defines the journal file to use for recovery. Default is to use the internal journal or an external journal on a block device as it's defined in the filesystem superblock. With this option it is possible to use a journal copy.



-d target_dir
This option defines the output directory of the recovered files. The directory will be created if it does not exist. Default is to use RECOVERDIR in the directory ext4magic was started. IMPORTANT: This directory shouldn't be located on the filesystem which is recovered. The filesystem of the output directory should also be ext3 or ext4. Otherwise there may be problems with the access rights and properties of the recovered files. And last but not least, this directory should be enough space, so that all found files can written there.



-i input_list
input_list is an input file which contains file names with double quotes. All these files will be recovered if option -r or R is used and a suitable time window is set
Empty lines, unquoted lines or incorrect quoted filenames or leading and trailing characters of the line, all ignored.
Options -l -x or -L -x create output in this format and can be edited by hand or script and later used as argument for a input_list for to recover. Use the same time options for restoring as for creating the list





Expert Options

Expert mode has to be enabled during the compile step with --enable-expert-mode during configure. That way corrupted filesystems can beopened to recover data. This allows to use the superblock backups to recreate a corrupted journal inode and to recover files on a partially or corrupted filesystem.



-s blocksize -n blocknumber
This options allow to select a superblock backup. Valid blocksizes are 1024, 2048, or 4096. Blocknumber is the filesystem block number to use. The blocknumbers of these backups are dependent on the blocksize. Use the same values for blocksize and blocknumber as they are used with fsck or debugfs. Both options have to be inserted in the sequence -s and -n.



-c
This option forces ext4magic to extract the journal inode from the superblock and helps to recover data if the first inode block of the filesystem is corrupt.



-D
All data from a corrupted filesystem will be recoverd if possible. The combination all of these expert options is useful if the superblock and the begin of the filesystem is corrupted. This works only if the filesystem wasn't repaired with e2fsck already..
Example:
The first megabytes of the filesystem were overwritten. Following command tries to recover all undamanged files into the target directory /tmp/recoverdir.
# ext4magic /dev/sda1 -s 4096 -n 32768 -c -D -d /tmp/recoverdir



-Q
This option works only additional with the recovery options -r and -R and this works only if there exist correlated directory data in the journal. This option has special requirements on the quality of the journal data and requires very precise time settings. should be used carefully. If you were not sure do not use this option.


more infos about these options near future Expert-Mode





Examples

Some examples of using ext4magic. Typical scenarios with detailed description in near future on a separate page



Display examples

# ext4magic /dev/sda3 -f /
Display the contents of inode (root directory of file system)
# ext4magic /dev/sda3 -I 2
Displays the actual root inode of the filesystem. The first exampl usd the pathname and the second uses the inode number, both result in the same output



# ext4magic /tmp/filesystem.iso -f / -T -x
Display all transaction of block copies which reside in the root inode. In addition every different inode with a list of data blocks which this inode referrs to is displayed. Because the inode is a directory inode the existing block copies of the directory contents are displayed. Filesystem image /tmp/filesystem.iso is used. Have there been many changes in the directory and are there still in the Journal, then this output is a piece of history of this directory.



# ext4magic /tmp/filesystem.iso -j /tmp/journal.backup -I 8195 -t 182
Use filesystem image /tmp/filesystem.iso and read the journal from and external copy file /tmp/journal.backup. Output will contain the contents of inode number 8195 with transaction number 182
A very specific request for outputting a specific inode copy of the journal.



# ext4magic /dev/sda3 -f user1/Documents -a $(date -d "-3 day" +%s) -b $(date -d "-2 day" +%s)
Output of the contents of an existing inode copy for pathname user1/Documents which was written between 3 and 2 days before now into the journal. If it's a directory in addition the contents of the directory is listed. If there exist copies of directory blocks which map to the inode copy of the journal the contents of the directory will be listed. If there are no data blocks found then the directory contents is the actual contents of the filesystem.



Recovery examples

Attention:

Don't use the journal of a filesystem mounted read/write because this can lead to incorrect recovery results. If for some reasons the filesystem has to be kept in read/write mode create a copy of the journal and use this copy with option -j during recovery. See Save journal copy paragraph above for details how to create a copy of the journal.



# ext4magic /dev/sda3 -r -f user1/picture/cim01234.jpg -d /tmp
Recovery of file /home/user1/picture/cim01234.jpg which was deleted a couple of minutes ago. The filesystem is mounted on /home. Note: The path is the path on the filesystem and not the root filesystem of the LInux system the filesystem is mounted on. If possible unmount the filesystem before recovery. The recovered file will be stored at /tmp/user1/picture/cim01234.jpg



# ext4magic /dev/sda3 -r
Recover all files which were deleted the last 24 hours, and which have undeleted inode copies within Journal. The recovered files will be stored at ./RECOVERDIR/



# ext4magic /dev/sda3 -R -a $(date -d "-5day" +%s)
Recover all files by inode copies even if parts of the deleted files are already overwritten, or still used by the undeleted files, starting 4 days ago. Here also restored files were not deleted from the file system. The Option "-R" should therefore be used preferably in a completely cleared file system.



# ext4magic /dev/sda3 -M -d /home/recover
Try to recover all files in a multistep process with different methods. Typical usecase: A filesystem was deleted complete with rm -rf *. Recovered files are stored in /home/recover
Note: The last step for ext3 and ext4 is currently only included in different versions. see also release notes



# ext4magic /home/filesystem.iso -Lx -f user1 | grep "jpg" > ./tmpfile
# ext4magic /home/filesystem.iso -i ./tmpfile -r -d /mnt/testrecover
Recovers all files deleted the last 24 hours from directory user1/ which have the string jpg in their filename. The recovered files are stored at /mnt/testrecover. A temporary file ./tmpfile is used to get the list of filenames to recover.



Ext4magic: Inode - Directory - Journal - Install - Time_Options - Histogram - Scenarios - Tips&Tricks - Manpage - Expert-Mode