Navigation:Documentation:Other:Link Collection:~~CLOUD:30~~ |
Table of Contents
Hex-Edit your way through the volumeThis how-to document will explain you how to find a file manually in some NTFS volume. You will need
Finding the $MFT
Open the volume using your favourite hex-editor. The hex listings in this document were generated by the standard At the very beginning (offset 0), you should see something similar to: # hexdump -Cn 16 /dev/hda1 00000000 eb 52 90 4e 54 46 53 20 20 20 20 00 02 08 00 00 |.R.NTFS .....|
The first 3 bytes are a jump command in x86 assembly. It is the first instruction executed after the boot sector receives control from the MBR (let's say, from GRUB or LILO). Right after that you find the Since this is the boot sector, let's open the NTFS Technical documentation on the $Boot entry (in the files chapter), and see what information we can extract here. Well, it does say that offset 0 has a 3 bytes jump, and an 8 bytes magic, so I'm right this far. But it also says something about “LCN of VCN 0 of the $MFT”, what is that? The MFT is the Master File Table. It is a table (ya right) of entries (that's what a table is about), where each entry is called a “file record”. This is the table that lists every file/directory/meta-data on the volume. We obviously interested in that (unless all you want is to find the volume serial number or some other trivial stuff).
And what are those LCN/VCN abbreviations? These are Logical/Virtual Cluster Numbers. If you don't know what a cluster is, check out the Let's return to our example: So we need to find the “LCN of VCN 0 of the $MFT”. # hexdump -C -s 0x30 -n 8 /dev/hda1 00000030 0a 00 00 00 00 00 00 00 |........| (Almost) All the NTFS structures are in Little-Endian format. In simple and practical terms, numbers are read backwards. Therefore, the LCN we need is 0x0A, or in decimal: 10. The contents of offset 0x0B in $Boot is 0×0200 (see the listing above, and don't forget that it is in Little-Endian format), and offset 0x0D is 0×08. That means that each cluster (on my system) is 512*8=4096 (=0×1000) bytes. Now we know where to find the $MFT: in offset 4096*10=40960 (=0xA000). Let's dump the very beginning of it: # hexdump -C -s 0xa000 -n 32 /dev/hda1 0000a000 46 49 4c 45 2a 00 03 00 41 c7 b4 c6 00 00 00 00 |FILE*...A.......| 0000a010 01 00 01 00 30 00 01 00 c0 01 00 00 00 04 00 00 |....0...........|
The Congratulations, you (me) have found the (first entry in the) $MFT file. Finding a specific MFT record
The $MFT is a stream of MFT records, a.k.a
Each MFT record has the size of # hexdump -C -s 0x40 -n 4 /dev/hda1 00000040 f6 00 00 00 |....| On my volume, this is -10 (if you don't know how to recognize negative numbers, this is the time to learn). Looking again at the $Boot entry, we find out that -10 means that each MFT record contains exactly 1024 bytes. Let's suppose we want to find MFT record #1234. We can find it in offset 1234 * 1024 in the $MFT. A regular guy would try to add the $MFT offset in the volume and would conclude that the record is in 1234 * 1024 + 2048 = xyz from the start of the volume. This is wrong because the MFT can become fragmented, and therefore, The VCNs of the $MFT are not laid out sequentially on the volume, as a result, LCN x is not always VCN x + 10 (10 comes from the example). The real question is “Where is $MFT offset 1234 * 1024 on the volume?” Seeking an offset in a file.
Suppose you have a
Most attributes are resident, however, some can become nonresident. A resident attribute is an attribute that stores its data inside the To find an offset in the file, we need to find what VCN this offset is in. In the example, this is in the middle of VCN 308 (1234 * 1024 / 4096). Then we look at the run-list to map it to a LCN. Don't forget to remove the fixup before searching through the run-list - this is one of the most common mistakes. Finding the run-listLook at the MFT record, read offset 0×14 (Offset to the first Attribute), and go through that list by using offset 0×4 of each attribute (Length) until you find the one you look for (usually 0×80). Removing the fixupThere's a long description on that in the NTFS Technical documentation. No need to duplicate it here. Finding a specific MFT record (cont)
So by now, we know where the $MFT starts. We have the first file record. By looking in the
And since we have the
Notice that this is almost cheating - For finding the file record of $MFT we don't look it up like every other file record (using the run-list). This is because a cluster is guaranteed to be larger than a file record, and thus the Of course, we know how to find an offset inside the $MFT because we know how to use that run-list. and we know how to calculate that offset, so were set. Now all that we need it to find The MFT record number of the file we need. Translating a filename to its MFT record numberMFT record #5 is called . (dot). This is the root directory. You will need to split the path leading to your file to sub-directories. Then, you will need to scan each directory, starting with ”.” until you can find your file. But since this howto becomes large enough, and fulfilled it's goal to help you start, you will have to use another how-to. ConclusionYou have experienced in using an hex-viewer/editor and looked (logically) at your hard-disk from a very low level. You also have experienced in reading the NTFS Technical documentation. That means you now have the tools (and starting to develop the skill) to read every little detail that resides on NTFS volumes. Congratulations, You have reached the end of this how-to. I hoped you enjoyed your flight. The Linux-NTFS team. |