-
Notifications
You must be signed in to change notification settings - Fork 11
Writing File Signatures for the File Carver
⚠️ The FileSignature is still a work in progress and definitely has room for improvement but it does the job. This write-up explains how it is currently implemented.
The file carver checks for different file types at certain offsets based on a configurable increment known as the file carver interval. As an example, if the file carver interval is set to 0x800, then the file carver will search every 0x800 bytes for all known file types. We use a constant increment because file data is always stored aligned to some block size. On FATX, this block size is typically 0x4000, so that is the default increment. In this example we will assume an increment that is unaligned with 0x4000. In the image below we've identified a file that exists at 0x2000. This is a made up file that follows the "Magic" format.
The file types are implemented in C# and exist in the FATXTools/FATX/Analyzers/Signatures/
folder. At each increment, the file carver will take each of these signatures and run them at each offset to determine whether or not the file type exists at that offset.
❌ PNGSignature
❌ BMPSignature
❌ XBESignature
✔️ MagicSignature
Let's take a closer look at the Magic file type as shown below. We find the identifier which marks the start of the file as "MGIC". We can use this to tell the file carver that we found a valid "Magic" file. Then we also see the size of the file is written afterwards, likely a 32 bit big endian integer "00 00 00 38", and we see that it conveniently also stores the file name for us as "MyMagicFile.mg". We can also see that there is a end of file marker shown as "ENDF".
Color | Offset | Description |
---|---|---|
🔴 | 0x0 | Identifier |
🟢 | 0x4 | File Size (int32) |
🔵 | 0xC | File Name (c-string) |
🟡 | 0x34 | End Identifier |
Each file type implements a FileSignature
class and all it should do is read the data, determine if the data at a given offset is a specific file type, and if it is, parse the data to figure out some the file's properties such as the name and most importantly the size. The file carver will save those properties including the offset that it found it at and will then allow the user to dump them at the end.
To create a new FileSignature
class, we need to implement the Test()
and Parse()
methods. The Test()
method simply checks if the file type exists at the offset given in the constructor. The Parse()
method does the work of recovering as much information as possible from the file format.
The FileSignature
class has read utility methods to read binary data from the current offset, as well as a SetByteOrder(ByteOrder)
method that lets you switch to either little or big endian. See the FileSignature class for implementation details.
Method | Description |
---|---|
Seek | Seeks to an offset relative to the base offset given in the constructor. |
SetByteOrder | Allows you to switch to little or big endian read modes. |
ReadBytes | Reads and returns an array of bytes. |
ReadUInt16 | Reads and returns a 16 bit integer. |
ReadUInt32 | Reads and returns a 32 bit integer. |
ReadCString | Reads and returns a null terminated string |
Property | Description |
---|---|
FileName | [Optional] Set this if you can recover the file name. Default file name is the file signature name (e.g. MagicSignature) with an index appended to it. |
FileSize | [Optional] Set this if you can recover the file size. Default is 0 which means no data is recovered |
For this Magic file type, we can implement it like so:
class MagicSignature : FileSignature
{
private const string Identifier = "MGIC";
// <summary>
// The constructor for this signature class. We usually don't do much in here except initialize the base class.
// <summary>
// <param name="volume">The volume contains file system information.</param>
// <param name="offset">The offset tells you what offset the file carver is currently at.</param>
public MagicSignature(Volume volume, long offset) : base(volume, offset)
{
}
// <summary>
// Returns whether or not the file type exists at the offset given in the constructor.
// </summary>
public override bool Test()
{
byte[] identifier = ReadBytes(4);
// If the offset is at 0x2000, then this should return true.
if (Encoding.ASCII.GetString(identifier) == Identifier)
{
return true;
}
// If the offset is not at 0x2000 and the identifier does not match, we must return false.
// Returning false prevents the Parse() method from being called and tells the file carver that
// this file type does not exist at the given offset.
return false;
}
// <summary>
// Read the file properties if any are available.
// </summary>
public override void Parse()
{
// Default endian is little, but we need big endian for this file type.
SetByteOrder(ByteOrder.Big);
// Seek to the 4th byte. Note that this is relative to the offset given in the constructor.
Seek(0x4);
// Read the 32 bit file size we noted earlier.
this.FileSize = ReadUInt32();
// Seek to the 12th byte (0xC in hexadecimal).
Seek(0xC);
// Read the file name we noted earlier.
this.FileName = ReadCString();
}
}
We don't really need the ENDF identifier since we have the size of the file already, but it can still be useful to let the user know that the file may be corrupted if we don't find what we expect to find.
private const string EndIdentifier = "ENDF";
public override void Parse()
{
...
// We do this sometime after reading the FileSize
// Seek to the end of the file minus 4 to try and read the end identifier
Seek(this.FileSize - 4);
var endIdentifier = ReadBytes(4);
if (Encoding.ASCII.GetString(endIdentifier) != endIdentifier)
{
Console.WriteLine("End identifier does not exist at end of file! File may be corrupted!");
}
...
}
The output in FATXTools should reveal these files.
File Type | Offset | File Name | File Size |
---|---|---|---|
MagicSignature | 0x2000 | MyMagicFile.mg | 0x38 |