ReFS File System structure

A ReFS filesystem can be identified with the following signature at the very start of the partition:

    00 00 00 52  65 46 53 00  00 00 00 00  00 00 00 00 ...ReFS.........
    46 53 52 53  XX XX XX XX  XX XX XX XX  XX XX XX XX FSRS

ReFS pages are 0x4000 bytes in length.

On all inspected systems, the first page number is 0x1e (0x78000 bytes after the start of the partition containing the filesystem). This is inline w/ Microsoft documentation which states that the first metadata dir is at a fixed offset on the disk.

Other pages contain various system, directory, and volume structures and tables as well as journaled versions of each page (shadow-written upon regular disk writes)

The first byte of each page is its Page Number

The first 0x30 bytes of every metadata page (dubbed the Page Header) seem to follow a certain pattern:


    byte  0: XX XX 00 00   00 00 00 00   YY 00 00 00   00 00 00 00
    byte 16: 00 00 00 00   00 00 00 00   ZZ ZZ 00 00   00 00 00 00
    byte 32: 01 00 00 00   00 00 00 00   00 00 00 00   00 00 00 00

  • dword 0 (XX XX) is the page number which is sequential and corresponds to the 0x4000 offset of the page
  • dword 2 (YY) is the journal number or sequence number
  • dword 6 (ZZ ZZ) is the "Virtual Page Number", which is non-sequential (eg values are in no apparent order) and seem to tie related pages together.
  • dword 8 is always 01, perhaps an "allocated" flag or other

Multiple pages may share a virtual page number (byte 24/dword 6) but usually don't appear in sequence.

The Object Table (virtual page number 0x02) associates object ids' with the pages on which they reside. Here we an AttributeList consisting of Records of key/value pairs (see below for the specifics on these data structures). We can lookup the object id of the root directory (0x600000000) to retrieve the page on which it resides:

   50 00 00 00 10 00 10 00 00 00 20 00 30 00 00 00 - total length / key & value boundries
   00 00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 - object id
   F4 0A 00 00 00 00 00 00 00 00 02 08 08 00 00 00 - page id / flags
   CE 0F 85 14 83 01 DC 39 00 00 00 00 00 00 00 00 - checksum
   08 00 00 00 08 00 00 00 04 00 00 00 00 00 00 00

^ The object table entry for the root dir, containing its page (0xAF4)

When retrieving pages by id or virtual page number, look for the ones with the highest sequence number as those are the latest copies of the shadow-write mechanism.

Directories, from the root dir down, follow a consistent pattern. They are comprised of sequential lists of data structures whose length is given by the first word value (Attributes and Attribute Lists).

List are often prefixed with a Header Attribute defining the total length of the Attributes that follow that consititute the list. Though this is not a hard set rule as in the case where the list resides in the body of another Attribute (more on that below).

In either case, Attributes may be parsed by iterating over the bytes after the directory page header, reading and processing the first word to determine the next number of bytes to read (minus the length of the first word), and then repeating until null (0000) is encountered (being sure to process specified padding in the process)

Various Attributes take on different semantics including references to subdirs and files as well as branches to additional pages containing more directory contents (for large directories); though not all Attributes have been identified.

The structures in a directory listing always seem to be of one of the following formats:

Base Attribute

The simplest / base attribute consisting of a block whose length is given at the very start.

An example of a typical Attribute follows:

      a8 00 00 00  28 00 01 00  00 00 00 00  10 01 00 00
      10 01 00 00  02 00 00 00  00 00 00 00  00 00 00 00
      00 00 00 00  00 00 00 00  a9 d3 a4 c3  27 dd d2 01
      5f a0 58 f3  27 dd d2 01  5f a0 58 f3  27 dd d2 01
      a9 d3 a4 c3  27 dd d2 01  20 00 00 00  00 00 00 00
      00 06 00 00  00 00 00 00  03 00 00 00  00 00 00 00
      5c 9a 07 ac  01 00 00 00  19 00 00 00  00 00 00 00
      00 00 01 00  00 00 00 00  00 00 00 00  00 00 00 00
      00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00
      00 00 00 00  00 00 00 00  01 00 00 00  00 00 00 00
      00 00 00 00  00 00 00 00

Here we a section of 0xA8 length containing the following four file timestamps (more on this conversion below)

       a9 d3 a4 c3  27 dd d2 01 - 2017-06-04 07:43:20
       5f a0 58 f3  27 dd d2 01 - 2017-06-04 07:44:40
       5f a0 58 f3  27 dd d2 01 - 2017-06-04 07:44:40
       a9 d3 a4 c3  27 dd d2 01 - 2017-06-04 07:43:20

It is safe to assume that either

  • one of the first fields in any given Attribute contains an identifier detailing how the attribute should be parsed or
  • the context is given by the Attribute's position in the list.
  • attributes corresponding to given meaning are referenced by address or identifier elsewhere

Records

Key / Value pairs whose total length and key / value lengths are given in the first 0x20 bytes of the attribute. These are used to associated metadata sections with files whose names are recorded in the keys and contents are recorded in the value.

An example of a typical Record follows:

    40 04 00 00   10 00 1A 00   08 00 30 00   10 04 00 00   @.........0.....
    30 00 01 00   6D 00 6F 00   66 00 69 00   6C 00 65 00   0...m.o.f.i.l.e.
    31 00 2E 00   74 00 78 00   74 00 00 00   00 00 00 00   1...t.x.t.......
    A8 00 00 00   28 00 01 00   00 00 00 00   10 01 00 00   ¨...(...........
    10 01 00 00   02 00 00 00   00 00 00 00   00 00 00 00   ................
    00 00 00 00   00 00 00 00   A9 D3 A4 C3   27 DD D2 01   ........©Ó¤Ã'ÝÒ.
    5F A0 58 F3   27 DD D2 01   5F A0 58 F3   27 DD D2 01   _ Xó'ÝÒ._ Xó'ÝÒ.
    A9 D3 A4 C3   27 DD D2 01   20 00 00 00   00 00 00 00   ©Ó¤Ã'ÝÒ. .......
    00 06 00 00   00 00 00 00   03 00 00 00   00 00 00 00   ................
    5C 9A 07 AC   01 00 00 00   19 00 00 00   00 00 00 00   \..¬............
    00 00 01 00   00 00 00 00   00 00 00 00   00 00 00 00   ................
    00 00 00 00   00 00 00 00   00 00 00 00   00 00 00 00   ................
    00 00 00 00   00 00 00 00   01 00 00 00   00 00 00 00   ................
    00 00 00 00   00 00 00 00   20 00 00 00   A0 01 00 00   ........ ... ...
    D4 00 00 00   00 02 00 00   74 02 00 00   01 00 00 00   Ô.......t.......
    78 02 00 00   00 00 00 00 ...(cutoff)                   x.......

Here we see the Record parameters given by the first row:

  • total length - 4 bytes = 0x440
  • key offset - 2 bytes = 0x10
  • key length - 2 bytes = 0x1A
  • flags / identifer - 2 bytes = 0x08
  • value offset - 2 bytes = 0x30
  • value length - 2 bytes = 0x410

Naturally, the Record finishes after the value, 0x410 bytes after the value start at 0x30, or 0x440 bytes after the start of the Record (which lines up with the total length).

We also see that this Record corresponds to a file I created on disk as the key is the File Metadata flag (0x10030) followed by the filename (mofile1.txt).

Here the first attribute in the Record value is the simple attribute we discussed above, containing the file timestamps. The File Reference Attribute List Header follows (more on that below).

From observation Records w/ flag values of '0' or '8' are what we are looking for, while '4' occurs often, this almost always seems to indicate a Historical Record, or a Record that has since been replaced with another.

Since Records are prefixed with their total length, they can be thought of a subclass of Attribute.

AttributeList

These are more complicated but interesting. At first glance they are simple Attributes of length 0x20 but upon further inspection we consistently see it contains the length of a large block of Attributes (this length is inclusive, as it contains this first one). After parsing this Attribute, dubbed the 'List Header', we should read the remaining bytes in the List as well as the padding, before arriving at the next Attribute

   20 00 00 00   A0 01 00 00   D4 00 00 00   00 02 00 00 <- list header specifying total length (0x1A0) and padding (0xD4)
   74 02 00 00   01 00 00 00   78 02 00 00   00 00 00 00
   80 01 00 00   10 00 0E 00   08 00 20 00   60 01 00 00
   60 01 00 00   00 00 00 00   80 00 00 00   00 00 00 00
   88 00 00 00  ... (cutoff)

Here we see an Attribute of 0x20 length, that contains a reference to a larger block size (0x1A0) in its third word.

This can be confirmed by the next Attribute whose size (0x180) is the larger block size minute the length of the header (0x1A0 - 0x20). In this case the list only contains one item/child attribute.

In general a simple strategy to parse the entire case would be to:

  • Parse Attributes individually as normal
  • If we encounter a List Header Attribute, we calculate the size of the list (total length minus header length)
  • Then continue parsing Attributes, adding them to the list until the total length is completed.

It also seems that:

  • the padding that occurs after the list is given by header word number 5 (in this case 0xD4). After the list is parsed, we consistently see this many null bytes before the next Attribute begins (which is not part of & unrelated to the list).
  • the type of list is given by its 7th word; directory contents correspond to 0x200 while directory branches are indicated with 0x301

Directory Tree Branches

These are Attribute Lists where each Attribute corresponds to a record whose value references a page which contains more directory contents.

Upon encountering an AttributeList header with flag value 0x301, we should

  • iterate over the Attributes in the list,
  • parse them as Records,
  • use the first dword in each value as the page to repeat the directory traversal process (recursively).

Additional files and subdirs found on the referenced pages should be appended to the list of current directory contents.

Note this is the (an?) implementation of the BTree structure in the ReFS described by Microsoft, as the record keys contain the tree leaf identifiers (based on file and subdirectory names).

This can be used for quick / efficient file and subdir lookup by name (see 'optimization' in 'next steps' below)

SubDirectories

These are simply Records in the directory's Attribute List whose key contains the Directory Metadata flag (0x20030) as well as the subdir name.

The value of this Record is the corresponding object id which can be used to lookup the page containing the subdir in the object table.

A typical subdirectory Record:

    70 00 00 00  10 00 12 00  00 00 28 00  48 00 00 00
    30 00 02 00  73 00 75 00  62 00 64 00  69 00 72 00  <- here we see the key containing the flag (30 00 02 00) followed by the dir name ("subdir2")
    32 00 00 00  00 00 00 00  03 07 00 00  00 00 00 00  <- here we see the object id as the first qword in the value (0x730)
    00 00 00 00  00 00 00 00  14 69 60 05  28 dd d2 01  <- here we see the directory timestamps (more on those below)
    cc 87 ce 52  28 dd d2 01  cc 87 ce 52  28 dd d2 01
    cc 87 ce 52  28 dd d2 01  00 00 00 00  00 00 00 00
    00 00 00 00  00 00 00 00  00 00 00 10  00 00 00 00
 

Files: like directories are Records whose key contains a flag (0x10030) followed by the filename.

The value is far more complicated though and while we've discovered some basic Attributes allowing us to pull timestamps and content from the fs, there is still more to be deduced as far as the semantics of this Record's value.

  • The File Record value consists of multiple attributes, though they just appear one after each other, without a List Header. We can still parse them sequentially given that all Attributes are individually prefixed with their lengths and the File Record value length gives us the total size of the block.

  • The first attribute contains 4 file timestamps at an offset given by the fifth byte of the attribute (though this position may be coincidental an the timestamps could just reside at a fixed location in this attribute).

In the first attribute example above we see the first timestamp is

a9 d3 a4 c3 27 dd d2 01

This corresponds to the following date

017-06-04 07:43:20

Timestamps being in nanoseconds since the Windows Epoch Data (11644473600 = Jan 1, 1601 UTC)

The second Attribute seems to be the Header of an Attribute List containing the 'File Reference' semantics. These are the Attributes that encapsulate the file length and content pointers.

I'm assuming this is an Attribute List so as to contain many of these types of Attributes for large files. What is not apparent are the full semantics of all of these fields.

But here is where it gets complicated, this List only contains a single attribute with a few child Attributes. This encapsulation seems to be in the same manner as the Attributes stored in the File Record value above, just a simple sequential collection without a Header.

In this single attribute (dubbed the 'File Reference Body') the first Attribute contains the length of the file while the second is the Header for yet another List, this one containing a Record whose value contains a reference to the page which the file contents actually reside.

      ----------------------------------------
      | ...                                  |
      ----------------------------------------
      | File Entry Record                    |
      | Key: 0x10030 [FileName]              |
      | Value:                               |
      | Attribute1: Timestamps               |
      | Attribute2:                          |
      |   File Reference List Header         |
      |   File Reference List Body(Record)   |
      |     Record Key: ?                    |
      |     Record Value:                    |
      |       File Length Attribute          |
      |       File Content List Header       |
      |       File Content Record(s)         |
      | Padding                              |
      ----------------------------------------
      | ...                                  |
      ----------------------------------------

While complicated each level can be parsed in a similar manner to all other Attributes & Records, just taking care to parse Attributes into their correct levels & structures.

As far as actual values,

  • the file length is always seen at a fixed offset within its attribute (0x3c) and
  • the content pointer seems to always reside in the second qword of the Record value. This pointer is simply a reference to the page which the file contents can be read verbatim.