Data Pages

Data pages store records. Records are basically just an array of bytes with a given length. The contents of a record should be interpreted according to the type of the data page:

Every record can be uniquely identified in the database file using a record pointer.

Data Page Header

The page header is identical for all data pages.

Name Length Type Description
Page Signature 2 bytes UINT 16 LE Always 0x0101. Identifies this as a data page.
Free Space 2 bytes UINT 16 LE number of available bytes on this data page.
Owner 4 bytes UINT 32 LE
  • page number of the containing table's table definition page for table row pages
  • 'LVAL' for long value pages
Unknown 4 bytes ??? version 4 and later only
Record Count 2 bytes UINT 16 LE number of records stored on this page

repeated for every record (see Record Count):

Record Offset 2 bytes UINT 16 LE

This contains the offset of every record from the beginning of this page (including header). Records are stored beginning from the end of the page. The first record extends to the end of the page, and all other records extend until the beginning of the previous record.

Since only 12 bits are needed to adress all locations on a page, the 4 high order bits are used for flags:

  • 0x4000 means that the record contains a 4 byte record pointer to an overflow page where the record is actually stored. I think the record on the overflow page is marked with 0x8000.
  • 0x8000 means ignore this record. Could be referenced from somewhere else, or deleted.

When iterating through rows, simply ignore all with 0x8000 set, and always follow 0x4000 records.

Records in a Table Row Page

As stated before, every record in a table row page corresponds to a row in a table. The record first contains the contents of all fixed length columns, and then the contents of all variable length columns (if there are variable length columns in the table). The necessary metadata for the variable length columns (offsets etc.) are stored near the end of the record, so to read the variable length columns, one must start at the end.

Name Length Type Description
Field Count 1 byte (Jet3)
2 bytes (Jet 4)
The number of fields in this row
Fixed Length Fields n bytes The contents of the fixed length fields

only if there are variable length columns:

Variable Length Fields n bytes The contents of variable length fields that are not NULL
Total Data Length 1 byte (Jet3)
2 bytes (Jet 4)
The offset of the byte after the last variable length field. Used to determine the length of the last variable field.
Variable Length Field Offsets 1 byte per field (Jet3)
2 bytes per field (Jet 4)

The offsets from the beginning of the row of every variable length field.

Stored in reverse order.

Variable Length Jump Table floor((length of row - 1) / 256) bytes UINT 8

Jet 3 only.

Byte i of this field contains the number of the first variable length field that has an offset ≥(i+1)*256.

Stored in reverse order.

Required because the offsets are only stored as single byte integers in Jet 3.

Variable Length Field Count 1 byte (Jet3)
2 bytes (Jet 4)
The number of variable length fields.

Field Bitmap floor(number of columns + 7 / 8) bytes bitmap

A bitmap for every column in the table, starting with the least significant bit for the first column.

  • 0 for fields that are NULL
  • 1 otherwise

Boolean fields cannot be NULL and are encoded in this bitmap.

Memo Fields

The total number of characters in a row is limited by the database page size. To store longer texts and files, there are Memo and OLE fields. If they contain small amounts of data, they are stored inline; otherwise they are stored in other pages. Memo and OLE fields can contain up to 1GB of data (see Microsoft Access 2003 Specifications).

Name Length Type Description
Field Length 4 bytes UINT 32 LE

The number of bytes in this field. Since the maximum size is 1GB, only 30 bits are needed for the size (bit mask 0x3FFFFFFF). This leaves the two high order bits for flags. These flags determine how the MEMO (or OLE) field is stored:

  • 0x80000000: the data is inline right after the header
  • 0x40000000: the data is in a single LVAL type 1 record
  • 0x00000000: (no flag set) the data is stored in multiple LVAL type 2 records
Data Location 4 bytes UINT 32 LE A record pointer to some LVAL page where the data is actually stored.
Unknown 4 bytes ??? Usually 0x0000

Records in an LVAL page

Records in an LVAL page can be of two types: Type 1 records contain data only, and type 2 records contain a part of the data, and, like a linked list, point to the next LVAL record.

Name Length Type Description
Next Record 4 bytes
(type 2 only)

A record pointer to the next LVAL record.

Data remaining bytes The data stored in this record (or parts of it for type 2 records)