PAK file format

From Arx Libertatis Wiki
Revision as of 05:07, 11 February 2012 by Ds (talk | contribs) (Add a description of the PAK file format used by Arx Fatalis.)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

This page describes the PAK format used by Arx Fatalis for game data archives. PAK files are a very simple archive format with a file table that contains offsets to the file data, which can be stored at arbitrary positions and in arbitrary order but must be contiguous for each individual file in the archive.

All integers are encoded in little-endian byte order. Types are the same as in platform/Platform.h - s (signed integer) / u (unsigned integer) / f (float), followed by the number of bits. Arrays are denoted by brackets like in c++.

Header

The file header starts at byte position 0 - there is no magic number.

Type Description
u32 File table offset

The file table offset is the location of the file table (in bytes) relative to the start of the file.

File table

Header

The file table begins with the file table header:

Type Description
u32 File table size

The file table size is the number of bytes in the file table (not including the header).

"Encryption"

The remaining data of the file table is encrypted using the very secure and modern chiffre indéchiffrable using 1-bit characters (a fixed bitstring, the key, is repeated to match the file data length and then bitwise xor-ed to the unencrypted data).

The key differs for demo data and full game data (here given as 8-bit ASCII characters):

Release type Key First bytes
Demo
NSIARKPRQPHBTE50GRIH3AYXJP2AMF3FCEYAVQO5QGA0JGIIH2AYXKVOA1VOGGU5
GSQKKYEOIAQG1XRX0J4F5OEAEFI4DD3LL45VJTVOA1VOGGUKE50GRI
4E 53 49 41
Full game
AVQF3FCKE50GRIAYXJP2AMEYO5QGA0JGIIH2NHBTVOA1VOGGU5H3GSSIARKPRQPQ
KKYEOIAQG1XRX0J4F5OEAEFI4DD3LL45VJTVOA1VOGGUKE50GRIAYX
41 56 51 46

These keys are included in the Arx Fatalis source release. Whitespace and terminating null-bytes are not part of the keys.

There is no field in the PAK files specifying which key is in use, but it is safe to assume that the unencrypted file table starts with at least four null bytes (an empty directory without an empty name). This means that the first four bytes of the encrypted file table are the same as the first four bytes of the used key. Using these four bytes is also the best way to distinguish demo and release versions of the game if that information is requested by scripts.

Directory entries

The decrypted file table is made up of multiple directory and file entries, starting with a directory entry:

Type Description
null-terminated string Directory path
u32 File count

The directory path is always given relative to the root of the archive. Individual directory names are separated by backslashes (\). Directory and file names are case-insensitive. Arx Libertatis converts all file and directory names to lower case when loading PAK files.

Each directory entry is followed directly by file count file entries.

File entries

Type Description
null-terminated string Filename
u32 Offset
u32 Flags
u32 Uncompressed size
u32 Size

Filenames are relative to the path in the last directory entry and should not contain any backslashes. Directory and file names are case-insensitive. Arx Libertatis converts all file and directory names to lower case when loading PAK files.

Offset gives the position of the file data in the number of bytes from the file start.

Size gives the number of bytes stored for this file at the given offset.

If the flags have bit 1 set, the file is compressed using the PKWARE implode library. Code to decompress PKWARE implode-encoded data can be found in blast.c in the contrib directory of the zlib source. In this case, uncompressed size gives the original size of the file before compression.

Otherwise the file is stored as-is and uncompressed size is undefined. Other flags are not defined.