PAK file format: Difference between revisions
(Add a description of the PAK file format used by Arx Fatalis.) |
m (→File entries) |
||
(15 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
This page describes the PAK format used by Arx Fatalis for game data archives. PAK files are a very simple archive format with a file table that contains offsets to the file data, which can be stored at arbitrary positions and in arbitrary order but must be contiguous for each individual file in the archive. | This page describes the PAK format used by Arx Fatalis for game data archives. PAK files are a very simple archive format with a file table that contains offsets to the file data, which can be stored at arbitrary positions and in arbitrary order but must be contiguous for each individual file in the archive. | ||
See [[Common file format types]] for a description of the type names used here. | |||
There are several [[Arx Fatalis tools#Unpackers|tools to explore and extract Arx Fatalis .pak archives]]. | |||
== Header == | == Header == | ||
Line 35: | Line 37: | ||
=== "Encryption" === | === "Encryption" === | ||
The remaining data of the file table is encrypted using the very secure and modern [ | The remaining data of the file table is encrypted using the very secure and modern [[wp:Vigenère cipher|chiffre indéchiffrable]] with 1-bit characters: a fixed bitstring, the key, is repeated to match the data length and then bitwise xor-ed to the unencrypted data. | ||
The key differs for demo data and full game data (here given as 8-bit ASCII characters): | The key differs for demo data and full game data (here given as 8-bit ASCII characters): | ||
Line 45: | Line 47: | ||
| Demo | | Demo | ||
| | | | ||
NSIARKPRQPHBTE50GRIH3AYXJP2AMF3FCEYAVQO5QGA0JGIIH2AYXKVOA1V | |||
OGGU5GSQKKYEOIAQG1XRX0J4F5OEAEFI4DD3LL45VJTVOA1VOGGUKE50GRI | |||
| <code><b>4E 53 49 41</b></code> | | <code><b>4E 53 49 41</b></code> | ||
|- | |- | ||
| Full game | | Full game | ||
| | | | ||
AVQF3FCKE50GRIAYXJP2AMEYO5QGA0JGIIH2NHBTVOA1VOGGU5H3GSSIARK | |||
PRQPQKKYEOIAQG1XRX0J4F5OEAEFI4DD3LL45VJTVOA1VOGGUKE50GRIAYX | |||
| <code><b>41 56 51 46</b></code> | | <code><b>41 56 51 46</b></code> | ||
|} | |} | ||
These keys are included in the Arx Fatalis source release. Whitespace and terminating | These keys are included in the Arx Fatalis source release. Whitespace and terminating nul-bytes are not part of the keys. | ||
There is no field in the PAK files specifying which key is in use, but it is safe to assume that the unencrypted file table starts with at least four null bytes (an empty directory | There is no field in the PAK files specifying which key is in use, but it is safe to assume that the unencrypted file table starts with at least four null bytes (an empty directory with an empty name). This means that the first four bytes of the encrypted file table are the same as the first four bytes of the used key. Using these four bytes is also the best way to distinguish demo and release versions of the game if that information is requested by scripts. | ||
=== Directory entries === | === Directory entries === | ||
Line 67: | Line 69: | ||
! Description | ! Description | ||
|- | |- | ||
| | | <b>c string</b> | ||
| Directory path | | Directory path | ||
|- | |- | ||
Line 74: | Line 76: | ||
|} | |} | ||
The <code>directory path</code> is always given relative to the root of the archive. Individual directory names are separated by backslashes (<code>\</code>). Directory and file names are case-insensitive. Arx Libertatis converts all file and directory names to lower case when loading PAK files. | The <code>directory path</code> is always given relative to the root of the archive. Individual directory names are separated by backslashes (<code>\</code>). Directory and file names are case-insensitive and use the [[wp:ISO/IEC 8859-15|ISO-8859-15]] encoding. Arx Libertatis converts all file and directory names to lower case when loading PAK files. | ||
Each directory entry is followed directly by <code>file count</code> file entries. | Each directory entry is followed directly by <code>file count</code> file entries. | ||
Line 84: | Line 86: | ||
! Description | ! Description | ||
|- | |- | ||
| | | <b>c string</b> | ||
| Filename | | Filename | ||
|- | |- | ||
Line 100: | Line 102: | ||
|} | |} | ||
<code>Filenames</code> are relative to the path in the last directory entry and should not contain any backslashes. Directory and file names are case-insensitive. Arx Libertatis converts all file and directory names to lower case when loading PAK files. | <code>Filenames</code> are relative to the path in the last directory entry and should not contain any backslashes. Directory and file names are case-insensitive and use the [[wp:ISO/IEC 8859-15|ISO-8859-15]] encoding. It is not safe to assume that filenames only use [[wp:ASCII|ASCII]] characters. Arx Libertatis converts all file and directory names to lower case when loading PAK files. | ||
<code>Offset</code> gives the position of the file data in the number of bytes from the file start. | <code>Offset</code> gives the position of the file data in the number of bytes from the file start. | ||
Line 107: | Line 109: | ||
If the <code>flags</code> have bit <b>1</b> set, the file is compressed using the PKWARE implode library. | If the <code>flags</code> have bit <b>1</b> set, the file is compressed using the PKWARE implode library. | ||
Code to decompress PKWARE implode-encoded data can be found in [https://github.com/madler/zlib/tree/master/contrib/blast blast.c in the contrib directory of the zlib source]. In this case, <code>uncompressed size</code> gives the original size of the file before compression. | Code to decompress PKWARE implode-encoded data can be found in [https://github.com/madler/zlib/tree/master/contrib/blast blast.c in the contrib directory of the zlib source]. In this case, <code>uncompressed size</code> gives the original size of the file before compression / after decompression. | ||
Otherwise the file is stored as-is and <code>uncompressed size</code> is undefined. Other <code>flags</code> are not defined. | Otherwise the file is stored as-is and <code>uncompressed size</code> is undefined. Other <code>flags</code> are not defined. |
Latest revision as of 17:34, 18 October 2022
This page describes the PAK format used by Arx Fatalis for game data archives. PAK files are a very simple archive format with a file table that contains offsets to the file data, which can be stored at arbitrary positions and in arbitrary order but must be contiguous for each individual file in the archive.
See Common file format types for a description of the type names used here.
There are several tools to explore and extract Arx Fatalis .pak archives.
Header
The file header starts at byte position 0 - there is no magic number.
Type | Description |
---|---|
u32 | File table offset |
The file table offset
is the location of the file table (in bytes) relative to the start of the file.
File table
Header
The file table begins with the file table header:
Type | Description |
---|---|
u32 | File table size |
The file table size
is the number of bytes in the file table (not including the header).
"Encryption"
The remaining data of the file table is encrypted using the very secure and modern chiffre indéchiffrable with 1-bit characters: a fixed bitstring, the key, is repeated to match the data length and then bitwise xor-ed to the unencrypted data.
The key differs for demo data and full game data (here given as 8-bit ASCII characters):
Release type | Key | First bytes |
---|---|---|
Demo |
NSIARKPRQPHBTE50GRIH3AYXJP2AMF3FCEYAVQO5QGA0JGIIH2AYXKVOA1V OGGU5GSQKKYEOIAQG1XRX0J4F5OEAEFI4DD3LL45VJTVOA1VOGGUKE50GRI |
4E 53 49 41
|
Full game |
AVQF3FCKE50GRIAYXJP2AMEYO5QGA0JGIIH2NHBTVOA1VOGGU5H3GSSIARK PRQPQKKYEOIAQG1XRX0J4F5OEAEFI4DD3LL45VJTVOA1VOGGUKE50GRIAYX |
41 56 51 46
|
These keys are included in the Arx Fatalis source release. Whitespace and terminating nul-bytes are not part of the keys.
There is no field in the PAK files specifying which key is in use, but it is safe to assume that the unencrypted file table starts with at least four null bytes (an empty directory with an empty name). This means that the first four bytes of the encrypted file table are the same as the first four bytes of the used key. Using these four bytes is also the best way to distinguish demo and release versions of the game if that information is requested by scripts.
Directory entries
The decrypted file table is made up of multiple directory and file entries, starting with a directory entry:
Type | Description |
---|---|
c string | Directory path |
u32 | File count |
The directory path
is always given relative to the root of the archive. Individual directory names are separated by backslashes (\
). Directory and file names are case-insensitive and use the ISO-8859-15 encoding. Arx Libertatis converts all file and directory names to lower case when loading PAK files.
Each directory entry is followed directly by file count
file entries.
File entries
Type | Description |
---|---|
c string | Filename |
u32 | Offset |
u32 | Flags |
u32 | Uncompressed size |
u32 | Size |
Filenames
are relative to the path in the last directory entry and should not contain any backslashes. Directory and file names are case-insensitive and use the ISO-8859-15 encoding. It is not safe to assume that filenames only use ASCII characters. Arx Libertatis converts all file and directory names to lower case when loading PAK files.
Offset
gives the position of the file data in the number of bytes from the file start.
Size
gives the number of bytes stored for this file at the given offset.
If the flags
have bit 1 set, the file is compressed using the PKWARE implode library.
Code to decompress PKWARE implode-encoded data can be found in blast.c in the contrib directory of the zlib source. In this case, uncompressed size
gives the original size of the file before compression / after decompression.
Otherwise the file is stored as-is and uncompressed size
is undefined. Other flags
are not defined.