docs: revise pack file internals based on PR #9669 review, refs #8572

remove forward-looking N>1 references, hardcoded offsets, and stale "currently"; use borgstore vocabulary, medium-sized index files, and simplified recovery prose
2026-06-11 01:41:57 -04:00 · 2026-05-29 00:45:12 +05:30 · 2026-05-29 00:45:12 +05:30 · 6bba477ae2
commit 6bba477ae2
parent 5d6cafe4b8
5 changed files with 87 additions and 201 deletions
--- a/docs/borg_theme/css/borg.css
+++ b/docs/borg_theme/css/borg.css
@ -178,11 +178,6 @@ cite {
 #common-options .option {
    white-space: nowrap;
 }
-/* Extra vertical breathing room for figures that need it. */
-.figure-padded {
-    margin-top: 1.5em;
-    margin-bottom: 1.5em;
-}

 /* Remove the right-column max-width cap so content fills the full available width. */
 #right-column {
--- a/docs/internals/pack-layout.png
+++ b/docs/internals/pack-layout.png
--- a/docs/internals/pack-objheader.png
+++ b/docs/internals/pack-objheader.png
--- a/docs/internals/pack-write-order.png
+++ b/docs/internals/pack-write-order.png
--- a/docs/internals/packs.rst
+++ b/docs/internals/packs.rst
@ -6,15 +6,15 @@
 Pack files
 ==========

-Borg currently stores each repository object (chunk) as a separate object in the
-borgstore.  For large repositories this means millions of individual objects, each
-requiring its own I/O round trip to read or write.  On high-latency backends (SFTP,
-cloud object storage) this overhead dominates backup and restore times.
+Without pack files, each repository chunk is stored as a separate borgstore object.
+For large repositories this means millions of individual objects, each requiring its
+own I/O round trip to read or write. On high-latency backends (SFTP, cloud object
+storage) this overhead dominates backup and restore times.

-Pack files fix this by grouping multiple repo objects into a single store
-object.  A reader that needs one chunk does a partial read (range request) at
-a known offset instead of fetching a separate file.  Store object count drops
-from one-per-chunk to one-per-pack.
+Pack files address this by grouping multiple chunks into a single store object. A
+reader that needs one chunk does a partial read (range request) at a known offset
+instead of fetching a separate file. Store object count drops from one-per-chunk to
+one-per-pack.


 .. _pack-format:
@ -22,189 +22,110 @@ from one-per-chunk to one-per-pack.
 Pack File Format
 ----------------

-Every pack file begins with a fixed 9-byte file header, followed by one or more
-length-prefixed repo object blobs.  The format is designed for forward scanning:
-given only the file bytes and no external index, a reader can locate every blob
-boundary by reading the 4-byte length prefix before each blob.
-
-File header
-~~~~~~~~~~~
-
-::
-
-    Offset  Size  Type    Field
-    ------  ----  ------  -----
-    0       8     bytes   Magic: ASCII b"BORGPACK"
-    8       1     uint8   Format version: 0x01
-
-Any reader can check the magic bytes to confirm a file in the ``packs/``
-namespace is a valid Borg pack file and reject misplaced or truncated files
-before parsing further.  The version byte lets future incompatible layout changes
-be detected without bumping the repository version.
+There is no separate file header. Each blob starts with an 8-byte ``BORGPACK``
+magic, so a forward scanner can locate blob boundaries and identify each chunk
+using only the pack file bytes with no external index.

 Per-blob layout
 ~~~~~~~~~~~~~~~

-Each blob is stored as a 4-byte length prefix followed by the full repo object::
+Each blob is a self-contained unit::

    Offset (relative to blob start)  Size        Type     Field
    --------------------------------  ----------  -------  -----
-    0                                 4           uint32le blob_len
-    4                                 24          bytes    ObjHeader
-    4 + 24                            meta_size   bytes    encrypted_meta
-    4 + 24 + meta_size                data_size   bytes    encrypted_data
+    0                                 8           bytes    Magic: ASCII b"BORGPACK"
+    8                                 1           uint8    Format version: 0x01
+    9                                 32          bytes    chunk_id
+    41                                4           uint32le meta_size
+    45                                4           uint32le data_size
+    49                                meta_size   bytes    encrypted_meta
+    49 + meta_size                    data_size   bytes    encrypted_data

-``blob_len`` is the total byte count of the repo object that follows: ObjHeader
-(always 24 bytes) plus ``encrypted_meta`` plus ``encrypted_data``.  It satisfies::
+``chunk_id`` is the keyed MAC of the plaintext data (``id_hash(plaintext_data)``).
+Storing it in the unencrypted header lets a scanner rebuild the
+``chunk_id → location`` index without decrypting any blob.

-    blob_len == 24 + meta_size + data_size
+A reader locates the next blob by advancing::

-The ObjHeader is the existing 24-byte structure shared by all Borg repo objects::
+    next_blob_offset = current_blob_offset + 49 + meta_size + data_size

-    Offset  Size  Type     Field
-    ------  ----  -------  -----
-    0       4     uint32le meta_size
-    4       4     uint32le data_size
-    8       8     bytes    xxh64(encrypted_meta)
-    16      8     bytes    xxh64(encrypted_data)
+The per-blob magic limits corruption blast radius: a damaged blob causes the
+scanner to lose at most one chunk. Once it finds the next ``BORGPACK`` sequence
+it resumes.

-The ObjHeader is stored unmodified inside the pack blob; the pack layer treats blobs
-as opaque bytes and does not rewrite the header.  The SHA256 content-addressed
-``pack_id`` handles pack-level integrity; the xxh64 fields come from the existing
-RepoObj wire format and are left as-is.  They remain useful for the keyless recovery
-scan (see :ref:`pack-recovery`) where AEAD decryption is not available.
+Blobs follow one another contiguously with no padding::

-.. figure:: pack-objheader.png
-    :figwidth: 100%
-    :width: 100%
-    :figclass: figure-padded
-
-    ObjHeader structure: 24 bytes encoding the sizes and xxh64 integrity hashes for
-    the encrypted meta and data sections of each blob.
-
-Blobs follow one another contiguously with no padding between them::
-
-    [file header: 9 B]
-    [blob_len_0: 4 B][ObjHeader_0: 24 B][encrypted_meta_0][encrypted_data_0]
-    [blob_len_1: 4 B][ObjHeader_1: 24 B][encrypted_meta_1][encrypted_data_1]
+    [BORGPACK_0: 8B][0x01: 1B][chunk_id_0: 32B][meta_size_0: 4B][data_size_0: 4B][encrypted_meta_0][encrypted_data_0]
+    [BORGPACK_1: 8B][0x01: 1B][chunk_id_1: 32B][meta_size_1: 4B][data_size_1: 4B][encrypted_meta_1][encrypted_data_1]
    ...
-    [blob_len_N-1: 4 B][ObjHeader_N-1: 24 B][encrypted_meta_N-1][encrypted_data_N-1]
-
-.. figure:: pack-layout.png
-    :figwidth: 100%
-    :width: 100%
-    :figclass: figure-padded
-
-    Pack file binary layout: 9-byte file header followed by contiguous
-    length-prefixed blobs, each containing an ObjHeader and the encrypted payload.
-
-There is no trailing table of contents.  The ``index/`` namespace (see
-:ref:`pack-index-namespace`) is the sole authoritative source of chunk-to-location
-mappings for normal operation.

 Pack ID
 ~~~~~~~

-For packs containing more than one blob, the pack ID is the SHA-256 digest of the
-entire pack file content (file header plus all blobs)::
+The pack ID equals the ``chunk_id`` of the blob it contains::

-    pack_id = SHA256(pack_file_bytes)
+    pack_id = chunk_id

-This makes pack files content-addressed: the stored filename is a commitment to the
-content.  ``borg check`` can detect silent corruption by recomputing the digest and
-comparing it to the filename without decrypting any blob.
+Since ``chunk_id`` is a keyed MAC of the plaintext, the filename commits to the
+content. ``borg check`` can detect silent corruption without decrypting any blob.

 Namespace
 ~~~~~~~~~

-Pack files are stored under the ``packs/`` namespace in borgstore, using a two-level
-directory nesting on the first two bytes of the pack ID (hex-encoded)::
+Pack files are stored under the ``packs/`` namespace in borgstore, using a
+single directory level keyed on the first byte of the pack ID (hex-encoded)::

    packs/
      00/ .. ff/
-        00/ .. ff/
-          <pack_id_hex>
+        <pack_id_hex>

 The nesting depth is controlled by the ``packs/`` entry in the repository's
 ``levels_config``, the same mechanism used by the ``data/`` namespace.


-.. _pack-phase1:
+.. _pack-index-entry:

-Phase 1 Implementation (N=1)
-----------------------------
+Pack Index Entry
+----------------

-The initial implementation puts one blob per pack file.  Assembly is simpler: no
-multi-chunk buffering, and the PackIndex lookup follows directly from the chunk ID.
-
-Under this phase the pack ID is set equal to the chunk ID of the single blob it
-contains::
-
-    pack_id = chunk_id   # Phase 1 only
-
-Computing SHA-256 over the pack content is therefore unnecessary.  The pack for a
-given chunk is always at::
+Each pack contains one blob. The pack for a given chunk is always at::

    packs/<hex(chunk_id)>

-where the chunk ID is the same keyed MAC (``id_hash(plaintext_data)``) used today
-for objects in ``data/``.
+A PackIndex entry records the pack and byte range needed to read a blob::

-The ``BORGPACK`` header and the 4-byte ``blob_len`` prefix are written regardless.
-Phase 1 packs are structurally identical to multi-blob packs; readers require no
-special case for N=1.
-
-A PackIndex entry for Phase 1 packs is::
-
-    chunk_id  →  (pack_id = chunk_id,
-                  offset  = 13,        # 9-byte file header + 4-byte blob_len
-                  length  = blob_len)  # value read from bytes 9-12 of the pack file
-
-.. note::
-
-    The N value is configurable at ``borg repo-create`` time.  Expanding to N>1 in
-    a follow-on change requires no modification to the pack file format: the file
-    header and per-blob layout are identical.  Only the pack assembly, PackIndex
-    update, and pack ID computation logic changes.
+    chunk_id  →  (pack_id = chunk_id, offset, length)

+``offset`` is the position of the blob's size fields within the pack file.
+``length`` covers those size fields plus the encrypted meta and data payloads.

 .. _pack-write-order:

 Write Order and Crash Safety
 -----------------------------

-Pack data must reach stable storage before any index or manifest entry references
-it.  The required write order is:
+Pack data must be stored before any archive pointer references it.
+The required write order is:

-1. Write the pack file to ``packs/<pack_id>``.
-2. ``fsync`` the pack file and its containing directory.
-3. Write the index piece file to ``index/<index_id>`` (see :ref:`pack-index-namespace`).
-4. ``fsync`` the index piece file.
-5. Update and write the manifest.  This is the sole commit point.
+1. Store the pack file to ``packs/<pack_id>`` via borgstore.
+2. Store the partial index file to ``index/<index_id>`` (see :ref:`pack-index-namespace`).
+3. Write the archive and archive pointer. This is the sole commit point.

-.. figure:: pack-write-order.png
-    :figwidth: 40%
-    :width: 100%
-    :align: center
-
-    Write order and crash safety: the manifest write at step 5 is the sole commit
-    point; partial failures at earlier steps leave only harmless orphan data.
-
-A crash between steps 1 and 3 leaves orphan pack files in ``packs/``.  No archive
+A crash between steps 1 and 2 leaves orphan pack files in ``packs/``. No archive
 references these chunks; ``borg compact`` removes them on the next run.

-A crash between steps 3 and 5 leaves an index piece covering packs not yet committed
-to any archive.  The extra index entries point to valid, fully-written pack data; they
-are harmless and will be cleaned up by the next ``borg compact``.
+A crash between steps 2 and 3 leaves a partial index file covering packs not yet
+committed to any archive. The extra index entries point to valid, fully-written pack
+data; they are harmless and will be cleaned up by the next ``borg compact``.

-A crash after step 5 cannot leave the repository in an inconsistent state.  The
-manifest is the commit point: data that the manifest does not reference is unreachable
-and treated as garbage by ``borg compact``.
+A crash after step 3 cannot leave the repository in an inconsistent state. The
+archive pointer write is the commit point: data not referenced by any archive pointer
+is unreachable and treated as garbage by ``borg compact``.

-Deletion is soft: ``repository.delete()`` does not remove the pack file.  The pack
-stays on disk until ``borg compact`` confirms via mark-and-sweep that none of its
-blobs appear in any archive, then removes the whole file.  The same approach carries to N>1: there is no way to remove one blob from a pack
-without rewriting the whole file, so soft-delete is the only option.
+Only ``borg compact`` and ``borg check --repair`` delete pack files. When compact
+determines via mark-and-sweep that none of a pack's blobs are referenced by any
+archive, it removes the whole file. Individual blobs cannot be removed without
+rewriting the entire pack, so deletion always operates at pack granularity.


 .. _pack-index-namespace:
@ -212,28 +133,31 @@ without rewriting the whole file, so soft-delete is the only option.
 Index Namespace
 ---------------

-Borg does not embed a table of contents inside each pack file.  Chunk-to-location
-mappings are stored as a separate set of encrypted piece files under the ``index/``
-namespace.
+Chunk-to-location mappings are stored as a separate set of encrypted partial index
+files under the ``index/`` namespace.

-Each piece file covers the packs written in one backup session.  Its name is the
-SHA-256 digest of its own content::
+Each partial index file covers the packs written in one backup session. Its name is
+the SHA-256 digest of its own content::

    index/
      <sha256_of_content_hex>

-Content-addressed naming makes each piece file self-verifying and idempotent: writing
-the same index data twice produces the same filename, so a repeated write is a no-op.
+Content-addressed naming makes each partial index file self-verifying and idempotent:
+writing the same index data twice produces the same filename, so a repeated write is
+a no-op.

-Piece files are write-once.  A session appends new piece files; existing files are
-never modified.  On repository open, the client downloads all files under ``index/``,
-decrypts them, and merges the results into the in-memory PackIndex (a ``borghash``
-``HashTableNT`` keyed on ``chunk_id``).  The merge is commutative and idempotent;
-piece file order does not matter.
+Partial index files are write-once. A session stores new partial index files via
+borgstore; existing files are never modified. On repository open all files under
+``index/`` are loaded via borgstore, decrypted, and merged into the in-memory PackIndex
+(a ``borghash`` ``HashTableNT`` keyed on ``chunk_id``). The merge is commutative and
+idempotent; order does not matter.

-``borg compact`` consolidates all existing piece files into a single replacement file
-that covers only live chunks, writes it to ``index/``, and removes the files it
-supersedes.  This keeps the namespace small and open-time merge cost bounded.
+``borg compact`` rewrites the ``index/`` namespace: it identifies live chunks via
+mark-and-sweep, consolidates the surviving mappings into medium-sized replacement
+files (targeting roughly 10–100 packs per file), and removes the files it supersedes.
+Medium-sized files keep the open-time merge cost bounded while avoiding the
+cache-invalidation traffic on other clients that a single all-in-one index would
+cause.

 If the entire ``index/`` namespace is lost or corrupt, the PackIndex can be rebuilt
 by scanning pack files directly; see :ref:`pack-recovery`.
@ -247,43 +171,10 @@ Recovery Path
 When ``borg check --repair`` detects a missing or incomplete PackIndex it rebuilds
 it by forward-scanning all pack files in ``packs/``.

-The 4-byte ``blob_len`` prefix before each blob makes the scan self-contained: no
-prior knowledge of blob sizes or count is required.  The algorithm for one pack file::
-
-    verify magic   = first 8 bytes are b"BORGPACK"
-    verify version = byte 8 is 0x01
-
-    pos = 9
-    while pos < file_size:
-        if pos + 4 > file_size:
-            raise CorruptPackError(pack_id, pos)
-        blob_len = uint32le(file[pos : pos + 4])
-        if blob_len == 0 or pos + 4 + blob_len > file_size:
-            raise CorruptPackError(pack_id, pos)
-
-        obj_header          = file[pos + 4 : pos + 4 + 24]
-        meta_size, data_size = uint32le(obj_header[0:4]), uint32le(obj_header[4:8])
-
-        # Verify ObjHeader integrity without the key.
-        assert xxh64(file[pos+28 : pos+28+meta_size])            == obj_header[8:16]
-        assert xxh64(file[pos+28+meta_size : pos+4+blob_len])    == obj_header[16:24]
-
-        # Reconstruct the chunk_id for this blob (requires the repository key).
-        chunk_id = derive_chunk_id(pack_id, pos + 4, blob_len)
-
-        record_index_entry(chunk_id,
-                           pack_id = pack_id,
-                           offset  = pos + 4,
-                           length  = blob_len)
-        pos += 4 + blob_len
-
-The ``offset`` recorded in the rebuilt index points past the ``blob_len`` prefix,
-directly at the ObjHeader, consistent with normal PackIndex entries.
-
-Reconstructing ``chunk_id`` values requires the repository key because the chunk ID
-is a keyed MAC of the plaintext data (``id_hash(plaintext_data)``).  Without the key,
-a structural scan can still verify magic bytes, version, blob boundaries, and
-ObjHeader xxh64 hashes, but cannot produce a usable ``chunk_id → location`` mapping.
+Each blob's unencrypted header supplies the ``BORGPACK`` magic (for re-sync after
+corruption), the ``chunk_id``, and the size fields needed to locate the next blob.
+The scan produces a complete ``chunk_id → (pack_id, offset, length)`` mapping
+without decrypting any blob and without the repository key.


 .. _pack-repo-version:
@ -291,11 +182,11 @@ ObjHeader xxh64 hashes, but cannot produce a usable ``chunk_id → location`` ma
 Repository Version and Feature Flags
 --------------------------------------

-Repositories using pack files require repository version **4**.  Clients that only
+Repositories using pack files require repository version **4**. Clients that only
 accept version 3 refuse to open a version 4 repository with an unsupported-version
 error before any data is read.

-In addition, the manifest's ``config.feature_flags`` must include ``pack_files`` in
+In addition, the repository ``config.feature_flags`` must include ``pack_files`` in
 the mandatory set for all access modes:

 .. code-block:: python
@ -310,9 +201,9 @@ the mandatory set for all access modes:

 A client that does not recognise the ``pack_files`` feature flag will refuse to open
 the repository with a ``MandatoryFeatureUnsupported`` error regardless of the version
-number.  The two guards cover different failure modes: the version bump stops clients
+number. The two guards cover different failure modes: the version bump stops clients
 that predate feature-flag support entirely; the feature flag gives a clearer error
 message to clients that understand feature flags but don't know about packs yet.

-There is no migration path from version 3 repositories to version 4.  Users of the
+There is no migration path from version 3 repositories to version 4. Users of the
 version 3 beta format must create a new repository with ``borg repo-create``.