mirror of
https://github.com/borgbackup/borg.git
synced 2026-06-11 01:41:57 -04:00
Merge pull request #9750 from ThomasWaldmann/remove-xxh64
remove xxhash / xxh64 requirement, mentions
This commit is contained in:
commit
d2cdeaff57
5 changed files with 8 additions and 30 deletions
|
|
@ -846,8 +846,7 @@ and disk space on subsequent runs. Here what Borg does when you run ``borg creat
|
|||
- Transmits to repo. If the repo is remote, this usually involves an SSH connection
|
||||
(does its own encryption / authentication).
|
||||
- Stores the chunk into a key/value store (the key is the chunk id, the value
|
||||
is the data). While doing that, it computes XXH64 of the data (repo low-level
|
||||
checksum, used by borg check --repository).
|
||||
is the data).
|
||||
|
||||
Subsequent backups are usually very fast if most files are unchanged and only
|
||||
a few are new or modified. The high performance on unchanged files primarily depends
|
||||
|
|
|
|||
|
|
@ -14,7 +14,6 @@
|
|||
.. _ACL: https://en.wikipedia.org/wiki/Access_control_list
|
||||
.. _libacl: https://savannah.nongnu.org/projects/acl/
|
||||
.. _libattr: https://savannah.nongnu.org/projects/attr/
|
||||
.. _libxxhash: https://github.com/Cyan4973/xxHash
|
||||
.. _liblz4: https://github.com/Cyan4973/lz4
|
||||
.. _libzstd: https://github.com/facebook/zstd
|
||||
.. _OpenSSL: https://www.openssl.org/
|
||||
|
|
@ -28,4 +27,3 @@
|
|||
.. _userspace filesystems: https://en.wikipedia.org/wiki/Filesystem_in_Userspace
|
||||
.. _Cython: https://cython.org/
|
||||
.. _virtualenv: https://pypi.org/project/virtualenv/
|
||||
.. _python-xxhash: https://github.com/ifduyue/python-xxhash/
|
||||
|
|
|
|||
|
|
@ -81,14 +81,9 @@ A repo object has a structure like this:
|
|||
|
||||
* 32-bit meta size
|
||||
* 32-bit data size
|
||||
* 64-bit xxh64(meta)
|
||||
* 64-bit xxh64(data)
|
||||
* meta
|
||||
* data
|
||||
|
||||
The size and xxh64 hashes can be used for server-side corruption checks without
|
||||
needing to decrypt anything (which would require the borg key).
|
||||
|
||||
The overall size of repository objects varies from very small (a small source
|
||||
file will be stored as a single repository object) to medium (big source files will
|
||||
be cut into medium-sized chunks of some MB).
|
||||
|
|
@ -897,8 +892,7 @@ Data corruption in the files cache could create incorrect archives, e.g. due
|
|||
to wrong object IDs or sizes in the files cache.
|
||||
|
||||
Therefore, Borg calculates checksums when writing these files and tests checksums
|
||||
when reading them. Checksums are generally 64-bit XXH64 hashes.
|
||||
The canonical xxHash representation is used, i.e. big-endian.
|
||||
when reading them. Checksums are generally 256-bit sha256 hashes.
|
||||
Checksums are stored as hexadecimal ASCII strings.
|
||||
|
||||
For compatibility, checksums are not required and absent checksums do not trigger errors.
|
||||
|
|
@ -909,19 +903,7 @@ Checksums are a data safety mechanism. They are not a security mechanism.
|
|||
|
||||
.. rubric:: Choice of algorithm
|
||||
|
||||
XXH64 has been chosen for its high speed on all platforms, which avoids performance
|
||||
degradation in CPU-limited parts (e.g. cache synchronization).
|
||||
Unlike CRC32, it neither requires hardware support (crc32c or CLMUL)
|
||||
nor vectorized code nor large, cache-unfriendly lookup tables to achieve good performance.
|
||||
This simplifies deployment of it considerably (cf. src/borg/algorithms/crc32...).
|
||||
|
||||
Further, XXH64 is a non-linear hash function and thus has a "more or less" good
|
||||
chance to detect larger burst errors, unlike linear CRCs where the probability
|
||||
of detection decreases with error size.
|
||||
|
||||
The 64-bit checksum length is considered sufficient for the file sizes typically
|
||||
checksummed (individual files up to a few GB, usually less).
|
||||
xxHash was expressly designed for data blocks of these sizes.
|
||||
sha256 has been chosen for its wide availability on all platforms and hw acceleration on some.
|
||||
|
||||
Lower layer — file_integrity
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
|
@ -959,10 +941,10 @@ All checksums are compiled into a simple JSON structure called *integrity data*:
|
|||
.. code-block:: json
|
||||
|
||||
{
|
||||
"algorithm": "XXH64",
|
||||
"algorithm": "SHA256",
|
||||
"digests": {
|
||||
"HashHeader": "eab6802590ba39e3",
|
||||
"final": "e2a7f132fc2e8b24"
|
||||
"HashHeader": "eab6802590ba39e3...",
|
||||
"final": "e2a7f132fc2e8b24..."
|
||||
}
|
||||
}
|
||||
|
||||
|
|
@ -996,7 +978,7 @@ The ``[integrity]`` section is used:
|
|||
|
||||
[integrity]
|
||||
manifest = 10e...21c
|
||||
files = {"algorithm": "XXH64", "digests": {"HashHeader": "eab...39e3", "final": "e2a...b24"}}
|
||||
files = {"algorithm": "SHA256", "digests": {"HashHeader": "eab...39e3", "final": "e2a...b24"}}
|
||||
|
||||
The manifest ID is duplicated in the integrity section due to the way all Borg
|
||||
versions handle the config file. Instead of creating a "new" config file from
|
||||
|
|
|
|||
|
|
@ -39,7 +39,6 @@ dependencies = [
|
|||
"argon2-cffi",
|
||||
"shtab>=1.8.0",
|
||||
"backports-zstd; python_version < '3.14'", # for python < 3.14.
|
||||
"xxhash>=2.0.0",
|
||||
"jsonargparse>=4.47.0",
|
||||
"PyYAML>=6.0.2", # we need to register our types with yaml, jsonargparse uses yaml for config files
|
||||
"blake3>=1.0.0",
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
#!/bin/bash
|
||||
|
||||
pacman -S --needed --noconfirm git mingw-w64-ucrt-x86_64-{toolchain,pkgconf,lz4,xxhash,openssl,rclone,python-msgpack,python-argon2_cffi,python-platformdirs,python,cython,python-setuptools,python-wheel,python-build,python-pkgconfig,python-packaging,python-pip,python-paramiko,rust,python-maturin}
|
||||
pacman -S --needed --noconfirm git mingw-w64-ucrt-x86_64-{toolchain,pkgconf,lz4,openssl,rclone,python-msgpack,python-argon2_cffi,python-platformdirs,python,cython,python-setuptools,python-wheel,python-build,python-pkgconfig,python-packaging,python-pip,python-paramiko,rust,python-maturin}
|
||||
|
||||
if [ "$1" = "development" ]; then
|
||||
pacman -S --needed --noconfirm mingw-w64-ucrt-x86_64-python-{pytest,pytest-benchmark,pytest-cov,pytest-xdist}
|
||||
|
|
|
|||
Loading…
Reference in a new issue