It's easy enough to verify exhaustively for any plausible chunker params
that Padmé always produces at most a 12% overhead. Checking that again
at runtime is pointless.
This only happened when:
- using borg extract --numeric-ids
- processing NFS4 ACLs
It didn't affect POSIX ACL processing.
This is rather old code, so it looks like nobody used that
code or the bug was not reported.
The bug was discovered by PyCharm's "Junie" AI. \o/
Sometimes, usually for file content chunks, it makes sense to
generate all-zero replacement chunks on-the-fly.
But for e.g. an archive items metadata stream, this does not
make sense (because it wants to msgpack.unpack the data), so
we rather want None. In that case, we do not have the size
information anyway.
preloading: always use raise_missing=False, because
the behaviour is defined at preloading time.
fetch_many: use get_many with raise_missing=False.
if get_many yields None instead of the expected chunk
cdata bytes, on-the-fly create an all-zero replacement
chunk of the correct size (if the size is known) and
emit an error msg about the missing chunk id / size.
note: for borg recreate with re-chunking this is a bit
unpretty, because it will transform a missing chunk into
a zero bytes range in the target file in the recreated
archive. it will emit an error message at recreate time,
but afterwards the recreated archive will not "know"
about the problem any more and will just have that
zero-patched file.
so guess borg recreate with re-chunking should better
only be used on repos that do not miss chunks.
Well, it's not totally removed, some code in Item, Archive and
borg transfer --from-borg1 needs to stay in place, so that we
can pick the CORRECT chunks list that is in .chunks_healthy
for all-zero-replacement-chunk-patched items when transferring
archives from borg1 to borg2 repos.
transfer: do not transfer replacement chunks, deal with missing chunks in other_repo
FUSE fs read: IOError or all-zero result
Improve handling when defining a passphrase or debugging passphrase issues, fixes#8496
Setting `BORG_DEBUG_PASSPHRASE=YES` enables passphrase debug logging to stderr, showing passphrase, hex utf-8 byte sequence and related env vars if a wrong passphrase was encountered.
Setting `BORG_DISPLAY_PASSHRASE=YES` now always shows passphrase and its hex utf-8 byte sequence.
if retry is True, it will just retry to get a valid answer.
if retry is False, it will return the default.
the code can be tested by entering "error" (without the quotes).
It needs to be possible to iterate over all items in an archive,
do some output (e.g. if an item is included / excluded) and then
only preload content data chunks for the included items.
We do not want that urllib spoils test output with LibreSSL related
warnings on OpenBSD.
`NotOpenSSLWarning: urllib3 v2 only supports OpenSSL 1.1.1+, currently
the 'ssl' module is compiled with 'LibreSSL 3.8.2'`.
Worst (but frequent) case here is that all or most of the chunks
in the repo need to get recompressed, thus storing all chunk ids
in a python list would need significant amounts of memory for
large repositories.
We already have all chunk ids stored in cache.chunks, so we now just
flag the ones needing re-compression by setting the F_COMPRESS flag
(that does not need any additional memory).
- ChunkIndex: implement system flags
- ChunkIndex: F_NEW flag as 1st system flag for newly added chunks
- incrementally write only NEW chunks to repo/cache/chunks.*
- merge all chunks.* when loading the ChunkIndex from the repo
Also: the cached ChunkIndex only has the chunk IDs. All values are just dummies.
The ChunkIndexEntry value can be used to set flags and track size, but we
intentionally do not persist flags and size to the cache.
The size information gets set when borg loads the files cache and "compresses"
the chunks lists in the files cache entries. After that, all chunks referenced
by the files cache will have a valid size as long as the ChunkIndex is in memory.
This is needed so that "uncompress" can work.
- doesn't need a separate file for the hash
- we can later write multiple partial chunkindexes to the cache
also:
add upgrade code that renames the cache from previous borg versions.
Consider soft-deleted archives/ directory entries, but only create a new
archives/ directory entry if:
- there is no entry for that archive ID
- there is no soft-deleted entry for that archive ID either
Support running with or without --repair.
Without --repair, it can be used to detect such inconsistencies and return with rc != 0.
--repository-only contradicts --find-lost-archives.
We are only interested in archive metadata objects here, thus for most repo objects
it is enough to read the repoobj's metadata and determine the object's type.
Only if it is the right type of object, we need to read the full object (metadata
and data).