Commit graph

1570 commits

Author SHA1 Message Date
Thomas Waldmann
3d0c61a184 revert incorrect fix for put updating shadow_index, fixes #5661
A) the compaction code needs the shadow index only for this case:

segment A: PUT x, segment B: DEL x, with A < B  (DEL shadows the PUT).

B) for the following case, we have no shadowing DEL (or rather: it does not matter,
because there is a PUT right after the DEL) and x is in the repo index,
thus the shadow_index is not needed for the special case in the compaction code:

segment A: PUT x, segment B: DEL x PUT x

see also PR #5636.

reverts f079a83fed
and clarifies the code by more comments.

we keep the code deduplication of 5f32b5666a
and just add a update_shadow_index param to make it not look like there was
something accidentally forgotten, which was the whole reason for the reverted
"fix".
2021-02-04 02:29:43 +01:00
Thomas Waldmann
a77db94b01 fix bad default: manifest.archives.list(consider_checkpoints=False), fixes #5668
also, add a comment about it, to avoid future similar mistakes.
2021-02-03 14:13:27 +01:00
Thomas Waldmann
649603f247 only print stats if not Ctrl-C'ed
if the user interrupts by ctrl-c, we do not save the archive,
thus we can not show stats because archive.id will not be in
the chunks index.
2021-02-03 01:56:52 +01:00
Thomas Waldmann
12d9110882 also accept msgpack up to 1.0.2
exclude 1.0.1 though, which had some issues (not sure whether they affect borg).
2021-01-30 21:31:33 +01:00
Thomas Waldmann
f079a83fed fix updating shadow_index also in put
The shadow_index should be in same state after both of these sequences
(let's assume that A is not in repo yet for simplicity, but it does not matter):

a) explicit delete: put(A), delete(A), put(A), resulting in: PUT A, DEL A, PUT A repo contents

b) implicit delete: put(A), put(A), resulting in: PUT A, DEL A, PUT A repo contents
2021-01-29 17:05:01 +01:00
Thomas Waldmann
5f32b5666a deduplicate code of put and delete, no functional change 2021-01-29 17:05:01 +01:00
Thomas Waldmann
6f00b025d8 remove empty shadowed_segments lists, fixes #5275
also:
- add test for removed empty shadowed_segments list
- add some comments
- add repo_dump test debug tool
2021-01-29 15:44:49 +01:00
Thomas Waldmann
a83fdd7de9 implement borg debug dump-hints 2021-01-29 14:11:32 +01:00
Thomas Waldmann
c8e9131158 remove bundled blake2 code, usage of libb2
we just use it via python3 now.
2021-01-28 18:00:00 +01:00
Thomas Waldmann
1dbe86a14e use blake2b from hashlib 2021-01-28 18:00:00 +01:00
Thomas Waldmann
6fa5bb4630 add tests for blake2b_128 2021-01-28 17:59:46 +01:00
TW
699256edbd
Merge pull request #5620 from ThomasWaldmann/sparse-file-integr2
Sparse file support (integration)
2021-01-17 17:45:53 +01:00
Thomas Waldmann
6dc334422e fixup: improve comment about assumptions in the item metadata stream chunker 2021-01-15 21:51:15 +01:00
Thomas Waldmann
2391d160a8 add all-zero detection to buzhash chunk data processing 2021-01-15 21:27:29 +01:00
Thomas Waldmann
2d76365214 cosmetic: directly set allocation instead going via is_zero 2021-01-15 21:10:07 +01:00
Thomas Waldmann
8162e2e67b cached_hash is only used in archive, move it there 2021-01-14 20:50:12 +01:00
Thomas Waldmann
e41dc6e96f use zeros for benchmarks 2021-01-14 20:19:10 +01:00
Thomas Waldmann
be257728ca move zeros to constants module 2021-01-14 20:02:18 +01:00
Thomas Waldmann
3b9798cffc remove max_chunk_size (unused) 2021-01-14 19:56:39 +01:00
TW
4041bdf169
Merge pull request #5606 from ThomasWaldmann/fix-5603-master
do not recurse into duplicate roots, fixes #5603 (master)
2021-01-11 16:51:49 +01:00
Thomas Waldmann
4e3be1db5e reuse zeros also in fixed-size chunker for all-zero chunk detection
also: zeros.startswith() is faster
2021-01-08 23:39:53 +01:00
Thomas Waldmann
ef19d937ed use cached_hash also to generate all-zero replacement chunks
at least for major amounts of fixed-size replacement hashes,
this will be much faster. also less memory management overhead.
2021-01-08 23:39:53 +01:00
Thomas Waldmann
f3088a9893 rename chunk_to_id_data to cached_hash 2021-01-08 23:39:53 +01:00
Thomas Waldmann
92f221075a refactor recreate to use chunk_to_id_data 2021-01-08 23:39:53 +01:00
Thomas Waldmann
b3659e0b8c reuse chunker.zeros for sparse extraction 2021-01-08 23:39:53 +01:00
Thomas Waldmann
9fd284ce1a refactor new zero chunk handling to be reusable 2021-01-08 23:39:53 +01:00
Thomas Waldmann
6d0f9a52eb detect all-zero chunks, avoid hashing them
comparing zeros is quicker than hashing them.
the comparison should fail quickly inside non-zero data.
2021-01-08 17:40:06 +01:00
Thomas Waldmann
52bd55b29a integrate Chunk type, avoid hashing holes 2021-01-08 17:39:51 +01:00
Thomas Waldmann
7319f85b54 adapt the existing chunker tests 2021-01-08 17:33:25 +01:00
Thomas Waldmann
8c299696aa Chunker: yield Chunk namedtuple instead of bytes/memoryview 2021-01-08 01:10:44 +01:00
Thomas Waldmann
7a3a49e99b reverted changes to 3rd party code
all algorithms/* stuff needs to be fixed upstream.

we just copy the files from there now and then.

https://github.com/lz4/lz4
https://github.com/facebook/zstd
https://github.com/Cyan4973/xxHash
2021-01-07 18:18:15 +01:00
Andrea Gelmini
72e7c46fa7 Fix typos 2021-01-07 17:54:33 +01:00
Thomas Waldmann
f2cb17d66c check: debug log segment filename 2021-01-03 18:23:52 +01:00
Thomas Waldmann
73c04398f3 move requires_hardlinks upwards 2021-01-03 17:55:45 +01:00
axapaxa
b291b91962
Add remote upload buffer (--remote-buffer) (#5574)
add remote upload buffer (--remote-buffer)

- added new option --remote-buffer
- allow to_send to grow to selected size
- don't grow if wait is specified
- fill pipe on any command (including 'async_response')
- add new option to docs
- create EfficientBytesQueue to prevent recreation of buffer each time we send something
- add tests for EfficientBytesQueue
2021-01-03 17:37:16 +01:00
Thomas Waldmann
806dc5084d do not recurse into duplicate roots, fixes #5603 2021-01-02 23:01:31 +01:00
Thomas Waldmann
257791274f add a test whether a duplicate root is skipped, see #5603 2021-01-02 23:00:52 +01:00
Thomas Waldmann
58c0a0186f add a test for hardlink extraction issue, see #5603 2021-01-02 23:00:38 +01:00
Thomas Waldmann
37a7436ff9 detect sparse support by fs 2020-12-28 19:53:52 +01:00
Thomas Waldmann
c0c0da9c76 skip sparse tests if has_seek_hole is False
also: do the os.SEEK_(HOLE|DATA) check only once
2020-12-27 22:06:08 +01:00
Thomas Waldmann
b8bb0494f6 create --sparse, file map support for the "fixed" chunker, see #14
a file map can be:

- created internally inside chunkify by calling sparsemap, which uses
  SEEK_DATA / SEEK_HOLE to determine data and hole ranges inside a
  seekable sparse file.
  Usage: borg create --sparse --chunker-params=fixed,BLOCKSIZE ...
  BLOCKSIZE is the chunker blocksize here, not the filesystem blocksize!

- made by some other means and given to the chunkify function.
  this is not used yet, but in future this could be used to only read
  the changed parts and seek over the (known) unchanged parts of a file.

sparsemap: the generate range sizes are multiples of the fs block size.
           the tests assume 4kiB fs block size.
2020-12-27 22:06:08 +01:00
Thomas Waldmann
227dccdfdc use strerror(e.errno) to get verbose error msg
otherwise it is just like: [Errno NN] Exxxxx
2020-12-25 19:36:37 +01:00
Thomas Waldmann
2dbdaebd8a fix tests for new xattr exception handler, see #5583 2020-12-25 19:35:27 +01:00
Thomas Waldmann
d986114e5e refactor/dedup xattr exception handler 2020-12-25 19:30:05 +01:00
Thomas Waldmann
ecae0841b1 extract: add generic exception handler when setting xattrs, fixes #5092
emit a warning message giving the path, xattr key and error message.

also: continue trying to restore other xattrs and bsdflags afterwards
(it did not continue with this before this fix).
2020-12-25 19:24:49 +01:00
TW
2b992fe078
Merge pull request #5332 from amikula/keep-oldest-when-retention-target-not-met
Keep oldest when retention target not met
2020-12-25 19:00:19 +01:00
TW
f3b90cc5c7
Merge pull request #5576 from ypid/feature/https-everywhere
Use HTTPS everywhere (mechanical edit using util from https-everywhere)
2020-12-22 21:24:41 +01:00
Robin Schneider
0742fe7ab7
Comply with editorconfig insert_final_newline in paperkey.html 2020-12-22 17:31:00 +01:00
Robin Schneider
fb38ba579f
Use HTTPS everywhere (mechanical edit using util from https-everywhere)
Ref: https://github.com/EFForg/https-everywhere/tree/master/utils/rewriter

```Shell
~/src/EFForg/https-everywhere/utils/rewriter/rewriter.js .
```

A few changes were reset/fixed manually before the commit.
2020-12-22 16:36:40 +01:00
Thomas Waldmann
dc2a57af47 use pytest.fixture instead of yield_fixture, fixes #5575
/vagrant/borg/borg/.tox/py36-none/lib/python3.6/site-packages/borg/testsuite/remote.py:73:
    PytestDeprecationWarning: @pytest.yield_fixture is deprecated.
Use @pytest.fixture instead; they are the same.
Docs: https://docs.pytest.org/en/stable/warnings.html
2020-12-20 00:11:04 +01:00