Commit graph

90 commits

Author SHA1 Message Date
Thomas Waldmann
3c173cc03b wrap msgpack, fixes #3632, fixes #2738
wrap msgpack to avoid future upstream api changes making troubles
or that we would have to globally spoil our code with extra params.

make sure the packing is always with use_bin_type=False,
thus generating "old" msgpack format (as borg always did) from
bytes objects.

make sure the unpacking is always with raw=True,
thus generating bytes objects.

note:

safe unicode encoding/decoding for some kinds of data types is done in Item
class (see item.pyx), so it is enough if we care for bytes objects on the
msgpack level.

also wrap exception handling, so borg code can catch msgpack specific
exceptions even if the upstream msgpack code raises way too generic
exceptions typed Exception, TypeError or ValueError.
We use own Exception classes for this, upstream classes are deprecated
2018-08-06 17:32:55 +02:00
Thomas Waldmann
89c11f45ce cache lock: use lock_wait everywhere to fix infinite wait
also: clarify docs
(cherry picked from commit 2f3e60d9d5)
2018-07-16 23:50:04 +02:00
Thomas Waldmann
de113bab23 move capacity calculation to IndexBase, fixes #2646
we just give how many "usable" hashtable entries we want and it computes
the hashtable capacity internally via int(usable / MAX_LOAD_FACTOR).
2018-06-12 22:25:27 +02:00
Thomas Waldmann
e2f71b5dc3
cleanup: get rid of ignore_inode, replace with cache_mode
ignore_inode == ('i' not in cache_mode)  # i)node
2018-03-24 17:04:20 -07:00
Thomas Waldmann
b1e7e7f90a
cleanup: get rid of Cache.do_files, replace with cache_mode
not do_files == (cache_mode == 'd')  # d)isabled
2018-03-24 17:04:20 -07:00
Thomas Waldmann
91e5e231f1
read files cache early, init checkpoint timer after that, see #3394
reading the files cache can take considerable amount of time (a user
reported 1h 42min for a 700MB files cache for a repo with 8M files and
15TB total), so we must init the checkpoint timer after that or borg
will create the checkpoint too early.

creating a checkpoint means (among other stuff) saving the files cache,
which will also take a lot of time in such a case, one time too much.

doing this in a clean way required some refactoring:
- cache_mode is now given to Cache initializer and stored in instance
- the files cache is loaded early in _do_open (if needed)
2018-03-24 17:04:13 -07:00
Milkey Mouse
4098f0f05c
Set previous_location on load instead of save
This caused a really stupid bug with borg config --cache, see
https://github.com/borgbackup/borg/issues/3304#issuecomment-371896766
2018-03-10 16:52:42 -08:00
Thomas Waldmann
4e0f369d0a fix borg create never showing M status
the problem was that the upper layer code did not have enough information
about the file, whether it is known or not - and thus, could not decide
correctly whether status should be M)odified or A)dded.

now, file_known_and_unchanged method returns an additional "known"
boolean to fix this.

also: add comment about files cache loading in cache_mode='r'
2018-02-26 11:07:20 +01:00
Thomas Waldmann
2493598eef files cache: improve exception handling, fixes #3553
now deals with:
- corrupted files cache (truncated or modified not by borg)
- inaccessible/unreadable files cache
- missing files cache

The latter fix is not sufficient, the cache transaction processing
would still stumble over expected, but missing files in the cache.

(cherry picked from commit 423ec4ba1e)
2018-01-20 06:39:48 +01:00
Thomas Waldmann
8f772437f2 also delete security dir when deleting a repo, fixes #3427 2017-12-16 22:59:47 +01:00
Aidan Woods
21a553b1ae
Highlight that information is obtained from security dir
(deleting the cache will not bypass this error in the
event the user knows this is a legitimate repo).
2017-11-20 01:08:33 +00:00
Thomas Waldmann
0190abff81 cache: use SaveFile for more safety, fixes #3158
Looks like under unfortunate circumstances, these files could become
0 byte files (see #3158). SaveFile usage should prevent that.
2017-10-15 00:39:39 +02:00
Thomas Waldmann
5e2de8ba67 implement files cache mode control, fixes #911
You can now control the files cache mode using this option:

--files-cache={ctime,mtime,size,inode,rechunk,disabled}*

(only some combinations are supported)

Previously, only these modes were supported:
- mtime,size,inode (default of borg < 1.1.0rc4)
- mtime,size (by using --ignore-inode)
- disabled (by using --no-files-cache)

Now, you additionally get:
- ctime alternatively to mtime (more safe), e.g.:
  ctime,size,inode (this is the new default of borg >= 1.1.0rc4)
- rechunk (consider all files as changed, rechunk them)

Deprecated:
- --ignore-inodes (use modes without "inode")
- --no-files-cache (use "disabled" mode)

The tests needed some changes:
- previously, we use os.utime() to set a files mtime (atime) to specific
  values, but that does not work for ctime.
- now use time.sleep() to create the "latest file" that usually does
  not end up in the files cache (see FAQ)
2017-10-01 00:52:32 +02:00
enkore
8dc79aea7a Merge pull request #2950 from enkore/f/mt-1.5b
cache: adjust AdHocCache API
2017-08-21 12:19:20 +02:00
Marian Beermann
c5a154985f cache: adjust AdHocCache API (test_create_no_cache_sync) 2017-08-20 21:23:55 +02:00
Thomas Waldmann
c8920fa2e6 ignore corrupt files cache, fixes #2939
ignore the files cache when corrupt and emit a warning message
so the users notices that there is a problem.

(cherry picked from commit 4eadb59c10)
2017-08-19 01:13:12 +02:00
Marian Beermann
2623e330a4 cache: write_archive_index: truncate_and_unlink on error 2017-07-24 10:45:57 +02:00
Thomas Waldmann
2edbcd7703 chunk_incref: compute "_size or size" only once 2017-07-23 13:53:48 +02:00
Thomas Waldmann
fc3498ac53 chunk_incref: use "size" for public api 2017-07-23 13:53:48 +02:00
Thomas Waldmann
663d3c544a chunk_incref size assertion: fail early 2017-07-23 13:53:48 +02:00
Thomas Waldmann
186123cb68 give known chunk size to chunk_incref, fixes #2853
chunk_incref was called when dealing with part files without giving the
known chunk size in the size_ parameter.

adjusted LocalCache.chunk_incref to have same signature.
2017-07-23 13:53:47 +02:00
Marian Beermann
55e1a54385 AdHocCache: avoid divison by zero
0.01 ~ "one tick or less". ymmv.
2017-06-18 13:32:12 +02:00
Marian Beermann
4689fd0c22 cache: explain fetch_missing_csize cost 2017-06-18 02:04:31 +02:00
Marian Beermann
2cbff48fd3 AdHocCache: explicate chunk_incref assertion 2017-06-18 02:01:27 +02:00
Marian Beermann
5eeca3493b TestAdHocCache 2017-06-18 02:01:27 +02:00
Marian Beermann
3c8257432a cache sync: fetch_missing_csize don't check ids against empty idx
This is always the case if self.do_cache is False.
2017-06-18 02:01:27 +02:00
Marian Beermann
fc7c560345 AdHocCache: fix size not propagating to incref 2017-06-18 02:01:27 +02:00
Marian Beermann
8aa745ddbd create: --no-cache-sync 2017-06-18 02:01:26 +02:00
Marian Beermann
c9c227f2ca cache sync: check Operation.READ compatibility with manifest 2017-06-12 23:46:49 +02:00
textshell
86363dcd4b Merge pull request #2648 from textshell/feature/mandatory-features-master
Add minimal version of in repository mandatory feature flags. (master)
2017-06-10 17:50:28 +02:00
Martin Hostettler
b8ad8b84da Cache: Wipe cache if compatibility is not sure
Add detection of possibly incompatible combinations
of the borg versions maintaining the cache and the featues used.
2017-06-10 11:42:48 +02:00
Marian Beermann
92a01f9d6c cache sync: fix incorrect .integrity location for .compact 2017-06-09 12:23:27 +02:00
Marian Beermann
3789459a41 cache sync: extract read_archive_index function 2017-06-09 12:23:26 +02:00
Marian Beermann
09a9d892cf cache sync: convert existing archive chunks idx to compact 2017-06-09 12:23:26 +02:00
Marian Beermann
6e011b9354 cache: compact hashindex before writing to chunks.archive.d 2017-06-09 12:23:26 +02:00
Marian Beermann
67b97f2223 cache sync: cleanup progress handling, unused parameters 2017-06-02 17:43:15 +02:00
Marian Beermann
835b0e5ee0 cache sync/remote: compressed, decrypted cache 2017-06-02 17:43:15 +02:00
Marian Beermann
c786a5941e CacheSynchronizer: redo as quasi FSM on top of unpack.h
This is a (relatively) simple state machine running in the
data callbacks invoked by the msgpack unpacking stack machine
(the same machine is used in msgpack-c and msgpack-python,
changes are minor and cosmetic, e.g. removal of msgpack_unpack_object,
removal of the C++ template thus porting to C and so on).

Compared to the previous solution this has multiple advantages
- msgpack-c dependency is removed
- this approach is faster and requires fewer and smaller
  memory allocations

Testability of the two solutions does not differ in my
professional opinion(tm).

Two other changes were rolled up; _hashindex.c can be compiled
without Python.h again (handy for fuzzing and testing);
a "small" bug in the cache sync was fixed which allocated too
large archive indices, leading to excessive archive.chunks.d
disk usage (that actually gave me an idea).
2017-06-02 17:43:15 +02:00
Marian Beermann
167875b753 cache sync: fix n^2 behaviour in lookup_name 2017-06-02 17:43:14 +02:00
Marian Beermann
9f8b967a6f cache sync: initialize master index to known capacity 2017-06-02 17:43:14 +02:00
Marian Beermann
740898d83b CacheSynchronizer 2017-06-02 17:43:14 +02:00
Marian Beermann
9032aa062b testsuite: simplify ArchiverCorruptionTestCase 2017-05-31 18:08:20 +02:00
Marian Beermann
0a5d9b6f7c cache sync: close archive chunks file before renaming 2017-05-31 18:06:28 +02:00
Marian Beermann
d35d388d9c cache integrity: handle interference from old versions 2017-05-25 17:44:01 +02:00
Marian Beermann
50ac9d914d testsuite: add ArchiverCorruptionTestCase 2017-05-25 17:44:01 +02:00
Marian Beermann
f59affe585 cache: fix possible printf issue with archive names in sync 2017-05-25 17:44:01 +02:00
Marian Beermann
addd7addfe cache: chunks.archive.d: autofix corruption 2017-05-25 17:44:01 +02:00
Marian Beermann
1dfe693003 cache: integrity checking in archive.chunks.d 2017-05-25 16:28:46 +02:00
Marian Beermann
2b518b7188 cache: add integrity checking of chunks and files caches 2017-05-25 16:28:46 +02:00
Marian Beermann
c23e1e28c6 use assert_secure for all commands that use the manifest
This already excludes the debug commands that we wouldn't want this on.
2017-05-20 14:58:17 +02:00