Commit graph

24 commits

Author SHA1 Message Date
TW
adac324b6c Merge pull request #240 from ThomasWaldmann/cache-config-check
fix multiple issues with the cache config version check, fixes #234
2015-10-03 19:19:22 +02:00
Thomas Waldmann
893242ead4 fix multiple issues with the cache config version check, fixes #234
- issue #234: handle exception when config file is empty is really not a borg cache config
- there was a unused %s in the Exception string
- error msg was wrong when version check failed - this IS a borg cache, but not of expected version
2015-10-02 18:11:10 +02:00
Thomas Waldmann
8978515991 temporary hack to avoid using lots of disk space for chunks.archive.d 2015-10-02 16:56:31 +02:00
Thomas Waldmann
26bde96a3a Merge branch 'master' into faster-cache-sync 2015-09-10 23:12:55 +02:00
TW
70d97c4467 Merge pull request #180 from ThomasWaldmann/read-device
read special files as if they were regular files, update docs, closes #79
2015-09-06 21:38:31 +02:00
Thomas Waldmann
a912c02757 detect inconsistency / corruption / hash collision, closes #170
added a check that compares the size of the new chunk with the stored size of the
already existing chunk in storage that has the same id_hash value.
raise an exception if there is a size mismatch.

this could happen if:

- the stored size is somehow incorrect (corruption or software bug)
- we found a hash collision for the id_hash (for sha256, this is very unlikely)
2015-09-06 01:10:43 +02:00
Thomas Waldmann
0b1035746e read special files as if they were regular files, update docs, closes #79
do not use the files cache for such special files
2015-09-06 00:29:46 +02:00
Thomas Waldmann
54ccbc5ae2 chunks index resync: do all in one pass
if we do not have a cached archive index: fetch and build and merge it
if we have one: merge it
2015-08-30 15:15:15 +02:00
Thomas Waldmann
22dd925986 chunks index archive: remove all tar and compression related stuff and just use separate files in a directory
the compression was quite cpu intensive and didn't work that great anyway.
now the disk space usage is a bit higher, but it is much faster and less hard on the cpu.

disk space needs grow linearly with the amount and size of the archives, this
is a problem esp. if one has many and/or big archives (but this problem existed
before also because compression was not as effective as I believed).

the tar archive always needed a complete rebuild (and thus: decompression
and recompression) because deleting outdated archive indexes was not
possible in the tar file.

now we just have a directory chunks.archive.d and keep archive index files
there for all archives we already know.
if an archive does not exist any more in the repo, we just delete its index file.
if an archive is unknown still, we fetch the infos and build a new index file.

when merging, we avoid growing the hash table from zero, but just start
with the first archive's index as basis for merging.
2015-08-30 03:03:48 +02:00
Thomas Waldmann
f7210c749f remove cpu intensive compression methods for the chunks.archive
also remove the comment about how good xz compresses - while that was true for smaller index files,
it seems to be less effective with bigger ones. maybe just an issue with compression dict size.
2015-08-29 23:42:28 +02:00
Thomas Waldmann
69456e07c4 cache sync: change progress output to separate lines
printing without \n plus sys.stdout.flush() didn't work as expected.
2015-08-09 19:02:35 +02:00
Thomas Waldmann
35b0f38f5c cache sync: show progress indication
sync can take quite long, so show what we are doing.
2015-08-09 01:14:37 +02:00
Thomas Waldmann
a1e039ba21 reimplement the chunk index merging in C
the python code could take a rather long time and likely most of it was converting stuff from python to C and back.
2015-08-06 23:32:53 +02:00
Thomas Waldmann
e4a41c8981 fix Traceback when running check --repair, attic issue #232
This fix is maybe not perfect yet, but maybe better than nothing.

A comment by Ernest0x (see https://github.com/jborg/attic/issues/232 ):

@ThomasWaldmann your patch did the job.
attic check --repair did the repairing and attic delete deleted the archive.
Thanks.

That said, however, I am not sure if the best place to put the check is where
you put it in the patch. For example, the check operation uses a custom msgpack
unpacker class named "RobustUnpacker", which it does try to check for correct
format (see the comment: "Abort early if the data does not look like a
serialized dict"), but it seems it does not catch my case. The relevant code
in 'cache.py', on the other hand, uses msgpack's Unpacker class.
2015-07-15 13:32:05 +02:00
Thomas Waldmann
b2f460d591 fix filenames used for locking, update docs about locking 2015-07-13 23:20:46 +02:00
Thomas Waldmann
e4c519b1e9 new locking code
exclusive locking by atomic mkdir fs operation
on top of that, shared (read) locks and exclusive (write) locks using a json roster.
2015-07-13 13:55:28 +02:00
Thomas Waldmann
434dac0e48 move locking code to own module, same for locking tests
fix imports, no other changes.
2015-07-12 23:41:52 +02:00
TW
4b81f380f8 Merge pull request #88 from ThomasWaldmann/py3style
style and cosmetic fixes, no semantic changes
2015-07-11 18:39:42 +02:00
Thomas Waldmann
0580f2b4eb style and cosmetic fixes, no semantic changes
use simpler super() syntax of python 3.x

remove fixed errors/warnings' codes from setup.cfg flake8 configuration

fix file exclusion list for flake8
2015-07-11 18:31:49 +02:00
Thomas Waldmann
a59211f295 use borg-tmp as prefix for temporary files / directories
also: remove some unused temp dir. code
2015-07-11 17:22:12 +02:00
Thomas Waldmann
a3f4d19515 speed up chunks cache sync, fixes #18
Re-synchronize chunks cache with repository.

If present, uses a compressed tar archive of known backup archive
indices, so it only needs to fetch infos from repo and build a chunk
index once per backup archive.

If out of sync, the tar gets rebuilt from known + fetched chunk infos,
so it has complete and current information about all backup archives.

Finally, it builds the master chunks index by merging all indices from
the tar.

Note: compression (esp. xz) is very effective in keeping the tar
            relatively small compared to the files it contains.

Use python >= 3.3 to get better compression with xz,
there's a fallback to bz2 or gz when xz is not supported.
2015-05-31 19:17:01 +02:00
Thomas Waldmann
926454c0d8 explicitely specify binary mode to open binary files
on POSIX OSes, it doesn't make a difference, but it is cleaner and also good for portability.
2015-05-31 17:57:45 +02:00
Thomas Waldmann
5e98400a5a fix all references to package name
use relative imports if possible
reorder imports (1. stdlib 2. dependencies 3. borg 4. borg.testsuite)
2015-05-22 19:21:41 +02:00
Thomas Waldmann
78bfc58b47 rename package directory to borg 2015-05-22 17:48:54 +02:00
Renamed from attic/cache.py (Browse further)