Commit graph

572 commits

Author SHA1 Message Date
TW
2dda41dc2a Merge pull request #294 from anarcat/x-option
add -x flag, common to GNU utilities
2015-10-17 20:06:02 +02:00
Antoine Beaupré
eacb0b9e83 Merge branch 'logging-refactor' into upstream 2015-10-17 12:29:52 -04:00
Antoine Beaupré
4fd06e2634 add -x flag, common to GNU utilities
it should also probably be --one-file-system for coherence with du(1)
2015-10-17 12:23:45 -04:00
Antoine Beaupré
34d0e0641c make sure hardlink copy doesn't break perms 2015-10-17 00:41:20 -04:00
Antoine Beaupré
4be9c29d0d os.link signature is the same as shutil.copy, use it directly 2015-10-17 00:38:14 -04:00
Antoine Beaupré
aaf72e3861 do not skip all attic tests, some work without now 2015-10-17 00:38:14 -04:00
Antoine Beaupré
6d457aed57 do not upgrade repositories in place by default
instead, we perform the equivalent of `cp -al` on the repository to
keep a backup, and then rewrite the files, breaking the hardlinks as
necessary.

it has to be confirmed that the rest of Borg will also break hardlinks
when operating on files in the repository. if Borg operates in place
on any files of the repository, it could jeoperdize the backup, so
this needs to be verified. I believe that most files are written to a
temporary file and moved into place, however, so the backup should be
safe.

the rationale behind the backup copy is that we want to be extra
careful with user's data by default. the old behavior is retained
through the `--inplace`/`-i` commandline flag. plus, this way we don't
need to tell users to go through extra steps (`cp -a`, in particular)
before running the command.

also, it can take a long time to do the copy of the attic repository
we wish to work on. since `cp -a` doesn't provide progress
information, the new default behavior provides a nicer user experience
of giving an overall impression of the upgrade progress, while
retaining compatibility with Attic by default (in a separate
repository, of course).

this makes the upgrade command much less scary to use and hopefully
will convert drones to the borg collective.

the only place where the default inplace behavior is retained is in
the header_replace() function, to avoid breaking the cache conversion
code and to keep API stability and semantic coherence ("replace" by
defaults means in place).
2015-10-17 00:38:07 -04:00
Thomas Waldmann
9b10e8a3f3 if borg.exe is not present, do not try to test it 2015-10-16 00:52:23 +02:00
Thomas Waldmann
1a248116db move cmd fixture to archiver test module 2015-10-16 00:18:46 +02:00
Thomas Waldmann
4b7c02775e benchmarks: test with both the binary and the python code
we use forking mode always and either execute python with the archiver module or the "borg.exe" binary.
the cmd fixture alternates between 'python' and 'binary' mode and calls exec_cmd accordingly.
2015-10-16 00:12:02 +02:00
Thomas Waldmann
7efab2f254 benchmark tests: improve comments 2015-10-14 01:00:25 +02:00
Thomas Waldmann
a5a6ba0d77 integrate pytest-benchmark, test create, extract, list, delete, info, check, help, fixes #146
Instead of "realistic data", I chose the test data to be either all-zero (all-ascii-zero to be precise)
or all-random and benchmark them separately.
So we can better determine the cause (deduplication or storage) in case we see some performance regression.

"help" is benchmarked to see the minimum runtime when it basically does nothing.

also:
- refactor archiver execution core functionality into exec_cmd() so it can be used more flexibly
- tox: usually we want to skip benchmarks, only run them if requested manually
- install pytest-benchmark - run tox with "-r" to have it installed into your .tox envs
2015-10-11 16:07:11 +02:00
Antoine Beaupré
dbc183f87f drop support for Python 2 2015-10-09 14:13:58 -04:00
Antoine Beaupré
bdbdbdde90 Merge remote-tracking branch 'origin/master' into logging-refactor
Conflicts:
	borg/archive.py
	borg/archiver.py
	borg/cache.py
	borg/key.py
2015-10-09 12:58:27 -04:00
Antoine Beaupré
fd41408de2 move progress information out of tests and back in converter
this was a remnant of when i was writing the converter/upgrader code,
and was destined to be a general progress message in the migration
process. i removed a more technical, internal debugging message in
exchange
2015-10-09 12:34:05 -04:00
Antoine Beaupré
efa88ef6c6 fix tests: they expect check to spew output
our default verbosity shows only warnings, we'd have to tweak tests to be verbose for this to work

This reverts commit 27be46a5ba.
2015-10-09 12:28:29 -04:00
Thomas Waldmann
60d3b24df4 tagged/signed 0.27.0
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCgAGBQJWFPIaAAoJECQ6z6lR944Bl6UQAJsRgzQqQBEFk1tmQiDjox36
 QEJA4SyaK6dPCKqxjXNVQ8T/hdDWm2mNGoxBq3A/gcLoU4NRg73DnxroPTg0YRtR
 /mX8ojyjA0Yz9mAAFCGm8LMgTXLRBxqjpow6U/89kA/iOJbfIr4sO0a6cCI/hr9c
 7lL7ZGXO5hgVP7OugJMvLpSG/mW8xRckIW6xx3wjSvcORYPerPAoDpotDGIdMzNy
 UvqZI+TIMIeW1U85PSKa0uODrxdtnVKHBxCJRznqAYb7mOButeYY3u7Ya48oMjdB
 ZucipdWQh++lTod7zhRmczkridVrH229mKMsoA2t9fxmkiJsqkTKzpB+70l8ws+H
 76YQsPc81VJyNg6Cj9rRqrbfSjQ8Wms9+SnKzD9f0MkyyHgwUwkOZhZph5DXH6xn
 xbHbwhBhYbtFyxTXhb9UJMjY2gUREWZ48BDbf81jHUIVfCo9IuXyeWnoEoRmYBJ5
 GGB0ul2V1LG6+divoS9K3HEzf+MPW3nQCi3U4J71wvYDfDR2JggFVayXt/aL/D3E
 ED2FJZMQEQXA5vPKmosBcDr7u1vv7SZ3pjzzbyDLMHHzOG36ScxKkAIzULjpWWhP
 y9fnOv+V2g7Hoq0y3XxBLel3P0Uvx1AADqgKUTeLK0GiSLndqEehHrU1TiYYq077
 NurFcaa97j08b1l8Dif2
 =q6Ou
 -----END PGP SIGNATURE-----

Merge tag '0.27.0' into multithreading

tagged/signed 0.27.0
2015-10-09 17:18:34 +02:00
Antoine Beaupré
1c61f87da3 remove debugging code and fix all have_cython calls 2015-10-08 17:20:52 -04:00
Antoine Beaupré
80e53fb66d it's a function, call it as such 2015-10-08 17:16:11 -04:00
Antoine Beaupré
a1dad8c9da try to mock msgpack altogether to fix RTD again
it seems that msgpack is a hard depends in archive...
2015-10-08 17:06:48 -04:00
Antoine Beaupré
a869ab0702 try to fix RTD build *again* *again* 2015-10-08 17:03:35 -04:00
Antoine Beaupré
e8ae96b54e try to fix RTD build *again* 2015-10-08 17:01:42 -04:00
Antoine Beaupré
f2c56fb890 try to fix build on RTD *again* 2015-10-08 16:57:36 -04:00
Antoine Beaupré
3f68399463 style: wrap multiline strings elegantly 2015-10-08 16:52:49 -04:00
Antoine Beaupré
a42a5e3e2d fix all logger pep8 warnings except long lines 2015-10-08 16:50:44 -04:00
Antoine Beaupré
8075a139b3 fix typo 2015-10-08 16:44:30 -04:00
Antoine Beaupré
b1eafe7833 remove status that is best reflected in VCS 2015-10-08 16:44:30 -04:00
Antoine Beaupré
2d4b735fed remove unintended changes 2015-10-08 16:38:53 -04:00
Antoine Beaupré
27be46a5ba tweak some levels 2015-10-08 16:31:22 -04:00
Antoine Beaupré
423ff45d81 rename cython detection function
"have" has clearer semantics than "detect"
2015-10-08 15:34:44 -04:00
Antoine Beaupré
ff483fe485 fix logical inversion in the semantics of detect_cython() 2015-10-08 15:31:46 -04:00
Antoine Beaupré
f98998f042 fix logical inversion in the semantics of detect_cython() 2015-10-08 15:26:40 -04:00
Antoine Beaupré
6f9e04bc21 generalise the cython check hack
instead of applying this only to usage generation, use it as a generic
mechanism to disable loading of Cython code.

it may be incomplete: there may be other places where Cython code is
loaded that is not checked, but that is sufficient to build the usage
docs. the environment variable used is documented as such in the
docs/usage.rst.

we also move the check to a helper function and document it
better. this has the unfortunate side effect of moving includes
around, but I can't think of a better way.
2015-10-08 08:56:02 -04:00
Antoine Beaupré
9cbc868764 make tests and build work again in usage using environment
this is such a crude hack it is totally embarrassing....

the proper solution would probably be to move the `build_parser()`
function out of `Archiver` completely, but this is such an undertaking
that i doubt it is worth doing since we're looking at switching to
click anyways.

the main problem in moving build_parser() out is that it references
`self` all the time, so it *needs* an archiver context that it can
reuse. we could make the function static and pass self in there by
hand, but it seems like almost a worse hack... and besides, we would
need to load the archiver in order to do that, which would break usage
all over again...
2015-10-07 22:26:59 -04:00
Antoine Beaupré
13d3568548 move usage generation to setup.py
this is an unfortunate rewrite of the manpage creation code mentionned
in #208. ideally, this would be rewritten into a class that can
generate both man pages and .rst files.
2015-10-07 21:07:15 -04:00
Thomas Waldmann
a4967ec582 ssh_cmd: fix wrong caller, fixes #255 2015-10-07 03:32:55 +02:00
Thomas Waldmann
8ddc448f41 make sure to always give segment and offset in repo IntegrityError exception messages
this was only handled correctly at one place, by adding the segment number afterwards.
now the segment number is always included.
2015-10-06 20:35:22 +02:00
Antoine Beaupré
d9e05946ac document logger module 2015-10-06 13:03:49 -04:00
Antoine Beaupré
42cc17caed use new logger object initialisation code 2015-10-06 12:57:27 -04:00
Antoine Beaupré
26561a7766 factor out logger object initialisation 2015-10-06 12:57:27 -04:00
Antoine Beaupré
2d0dae4e8b move logging setup to logger module 2015-10-06 12:57:26 -04:00
anarcat
7908c29180 Merge pull request #249 from anarcat/ssh-env
complete test coverage for SSH args parsing
2015-10-05 19:53:17 -04:00
Antoine Beaupré
a7b70d87cd complete test coverage for SSH args parsing 2015-10-05 19:22:33 -04:00
TW
745f9b89f8 Merge pull request #247 from anarcat/xdg
respect XDG_CACHE_HOME
2015-10-06 01:20:48 +02:00
TW
974dd58c23 Merge pull request #248 from anarcat/ssh-env
add support for arbitrary SSH commands
2015-10-06 01:20:23 +02:00
Antoine Beaupré
8f0de2cab7 fix tests on travis, which seem to set BORG_CACHE_DIR 2015-10-05 19:05:27 -04:00
Antoine Beaupré
a0ef4e25dd add support for arbitrary SSH commands (attic#99)
while SSH options can be specified through `~/.ssh/config`, some users
may want to use a completely different SSH command for their backups,
without overriding their $PATH variable. it may also be easier to do
ad-hoc configuration and tests that way.

plus, the POLA tells us that users expects something like this to be
supported by commands that talk to ssh. it is supported by rsync, git
and so on.
2015-10-05 18:54:00 -04:00
Antoine Beaupré
43a65933f7 move ssh generation code to a stub, add unit test 2015-10-05 18:51:20 -04:00
Antoine Beaupré
de2a811606 move RemoteRepository defaults to the class
the reasoning behind this is that we may need to test a
RemoteRepository setup outside of the main archiver routines, which
the current default location makes impossible

by moving the umask and remote_path remotes into the RemoteRepository
the (reasonable) defaults are available regardless of the (currently
obscure) initialisation routine, and make unit tests easier to develop
and support
2015-10-05 18:45:57 -04:00
Antoine Beaupré
427ddd64a6 respect XDG_CACHE_HOME
fixes attic#181
2015-10-05 17:50:46 -04:00
Thomas Waldmann
c50f32426b do not crash on empty lock.roster, fixes #232 2015-10-05 23:23:59 +02:00
Thomas Waldmann
6f637bed2f LoggedIO: deduplicated code, improved checks and error handling in read()
Code shared by read() and iter_objects() was moved into _read().

Compared to read()'s previous state, this improved:
- fixed size check to avoid read with negative size
- exception handler for struct unpack
- checking for short read
- more precise exception messages
2015-10-05 02:27:24 +02:00
Antoine Beaupré
97189dd25b add missing sys import, fixing build 2015-10-03 15:08:55 -04:00
Antoine Beaupré
e414203ce2 convert upgrade code to logger as well 2015-10-03 14:24:48 -04:00
Antoine Beaupré
24413136ee Merge remote-tracking branch 'origin/master' into logging-refactor
Conflicts:
	borg/archiver.py
2015-10-03 14:23:53 -04:00
Thomas Waldmann
51dc66d05f implement borg delete --cache-only repo, attic #123
it deletes just the local cache for the given repository, not the repo itself.
2015-10-03 19:29:45 +02:00
TW
adac324b6c Merge pull request #240 from ThomasWaldmann/cache-config-check
fix multiple issues with the cache config version check, fixes #234
2015-10-03 19:19:22 +02:00
TW
23bfe4d1bc Merge pull request #238 from ThomasWaldmann/index-archive-optional
temporary hack to avoid using lots of disk space for chunks.archive.d
2015-10-03 19:12:52 +02:00
Antoine Beaupré
5409cbaa67 also copy files cache verbatim
it seems the file cache does *not* have the ATTIC magic header (nor
does it have one in borg), so we don't need to edit the file - we just
copy it like a regular file.

while i'm here, simplify the cache conversion loop: it's no use
splitting the copy and the edition since the latter is so fast, just
do everything in one loop, which makes it much easier to read.
2015-10-03 12:56:03 -04:00
Antoine Beaupré
fded2219a8 mention borg delete borg
this makes it clear how to start from scratch, in case the chunk cache
was failed to be copied and so on.
2015-10-03 12:46:23 -04:00
Antoine Beaupré
c91c5d0029 rename convert command to upgrade
convert is too generic for the Attic conversion: we may have other
converters, from other, more foreign systems that will require
different options and different upgrade mechanisms that convert could
never cover appropriately. we are more likely to use an approach
similar to "git fast-import" instead here, and have the conversion
tools be external tool that feed standard data into borg during
conversion.

upgrade seems like a more natural fit: Attic could be considered like
a pre-historic version of Borg that requires invasive changes for borg
to be able to use the repository. we may require such changes in the
future of borg as well: if we make backwards-incompatible changes to
the repository layout or data format, it is possible that we require
such changes to be performed on the repository before it is usable
again. instead of scattering those conversions all over the code, we
should simply have assertions that check the layout is correct and
point the user to upgrade if it is not.

upgrade should eventually automatically detect the repository format
or version and perform appropriate conversions. Attic is only the
first one. we still need to implement an adequate API for
auto-detection and upgrade, only the seeds of that are present for now.

of course, changes to the upgrade command should be thoroughly
documented in the release notes and an eventual upgrade manual.
2015-10-03 12:36:52 -04:00
Antoine Beaupré
48b7c8cea3 avoid checking for non-existent files
if there's no attic cache, it's no use checking for individual files

this also makes the code a little clearer

also added comments
2015-10-03 11:52:12 -04:00
Antoine Beaupré
690541264e style fixes (pep8, append, file builtin) 2015-10-03 11:49:01 -04:00
Antoine Beaupré
3773681f00 rewire cache copy mechanisms
we separate the conversion and the copy in order to be able to copy
arbitrary files from attic without converting them. this allows us to
copy the config file cleanly without attempting to rewrite its magic
number
2015-10-03 11:07:38 -04:00
Antoine Beaupré
2c66e7c233 make percentage a real percentage 2015-10-03 10:49:29 -04:00
Antoine Beaupré
f25888b27a restore box removed by mistake 2015-10-02 16:04:35 -04:00
Antoine Beaupré
35aaeef8bd remove spurious output 2015-10-02 16:00:42 -04:00
Antoine Beaupré
8193c414a9 add and use string representation of archive for stats 2015-10-02 15:56:21 -04:00
Antoine Beaupré
e5a0936a05 add formatters for Cache and Statistics objects
this greatly simplifies the display of those objects, as the
__format__() parameter allows for arbitrary display of the internal
fields of both objects

this will allow us to display those summaries without having to pass a
label to the string representation. we can also print the objects
directly without formatting at all.
2015-10-02 15:55:14 -04:00
Thomas Waldmann
893242ead4 fix multiple issues with the cache config version check, fixes #234
- issue #234: handle exception when config file is empty is really not a borg cache config
- there was a unused %s in the Exception string
- error msg was wrong when version check failed - this IS a borg cache, but not of expected version
2015-10-02 18:11:10 +02:00
Antoine Beaupré
b09cdb4a63 don't add module name on standard messages
too much clutter
2015-10-02 11:22:38 -04:00
Antoine Beaupré
2f78b8ad41 remove stray comment 2015-10-02 11:16:28 -04:00
Antoine Beaupré
1819adaf59 log level tweaking: stats should not be a warning
we should find other ways of forcing that to be shown, it seems
2015-10-02 11:16:21 -04:00
Antoine Beaupré
ca6c52610f restore some print statements
the heuristics i used are the following:

 1. if we are prompting the use, use print on stderr (input() may
    produce some stuff on stdout, but it's outside the scope of this
    patch). we do not want those prompts to end up on the standard
    output in case we are piping stuff around

 2. if the command is primarily producing output for the user on the
    console (`list`, `info`, `help`), we simply print on the default
    file descriptor.

 3. everywhere else, we use the logging module with varying levels of
    verbosity, as appropriate.
2015-10-02 11:13:01 -04:00
Antoine Beaupré
c9b11316ab use a module-specific logger instead of global one
that way we have one logger per module, and we can pick and choose
which module we want verbose, for example
2015-10-02 11:05:44 -04:00
Thomas Waldmann
8978515991 temporary hack to avoid using lots of disk space for chunks.archive.d 2015-10-02 16:56:31 +02:00
Antoine Beaupré
ea5d00436c also document the cache locations 2015-10-02 10:12:13 -04:00
Antoine Beaupré
ad85f64842 whitespace 2015-10-02 10:10:50 -04:00
Antoine Beaupré
69040588cd update docs to reflect that cache is converted 2015-10-02 10:10:43 -04:00
Antoine Beaupré
d4d1b414b5 remove needless autouse 2015-10-02 09:44:53 -04:00
Antoine Beaupré
41e9942efe follow naming of tested module 2015-10-02 09:43:51 -04:00
Antoine Beaupré
081b91bea0 remove needless paren 2015-10-02 09:43:10 -04:00
Antoine Beaupré
3e7fa0d633 also copy the cache config file to workaround #234 2015-10-01 16:41:17 -04:00
Antoine Beaupré
8022e563a9 don't clobber existing borg cache 2015-10-01 16:27:19 -04:00
Antoine Beaupré
55f79b4999 complete cache conversion code
we need to create the borg cache directory

dry run was ignored, fixed.

process cache before segment, because we want to do the faster stuff first
2015-10-01 16:24:28 -04:00
Antoine Beaupré
28a033d1d3 remove debug output that clobbers segment spinner 2015-10-01 16:03:52 -04:00
Antoine Beaupré
4f9a411ad8 remove unneeded fixture decorator 2015-10-01 16:01:17 -04:00
Antoine Beaupré
022de5be47 untested file/chunks cache conversion
i couldn't figure out how to generate a cache set directly, Archiver is a pain...
2015-10-01 16:01:01 -04:00
Antoine Beaupré
7c32f555ac repository index conversion 2015-10-01 15:43:16 -04:00
Antoine Beaupré
a7902e5657 cosmetic: show 100% when done, not n-1/n% 2015-10-01 14:29:09 -04:00
Antoine Beaupré
35b219597f only write magic num if necessary
this could allow speeding up conversions resumed after interruption
2015-10-01 14:28:49 -04:00
Antoine Beaupré
180dfcb18f remove needless indentation 2015-10-01 14:23:43 -04:00
Antoine Beaupré
6a72252b69 release lock properly if segment conversion crashes 2015-10-01 14:22:29 -04:00
Antoine Beaupré
1b540d91a0 convert more print() statements to logging
we use logging.warning in info and list, but print() more usage bits.

we also now support logging.debug() and by default are more silent
2015-10-01 14:20:29 -04:00
Antoine Beaupré
09ffbb1d9d convert most print() calls to logging
the logging level varies: most is logging.info(), in some place
logging.warning() or logging.error() are used when the condition is
clearly an error or warning. in other cases, we keep using print, but
force writing to sys.stderr, unless we interact with the user.

there were 77 calls to print before this commit, now there are 7, most
of which in the archiver module, which interacts directly with the
user. in one case there, we still use print() only because logging is
not setup properly yet during argument parsing.

it could be argued that commands like info or list should use print
directly, but we have converted them anyways, without ill effects on
the unit tests

unit tests still use print() in some places

this switches all informational output to stderr, which should help
with, if not fix jborg/attic#312 directly
2015-10-01 13:41:45 -04:00
Antoine Beaupré
c996fd8366 just call print() once in the odd print_() function
this will help when we want to refactor output functions

this function should definitely be replaced by a __repr__() or
__str__() however
2015-10-01 12:39:13 -04:00
Antoine Beaupré
3bb3bd45fc add percentage progress 2015-10-01 12:36:53 -04:00
Antoine Beaupré
0d457bc846 clarify what to do about the cache warning 2015-10-01 11:25:12 -04:00
Antoine Beaupré
946aca97a1 avoid flooding the console
instead we add progress information
2015-10-01 11:25:02 -04:00
Antoine Beaupré
6c318a0f27 re-pep8 2015-10-01 11:12:23 -04:00
Antoine Beaupré
7f6fd1f306 add docs for all converter test code 2015-10-01 11:11:30 -04:00
Antoine Beaupré
a08bcb21ae refactor common code
we get rid of repo_open() which doesn't same much typing, and add a validator for keys
2015-10-01 11:10:56 -04:00
Antoine Beaupré
f5cb0f4e73 rewrite convert tests with pytest fixtures 2015-10-01 10:41:31 -04:00
Antoine Beaupré
98e4e6bc25 lock repository when converting segments 2015-10-01 09:35:17 -04:00
Antoine Beaupré
58815bc28a fix commandline dispatch for converter 2015-10-01 09:23:17 -04:00
Antoine Beaupré
79d9aebaf2 use permanently instead of irrevocably, which is less common 2015-10-01 09:00:49 -04:00
Antoine Beaupré
b9c474d187 pep8: put pytest skip marker after imports 2015-10-01 08:59:01 -04:00
Antoine Beaupré
4a85f2d0f5 fix most pep8 warnings
* limit all lines to 80 chars
* remove spaces around parameters
* missing blank lines
2015-10-01 08:58:01 -04:00
Antoine Beaupré
5f6eb87385 much nicer validation checking 2015-10-01 08:50:06 -04:00
Antoine Beaupré
d5198c551b split out depends in imports 2015-10-01 08:47:23 -04:00
Antoine Beaupré
d66516351f use builtin NotImplementedError instead of writing our own
NotImplemented didn't work with pytest.raise(), i didn't know about NotImplementedError, thanks tw
2015-10-01 08:46:30 -04:00
Antoine Beaupré
ef0ed409b6 fix typo 2015-10-01 08:44:17 -04:00
Antoine Beaupré
5b8cb63479 remove duplicate code with the unit test 2015-10-01 08:43:05 -04:00
Antoine Beaupré
dbd4ac7f8d add missing colon 2015-10-01 08:41:44 -04:00
Antoine Beaupré
c2913f5f10 style: don't use continue for nothing 2015-10-01 08:40:56 -04:00
Antoine Beaupré
efbad396f4 help text review: magic s/number/string/, s/can/must/ 2015-10-01 08:40:25 -04:00
Antoine Beaupré
a81755f1a9 use triple-double-quoted instead of single-double-quoted
at the request of TW, see #231
2015-10-01 08:34:18 -04:00
Antoine Beaupré
bcd94b96e0 split up keyfile, segments and overall testing in converter 2015-10-01 00:32:34 -04:00
Antoine Beaupré
1ba856d2b3 refactor: group test repo subroutine 2015-10-01 00:15:25 -04:00
Antoine Beaupré
1b29699403 cosmetic: reorder 2015-10-01 00:15:12 -04:00
Antoine Beaupré
a5f32b0a27 add convert command 2015-09-30 23:50:46 -04:00
Antoine Beaupré
f35e8e17f2 add dry run support to converter 2015-09-30 23:50:35 -04:00
Antoine Beaupré
e554365765 remove unused import 2015-09-30 23:28:07 -04:00
Antoine Beaupré
77ed6dec2b skip converter tests if attic isn't installed 2015-09-30 23:27:55 -04:00
Antoine Beaupré
c30df4e033 move converter code out of test suite 2015-09-30 23:18:03 -04:00
Antoine Beaupré
5a1680397c remove needless use of self 2015-09-30 23:02:21 -04:00
Antoine Beaupré
aa25a217a4 move conversion code to a separate class for clarity 2015-09-30 23:01:03 -04:00
Antoine Beaupré
312c3cf738 rewrite converter to avoid using attic code
the unit tests themselves still use attic to generate an attic
repository for testing, but the converter code should now be
standalone
2015-09-30 22:54:00 -04:00
Antoine Beaupré
c7af4c7f1d more debug 2015-09-30 22:43:08 -04:00
Antoine Beaupré
2d1988179e some debugging code 2015-09-30 22:41:38 -04:00
Antoine Beaupré
e88a994c8a reshuffle and document 2015-09-30 22:41:21 -04:00
Antoine Beaupré
9ab1e1961e keyfile conversion code 2015-09-30 22:37:39 -04:00
Antoine Beaupré
de54228809 first stab at an attic-borg converter
for now, just in the test suite, but will be migrated to a separate command
2015-09-30 21:08:47 -04:00
Thomas Waldmann
6aca4694fe fix segment entry header size check, attic issue #352
it only checked for too big sizes, but not for too small ones.
that made it die with a ValueError and not raise the appropriate IntegrityError
that gets handled in check() and triggers the repair attempt for the segment.
2015-09-30 16:10:50 +02:00
Thomas Waldmann
ab76176553 fix: patterns might be None 2015-09-19 18:38:44 +02:00
Thomas Waldmann
e0a08c5cae borg extract: warn if a include pattern never matched, fixes #209 2015-09-19 18:16:47 +02:00
Thomas Waldmann
15b003e344 add a string representation for Include/ExcludePattern
it just gives the original string that was used.
2015-09-19 18:03:53 +02:00
Thomas Waldmann
08417b52ec implement counters for Include/ExcludePatterns 2015-09-19 17:48:41 +02:00
Thomas Waldmann
aed6cc9446 be more clear about pruning, attic issue #132 2015-09-19 16:58:02 +02:00
Thomas Waldmann
cad0515178 archive names with slashes are invalid, attic issue #180
for borg mount's FUSE filesystem, we use the archive name as a directory name,
thus slashes are not allowed.
2015-09-19 16:09:20 +02:00
Thomas Waldmann
375717c095 tests: work around strange mtime granularity issue on netbsd, fixes #204
not sure where the problem is:
it seems to announce it supports st_mtime_ns, but if one uses it and
has a file with ...123ns, i t gets restored as ...000ns.

Then I tried setting st_mtime_ns_round to -3, but it still failed with +1000ns difference.

Maybe rounding is incorrect and it should be truncating?
Issue with granularity could be in python, in netbsd (netbsd platform code), in ffs filesystem, ...
2015-09-18 00:02:44 +02:00
Thomas Waldmann
48634d4e96 tests: ignore st_rdev if file is not a block/char device, fixes #203 2015-09-17 22:41:49 +02:00
Thomas Waldmann
41860ef5f0 test setup: stay away from the setgid mode bit
for vagrant testing on misc. platforms, we can't know the group /
we can't have the same group everywhere.

but the OS won't let us set setgid bit if the file does not have our group.
on netbsd, the created file somehow happens to have group "wheel",
but vagrant is not in group wheel.
2015-09-15 23:52:17 +02:00
Thomas Waldmann
cf9ba87734 test setup: do not set the sticky bit on a regular file
sticky bit only has a function on directories.
openbsd does not let one set sticky on files.
other systems seem to just ignore it.
2015-09-15 00:41:32 +02:00
Thomas Waldmann
bc5949a7f4 chunker: add a check whether the POSIX_FADV_DONTNEED constant is defined
on openbsd, it isn't.
2015-09-14 17:36:04 +02:00
Thomas Waldmann
13ded3d5e7 xattr tests: ignore security.selinux attribute showing up 2015-09-14 01:26:20 +02:00
Thomas Waldmann
bc2cfdfc59 fix the other argparse import also 2015-09-13 01:01:48 +02:00
Thomas Waldmann
2b311846e0 add a argparse.py (from py 3.2.6) that is not broken
also: remove previois attempt to fix this, installing pypi argparse into virtualenv does not work.
2015-09-13 00:58:57 +02:00
Thomas Waldmann
7774d4f82c ext3 seems to need a bit more space for a sparse file
but it is still sparse, just needed some adjustment
2015-09-13 00:36:17 +02:00
Thomas Harold
03579ddb5a Obtaining 'char *' from temporary Python value
Old code causes a compile error on Mint 17.2
2015-09-12 17:21:49 -04:00
Thomas Waldmann
bc021d4ed7 do not test lzma level 9 compression
got a MemoryError in a vagrant VM, level 9 needs a lot of memory...
2015-09-12 19:16:45 +02:00
Thomas Waldmann
26bde96a3a Merge branch 'master' into faster-cache-sync 2015-09-10 23:12:55 +02:00
Thomas Waldmann
1eecb020e8 cython code: add some int types to get rid of unspecific python add / subtract operations
they somehow pull in some floating point error code that led to a undefined
symbol FPE_... when using the borgbackup wheel on some non-ubuntu/debian linux
platform.
2015-09-10 23:12:12 +02:00
Ed Blackman
13ddfdf4a3 Move pattern normalization decision into decorator
Using a decorator moves the duplicate code in the init methods into a
single decorator method, while still retaining the same runtime overhead
(zero for for the non-OSX path, one extra function call plus the call to
unicodedata.normalize for OSX).  The pattern classes are much visually
cleaner, and duplicate code limited to two lines normalizing the pattern
on OSX.

Because the decoration happens at class init time (vs instance init time
for the previous approach), the OSX and non-OSX test cases can no longer
be called in the same run, so I also removed the OSX test case monkey
patching and uncommented the platform skipif decorator.
2015-09-09 15:00:58 -04:00
Ed Blackman
cc13f3db97 Express non-ascii pattern platform skips better
including correcting thinko in the commented-out OSX-only test
2015-09-09 13:48:46 -04:00
Ed Blackman
d510ff7c63 Merge non-ascii Include and ExcludePattern tests
to parallel the OSX non-ascii tests
2015-09-09 13:41:34 -04:00
Ed Blackman
d9fb1d2b03 Normalize paths before pattern matching on OS X
The OS X file system HFS+ stores path names as Unicode, and converts
them to a variant of Unicode NFD for storage.  Because path names will
always be in this canonical form, it's not friendly to require users to
match this form exactly.  Convert paths from the repository and patterns
from the command line to NFD before comparing them.

Unix (and Windows, I think) file systems don't convert path names into a
canonical form, so users will continue to have to exactly match the path
name they want, because there could be two paths with the same character
visually that are actually composed of different byte sequences.
2015-09-08 23:33:34 -04:00
Thomas Waldmann
322a87cbfd Merge branch 'master' into multithreading
Note: there is a failing archiver test on py33-only now.
It is somehow related to __del__ method usage in Cache
and/or locking code. Could not find out the exact reason
why it behaves like that.
2015-09-08 21:22:37 +02:00
Thomas Waldmann
1aacdda4a4 implement borg create --dry-run, attic issue #267
also: fix verbose mode display of stdin backup
2015-09-08 03:12:45 +02:00
Thomas Waldmann
13f20647dc use absolute path, attic issue #200, attic issue #137
the daemonize code changes the cwd, thus a relative repo path can't work.

borg mount repo mnt  # did not work
borg mount --foreground repo mnt  # did work
borg mount /abs/path/repo mnt  # did work
2015-09-06 23:26:47 +02:00
Thomas Waldmann
e244fe2f69 change 2 more chunker vars to off_t
so they get 64bit on 32bit platforms.
2015-09-06 22:06:52 +02:00
Thomas Waldmann
32e276c526 Merge branch 'chunker_small_fixes' of https://github.com/sourcejedi/borg into chunker_small_fixes 2015-09-06 22:03:42 +02:00
TW
947fc095d8 Merge pull request #183 from ThomasWaldmann/borg-repo-envvar
BORG_REPO env var support
2015-09-06 21:51:24 +02:00
TW
70d97c4467 Merge pull request #180 from ThomasWaldmann/read-device
read special files as if they were regular files, update docs, closes #79
2015-09-06 21:38:31 +02:00
TW
3ab068b834 Merge pull request #181 from ThomasWaldmann/hash-collision
detect inconsistency / corruption / hash collision, closes #170
2015-09-06 21:35:53 +02:00
Thomas Waldmann
f5069c4e81 fix reaction to "no" answer at delete repo prompt, fixes #182 2015-09-06 21:11:52 +02:00
Thomas Waldmann
817ce18bc6 fix repository arg default 2015-09-06 20:19:28 +02:00
Thomas Waldmann
b3f5231bac BORG_REPO env var support
sets the default repository to use, e.g. like:

export BORG_REPO=/mnt/backup/repo
borg init
borg create ::archive
borg list
borg mount :: /mnt
fusermount -u /mnt
borg delete ::archive
2015-09-06 18:18:24 +02:00
Thomas Waldmann
a912c02757 detect inconsistency / corruption / hash collision, closes #170
added a check that compares the size of the new chunk with the stored size of the
already existing chunk in storage that has the same id_hash value.
raise an exception if there is a size mismatch.

this could happen if:

- the stored size is somehow incorrect (corruption or software bug)
- we found a hash collision for the id_hash (for sha256, this is very unlikely)
2015-09-06 01:10:43 +02:00
Thomas Waldmann
0b1035746e read special files as if they were regular files, update docs, closes #79
do not use the files cache for such special files
2015-09-06 00:29:46 +02:00
Thomas Waldmann
54ccbc5ae2 chunks index resync: do all in one pass
if we do not have a cached archive index: fetch and build and merge it
if we have one: merge it
2015-08-30 15:15:15 +02:00
Thomas Waldmann
22dd925986 chunks index archive: remove all tar and compression related stuff and just use separate files in a directory
the compression was quite cpu intensive and didn't work that great anyway.
now the disk space usage is a bit higher, but it is much faster and less hard on the cpu.

disk space needs grow linearly with the amount and size of the archives, this
is a problem esp. if one has many and/or big archives (but this problem existed
before also because compression was not as effective as I believed).

the tar archive always needed a complete rebuild (and thus: decompression
and recompression) because deleting outdated archive indexes was not
possible in the tar file.

now we just have a directory chunks.archive.d and keep archive index files
there for all archives we already know.
if an archive does not exist any more in the repo, we just delete its index file.
if an archive is unknown still, we fetch the infos and build a new index file.

when merging, we avoid growing the hash table from zero, but just start
with the first archive's index as basis for merging.
2015-08-30 03:03:48 +02:00
Thomas Waldmann
f7210c749f remove cpu intensive compression methods for the chunks.archive
also remove the comment about how good xz compresses - while that was true for smaller index files,
it seems to be less effective with bigger ones. maybe just an issue with compression dict size.
2015-08-29 23:42:28 +02:00
TW
17c4394896 Merge pull request #161 from RonnyPfannschmidt/setuptools-scm
replace versioneer with setuptools_scm
2015-08-29 16:46:41 +02:00
Thomas Waldmann
31e97d568b remove x bits from repository.py 2015-08-29 12:52:18 +02:00
Thomas Waldmann
d779057b79 fix issue with negative "all archives" size, fixes #165
This fixes a infrequent problem when (refcount * chunksize) overflowed a int32_t.
chunksize is always <= 8MiB and usually rather ~64KiB (with default chunker params).
Thus, this happened only for high refcounts and/or unusually big chunks.
2015-08-29 04:46:13 +02:00
Thomas Waldmann
c823554b6b docs: usage: improved formatting, cosmetic changes 2015-08-29 04:00:22 +02:00
Thomas Waldmann
9ebc53ad77 restore_xattrs: ignore if setxattr fails with EACCES, fixes #162
e.g.:
- setting any security.* key is expected to fail with EACCES if one is not root.
- issue #162 on our issue tracker: user was root, but due to some specific scenario
  involving docker and selinux, setting security.selinux key fails even when running as root

not sure if it is the best solution to silently ignore this, but some lines below this change
failure to do a chown is also silently ignored (happens e.g. when restoring a file not owned
by the current user as a non-root user).
2015-08-29 00:11:04 +02:00
Thomas Waldmann
ea8f3bd7e7 restore_xattrs: minor cleanup / simplification
if we use {} as default for item.get(), we do not need the "if" as iteration over an empty dict won't do anything.
also fixes too deep indentation the original code had.
2015-08-28 23:22:26 +02:00
Ronny Pfannschmidt
8b6ca0d912 propperly handle borg._version using setuptools_scm 2015-08-22 15:54:40 +02:00
Alan Jenkins
59a44296e4 chunker - cast from size_t to off_t can now be removed
Sorry, this should really have been part of the previous commit -
it's why I noticed a problem.
2015-08-20 17:48:59 +01:00
Thomas Waldmann
0a2bd8dad5 lock roster: catch file not found in remove() method and ignore it 2015-08-20 18:40:24 +02:00
Alan Jenkins
ce3e67cb96 chunker - fix 4GB files on 32-bit systems
From code inspection - effect not actually tested.
2015-08-20 17:23:50 +01:00
Alan Jenkins
7c6f3ece66 Initialize chunker fd to -1, so it's not equal to STDIN_FILENO (0) 2015-08-20 17:23:41 +01:00
Thomas Waldmann
d3d78f7ae3 call fadvise DONTNEED for the byterange we actually have read, fixes #158
avoid throwing away potential readahead data the OS might have read into the cache.
2015-08-20 05:33:51 +02:00
Thomas Waldmann
93a89d97fa ChunkerParams: fix parameter order
the parser for the --chunker-params argument had a wrong parameter order.
fixed the order so it conforms to the help text and the docs.
also added some tests for it and a text for the ValueError exception.
2015-08-17 11:50:47 +02:00
Thomas Waldmann
b180158876 generalize hashindex code for any key length
currently, we only use sha256 hashes as key, so key length is always 32.
but instead of hardcoding 32 everywhere, using key_length is just better
readable and also more flexible for the future.
2015-08-16 14:51:15 +02:00
Thomas Waldmann
03ee28b544 fix tests after merge - we must not care for file order 2015-08-15 21:50:13 +02:00
Thomas Waldmann
91d2cfa671 Merge branch 'master' into multithreading 2015-08-15 21:45:52 +02:00
Thomas Waldmann
608c0935e0 borg list --short, remove requirement for fakeroot, xfail a test
borg list --short just spills out the list of files / dirs - better for some tests
and also useful on the commandline for interactive use.

the tests previously needed fakeroot because in the test setup it always
made calls to mknod and chown, which require (fake)root.
now, the tests adapt to whether it detects (fake)root or not - to run the
the tests completely, you still need fakeroot, but it won't fail all the archiver
tests just due to failing test setup.

also, a test not working correctly due to fakeroot was found:
it should detect whether a read-only repo is usable, but it failed to do that
because with (fake)root, there is no "read only" (at least not via taking away
 the w permission bits).
2015-08-15 20:52:14 +02:00
Thomas Waldmann
738ed5d91b 2 small archiver testsuite fixes
environment context manager: if a env var was not present before, it should not be present afterwards

teardown: cd out of the tmpdir before deleting it
2015-08-15 17:07:09 +02:00
Thomas Waldmann
e5b647fbd1 minor lrucache test fix 2015-08-15 16:15:10 +02:00
Thomas Waldmann
986b70c189 Merge branch 'lrucache' of https://github.com/sourcejedi/borg 2015-08-15 16:06:09 +02:00
Thomas Waldmann
bf757738f7 Merge branch 'master' into compression 2015-08-14 23:24:04 +02:00
Thomas Waldmann
a6b6712d6a deprecate the numeric --compression argument, rename null compression to none, update CHANGES 2015-08-14 23:00:04 +02:00
Alan Jenkins
02b3fbb401 lrucache: change test case to py.test
I re-wrote lrucache (and it seems like no-one had looked at it much
before :).  I was told my test function would have been simpler in
native py.test, so let's have a go converting it all.

We can avoid any reference to unittest, because lrucache doesn't write
files so it doesn't need any of our custom assertion helpers.
2015-08-14 14:51:10 +01:00
Alan Jenkins
0ee78240ee lrucache: test added code
Tests saved my butt, so I'd better contribute :).

These tests have been tested - substituting a null dispose function
causes an immediate failure.
2015-08-14 12:03:23 +01:00
Alan Jenkins
5e0013c5db Merge branch 'master' into lrucache 2015-08-14 10:59:21 +01:00
Thomas Waldmann
3100fac361 fix archiver test to not expect backup of the UF_NODUMP file, try 2 2015-08-12 17:03:30 +02:00
Thomas Waldmann
0481424128 fix archiver test to not expect backup of the UF_NODUMP file 2015-08-12 16:41:30 +02:00
Thomas Waldmann
b512827b07 Merge branch 'honor_nodump' of https://github.com/jeffrizzo/attic 2015-08-12 15:57:54 +02:00
Thomas Waldmann
02ccf37766 Merge branch 'minor' of https://github.com/sourcejedi/attic 2015-08-12 15:16:44 +02:00
Thomas Waldmann
8300efb1db remote: pragma: no cover for the stuff we can't test 2015-08-12 04:28:31 +02:00
Thomas Waldmann
4d8949e66a archiver: more tests 2015-08-12 04:09:36 +02:00
Thomas Waldmann
b16dc03e36 tests for CompressionSpec 2015-08-12 02:27:41 +02:00
Thomas Waldmann
e06b0b3612 use C99's uintmax_t and %ju format
whatever size_t and off_t is, should even fit in there
2015-08-12 01:04:03 +02:00
Thomas Waldmann
8af3aa3397 merged master 2015-08-09 23:51:46 +02:00
Thomas Waldmann
69456e07c4 cache sync: change progress output to separate lines
printing without \n plus sys.stdout.flush() didn't work as expected.
2015-08-09 19:02:35 +02:00
Thomas Waldmann
197ca9c0d3 C merge code: cast to correct pointer type, silences warning 2015-08-09 16:19:53 +02:00
Thomas Waldmann
955ac9c44c get rid of testsuite.mock, directly import from mock
this was left over from times when we either used mock from stdlib
or pypi mock. but as we only use pypi mock now, the indirection is
not needed any more.
2015-08-09 14:26:54 +02:00
Thomas Waldmann
74e5860508 document that passphrase(-only) mode is deprecated 2015-08-09 13:47:36 +02:00
Thomas Waldmann
e74c87d5b5 update borg check help 2015-08-09 12:52:39 +02:00
Thomas Waldmann
80ee8b98af fix the repair mode
if one used --last (or since shortly: gave an archive name), verify_chunks (old method name) was
not called because it requires all archives having been checked.

the problem was that also the final manifest.write() and repository.commit() was done in that method,
so all other repair work did not get committed in that case.

I moved these calls that to a separate finish() method.
2015-08-09 12:43:57 +02:00
Thomas Waldmann
4f6c43baec document what borg check does, fixes #138 2015-08-09 01:15:05 +02:00
Thomas Waldmann
03f39c2663 borg check: give a named single archive to it, fixes #139 2015-08-09 01:14:53 +02:00
Thomas Waldmann
35b0f38f5c cache sync: show progress indication
sync can take quite long, so show what we are doing.
2015-08-09 01:14:37 +02:00
Thomas Waldmann
cce0d20dad test whether borg extract can process unusual filenames 2015-08-09 01:14:37 +02:00
Thomas Waldmann
616d16a9b0 add help string for --no-files-cache, fixes #140 2015-08-08 20:50:21 +02:00
Thomas Waldmann
40801d74a6 remove old unittest discover / runner code, we use py.test now 2015-08-08 19:03:37 +02:00
Thomas Waldmann
a1e039ba21 reimplement the chunk index merging in C
the python code could take a rather long time and likely most of it was converting stuff from python to C and back.
2015-08-06 23:32:53 +02:00
Thomas Waldmann
5b441f7801 some small Cython code improvements, thanks to Stefan Behnel 2015-08-04 13:30:35 +02:00
Thomas Waldmann
175a6d7b04 simplify umask code
in a similar way as the remote_path code was implemented:
just patch the RemoteRepository class object
2015-08-04 12:31:06 +02:00
Thomas Waldmann
71646249cb implement --remote-path to allow non-default-path borg locations 2015-08-04 09:53:26 +02:00
Thomas Waldmann
9f1d92c993 implement --umask M
affects local and remote umask, secure by default M == 077
2015-08-03 23:48:56 +02:00
Thomas Waldmann
4c0012bddf add lzma compression
needs python 3.3+, on 3.2 it won't be available.
2015-08-03 00:31:33 +02:00
Thomas Waldmann
8997766202 integrate compress code, new compression spec parser for commandline
New null and lz4 compression.
Giving -C 0 now uses null compression, not zlib level 0 any more
(null has almost zero overhead while zlib-level0 still had to package everything into zlib frames).
Giving -C 10 uses new lz4 compression, super fast compression and even faster decompression.
See borg create --help (and --compression argument).

fix some issues, clean up, optimize:
CNULL: always return bytes
LZ4: deal with getting memoryviews
Compressor: give bytes to detect(), avoid memoryviews
for lz4, always use same COMPR_BUFFER, avoid memory management costs.
check --chunker-params CHUNK_MAX_EXP upper limit
2015-08-02 18:10:30 +02:00
Thomas Waldmann
746984c33b compress: add tests, zlib and null compression, ID header and autodetection 2015-08-02 01:21:41 +02:00
Thomas Waldmann
27de1b0a43 add a wrapper around liblz4 2015-08-01 15:07:54 +02:00
Thomas Waldmann
3be55bedd3 chunker: n needs to be a signed size_t
... as it is also used for the read() return value, which can be negative in case of errors.
2015-07-30 15:21:13 +02:00
Thomas Waldmann
195545075a repo delete: add destroy to allowed rpc methods, fixes issue #114
also: add test, automate YES confirmation for testing
2015-07-26 17:38:16 +02:00
Thomas Waldmann
ed2548ca02 add a __main__.py to nuitka works 2015-07-20 16:16:32 +02:00
Thomas Waldmann
e4a41c8981 fix Traceback when running check --repair, attic issue #232
This fix is maybe not perfect yet, but maybe better than nothing.

A comment by Ernest0x (see https://github.com/jborg/attic/issues/232 ):

@ThomasWaldmann your patch did the job.
attic check --repair did the repairing and attic delete deleted the archive.
Thanks.

That said, however, I am not sure if the best place to put the check is where
you put it in the patch. For example, the check operation uses a custom msgpack
unpacker class named "RobustUnpacker", which it does try to check for correct
format (see the comment: "Abort early if the data does not look like a
serialized dict"), but it seems it does not catch my case. The relevant code
in 'cache.py', on the other hand, uses msgpack's Unpacker class.
2015-07-15 13:32:05 +02:00
Thomas Waldmann
9b9c808713 fixed some minor issues found by pycharm/pytest-flakes 2015-07-15 11:30:25 +02:00
Thomas Waldmann
cc88d174af fix typos 2015-07-15 11:14:53 +02:00
Thomas Waldmann
b644565546 repo key mode (and deprecate passphrase mode), fixes #85
see usage.rst change for a description and why this is needed
2015-07-15 00:01:07 +02:00
Thomas Waldmann
b2f460d591 fix filenames used for locking, update docs about locking 2015-07-13 23:20:46 +02:00
Thomas Waldmann
2deb520e67 locking code: extract timeout/sleep code into reusable TimeoutTimer class 2015-07-13 16:45:18 +02:00
Thomas Waldmann
e4c519b1e9 new locking code
exclusive locking by atomic mkdir fs operation
on top of that, shared (read) locks and exclusive (write) locks using a json roster.
2015-07-13 13:55:28 +02:00
Thomas Waldmann
434dac0e48 move locking code to own module, same for locking tests
fix imports, no other changes.
2015-07-12 23:41:52 +02:00
Thomas Waldmann
d8e9a9bf96 skip test_crash_before_compact test for RemoteRepository
it was silently failing until recently. and it can't work the way it is on RemoteRepository.
it's still active (and now even really working) for the (local) Repository tests.
2015-07-12 23:29:34 +02:00
Thomas Waldmann
414dba3de7 remove usage of evil / broken unittest.mock, use mock from pypi
see testsuite.mock docstring for more details.

one test shows brokenness right now that was hidden / silent until now.
2015-07-12 23:08:44 +02:00
Thomas Waldmann
bd354d7bb4 create a RepositoryCache implementation that can cope with any amount of data, fixes attic #326
the old code blows up with an integer OverflowError when the cache file goes beyond 2GiB size.
the new code just reuses the Repository implementation as a local temporary key/value store.

still an issue: if the place where the temporary RepositoryCache is stored (usually /tmp) can't
cope with the cache size and runs full.

if you copy data from a fuse mount, the cache size is the copied deduplicated data size.
so, if you have lots of data to extract (more than your /tmp can hold), rather do not use fuse!

besides fuse mounts, this also affects attic check and cache sync (in these cases, only the
metadata size counts, but even that can go beyond 2GiB for some people).
2015-07-12 00:18:49 +02:00
TW
4b81f380f8 Merge pull request #88 from ThomasWaldmann/py3style
style and cosmetic fixes, no semantic changes
2015-07-11 18:39:42 +02:00
Thomas Waldmann
0580f2b4eb style and cosmetic fixes, no semantic changes
use simpler super() syntax of python 3.x

remove fixed errors/warnings' codes from setup.cfg flake8 configuration

fix file exclusion list for flake8
2015-07-11 18:31:49 +02:00
Thomas Waldmann
a59211f295 use borg-tmp as prefix for temporary files / directories
also: remove some unused temp dir. code
2015-07-11 17:22:12 +02:00
Thomas Waldmann
4068fc1e31 clarify help text, fixes #73 2015-06-28 14:02:38 +02:00
Thomas Waldmann
bc2f2fc7d2 chunker: release the gil for long-running C sections and I/O
also: add some benchmarking output showing singlethread, multithread and
multithread-with-gil-releasing-chunker performance.

this changeset maybe improves multithreading performance a little, about 3%
(but that might be close to the measurement accuracy).
2015-06-28 13:57:30 +02:00
Thomas Waldmann
4736f5b9d0 Merge branch 'master' into multithreading 2015-06-27 22:24:51 +02:00
TW
562f3c7c33 Merge pull request #72 from ThomasWaldmann/loggedio-exceptions
Loggedio exceptions
2015-06-27 22:17:02 +02:00
Thomas Waldmann
08688fbc13 Merge branch 'master' into loggedio-exceptions
Conflicts:
	borg/repository.py
2015-06-27 22:02:26 +02:00
TW
3303619b5f Merge pull request #69 from ThomasWaldmann/fix-prune-options
the short prune options without "keep-" are deprecated, so do not sug…
2015-06-26 01:19:29 +02:00
Thomas Waldmann
b92dd1bab2 the short prune options without "keep-" are deprecated, so do not suggest them 2015-06-26 00:04:35 +02:00
Thomas Waldmann
89db9b8b9e improve at-end error logging
always use archiver.print_error, so it goes to sys.stderr

always say "Error: ..." for errors

for rc != 0 always say "Exiting with failure status ..."

catch all exceptions subclassing Exception, so we can log them in same way and set exit_code=1
2015-06-25 23:57:38 +02:00
Thomas Waldmann
6964799d13 borg create --compression 0..9 for variable compression 2015-06-25 22:16:23 +02:00
Thomas Waldmann
54e8dd8419 misc chunker parameter changes
- use power-of-2 sizes / n bit hash mask so one can give them more easily
- chunker api: give seed first, so we can give *chunker_params after it
- fix some tests that aren't possible with 2^N
- make sparse file extraction zero detection flexible for variable chunk max size
2015-06-21 01:46:41 +02:00
Thomas Waldmann
3b9b976f2a borg create --chunker-params=... 2015-06-20 01:20:46 +02:00
Thomas Waldmann
6d0a00496a determine and report chunk counts in chunks index
borg info repo::archive now reports unique chunks count, total chunks count

also: use index->key_size instead of hardcoded value
2015-06-19 23:53:23 +02:00
Thomas Waldmann
2743ab1593 better Exception msg if there is no Borg installed on the remote repository server
(still a bit ugly to get even 2 tracebacks)
2015-06-18 23:18:05 +02:00
Thomas Waldmann
dd78e1a56e improve docs, usage help, changelog 2015-06-11 22:18:12 +02:00
Thomas Waldmann
614261604e don't hardcode MAGIC length 2015-06-02 02:41:23 +02:00
Thomas Waldmann
3dce75306a LoggedIO: better error checks / exceptions / exception handling
It doesn't just say "error reading segment X", but also what went wrong and at what offset.
2015-06-02 02:30:07 +02:00
Thomas Waldmann
646cdca312 "extract" micro optimization: first check for regular files, then for directories, check for fifos late
regular files are most common, more than directories. fifos are rare.

was no big issue, the calls are cheap, but also no big issue to just fix the order.
2015-05-31 21:55:15 +02:00
Thomas Waldmann
ed1e5e9c13 "create" micro optimization: do not check for sockets early
they are rare, so it's pointless to check for them first.

seen the stat..S_ISSOCK in profiling results with high call count.
was no big issue, that call is cheap, but also no big issue to just fix the order.
2015-05-31 21:54:51 +02:00
Thomas Waldmann
a3f4d19515 speed up chunks cache sync, fixes #18
Re-synchronize chunks cache with repository.

If present, uses a compressed tar archive of known backup archive
indices, so it only needs to fetch infos from repo and build a chunk
index once per backup archive.

If out of sync, the tar gets rebuilt from known + fetched chunk infos,
so it has complete and current information about all backup archives.

Finally, it builds the master chunks index by merging all indices from
the tar.

Note: compression (esp. xz) is very effective in keeping the tar
            relatively small compared to the files it contains.

Use python >= 3.3 to get better compression with xz,
there's a fallback to bz2 or gz when xz is not supported.
2015-05-31 19:17:01 +02:00
Thomas Waldmann
072326fef0 chunker: get rid of read_buf
if we have a OS file handle, we can directly read to the final destination - one memcpy less.
if we have a Python file object, we get a Python bytes object as read result (can't save the memcpy here).
2015-05-31 18:41:23 +02:00
Thomas Waldmann
926454c0d8 explicitely specify binary mode to open binary files
on POSIX OSes, it doesn't make a difference, but it is cleaner and also good for portability.
2015-05-31 17:57:45 +02:00
Thomas Waldmann
776bb9fabc hashindex: improve error messages 2015-05-31 17:48:19 +02:00
Thomas Waldmann
91e10fec5f Merge branch 'master' of github.com:jborg/attic 2015-05-31 17:37:02 +02:00
Thomas Waldmann
d067bc3178 efficient archive list from manifest
a lot of speedup for:
"list <repo>", "delete <repo>" list, "prune" - esp. for slow connections to remote repositories.

the previous method used metadata from the archive itself, which is (in total) rather large.
so if you had many archives and a slow (remote) connection, it was very slow.

but there is a lot easier way: just use the archives list from the repository manifest - we already
have it anyway and it also has name, id and timestamp for all archives - and that's all we need.

I defined a ArchiveInfo namedtuple that has same element names as seen as attribute names
of the Archive object, so as long as name, id, ts is enough, it can be used in its place.
2015-05-26 02:04:41 +02:00
Thomas Waldmann
240a27a227 multithreaded "create" operation
Making much better use of the CPU by dispatching all CPU intensive stuff
(hashing, crypto, compression) to N crypter threads (N == logical cpu count ==
4 for a dual-core CPU with hyperthreading).

I/O intensive stuff also runs in separate threads: the MainThread does the
filesystem traversal, the reader thread reads and chunks the files, the writer
thread writes to the repo. This way, we don't need to sit idle waiting for I/O,
but the I/O thread will block and another thread will get dispatched and use
the time. This applies for read as well as for write/fsync I/O wait time
(access time + data transfer).

There's one more thread, the "delayer". We need it to handle a race condition
related to the computation of the compressed size (which is only possible after
hashing/compression/encryption has finished). This "csize" makes all this code
quite more complicated than if we would not need it.

Although there is the GIL issue for Python code, we can still make good use of
multithreading as I/O operations and C code (that releases the GIL) can run in
parallel.

All threads are connected via Python Queues (which are intended for this and
thread safe). The Cache.chunks datastructure is also updated by threadsafe
code.

A little benchmark
------------------

Both is with compression (zlib level 6) and encryption on a haswell/ssd laptop:

Without multithreading code:

    Command being timed: "borg create /extra/attic/borg::1 /home/tw/Desktop/"
    User time (seconds): 13.78
    System time (seconds): 0.40
    Percent of CPU this job got: 83%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:16.98

With multithreading code:

    Command being timed: "borg create /extra/attic/borg::1 /home/tw/Desktop/"
    User time (seconds): 24.08
    System time (seconds): 1.16
    Percent of CPU this job got: 249%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:10.11

It's unclear to me why it uses much more "User time" (I'm not even sure that
measurement is correct). But the overall runtime "Elapsed" significantly
dropped and it makes better use of all cpu cores (not just 83% of one).
2015-05-25 22:37:15 +02:00
Thomas Waldmann
5e98400a5a fix all references to package name
use relative imports if possible
reorder imports (1. stdlib 2. dependencies 3. borg 4. borg.testsuite)
2015-05-22 19:21:41 +02:00
Thomas Waldmann
78bfc58b47 rename package directory to borg 2015-05-22 17:48:54 +02:00