borgbackup

mirror of https://github.com/borgbackup/borg.git synced 2026-06-08 16:23:42 -04:00

Author	SHA1	Message	Date
Thomas Waldmann	b512827b07	Merge branch 'honor_nodump' of https://github.com/jeffrizzo/attic	2015-08-12 15:57:54 +02:00
Thomas Waldmann	02ccf37766	Merge branch 'minor' of https://github.com/sourcejedi/attic	2015-08-12 15:16:44 +02:00
Thomas Waldmann	8300efb1db	remote: pragma: no cover for the stuff we can't test	2015-08-12 04:28:31 +02:00
Thomas Waldmann	4d8949e66a	archiver: more tests	2015-08-12 04:09:36 +02:00
Thomas Waldmann	b16dc03e36	tests for CompressionSpec	2015-08-12 02:27:41 +02:00
Thomas Waldmann	e06b0b3612	use C99's uintmax_t and %ju format whatever size_t and off_t is, should even fit in there	2015-08-12 01:04:03 +02:00
Thomas Waldmann	8af3aa3397	merged master	2015-08-09 23:51:46 +02:00
Thomas Waldmann	69456e07c4	cache sync: change progress output to separate lines printing without \n plus sys.stdout.flush() didn't work as expected.	2015-08-09 19:02:35 +02:00
Thomas Waldmann	197ca9c0d3	C merge code: cast to correct pointer type, silences warning	2015-08-09 16:19:53 +02:00
Thomas Waldmann	955ac9c44c	get rid of testsuite.mock, directly import from mock this was left over from times when we either used mock from stdlib or pypi mock. but as we only use pypi mock now, the indirection is not needed any more.	2015-08-09 14:26:54 +02:00
Thomas Waldmann	74e5860508	document that passphrase(-only) mode is deprecated	2015-08-09 13:47:36 +02:00
Thomas Waldmann	e74c87d5b5	update borg check help	2015-08-09 12:52:39 +02:00
Thomas Waldmann	80ee8b98af	fix the repair mode if one used --last (or since shortly: gave an archive name), verify_chunks (old method name) was not called because it requires all archives having been checked. the problem was that also the final manifest.write() and repository.commit() was done in that method, so all other repair work did not get committed in that case. I moved these calls that to a separate finish() method.	2015-08-09 12:43:57 +02:00
Thomas Waldmann	4f6c43baec	document what borg check does, fixes #138	2015-08-09 01:15:05 +02:00
Thomas Waldmann	03f39c2663	borg check: give a named single archive to it, fixes #139	2015-08-09 01:14:53 +02:00
Thomas Waldmann	35b0f38f5c	cache sync: show progress indication sync can take quite long, so show what we are doing.	2015-08-09 01:14:37 +02:00
Thomas Waldmann	cce0d20dad	test whether borg extract can process unusual filenames	2015-08-09 01:14:37 +02:00
Thomas Waldmann	616d16a9b0	add help string for --no-files-cache, fixes #140	2015-08-08 20:50:21 +02:00
Thomas Waldmann	40801d74a6	remove old unittest discover / runner code, we use py.test now	2015-08-08 19:03:37 +02:00
Thomas Waldmann	a1e039ba21	reimplement the chunk index merging in C the python code could take a rather long time and likely most of it was converting stuff from python to C and back.	2015-08-06 23:32:53 +02:00
Thomas Waldmann	5b441f7801	some small Cython code improvements, thanks to Stefan Behnel	2015-08-04 13:30:35 +02:00
Thomas Waldmann	175a6d7b04	simplify umask code in a similar way as the remote_path code was implemented: just patch the RemoteRepository class object	2015-08-04 12:31:06 +02:00
Thomas Waldmann	71646249cb	implement --remote-path to allow non-default-path borg locations	2015-08-04 09:53:26 +02:00
Thomas Waldmann	9f1d92c993	implement --umask M affects local and remote umask, secure by default M == 077	2015-08-03 23:48:56 +02:00
Thomas Waldmann	4c0012bddf	add lzma compression needs python 3.3+, on 3.2 it won't be available.	2015-08-03 00:31:33 +02:00
Thomas Waldmann	8997766202	integrate compress code, new compression spec parser for commandline New null and lz4 compression. Giving -C 0 now uses null compression, not zlib level 0 any more (null has almost zero overhead while zlib-level0 still had to package everything into zlib frames). Giving -C 10 uses new lz4 compression, super fast compression and even faster decompression. See borg create --help (and --compression argument). fix some issues, clean up, optimize: CNULL: always return bytes LZ4: deal with getting memoryviews Compressor: give bytes to detect(), avoid memoryviews for lz4, always use same COMPR_BUFFER, avoid memory management costs. check --chunker-params CHUNK_MAX_EXP upper limit	2015-08-02 18:10:30 +02:00
Thomas Waldmann	746984c33b	compress: add tests, zlib and null compression, ID header and autodetection	2015-08-02 01:21:41 +02:00
Thomas Waldmann	27de1b0a43	add a wrapper around liblz4	2015-08-01 15:07:54 +02:00
Thomas Waldmann	3be55bedd3	chunker: n needs to be a signed size_t ... as it is also used for the read() return value, which can be negative in case of errors.	2015-07-30 15:21:13 +02:00
Thomas Waldmann	195545075a	repo delete: add destroy to allowed rpc methods, fixes issue #114 also: add test, automate YES confirmation for testing	2015-07-26 17:38:16 +02:00
Thomas Waldmann	ed2548ca02	add a __main__.py to nuitka works	2015-07-20 16:16:32 +02:00
Thomas Waldmann	e4a41c8981	fix Traceback when running check --repair, attic issue #232 This fix is maybe not perfect yet, but maybe better than nothing. A comment by Ernest0x (see https://github.com/jborg/attic/issues/232 ): @ThomasWaldmann your patch did the job. attic check --repair did the repairing and attic delete deleted the archive. Thanks. That said, however, I am not sure if the best place to put the check is where you put it in the patch. For example, the check operation uses a custom msgpack unpacker class named "RobustUnpacker", which it does try to check for correct format (see the comment: "Abort early if the data does not look like a serialized dict"), but it seems it does not catch my case. The relevant code in 'cache.py', on the other hand, uses msgpack's Unpacker class.	2015-07-15 13:32:05 +02:00
Thomas Waldmann	9b9c808713	fixed some minor issues found by pycharm/pytest-flakes	2015-07-15 11:30:25 +02:00
Thomas Waldmann	cc88d174af	fix typos	2015-07-15 11:14:53 +02:00
Thomas Waldmann	b644565546	repo key mode (and deprecate passphrase mode), fixes #85 see usage.rst change for a description and why this is needed	2015-07-15 00:01:07 +02:00
Thomas Waldmann	b2f460d591	fix filenames used for locking, update docs about locking	2015-07-13 23:20:46 +02:00
Thomas Waldmann	2deb520e67	locking code: extract timeout/sleep code into reusable TimeoutTimer class	2015-07-13 16:45:18 +02:00
Thomas Waldmann	e4c519b1e9	new locking code exclusive locking by atomic mkdir fs operation on top of that, shared (read) locks and exclusive (write) locks using a json roster.	2015-07-13 13:55:28 +02:00
Thomas Waldmann	434dac0e48	move locking code to own module, same for locking tests fix imports, no other changes.	2015-07-12 23:41:52 +02:00
Thomas Waldmann	d8e9a9bf96	skip test_crash_before_compact test for RemoteRepository it was silently failing until recently. and it can't work the way it is on RemoteRepository. it's still active (and now even really working) for the (local) Repository tests.	2015-07-12 23:29:34 +02:00
Thomas Waldmann	414dba3de7	remove usage of evil / broken unittest.mock, use mock from pypi see testsuite.mock docstring for more details. one test shows brokenness right now that was hidden / silent until now.	2015-07-12 23:08:44 +02:00
Thomas Waldmann	bd354d7bb4	create a RepositoryCache implementation that can cope with any amount of data, fixes attic #326 the old code blows up with an integer OverflowError when the cache file goes beyond 2GiB size. the new code just reuses the Repository implementation as a local temporary key/value store. still an issue: if the place where the temporary RepositoryCache is stored (usually /tmp) can't cope with the cache size and runs full. if you copy data from a fuse mount, the cache size is the copied deduplicated data size. so, if you have lots of data to extract (more than your /tmp can hold), rather do not use fuse! besides fuse mounts, this also affects attic check and cache sync (in these cases, only the metadata size counts, but even that can go beyond 2GiB for some people).	2015-07-12 00:18:49 +02:00
TW	4b81f380f8	Merge pull request #88 from ThomasWaldmann/py3style style and cosmetic fixes, no semantic changes	2015-07-11 18:39:42 +02:00
Thomas Waldmann	0580f2b4eb	style and cosmetic fixes, no semantic changes use simpler super() syntax of python 3.x remove fixed errors/warnings' codes from setup.cfg flake8 configuration fix file exclusion list for flake8	2015-07-11 18:31:49 +02:00
Thomas Waldmann	a59211f295	use borg-tmp as prefix for temporary files / directories also: remove some unused temp dir. code	2015-07-11 17:22:12 +02:00
Thomas Waldmann	4068fc1e31	clarify help text, fixes #73	2015-06-28 14:02:38 +02:00
Thomas Waldmann	bc2f2fc7d2	chunker: release the gil for long-running C sections and I/O also: add some benchmarking output showing singlethread, multithread and multithread-with-gil-releasing-chunker performance. this changeset maybe improves multithreading performance a little, about 3% (but that might be close to the measurement accuracy).	2015-06-28 13:57:30 +02:00
Thomas Waldmann	4736f5b9d0	Merge branch 'master' into multithreading	2015-06-27 22:24:51 +02:00
TW	562f3c7c33	Merge pull request #72 from ThomasWaldmann/loggedio-exceptions Loggedio exceptions	2015-06-27 22:17:02 +02:00
Thomas Waldmann	08688fbc13	Merge branch 'master' into loggedio-exceptions Conflicts: borg/repository.py	2015-06-27 22:02:26 +02:00
TW	3303619b5f	Merge pull request #69 from ThomasWaldmann/fix-prune-options the short prune options without "keep-" are deprecated, so do not sug…	2015-06-26 01:19:29 +02:00
Thomas Waldmann	b92dd1bab2	the short prune options without "keep-" are deprecated, so do not suggest them	2015-06-26 00:04:35 +02:00
Thomas Waldmann	89db9b8b9e	improve at-end error logging always use archiver.print_error, so it goes to sys.stderr always say "Error: ..." for errors for rc != 0 always say "Exiting with failure status ..." catch all exceptions subclassing Exception, so we can log them in same way and set exit_code=1	2015-06-25 23:57:38 +02:00
Thomas Waldmann	6964799d13	borg create --compression 0..9 for variable compression	2015-06-25 22:16:23 +02:00
Thomas Waldmann	54e8dd8419	misc chunker parameter changes - use power-of-2 sizes / n bit hash mask so one can give them more easily - chunker api: give seed first, so we can give *chunker_params after it - fix some tests that aren't possible with 2^N - make sparse file extraction zero detection flexible for variable chunk max size	2015-06-21 01:46:41 +02:00
Thomas Waldmann	3b9b976f2a	borg create --chunker-params=...	2015-06-20 01:20:46 +02:00
Thomas Waldmann	6d0a00496a	determine and report chunk counts in chunks index borg info repo::archive now reports unique chunks count, total chunks count also: use index->key_size instead of hardcoded value	2015-06-19 23:53:23 +02:00
Thomas Waldmann	2743ab1593	better Exception msg if there is no Borg installed on the remote repository server (still a bit ugly to get even 2 tracebacks)	2015-06-18 23:18:05 +02:00
Thomas Waldmann	dd78e1a56e	improve docs, usage help, changelog	2015-06-11 22:18:12 +02:00
Thomas Waldmann	614261604e	don't hardcode MAGIC length	2015-06-02 02:41:23 +02:00
Thomas Waldmann	3dce75306a	LoggedIO: better error checks / exceptions / exception handling It doesn't just say "error reading segment X", but also what went wrong and at what offset.	2015-06-02 02:30:07 +02:00
Thomas Waldmann	646cdca312	"extract" micro optimization: first check for regular files, then for directories, check for fifos late regular files are most common, more than directories. fifos are rare. was no big issue, the calls are cheap, but also no big issue to just fix the order.	2015-05-31 21:55:15 +02:00
Thomas Waldmann	ed1e5e9c13	"create" micro optimization: do not check for sockets early they are rare, so it's pointless to check for them first. seen the stat..S_ISSOCK in profiling results with high call count. was no big issue, that call is cheap, but also no big issue to just fix the order.	2015-05-31 21:54:51 +02:00
Thomas Waldmann	a3f4d19515	speed up chunks cache sync, fixes #18 Re-synchronize chunks cache with repository. If present, uses a compressed tar archive of known backup archive indices, so it only needs to fetch infos from repo and build a chunk index once per backup archive. If out of sync, the tar gets rebuilt from known + fetched chunk infos, so it has complete and current information about all backup archives. Finally, it builds the master chunks index by merging all indices from the tar. Note: compression (esp. xz) is very effective in keeping the tar relatively small compared to the files it contains. Use python >= 3.3 to get better compression with xz, there's a fallback to bz2 or gz when xz is not supported.	2015-05-31 19:17:01 +02:00
Thomas Waldmann	072326fef0	chunker: get rid of read_buf if we have a OS file handle, we can directly read to the final destination - one memcpy less. if we have a Python file object, we get a Python bytes object as read result (can't save the memcpy here).	2015-05-31 18:41:23 +02:00
Thomas Waldmann	926454c0d8	explicitely specify binary mode to open binary files on POSIX OSes, it doesn't make a difference, but it is cleaner and also good for portability.	2015-05-31 17:57:45 +02:00
Thomas Waldmann	776bb9fabc	hashindex: improve error messages	2015-05-31 17:48:19 +02:00
Thomas Waldmann	91e10fec5f	Merge branch 'master' of github.com:jborg/attic	2015-05-31 17:37:02 +02:00
Thomas Waldmann	d067bc3178	efficient archive list from manifest a lot of speedup for: "list <repo>", "delete <repo>" list, "prune" - esp. for slow connections to remote repositories. the previous method used metadata from the archive itself, which is (in total) rather large. so if you had many archives and a slow (remote) connection, it was very slow. but there is a lot easier way: just use the archives list from the repository manifest - we already have it anyway and it also has name, id and timestamp for all archives - and that's all we need. I defined a ArchiveInfo namedtuple that has same element names as seen as attribute names of the Archive object, so as long as name, id, ts is enough, it can be used in its place.	2015-05-26 02:04:41 +02:00
Thomas Waldmann	240a27a227	multithreaded "create" operation Making much better use of the CPU by dispatching all CPU intensive stuff (hashing, crypto, compression) to N crypter threads (N == logical cpu count == 4 for a dual-core CPU with hyperthreading). I/O intensive stuff also runs in separate threads: the MainThread does the filesystem traversal, the reader thread reads and chunks the files, the writer thread writes to the repo. This way, we don't need to sit idle waiting for I/O, but the I/O thread will block and another thread will get dispatched and use the time. This applies for read as well as for write/fsync I/O wait time (access time + data transfer). There's one more thread, the "delayer". We need it to handle a race condition related to the computation of the compressed size (which is only possible after hashing/compression/encryption has finished). This "csize" makes all this code quite more complicated than if we would not need it. Although there is the GIL issue for Python code, we can still make good use of multithreading as I/O operations and C code (that releases the GIL) can run in parallel. All threads are connected via Python Queues (which are intended for this and thread safe). The Cache.chunks datastructure is also updated by threadsafe code. A little benchmark ------------------ Both is with compression (zlib level 6) and encryption on a haswell/ssd laptop: Without multithreading code: Command being timed: "borg create /extra/attic/borg::1 /home/tw/Desktop/" User time (seconds): 13.78 System time (seconds): 0.40 Percent of CPU this job got: 83% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:16.98 With multithreading code: Command being timed: "borg create /extra/attic/borg::1 /home/tw/Desktop/" User time (seconds): 24.08 System time (seconds): 1.16 Percent of CPU this job got: 249% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:10.11 It's unclear to me why it uses much more "User time" (I'm not even sure that measurement is correct). But the overall runtime "Elapsed" significantly dropped and it makes better use of all cpu cores (not just 83% of one).	2015-05-25 22:37:15 +02:00
Thomas Waldmann	5e98400a5a	fix all references to package name use relative imports if possible reorder imports (1. stdlib 2. dependencies 3. borg 4. borg.testsuite)	2015-05-22 19:21:41 +02:00
Thomas Waldmann	78bfc58b47	rename package directory to borg	2015-05-22 17:48:54 +02:00

... 8 9 10 11 12

572 commits