borgbackup

mirror of https://github.com/borgbackup/borg.git synced 2026-04-23 07:07:52 -04:00

Author	SHA1	Message	Date
Thomas Waldmann	91d2cfa671	Merge branch 'master' into multithreading	2015-08-15 21:45:52 +02:00
Thomas Waldmann	69456e07c4	cache sync: change progress output to separate lines printing without \n plus sys.stdout.flush() didn't work as expected.	2015-08-09 19:02:35 +02:00
Thomas Waldmann	35b0f38f5c	cache sync: show progress indication sync can take quite long, so show what we are doing.	2015-08-09 01:14:37 +02:00
Thomas Waldmann	a1e039ba21	reimplement the chunk index merging in C the python code could take a rather long time and likely most of it was converting stuff from python to C and back.	2015-08-06 23:32:53 +02:00
Thomas Waldmann	e4a41c8981	fix Traceback when running check --repair, attic issue #232 This fix is maybe not perfect yet, but maybe better than nothing. A comment by Ernest0x (see https://github.com/jborg/attic/issues/232 ): @ThomasWaldmann your patch did the job. attic check --repair did the repairing and attic delete deleted the archive. Thanks. That said, however, I am not sure if the best place to put the check is where you put it in the patch. For example, the check operation uses a custom msgpack unpacker class named "RobustUnpacker", which it does try to check for correct format (see the comment: "Abort early if the data does not look like a serialized dict"), but it seems it does not catch my case. The relevant code in 'cache.py', on the other hand, uses msgpack's Unpacker class.	2015-07-15 13:32:05 +02:00
Thomas Waldmann	b2f460d591	fix filenames used for locking, update docs about locking	2015-07-13 23:20:46 +02:00
Thomas Waldmann	e4c519b1e9	new locking code exclusive locking by atomic mkdir fs operation on top of that, shared (read) locks and exclusive (write) locks using a json roster.	2015-07-13 13:55:28 +02:00
Thomas Waldmann	434dac0e48	move locking code to own module, same for locking tests fix imports, no other changes.	2015-07-12 23:41:52 +02:00
TW	4b81f380f8	Merge pull request #88 from ThomasWaldmann/py3style style and cosmetic fixes, no semantic changes	2015-07-11 18:39:42 +02:00
Thomas Waldmann	0580f2b4eb	style and cosmetic fixes, no semantic changes use simpler super() syntax of python 3.x remove fixed errors/warnings' codes from setup.cfg flake8 configuration fix file exclusion list for flake8	2015-07-11 18:31:49 +02:00
Thomas Waldmann	a59211f295	use borg-tmp as prefix for temporary files / directories also: remove some unused temp dir. code	2015-07-11 17:22:12 +02:00
Thomas Waldmann	4736f5b9d0	Merge branch 'master' into multithreading	2015-06-27 22:24:51 +02:00
Thomas Waldmann	a3f4d19515	speed up chunks cache sync, fixes #18 Re-synchronize chunks cache with repository. If present, uses a compressed tar archive of known backup archive indices, so it only needs to fetch infos from repo and build a chunk index once per backup archive. If out of sync, the tar gets rebuilt from known + fetched chunk infos, so it has complete and current information about all backup archives. Finally, it builds the master chunks index by merging all indices from the tar. Note: compression (esp. xz) is very effective in keeping the tar relatively small compared to the files it contains. Use python >= 3.3 to get better compression with xz, there's a fallback to bz2 or gz when xz is not supported.	2015-05-31 19:17:01 +02:00
Thomas Waldmann	926454c0d8	explicitely specify binary mode to open binary files on POSIX OSes, it doesn't make a difference, but it is cleaner and also good for portability.	2015-05-31 17:57:45 +02:00
Thomas Waldmann	240a27a227	multithreaded "create" operation Making much better use of the CPU by dispatching all CPU intensive stuff (hashing, crypto, compression) to N crypter threads (N == logical cpu count == 4 for a dual-core CPU with hyperthreading). I/O intensive stuff also runs in separate threads: the MainThread does the filesystem traversal, the reader thread reads and chunks the files, the writer thread writes to the repo. This way, we don't need to sit idle waiting for I/O, but the I/O thread will block and another thread will get dispatched and use the time. This applies for read as well as for write/fsync I/O wait time (access time + data transfer). There's one more thread, the "delayer". We need it to handle a race condition related to the computation of the compressed size (which is only possible after hashing/compression/encryption has finished). This "csize" makes all this code quite more complicated than if we would not need it. Although there is the GIL issue for Python code, we can still make good use of multithreading as I/O operations and C code (that releases the GIL) can run in parallel. All threads are connected via Python Queues (which are intended for this and thread safe). The Cache.chunks datastructure is also updated by threadsafe code. A little benchmark ------------------ Both is with compression (zlib level 6) and encryption on a haswell/ssd laptop: Without multithreading code: Command being timed: "borg create /extra/attic/borg::1 /home/tw/Desktop/" User time (seconds): 13.78 System time (seconds): 0.40 Percent of CPU this job got: 83% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:16.98 With multithreading code: Command being timed: "borg create /extra/attic/borg::1 /home/tw/Desktop/" User time (seconds): 24.08 System time (seconds): 1.16 Percent of CPU this job got: 249% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:10.11 It's unclear to me why it uses much more "User time" (I'm not even sure that measurement is correct). But the overall runtime "Elapsed" significantly dropped and it makes better use of all cpu cores (not just 83% of one).	2015-05-25 22:37:15 +02:00
Thomas Waldmann	5e98400a5a	fix all references to package name use relative imports if possible reorder imports (1. stdlib 2. dependencies 3. borg 4. borg.testsuite)	2015-05-22 19:21:41 +02:00
Thomas Waldmann	78bfc58b47	rename package directory to borg	2015-05-22 17:48:54 +02:00

17 commits