borgbackup

mirror of https://github.com/borgbackup/borg.git synced 2026-02-20 00:10:35 -05:00

Author	SHA1	Message	Date
Thomas Waldmann	f1100f3c86	create: fix repo lock getting stale when processing lots of unchanged files, fixes #8442 as a side effect, maybe also better keeps the ssh / tcp connection alive, if there is a bit of traffic every 60s.	2024-10-02 12:49:39 +02:00
Thomas Waldmann	f082df7f33	allow -a / --match-archives multiple times, ANDed e.g.: borg delete -a home -a user:kenny -a host:kenny-pc	2024-09-27 00:19:15 +02:00
Thomas Waldmann	1436bbba1a	bugfix: remove superfluous repository.list() call Because it ended the loop only when .list() returned an empty result, this always needed one call more than necessary. We can also detect that we are finished, if .list() returns less than the limit we gave to it. Also: reduce code duplication by using repo_lister func.	2024-09-24 23:43:08 +02:00
Thomas Waldmann	36e3d63474	chunks index caching, fixes #8397 borg compact now uses ChunkIndex (a specialized, memory-efficient data structure), so it needs less memory now. Also, it saves that chunks index to cache/chunks in the repository. When the chunks index is needed, it is first tried to get it from cache/chunks. If that fails, fall back to building the chunks index via repository.list(), which can be rather slow and immediately cache the resulting ChunkIndex in the repo. borg check --repair currently just deletes the chunks cache, because it might have deleted some invalid chunks in the repo. cache.close now saves the chunks index to cache/chunks in repo if it was modified. thus, borg create will update the cached chunks index with new chunks. cache/chunks_hash can be used to validate cache/chunks (and also to validate / invalidate locally cached copies of that).	2024-09-24 22:25:00 +02:00
Thomas Waldmann	e5e685fd1f	cache: fix crash in _build_files_cache	2024-09-22 00:36:30 +02:00
Thomas Waldmann	ec9d412756	fix race condition with data loss potential, fixes #3536 we discard all files cache entries referring to files with timestamps AFTER we started the backup. so, even in case we would back up an inconsistent file that has been changed while we backed it up, we would not have a files cache entry for it and would fully read/chunk/hash it again in next backup.	2024-09-21 11:34:34 +02:00
Thomas Waldmann	c100e7b1f5	files cache: update ctime, mtime of known and "unchanged" files, fixes #4915	2024-09-20 00:44:55 +02:00
Thomas Waldmann	a891559578	files cache improvements, fixes #8385 , fixes #5658 - changes to locally stored files cache: - store as files.<H(archive_name)> - user can manually control suffix via env var - if local files cache is not found, build from previous archive. - enable rebuilding the files cache via loading the previous archive's metadata from the repo (better than starting with empty files cache and needing to read/chunk/hash all files). previous archive == same archive name, latest timestamp in repo. - remove AdHocCache (not needed any more, slow) - remove BORG_CACHE_IMPL, we only have one - remove cache lock (this was blocking parallel backups to same repo from same machine/user). Cache entries now have ctime AND mtime. Note: TTL and age still needed for discarding removed files. But due to the separate files caches per series, the TTL was lowered to 2 (from 20).	2024-09-20 00:40:49 +02:00
Thomas Waldmann	e2aa9d56d0	build_chunkindex_from_repo: reduce code duplication	2024-09-07 22:04:53 +02:00
Thomas Waldmann	ccc84c7a4e	cache: renamed .chunk_incref -> .reuse_chunk, boolean .seen_chunk reuse_chunk is the complement of add_chunk for already existing chunks. It doesn't do refcounting anymore. .seen_chunk does not return the refcount anymore, but just whether the chunk exists. If we add a new chunk, it immediately sets its refcount to MAX_VALUE, so there is no difference anymore between previously existing chunks and new chunks added. This makes the stats even more useless, but we have less complexity.	2024-09-07 22:04:47 +02:00
Thomas Waldmann	ef47666627	cache/hashindex: remove decref method, don't try to remove chunks on exceptions When the AdhocCache(WithFiles) queries chunk IDs from the repo to build the chunks index, it won't know their refcount and thus all chunks in the index have their refcount at the MAX_VALUE (representing "infinite") and that would never decrease nor could that ever reach zero and get the chunk deleted from the repo. Only completely new chunks first written in the current borg run have a valid refcount. In some exception handlers, borg tried to clean up chunks that won't be used by an item by decref'ing them. That is either: - pointless due to refcount being at MAX_VALUE - inefficient, because the user might retry the backup and would need to transmit these chunks to the repo again. We'll just rely on borg compact ONLY to clean up any unused/orphan chunks.	2024-09-07 22:04:40 +02:00
Thomas Waldmann	d27b7a7981	cache: remove transactions, load files/chunks cache on demand	2024-09-07 22:04:38 +02:00
Thomas Waldmann	c67cf07522	Repository.list: return [(id, stored_size), ...] Note: LegacyRepository still returns [id, ...] and so does RemoteRepository.list, if the remote repo is a LegacyRepository. also: use LIST_SCAN_LIMIT	2024-09-07 22:03:56 +02:00
Thomas Waldmann	05739aaa65	refactor: rename repository/locking classes/modules Repository -> LegacyRepository RemoteRepository -> LegacyRemoteRepository borg.repository -> borg.legacyrepository borg.remote -> borg.legacyremote Repository3 -> Repository RemoteRepository3 -> RemoteRepository borg.repository3 -> borg.repository borg.remote3 -> borg.remote borg.locking -> borg.fslocking borg.locking3 -> borg.storelocking	2024-09-07 22:01:11 +02:00
Thomas Waldmann	5e3f2c04d5	remove archive checkpointing borg1 needed this due to its transactional / rollback behaviour: if there was uncommitted stuff in the repo, next repo opening automatically rolled back to last commit. thus we needed checkpoint archives to reference chunks and commit the repo. borg2 does not do that anymore, unused chunks are only removed when the user invokes borg compact. thus, if a borg create gets interrupted, the user can just run borg create again and it will find some chunks are already in the repo, making progress even if borg create gets frequently interrupted.	2024-09-07 22:00:54 +02:00
Thomas Waldmann	68e64adb9f	cache: add log msg to _load_chunks_from_repo For big repos, this might take a while, so at least have messages on debug level.	2024-09-07 22:00:49 +02:00
Thomas Waldmann	1231c961fb	blacken the code	2024-09-07 22:00:39 +02:00
Thomas Waldmann	dcde48490e	remove CacheStatsMixin	2024-09-07 22:00:36 +02:00
Thomas Waldmann	fc6d459875	cache: replace .stats() by a dummy Dummy returns all-zero stats from that call. Problem was that these values can't be computed from the chunks cache anymore. No correct refcounts, often no size information. Also removed hashindex.ChunkIndex.summarize (previously used by the above mentioned .stats() call) and .stats_against (unused) for same reason.	2024-09-07 22:00:35 +02:00
Thomas Waldmann	d6a70f48f2	remove LocalCache Note: this is the default cache implementation in borg 1.x, it worked well, but there were some issues: - if the local chunks cache got out of sync with the repository, it needed an expensive rebuild from the infos in all archives. - to optimize that, a local chunks.archive.d cache was used to speed that up, but at the price of quite significant space needs. AdhocCacheWithFiles replaced this with a non-persistent chunks cache, requesting all chunkids from the repository to initialize a simplified non-persistent chunks index, that does not do real refcounting and also initially does not have size information for pre-existing chunks. We want to move away from precise refcounting, LocalCache needs to die.	2024-09-07 22:00:31 +02:00
Thomas Waldmann	8b9c052acc	manifest: store archives separately one-by-one into archives/* repository: - api/rpc support for get/put manifest - api/rpc support to access the store	2024-09-07 22:00:21 +02:00
Thomas Waldmann	d30d5f4aec	Repository3 / RemoteRepository3: implement a borgstore based repository Simplify the repository a lot: No repository transactions, no log-like appending, no append-only, no segments, just using a key/value store for the individual chunks. No locking yet. Also: mypy: ignore missing import there are no library stubs for borgstore yet, so mypy errors without that option. pyproject.toml: install borgstore directly from github There is no pypi release yet. use pip install -e . rather than python setup.py develop The latter is deprecated and had issues installing the "borgstore from github" dependency.	2024-08-23 23:55:09 +02:00
Thomas Waldmann	619a06a5ba	BORG_CACHE_IMPL defaults to "adhocwithfiles" now Also: support a "cli" env var value, that does not determine the implementation from the env var, but rather from cli options (similar to as it was before adding BORG_CACHE_IMPL).	2024-07-18 22:51:17 +02:00
Thomas Waldmann	5a500cddf8	rename NewCache -> AdHocWithFilesCache	2024-07-18 22:14:00 +02:00
Thomas Waldmann	616af8daa8	BORG_CACHE_IMPL environment variable added BORG_CACHE_IMPL allows users to choose the client-side cache implementation from 'local', 'newcache' and 'adhoc'.	2024-07-15 12:45:16 +02:00
Thomas Waldmann	c7249583e7	fix test_cache_chunks - skip test_cache_chunks if there is no persistent chunks cache file - init self.chunks for AdHocCache - remove warning output from AdHocCache.__init__, it gets mixed with JSON output and fails the JSON decoder.	2024-07-15 12:45:13 +02:00
Thomas Waldmann	561dcc8abf	Refactor cache sync options and introduce new cache preference Add new borg create option '--prefer-adhoc-cache' to prefer the AdHocCache over the NewCache implementation. Adjust a test to match the previous default behaviour (== use the AdHocCache) with --no-cache-sync.	2024-07-15 12:45:12 +02:00
Thomas Waldmann	85688e7543	keep timestamp only in security dir removed some code borg had for backwards compatibility with old borg versions (that had timestamp only in the cache). now the manifest timestamp is only checked against the manifest-timestamp file in the security dir, simplifying the code.	2024-07-15 12:45:09 +02:00
Thomas Waldmann	89d867ea30	keep key_type only in security dir removed some code borg had for backwards compatibility with old borg versions (that had key_type only in the cache). now the repo key_type is only checked against the key-type file in the security dir, simplifying the code.	2024-07-15 12:45:08 +02:00
Thomas Waldmann	cf8c3a3ae7	keep previous repo location only in security dir removed some code borg had for backwards compatibility with old borg versions (that had previous_location only in the cache). now the repo location is only checked against the location file in the security dir, simplifying the code and also fixing a related test failure with NewCache. also improved test_repository_move to test for aborting in case the repo location changed unexpectedly.	2024-07-15 12:45:06 +02:00
Thomas Waldmann	e2a1999c59	implement NewCache Also: - move common code to ChunksMixin - always use ._txn_active (not .txn_active) Some tests are still failing.	2024-07-15 12:44:52 +02:00
Thomas Waldmann	d466005682	refactor files cache code into FilesCacheMixin class	2024-07-15 12:44:47 +02:00
Thomas Waldmann	98162fbb42	create --no-cache-sync-forced option when given, force using the AdHocCache.	2024-07-15 12:44:44 +02:00
Thomas Waldmann	de342581d6	fix AdHocCache.add_chunk signature (ctype, clevel kwargs)	2024-07-15 12:44:43 +02:00
Thomas Waldmann	17fce18b44	always give id and size to chunk_incref/chunk_decref incref: returns (id, size), so it needs the size if it can't get it from the chunks index. also needed for updating stats. decref: caller does not always have the chunk size (e.g. for metadata chunks), as we consider 0 to be an invalid size, we call with size == 1 in that case. thus, stats might be slightly off.	2024-07-15 12:44:41 +02:00
Thomas Waldmann	4488c077a7	files cache: add chunk size information the files cache used to have only the chunk ids, so it had to rely on the chunks index having the size information - which is problematic with e.g. the AdhocCache (has size==0 for all not new chunks) and blocked using the files cache there.	2024-07-15 12:44:34 +02:00
William Bonnaventure	fb7a8f2d85	Add BORG_USE_CHUNKS_ARCHIVE option	2024-07-13 21:26:13 +02:00
William Bonnaventure	c3fb27f463	Automatic rebuild cache on exception, fixes #5213 (#8257 ) Try to rebuild cache if an exception is raised, fixes #5213 For now, we catch FileNotFoundError and FileIntegrityError. Write cache config without manifest to prevent override of manifest_id. This is needed in order to have an empty manifest_id. This empty id triggers the re-syncing of the chunks cache by calling sync() inside LocalCache.__init__() Adapt and extend test_cache_chunks to new behaviour: - a cache wipe is expected now. - borg detects the corrupt cache and wipes/rebuilds the cache. - check if the in-memory and on-disk cache is as expected (a rebuilt chunks cache).	2024-07-06 18:05:01 +02:00
Thomas Waldmann	334fbab897	refactor: use less binascii our own hex_to_bin / bin_to_hex is more comfortable to use. also: optimize remaining binascii usage / imports.	2024-02-19 02:16:19 +01:00
Thomas Waldmann	9de07ebd46	update "modern" error RCs (docs and code)	2024-02-13 22:58:02 +01:00
Thomas Waldmann	6a68ad5cd6	remove archive TAMs	2023-09-24 20:10:51 +02:00
Thomas Waldmann	1b6f928917	ro_type: typed repo objects, see #7670 writing: put type into repoobj metadata reading: check wanted type against type we got repoobj metadata is encrypted and authenticated. repoobj data is encrypted and authenticated, also (separately). encryption and decryption of both metadata and data get the same "chunk ID" as AAD, so both are "bound" to that (same) ID. a repo-side attacker can neither see cleartext metadata/data, nor successfully tamper with it (AEAD decryption would fail). also, a repo-side attacker could not replace a repoobj A with a differently typed repoobj B without borg noticing: - the metadata/data is cryptographically bound to its ID. authentication/decryption would fail on mismatch. - the type check would fail. thus, the problem (see CVEs in changelog) solved in borg 1 by the manifest and archive TAMs is now already solved by the type check.	2023-09-24 20:10:50 +02:00
Thomas Waldmann	0fcd3e9479	add_chunk: remove overwrite parameter	2023-09-23 00:10:35 +02:00
Thomas Waldmann	2d78fa89a5	always implicitly require archive TAMs they must be there since the upgrade to borg 1.2.6 (or other borg versions that also have a fix for CVE-2023-36811).	2023-09-03 22:02:35 +02:00
Thomas Waldmann	5cd2060345	rebuild_refcounts: keep archive ID, if possible rebuild_refcounts verifies and recreates the TAM. Now it re-uses the salt, so that the archive ID does not change just because of a new salt if the archive has still the same data.	2023-08-30 01:13:52 +02:00
Thomas Waldmann	277b0b81a8	cache sync: check archive TAM	2023-08-30 00:58:00 +02:00
Thomas Waldmann	5013121bd8	fix E501	2023-07-26 01:24:20 +02:00
Thomas Waldmann	3017701958	simplify flake8 configuration we use black since a while, so some stuff does not need to be ignored any more.	2023-07-25 23:56:31 +02:00
Thomas Waldmann	ec1f2dfbf1	--files-cache=size: fix crash, fixes #7658	2023-06-29 23:09:24 +02:00
Thomas Waldmann	989b0a2847	use correct path for security dir when accessing legacy repos (v1) while on macOS the new and old security dir location is the same path, this is not the case on e.g. Linux, it could move from .config/borg/security to .local/share/borg/security . See #5760.	2023-05-19 21:12:59 +02:00

1 2 3 4

187 commits