Consolidate key backup documentation into `borg key export` and reference
it from Quickstart and FAQ to avoid duplication and inconsistency.
Clarify that while `repokey` or `authenticated` mode stores the key in the
repo, a separate backup is still recommended to protect against repository
corruption or data loss.
Some features like append-only repositories rely on a server-side component
that enforces them (because that shall only be controllable server-side,
not client-side).
So, that can only work, if such a server-side component exists, which is the
case for borg 1.x ssh: repositories (but not for borg 1.x non-ssh: repositories).
For borg2, we currently have:
- fs repos
- sftp: repos
- rclone: repos (enabling many different cloud providers)
- s3/b3: repos
- ssh: repos using client/server rpc code similar as in borg 1.x
So, only for the last method we have a borg server-side process that could enforce some features, but not for any of the other repo types.
For append-only the current idea is that this should not be done within borg,
but solved by a missing repo object delete permission enforced by the storage.
borg create could then use credentials that miss permission to delete,
while borg compact would use credentials that include permission to delete.
- changes to locally stored files cache:
- store as files.<H(archive_name)>
- user can manually control suffix via env var
- if local files cache is not found, build from previous archive.
- enable rebuilding the files cache via loading the previous
archive's metadata from the repo (better than starting with
empty files cache and needing to read/chunk/hash all files).
previous archive == same archive name, latest timestamp in repo.
- remove AdHocCache (not needed any more, slow)
- remove BORG_CACHE_IMPL, we only have one
- remove cache lock (this was blocking parallel backups to same
repo from same machine/user).
Cache entries now have ctime AND mtime.
Note: TTL and age still needed for discarding removed files.
But due to the separate files caches per series, the TTL
was lowered to 2 (from 20).
in borg 1.x, we used to put a timestamp into the archive name to make
it unique, because borg1 required that.
borg2 does not require unique archive names, but it encourages you
to even use an identical archive name within the same SERIES of archives.
that makes matching (e.g. for prune, but also at other places) much
simpler and borg KNOWS which archives belong to the same series.
borg1 needed this due to its transactional / rollback behaviour:
if there was uncommitted stuff in the repo, next repo opening automatically
rolled back to last commit. thus we needed checkpoint archives to reference
chunks and commit the repo.
borg2 does not do that anymore, unused chunks are only removed when the
user invokes borg compact.
thus, if a borg create gets interrupted, the user can just run borg create
again and it will find some chunks are already in the repo, making progress
even if borg create gets frequently interrupted.
Note: this is the default cache implementation in borg 1.x,
it worked well, but there were some issues:
- if the local chunks cache got out of sync with the repository,
it needed an expensive rebuild from the infos in all archives.
- to optimize that, a local chunks.archive.d cache was used to
speed that up, but at the price of quite significant space needs.
AdhocCacheWithFiles replaced this with a non-persistent chunks cache,
requesting all chunkids from the repository to initialize a simplified
non-persistent chunks index, that does not do real refcounting and also
initially does not have size information for pre-existing chunks.
We want to move away from precise refcounting, LocalCache needs to die.
at some places, the docs were not updated yet.
for borg 1.x, -a (aka --glob-archives) expected
sh: style glob patterns ONLY (but one must not
give sh: explicitly).
for borg 2, -a (aka --match-archives) defaults
to id: style (identical match), so one must give
sh: if one wants shell-style globbing.
we now just treat that one .borg_part file we might have inside
checkpoint archives as a normal file.
people can recognize via the file name it is a partial file.
nobody cares for statistics of checkpoint files and the final
archive now does not contain any partial files any more, thus
no needs to maintain statistics about count and size of part
files.
One cannot "to not x", but one can "not to x".
Avoiding split infinitives gives the added bonus that machine
translation yields better results.
setup (n/adj) vs set(v) up. We don't "I setup it" but "I set it up".
Likewise for login(n/adj) and log(v) in, backup(n/adj) and back(v) up.
implemented by introducing one level of indirection, the limit is now
very high, so it is not practically relevant any more.
we always use the indirection (storing the metadata stream chunk ids list not
directly into the archive item, but into some repo objects referenced by the new
ArchiveItem.item_ptrs list).
thus, the code behaves the same for all archive sizes.
borg now has the chunks list in every item with content.
due to the symmetric way how borg now deals with hardlinks using
item.hlid, processing gets much simpler.
but some places where borg deals with other "sources" of hardlinks
still need to do some hardlink management:
borg uses the HardLinkManager there now (which is not much more
than a dict, but keeps documentation at one place and avoids some
code duplication we had before).
item.hlid is computed via hardlink_id function.
support hardlinked symlinks, fixes#2379
as we use item.hlid now to group hardlinks together,
there is no conflict with the item.source usage for
symlink targets any more.
2nd+ hardlinks now add to the files count as did the 1st one.
for borg, now all hardlinks are created equal.
so any hardlink item with chunks now adds to the "file" count.
ItemFormatter: support {hlid} instead of {source} for hardlinks
The previous sample for creating a ~/.borg-passphrase file creates it first and then chmod's it to 400 permissions. That's probably fine in practice, but means there's a tiny window where the passphrase file is sitting with default permissions (likely world readable, depending on the system umask).
It seems safer to first change the umask to remove all group & world bits (0077) _before_ creating the file. To be polite and avoid messing with the user's previous umask, we do this in a subshell. (Note that umask 0077 leads to a mode of 600 rather than the previous 400, because removing the owner write bit doesn't seem to buy much since the owner can just chmod the file anyway.)