borg 1.x encouraged users to put everything into the archive name:
- name of the dataset
- timestamp (usually used to make the archive name unique)
- maybe also hostname (when backing up to same repo from multiple hosts)
- maybe also username (when backing up to same repo from multiple users)
borg2 now discourages users from putting the timestamp into the name,
because we rather want same name within a series of archives - thus,
the field width for the name can be narrower.
the ID of the archive is now the only unique identifier, thus it is
moved to the leftmost place.
256bits (64 hex digits) was a bit much and as borg can also deal with
abbreviated IDs, we only show 32bits (8 hex digits) by default.
the ID is followed by the timestamp (also quite "interesting", because
it usually differs for different archives).
then following are: archive name, user name, host name - these might be
always the same if there is only one series of archives in a repo.
use 2 blanks separating the fields for better readability.
- changes to locally stored files cache:
- store as files.<H(archive_name)>
- user can manually control suffix via env var
- if local files cache is not found, build from previous archive.
- enable rebuilding the files cache via loading the previous
archive's metadata from the repo (better than starting with
empty files cache and needing to read/chunk/hash all files).
previous archive == same archive name, latest timestamp in repo.
- remove AdHocCache (not needed any more, slow)
- remove BORG_CACHE_IMPL, we only have one
- remove cache lock (this was blocking parallel backups to same
repo from same machine/user).
Cache entries now have ctime AND mtime.
Note: TTL and age still needed for discarding removed files.
But due to the separate files caches per series, the TTL
was lowered to 2 (from 20).
in borg 1.x, we used to put a timestamp into the archive name to make
it unique, because borg1 required that.
borg2 does not require unique archive names, but it encourages you
to even use an identical archive name within the same SERIES of archives.
that makes matching (e.g. for prune, but also at other places) much
simpler and borg KNOWS which archives belong to the same series.
Note: this is the default cache implementation in borg 1.x,
it worked well, but there were some issues:
- if the local chunks cache got out of sync with the repository,
it needed an expensive rebuild from the infos in all archives.
- to optimize that, a local chunks.archive.d cache was used to
speed that up, but at the price of quite significant space needs.
AdhocCacheWithFiles replaced this with a non-persistent chunks cache,
requesting all chunkids from the repository to initialize a simplified
non-persistent chunks index, that does not do real refcounting and also
initially does not have size information for pre-existing chunks.
We want to move away from precise refcounting, LocalCache needs to die.
Also: support a "cli" env var value, that does not determine
the implementation from the env var, but rather from cli options (similar to as it was before adding BORG_CACHE_IMPL).
Move the explanation below the general explanation of the `--keep-*` option
behavior rephrase the last sentence to make it clear that it works like the
other options that were explained in the previous paragraph.
Resolves#7687
- pattern needs to start with + - !
- first match wins
- the default is to list everything, thus a 2nd pattern
is needed to exclude everything not matched by 1st pattern.
at some places, the docs were not updated yet.
for borg 1.x, -a (aka --glob-archives) expected
sh: style glob patterns ONLY (but one must not
give sh: explicitly).
for borg 2, -a (aka --match-archives) defaults
to id: style (identical match), so one must give
sh: if one wants shell-style globbing.
One cannot "to not x", but one can "not to x".
Avoiding split infinitives gives the added bonus that machine
translation yields better results.
setup (n/adj) vs set(v) up. We don't "I setup it" but "I set it up".
Likewise for login(n/adj) and log(v) in, backup(n/adj) and back(v) up.
\n is automatically converted on write to the platform-dependent os.linesep.
Using os.linesep instead of \n means that on Windows, the line ending becomes "\r\r\n".
Also switches mentions of {LF} to {NL} in code and docs.
borg now has the chunks list in every item with content.
due to the symmetric way how borg now deals with hardlinks using
item.hlid, processing gets much simpler.
but some places where borg deals with other "sources" of hardlinks
still need to do some hardlink management:
borg uses the HardLinkManager there now (which is not much more
than a dict, but keeps documentation at one place and avoids some
code duplication we had before).
item.hlid is computed via hardlink_id function.
support hardlinked symlinks, fixes#2379
as we use item.hlid now to group hardlinks together,
there is no conflict with the item.source usage for
symlink targets any more.
2nd+ hardlinks now add to the files count as did the 1st one.
for borg, now all hardlinks are created equal.
so any hardlink item with chunks now adds to the "file" count.
ItemFormatter: support {hlid} instead of {source} for hardlinks
export-tar: just msgpack and b64encode all item metadata and
put that into a BORG specific PAX header.
this is *additional* to the standard tar metadata.
import-tar: when detecting the BORG specific PAX header, just get
all metadata from there (and ignore the standard tar
metadata).
"passphrase" encryption mode repos can not be created since borg 1.0.
back then, users were advised to switch existing repos of that type
to repokey mode using the "borg key migrate-to-repokey" command.
that command is still available in borg 1.0, 1.1 and 1.2, but not
any more in borg >= 1.3.
while we still might see the PassphraseKey.TYPE byte in old repos,
it is handled by the RepoKey code since borg 1.0.