update docs about separated compaction

This commit is contained in:
Thomas Waldmann 2018-07-12 20:53:03 +02:00
parent a221ca16ad
commit e6fcf4ea42
6 changed files with 95 additions and 25 deletions

View file

@ -158,9 +158,11 @@ such obsolete entries is called sparse, while a segment containing no such entri
Since writing a ``DELETE`` tag does not actually delete any data and
thus does not free disk space any log-based data store will need a
compaction strategy (somewhat analogous to a garbage collector).
Borg uses a simple forward compacting algorithm,
which avoids modifying existing segments.
Compaction runs when a commit is issued (unless the :ref:`append_only_mode` is active).
Compaction runs when a commit is issued with ``compact=True`` parameter, e.g.
by the ``borg compact`` command (unless the :ref:`append_only_mode` is active).
One client transaction can manifest as multiple physical transactions,
since compaction is transacted, too, and Borg does not distinguish between the two::
@ -197,9 +199,9 @@ The 1.1.x series writes version 2 of the format and reads either version.
When reading a version 1 hints file, Borg 1.1.x will
read all sparse segments to determine their sparsity.
This process may take some time if a repository is kept in the append-only mode,
which causes the number of sparse segments to grow. Repositories not in append-only
mode have no sparse segments in 1.0.x, since compaction is unconditional.
This process may take some time if a repository has been kept in append-only mode
or ``borg compact`` has not been used for a longer time, which both has caused
the number of sparse segments to grow.
Compaction processes sparse segments from oldest to newest; sparse segments
which don't contain enough deleted data to justify compaction are skipped. This

View file

@ -59,7 +59,7 @@ Also helpful:
- if you use LVM: use a LV + a filesystem that you can resize later and have
some unallocated PEs you can add to the LV.
- consider using quotas
- use `prune` regularly
- use `prune` and `compact` regularly
.. [1] This failsafe can fail in these circumstances:
@ -105,8 +105,10 @@ Some files which aren't necessarily needed in this backup are excluded. See
:ref:`borg_patterns` on how to add more exclude options.
After the backup this script also uses the :ref:`borg_prune` subcommand to keep
only a certain number of old archives and deletes the others in order to preserve
disk space.
only a certain number of old archives and deletes the others.
Finally, it uses the :ref:`borg_compact` subcommand to remove deleted objects
from the segment files in the repository to preserve disk space.
Before running, make sure that the repository is initialized as documented in
:ref:`remote_repos` and that the script has the correct permissions to be executable
@ -176,17 +178,24 @@ backed up and that the ``prune`` command is keeping and deleting the correct bac
prune_exit=$?
# actually free repo disk space by compacting segments
borg compact
compact_exit=$?
# use highest exit code as global exit code
global_exit=$(( backup_exit > prune_exit ? backup_exit : prune_exit ))
global_exit=$(( compact_exit > global_exit ? compact_exit : global_exit ))
if [ ${global_exit} -eq 1 ];
then
info "Backup and/or Prune finished with a warning"
info "Backup, Prune and/or Compact finished with a warning"
fi
if [ ${global_exit} -gt 1 ];
then
info "Backup and/or Prune finished with an error"
info "Backup, Prune and/or Compact finished with an error"
fi
exit ${global_exit}

View file

@ -6,6 +6,8 @@ Examples
# delete a single backup archive:
$ borg delete /path/to/repo::Monday
# actually free disk space:
$ borg compact /path/to/repo
# delete all archives whose names begin with the machine's hostname followed by "-"
$ borg delete --prefix '{hostname}-' /path/to/repo

View file

@ -148,16 +148,51 @@ Now, let's see how to restore some LVs from such a backup. ::
$ borg extract --stdout /path/to/repo::arch dev/vg0/home-snapshot > /dev/vg0/home
.. _separate_compaction:
Separate compaction
~~~~~~~~~~~~~~~~~~~
Borg does not auto-compact the segment files in the repository at commit time
(at the end of each repository-writing command) any more.
This is new since borg 1.2.0 and requires borg >= 1.2.0 on client and server.
This causes a similar behaviour of the repository as if it was in append-only
mode (see below) most of the time (until ``borg compact`` is invoked or an
old client triggers auto-compaction).
This has some notable consequences:
- repository space is not freed immediately when deleting / pruning archives
- commands finish quicker
- repository is more robust and might be easier to recover after damages (as
it contains data in a more sequential manner, historic manifests, multiple
commits - until you run ``borg compact``)
- user can choose when to run compaction (it should be done regularly, but not
neccessarily after each single borg command)
- user can choose from where to invoke ``borg compact`` to do the compaction
(from client or from server, it does not need a key)
- less repo sync data traffic in case you create a copy of your repository by
using a sync tool (like rsync, rclone, ...)
You can manually run compaction by invoking the ``borg compact`` command.
.. _append_only_mode:
Append-only mode
~~~~~~~~~~~~~~~~
Append-only mode (forbid compaction)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A repository can be made "append-only", which means that Borg will never overwrite or
delete committed data (append-only refers to the segment files, but borg will also
reject to delete the repository completely). This is useful for scenarios where a
backup client machine backups remotely to a backup server using ``borg serve``, since
a hacked client machine cannot delete backups on the server permanently.
A repository can be made "append-only", which means that Borg will never
overwrite or delete committed data (append-only refers to the segment files,
but borg will also reject to delete the repository completely).
If ``borg compact`` command is used on a repo in append-only mode, there
will be no warning or error, but no compaction will happen.
append-only is useful for scenarios where a backup client machine backups
remotely to a backup server using ``borg serve``, since a hacked client machine
cannot delete backups on the server permanently.
To activate append-only mode, set ``append_only`` to 1 in the repository config::

View file

@ -23,6 +23,8 @@ first so you will see what it would do without it actually doing anything.
# Same as above but only apply to archive names starting with the hostname
# of the machine followed by a "-" character:
$ borg prune -v --list --keep-daily=7 --keep-weekly=4 --prefix='{hostname}-' /path/to/repo
# actually free disk space:
$ borg compact /path/to/repo
# Keep 7 end of day, 4 additional end of week archives,
# and an end of month archive for every month:

View file

@ -2311,6 +2311,7 @@ class Archiver:
# It will replace the entire :ref:`foo` verbatim.
rst_plain_text_references = {
'a_status_oddity': '"I am seeing A (added) status for a unchanged file!?"',
'separate_compaction': '"Separate compaction"',
}
def process_epilog(epilog):
@ -3220,9 +3221,13 @@ class Archiver:
delete_epilog = process_epilog("""
This command deletes an archive from the repository or the complete repository.
Disk space is reclaimed accordingly. If you delete the complete repository, the
local cache for it (if any) is also deleted. Alternatively, you can delete just
the local cache with the ``--cache-only`` option.
Important: When deleting archives, repository disk space is **not** freed until
you run ``borg compact``.
If you delete the complete repository, the local cache for it (if any) is
also deleted. Alternatively, you can delete just the local cache with the
``--cache-only`` option.
When using ``--stats``, you will get some statistics about how much data was
deleted - the "Deleted data" deduplicated size there is most interesting as
@ -3376,8 +3381,12 @@ class Archiver:
prune_epilog = process_epilog("""
The prune command prunes a repository by deleting all archives not matching
any of the specified retention options. This command is normally used by
automated backup scripts wanting to keep a certain number of historic backups.
any of the specified retention options.
Important: Repository disk space is **not** freed until you run ``borg compact``.
This command is normally used by automated backup scripts wanting to keep a
certain number of historic backups.
Also, prune automatically removes checkpoint archives (incomplete archives left
behind by interrupted backup runs) except if the checkpoint is the latest
@ -3564,6 +3573,8 @@ class Archiver:
This is an *experimental* feature. Do *not* use this on your only backup.
Important: Repository disk space is **not** freed until you run ``borg compact``.
``--exclude``, ``--exclude-from``, ``--exclude-if-present``, ``--keep-exclude-tags``, and PATH
have the exact same semantics as in "borg create". If PATHs are specified the
resulting archive will only contain files from these PATHs.
@ -3592,10 +3603,9 @@ class Archiver:
With ``--target`` the original archive is not replaced, instead a new archive is created.
When rechunking space usage can be substantial, expect at least the entire
deduplicated size of the archives using the previous chunker params.
When recompressing expect approx. (throughput / checkpoint-interval) in space usage,
assuming all chunks are recompressed.
When rechunking (or recompressing), space usage can be substantial - expect
at least the entire deduplicated size of the archives using the previous
chunker (or compression) params.
If you recently ran borg check --repair and it had to fix lost chunks with all-zero
replacement chunks, please first run another backup for the same data and re-run
@ -3697,6 +3707,16 @@ class Archiver:
compact_epilog = process_epilog("""
This command frees repository space by compacting segments.
Use this regularly to avoid running out of space - you do not need to use this
after each borg command though.
borg compact does not need a key, so it is possible to invoke it from the
client or also from the server.
Depending on the amount of segments that need compaction, it may take a while.
See :ref:`separate_compaction` in Additional Notes for more details.
""")
subparser = subparsers.add_parser('compact', parents=[common_parser], add_help=False,
description=self.do_compact.__doc__,