mirror of
https://github.com/borgbackup/borg.git
synced 2026-05-28 04:03:21 -04:00
remove archive checkpointing
borg1 needed this due to its transactional / rollback behaviour: if there was uncommitted stuff in the repo, next repo opening automatically rolled back to last commit. thus we needed checkpoint archives to reference chunks and commit the repo. borg2 does not do that anymore, unused chunks are only removed when the user invokes borg compact. thus, if a borg create gets interrupted, the user can just run borg create again and it will find some chunks are already in the repo, making progress even if borg create gets frequently interrupted.
This commit is contained in:
parent
c2890efdd1
commit
5e3f2c04d5
25 changed files with 44 additions and 420 deletions
|
|
@ -105,7 +105,7 @@ modify it to suit your needs (e.g. more backup sets, dumping databases etc.).
|
|||
#
|
||||
|
||||
# Options for borg create
|
||||
BORG_OPTS="--stats --one-file-system --compression lz4 --checkpoint-interval 86400"
|
||||
BORG_OPTS="--stats --one-file-system --compression lz4"
|
||||
|
||||
# Set BORG_PASSPHRASE or BORG_PASSCOMMAND somewhere around here, using export,
|
||||
# if encryption is used.
|
||||
|
|
|
|||
43
docs/faq.rst
43
docs/faq.rst
|
|
@ -124,23 +124,14 @@ Are there other known limitations?
|
|||
remove files which are in the destination, but not in the archive.
|
||||
See :issue:`4598` for a workaround and more details.
|
||||
|
||||
.. _checkpoints_parts:
|
||||
.. _interrupted_backup:
|
||||
|
||||
If a backup stops mid-way, does the already-backed-up data stay there?
|
||||
----------------------------------------------------------------------
|
||||
|
||||
Yes, Borg supports resuming backups.
|
||||
|
||||
During a backup, a special checkpoint archive named ``<archive-name>.checkpoint``
|
||||
is saved at every checkpoint interval (the default value for this is 30
|
||||
minutes) containing all the data backed-up until that point.
|
||||
|
||||
This checkpoint archive is a valid archive, but it is only a partial backup
|
||||
(not all files that you wanted to back up are contained in it and the last file
|
||||
in it might be a partial file). Having it in the repo until a successful, full
|
||||
backup is completed is useful because it references all the transmitted chunks up
|
||||
to the checkpoint. This means that in case of an interruption, you only need to
|
||||
retransfer the data since the last checkpoint.
|
||||
Yes, the data transferred into the repo stays there - just avoid running
|
||||
``borg compact`` before you completed the backup, because that would remove
|
||||
unused chunks.
|
||||
|
||||
If a backup was interrupted, you normally do not need to do anything special,
|
||||
just invoke ``borg create`` as you always do. If the repository is still locked,
|
||||
|
|
@ -150,24 +141,14 @@ include the current datetime), it does not matter.
|
|||
|
||||
Borg always does full single-pass backups, so it will start again
|
||||
from the beginning - but it will be much faster, because some of the data was
|
||||
already stored into the repo (and is still referenced by the checkpoint
|
||||
archive), so it does not need to get transmitted and stored again.
|
||||
|
||||
Once your backup has finished successfully, you can delete all
|
||||
``<archive-name>.checkpoint`` archives. If you run ``borg prune``, it will
|
||||
also care for deleting unneeded checkpoints.
|
||||
|
||||
Note: the checkpointing mechanism may create a partial (truncated) last file
|
||||
in a checkpoint archive named ``<filename>.borg_part``. Such partial files
|
||||
won't be contained in the final archive.
|
||||
This is done so that checkpoints work cleanly and promptly while a big
|
||||
file is being processed.
|
||||
already stored into the repo, so it does not need to get transmitted and stored
|
||||
again.
|
||||
|
||||
|
||||
How can I back up huge file(s) over a unstable connection?
|
||||
----------------------------------------------------------
|
||||
|
||||
Yes. For more details, see :ref:`checkpoints_parts`.
|
||||
Yes. For more details, see :ref:`interrupted_backup`.
|
||||
|
||||
How can I restore huge file(s) over an unstable connection?
|
||||
-----------------------------------------------------------
|
||||
|
|
@ -794,10 +775,9 @@ If you feel your Borg backup is too slow somehow, here is what you can do:
|
|||
- Don't use any expensive compression. The default is lz4 and super fast.
|
||||
Uncompressed is often slower than lz4.
|
||||
- Just wait. You can also interrupt it and start it again as often as you like,
|
||||
it will converge against a valid "completed" state (see ``--checkpoint-interval``,
|
||||
maybe use the default, but in any case don't make it too short). It is starting
|
||||
it will converge against a valid "completed" state. It is starting
|
||||
from the beginning each time, but it is still faster then as it does not store
|
||||
data into the repo which it already has there from last checkpoint.
|
||||
data into the repo which it already has there.
|
||||
- If you don’t need additional file attributes, you can disable them with ``--noflags``,
|
||||
``--noacls``, ``--noxattrs``. This can lead to noticeable performance improvements
|
||||
when your backup consists of many small files.
|
||||
|
|
@ -1127,11 +1107,6 @@ conditions, but generally this should be avoided. If your backup disk is already
|
|||
full when Borg starts a write command like `borg create`, it will abort
|
||||
immediately and the repository will stay as-is.
|
||||
|
||||
If you run a backup that stops due to a disk running full, Borg will roll back,
|
||||
delete the new segment file and thus freeing disk space automatically. There
|
||||
may be a checkpoint archive left that has been saved before the disk got full.
|
||||
You can keep it to speed up the next backup or delete it to get back more disk
|
||||
space.
|
||||
|
||||
Miscellaneous
|
||||
#############
|
||||
|
|
|
|||
|
|
@ -44,7 +44,6 @@ from .helpers import ellipsis_truncate, ProgressIndicatorPercent, log_multi
|
|||
from .helpers import os_open, flags_normal, flags_dir
|
||||
from .helpers import os_stat
|
||||
from .helpers import msgpack
|
||||
from .helpers import sig_int
|
||||
from .helpers.lrucache import LRUCache
|
||||
from .manifest import Manifest
|
||||
from .patterns import PathPrefixPattern, FnmatchPattern, IECommand
|
||||
|
|
@ -357,18 +356,6 @@ class ChunkBuffer:
|
|||
def is_full(self):
|
||||
return self.buffer.tell() > self.BUFFER_SIZE
|
||||
|
||||
def save_chunks_state(self):
|
||||
# as we only append to self.chunks, remembering the current length is good enough
|
||||
self.saved_chunks_len = len(self.chunks)
|
||||
|
||||
def restore_chunks_state(self):
|
||||
scl = self.saved_chunks_len
|
||||
assert scl is not None, "forgot to call save_chunks_state?"
|
||||
tail_chunks = self.chunks[scl:]
|
||||
del self.chunks[scl:]
|
||||
self.saved_chunks_len = None
|
||||
return tail_chunks
|
||||
|
||||
|
||||
class CacheChunkBuffer(ChunkBuffer):
|
||||
def __init__(self, cache, key, stats, chunker_params=ITEMS_CHUNKER_PARAMS):
|
||||
|
|
@ -509,12 +496,6 @@ class Archive:
|
|||
self.items_buffer = CacheChunkBuffer(self.cache, self.key, self.stats)
|
||||
if name in manifest.archives:
|
||||
raise self.AlreadyExists(name)
|
||||
i = 0
|
||||
while True:
|
||||
self.checkpoint_name = "{}.checkpoint{}".format(name, i and (".%d" % i) or "")
|
||||
if self.checkpoint_name not in manifest.archives:
|
||||
break
|
||||
i += 1
|
||||
else:
|
||||
info = self.manifest.archives.get(name)
|
||||
if info is None:
|
||||
|
|
@ -629,32 +610,6 @@ Duration: {0.duration}
|
|||
stats.show_progress(item=item, dt=0.2)
|
||||
self.items_buffer.add(item)
|
||||
|
||||
def prepare_checkpoint(self):
|
||||
# we need to flush the archive metadata stream to repo chunks, so that
|
||||
# we have the metadata stream chunks WITHOUT the part file item we add later.
|
||||
# The part file item will then get into its own metadata stream chunk, which we
|
||||
# can easily NOT include into the next checkpoint or the final archive.
|
||||
self.items_buffer.flush(flush=True)
|
||||
# remember the current state of self.chunks, which corresponds to the flushed chunks
|
||||
self.items_buffer.save_chunks_state()
|
||||
|
||||
def write_checkpoint(self):
|
||||
metadata = self.save(self.checkpoint_name)
|
||||
# that .save() has committed the repo.
|
||||
# at next commit, we won't need this checkpoint archive any more because we will then
|
||||
# have either a newer checkpoint archive or the final archive.
|
||||
# so we can already remove it here, the next .save() will then commit this cleanup.
|
||||
# remove its manifest entry, remove its ArchiveItem chunk, remove its item_ptrs chunks:
|
||||
del self.manifest.archives[self.checkpoint_name]
|
||||
self.cache.chunk_decref(self.id, 1, self.stats)
|
||||
for id in metadata.item_ptrs:
|
||||
self.cache.chunk_decref(id, 1, self.stats)
|
||||
# also get rid of that part item, we do not want to have it in next checkpoint or final archive
|
||||
tail_chunks = self.items_buffer.restore_chunks_state()
|
||||
# tail_chunks contain the tail of the archive items metadata stream, not needed for next commit.
|
||||
for id in tail_chunks:
|
||||
self.cache.chunk_decref(id, 1, self.stats) # TODO can we have real size here?
|
||||
|
||||
def save(self, name=None, comment=None, timestamp=None, stats=None, additional_metadata=None):
|
||||
name = name or self.name
|
||||
if name in self.manifest.archives:
|
||||
|
|
@ -1163,60 +1118,11 @@ def cached_hash(chunk, id_hash):
|
|||
class ChunksProcessor:
|
||||
# Processes an iterator of chunks for an Item
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
*,
|
||||
key,
|
||||
cache,
|
||||
add_item,
|
||||
prepare_checkpoint,
|
||||
write_checkpoint,
|
||||
checkpoint_interval,
|
||||
checkpoint_volume,
|
||||
rechunkify,
|
||||
):
|
||||
def __init__(self, *, key, cache, add_item, rechunkify):
|
||||
self.key = key
|
||||
self.cache = cache
|
||||
self.add_item = add_item
|
||||
self.prepare_checkpoint = prepare_checkpoint
|
||||
self.write_checkpoint = write_checkpoint
|
||||
self.rechunkify = rechunkify
|
||||
# time interval based checkpointing
|
||||
self.checkpoint_interval = checkpoint_interval
|
||||
self.last_checkpoint = time.monotonic()
|
||||
# file content volume based checkpointing
|
||||
self.checkpoint_volume = checkpoint_volume
|
||||
self.current_volume = 0
|
||||
self.last_volume_checkpoint = 0
|
||||
|
||||
def write_part_file(self, item):
|
||||
self.prepare_checkpoint()
|
||||
item = Item(internal_dict=item.as_dict())
|
||||
# for borg recreate, we already have a size member in the source item (giving the total file size),
|
||||
# but we consider only a part of the file here, thus we must recompute the size from the chunks:
|
||||
item.get_size(memorize=True, from_chunks=True)
|
||||
item.path += ".borg_part"
|
||||
self.add_item(item, show_progress=False)
|
||||
self.write_checkpoint()
|
||||
|
||||
def maybe_checkpoint(self, item):
|
||||
checkpoint_done = False
|
||||
sig_int_triggered = sig_int and sig_int.action_triggered()
|
||||
if (
|
||||
sig_int_triggered
|
||||
or (self.checkpoint_interval and time.monotonic() - self.last_checkpoint > self.checkpoint_interval)
|
||||
or (self.checkpoint_volume and self.current_volume - self.last_volume_checkpoint >= self.checkpoint_volume)
|
||||
):
|
||||
if sig_int_triggered:
|
||||
logger.info("checkpoint requested: starting checkpoint creation...")
|
||||
self.write_part_file(item)
|
||||
checkpoint_done = True
|
||||
self.last_checkpoint = time.monotonic()
|
||||
self.last_volume_checkpoint = self.current_volume
|
||||
if sig_int_triggered:
|
||||
sig_int.action_completed()
|
||||
logger.info("checkpoint requested: finished checkpoint creation!")
|
||||
return checkpoint_done # whether a checkpoint archive was created
|
||||
|
||||
def process_file_chunks(self, item, cache, stats, show_progress, chunk_iter, chunk_processor=None):
|
||||
if not chunk_processor:
|
||||
|
|
@ -1237,16 +1143,13 @@ class ChunksProcessor:
|
|||
for chunk in chunk_iter:
|
||||
chunk_entry = chunk_processor(chunk)
|
||||
item.chunks.append(chunk_entry)
|
||||
self.current_volume += chunk_entry[1]
|
||||
if show_progress:
|
||||
stats.show_progress(item=item, dt=0.2)
|
||||
self.maybe_checkpoint(item)
|
||||
|
||||
|
||||
class FilesystemObjectProcessors:
|
||||
# When ported to threading, then this doesn't need chunker, cache, key any more.
|
||||
# write_checkpoint should then be in the item buffer,
|
||||
# and process_file becomes a callback passed to __init__.
|
||||
# process_file becomes a callback passed to __init__.
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
|
|
@ -2195,7 +2098,7 @@ class ArchiveChecker:
|
|||
if last and len(archive_infos) < last:
|
||||
logger.warning("--last %d archives: only found %d archives", last, len(archive_infos))
|
||||
else:
|
||||
archive_infos = self.manifest.archives.list(sort_by=sort_by, consider_checkpoints=True)
|
||||
archive_infos = self.manifest.archives.list(sort_by=sort_by)
|
||||
num_archives = len(archive_infos)
|
||||
|
||||
pi = ProgressIndicatorPercent(
|
||||
|
|
@ -2279,8 +2182,6 @@ class ArchiveRecreater:
|
|||
progress=False,
|
||||
file_status_printer=None,
|
||||
timestamp=None,
|
||||
checkpoint_interval=1800,
|
||||
checkpoint_volume=0,
|
||||
):
|
||||
self.manifest = manifest
|
||||
self.repository = manifest.repository
|
||||
|
|
@ -2305,8 +2206,6 @@ class ArchiveRecreater:
|
|||
self.stats = stats
|
||||
self.progress = progress
|
||||
self.print_file_status = file_status_printer or (lambda *args: None)
|
||||
self.checkpoint_interval = None if dry_run else checkpoint_interval
|
||||
self.checkpoint_volume = None if dry_run else checkpoint_volume
|
||||
|
||||
def recreate(self, archive_name, comment=None, target_name=None):
|
||||
assert not self.is_temporary_archive(archive_name)
|
||||
|
|
@ -2452,14 +2351,7 @@ class ArchiveRecreater:
|
|||
"Rechunking archive from %s to %s", source_chunker_params or "(unknown)", target.chunker_params
|
||||
)
|
||||
target.process_file_chunks = ChunksProcessor(
|
||||
cache=self.cache,
|
||||
key=self.key,
|
||||
add_item=target.add_item,
|
||||
prepare_checkpoint=target.prepare_checkpoint,
|
||||
write_checkpoint=target.write_checkpoint,
|
||||
checkpoint_interval=self.checkpoint_interval,
|
||||
checkpoint_volume=self.checkpoint_volume,
|
||||
rechunkify=target.recreate_rechunkify,
|
||||
cache=self.cache, key=self.key, add_item=target.add_item, rechunkify=target.recreate_rechunkify
|
||||
).process_file_chunks
|
||||
target.chunker = get_chunker(*target.chunker_params, seed=self.key.chunk_seed, sparse=False)
|
||||
return target
|
||||
|
|
|
|||
|
|
@ -14,7 +14,6 @@ try:
|
|||
import os
|
||||
import shlex
|
||||
import signal
|
||||
import time
|
||||
from datetime import datetime, timezone
|
||||
|
||||
from ..logger import create_logger, setup_logging
|
||||
|
|
@ -124,7 +123,6 @@ class Archiver(
|
|||
def __init__(self, lock_wait=None, prog=None):
|
||||
self.lock_wait = lock_wait
|
||||
self.prog = prog
|
||||
self.last_checkpoint = time.monotonic()
|
||||
|
||||
def print_warning(self, msg, *args, **kw):
|
||||
warning_code = kw.get("wc", EXIT_WARNING) # note: wc=None can be used to not influence exit code
|
||||
|
|
@ -455,20 +453,6 @@ class Archiver(
|
|||
logger.debug("Enabling debug topic %s", topic)
|
||||
logging.getLogger(topic).setLevel("DEBUG")
|
||||
|
||||
def maybe_checkpoint(self, *, checkpoint_func, checkpoint_interval):
|
||||
checkpointed = False
|
||||
sig_int_triggered = sig_int and sig_int.action_triggered()
|
||||
if sig_int_triggered or checkpoint_interval and time.monotonic() - self.last_checkpoint > checkpoint_interval:
|
||||
if sig_int_triggered:
|
||||
logger.info("checkpoint requested: starting checkpoint creation...")
|
||||
checkpoint_func()
|
||||
checkpointed = True
|
||||
self.last_checkpoint = time.monotonic()
|
||||
if sig_int_triggered:
|
||||
sig_int.action_completed()
|
||||
logger.info("checkpoint requested: finished checkpoint creation!")
|
||||
return checkpointed
|
||||
|
||||
def run(self, args):
|
||||
os.umask(args.umask) # early, before opening files
|
||||
self.lock_wait = args.lock_wait
|
||||
|
|
|
|||
|
|
@ -25,7 +25,7 @@ class ArchiveGarbageCollector:
|
|||
self.wanted_chunks = None # chunks that would be nice to have for next borg check --repair
|
||||
self.total_files = None # overall number of source files written to all archives in this repo
|
||||
self.total_size = None # overall size of source file content data written to all archives
|
||||
self.archives_count = None # number of archives (including checkpoint archives)
|
||||
self.archives_count = None # number of archives
|
||||
|
||||
def garbage_collect(self):
|
||||
"""Removes unused chunks from a repository."""
|
||||
|
|
@ -60,7 +60,7 @@ class ArchiveGarbageCollector:
|
|||
"""Iterate over all items in all archives, create the dicts id -> size of all used/wanted chunks."""
|
||||
used_chunks = {} # chunks referenced by item.chunks
|
||||
wanted_chunks = {} # additional "wanted" chunks seen in item.chunks_healthy
|
||||
archive_infos = self.manifest.archives.list(consider_checkpoints=True)
|
||||
archive_infos = self.manifest.archives.list()
|
||||
num_archives = len(archive_infos)
|
||||
pi = ProgressIndicatorPercent(
|
||||
total=num_archives, msg="Computing used/wanted chunks %3.1f%%", step=0.1, msgid="compact.analyze_archives"
|
||||
|
|
|
|||
|
|
@ -196,8 +196,7 @@ class CreateMixIn:
|
|||
archive.stats.rx_bytes = getattr(repository, "rx_bytes", 0)
|
||||
archive.stats.tx_bytes = getattr(repository, "tx_bytes", 0)
|
||||
if sig_int:
|
||||
# do not save the archive if the user ctrl-c-ed - it is valid, but incomplete.
|
||||
# we already have a checkpoint archive in this case.
|
||||
# do not save the archive if the user ctrl-c-ed.
|
||||
raise Error("Got Ctrl-C / SIGINT.")
|
||||
else:
|
||||
archive.save(comment=args.comment, timestamp=args.timestamp)
|
||||
|
|
@ -252,16 +251,7 @@ class CreateMixIn:
|
|||
numeric_ids=args.numeric_ids,
|
||||
nobirthtime=args.nobirthtime,
|
||||
)
|
||||
cp = ChunksProcessor(
|
||||
cache=cache,
|
||||
key=key,
|
||||
add_item=archive.add_item,
|
||||
prepare_checkpoint=archive.prepare_checkpoint,
|
||||
write_checkpoint=archive.write_checkpoint,
|
||||
checkpoint_interval=args.checkpoint_interval,
|
||||
checkpoint_volume=args.checkpoint_volume,
|
||||
rechunkify=False,
|
||||
)
|
||||
cp = ChunksProcessor(cache=cache, key=key, add_item=archive.add_item, rechunkify=False)
|
||||
fso = FilesystemObjectProcessors(
|
||||
metadata_collector=metadata_collector,
|
||||
cache=cache,
|
||||
|
|
@ -585,9 +575,7 @@ class CreateMixIn:
|
|||
The archive will consume almost no disk space for files or parts of files that
|
||||
have already been stored in other archives.
|
||||
|
||||
The archive name needs to be unique. It must not end in '.checkpoint' or
|
||||
'.checkpoint.N' (with N being a number), because these names are used for
|
||||
checkpoints and treated in special ways.
|
||||
The archive name needs to be unique.
|
||||
|
||||
In the archive name, you may use the following placeholders:
|
||||
{now}, {utcnow}, {fqdn}, {hostname}, {user} and some others.
|
||||
|
|
@ -942,25 +930,6 @@ class CreateMixIn:
|
|||
help="manually specify the archive creation date/time (yyyy-mm-ddThh:mm:ss[(+|-)HH:MM] format, "
|
||||
"(+|-)HH:MM is the UTC offset, default: local time zone). Alternatively, give a reference file/directory.",
|
||||
)
|
||||
archive_group.add_argument(
|
||||
"-c",
|
||||
"--checkpoint-interval",
|
||||
metavar="SECONDS",
|
||||
dest="checkpoint_interval",
|
||||
type=int,
|
||||
default=1800,
|
||||
action=Highlander,
|
||||
help="write checkpoint every SECONDS seconds (Default: 1800)",
|
||||
)
|
||||
archive_group.add_argument(
|
||||
"--checkpoint-volume",
|
||||
metavar="BYTES",
|
||||
dest="checkpoint_volume",
|
||||
type=int,
|
||||
default=0,
|
||||
action=Highlander,
|
||||
help="write checkpoint every BYTES bytes (Default: 0, meaning no volume based checkpointing)",
|
||||
)
|
||||
archive_group.add_argument(
|
||||
"--chunker-params",
|
||||
metavar="PARAMS",
|
||||
|
|
|
|||
|
|
@ -82,10 +82,4 @@ class DeleteMixIn:
|
|||
subparser.add_argument(
|
||||
"--list", dest="output_list", action="store_true", help="output verbose list of archives"
|
||||
)
|
||||
subparser.add_argument(
|
||||
"--consider-checkpoints",
|
||||
action="store_true",
|
||||
dest="consider_checkpoints",
|
||||
help="consider checkpoint archives for deletion (default: not considered).",
|
||||
)
|
||||
define_archive_filters_group(subparser)
|
||||
|
|
|
|||
|
|
@ -18,7 +18,6 @@ class InfoMixIn:
|
|||
def do_info(self, args, repository, manifest, cache):
|
||||
"""Show archive details such as disk space used"""
|
||||
|
||||
args.consider_checkpoints = True
|
||||
archive_names = tuple(x.name for x in manifest.archives.list_considering(args))
|
||||
|
||||
output_data = []
|
||||
|
|
|
|||
|
|
@ -158,12 +158,6 @@ class MountMixIn:
|
|||
from ._common import define_exclusion_group, define_archive_filters_group
|
||||
|
||||
parser.set_defaults(func=self.do_mount)
|
||||
parser.add_argument(
|
||||
"--consider-checkpoints",
|
||||
action="store_true",
|
||||
dest="consider_checkpoints",
|
||||
help="Show checkpoint archives in the repository contents list (default: hidden).",
|
||||
)
|
||||
parser.add_argument("mountpoint", metavar="MOUNTPOINT", type=str, help="where to mount filesystem")
|
||||
parser.add_argument(
|
||||
"-f", "--foreground", dest="foreground", action="store_true", help="stay in foreground, do not daemonize"
|
||||
|
|
|
|||
|
|
@ -4,7 +4,6 @@ from datetime import datetime, timezone, timedelta
|
|||
import logging
|
||||
from operator import attrgetter
|
||||
import os
|
||||
import re
|
||||
|
||||
from ._common import with_repository, Highlander
|
||||
from ..archive import Archive
|
||||
|
|
@ -91,25 +90,7 @@ class PruneMixIn:
|
|||
format = os.environ.get("BORG_PRUNE_FORMAT", "{archive:<36} {time} [{id}]")
|
||||
formatter = ArchiveFormatter(format, repository, manifest, manifest.key, iec=args.iec)
|
||||
|
||||
checkpoint_re = r"\.checkpoint(\.\d+)?"
|
||||
archives_checkpoints = manifest.archives.list(
|
||||
match=args.match_archives,
|
||||
consider_checkpoints=True,
|
||||
match_end=r"(%s)?\Z" % checkpoint_re,
|
||||
sort_by=["ts"],
|
||||
reverse=True,
|
||||
)
|
||||
is_checkpoint = re.compile(r"(%s)\Z" % checkpoint_re).search
|
||||
checkpoints = [arch for arch in archives_checkpoints if is_checkpoint(arch.name)]
|
||||
# keep the latest checkpoint, if there is no later non-checkpoint archive
|
||||
if archives_checkpoints and checkpoints and archives_checkpoints[0] is checkpoints[0]:
|
||||
keep_checkpoints = checkpoints[:1]
|
||||
else:
|
||||
keep_checkpoints = []
|
||||
checkpoints = set(checkpoints)
|
||||
# ignore all checkpoint archives to avoid keeping one (which is an incomplete backup)
|
||||
# that is newer than a successfully completed backup - and killing the successful backup.
|
||||
archives = [arch for arch in archives_checkpoints if arch not in checkpoints]
|
||||
archives = manifest.archives.list(match=args.match_archives, sort_by=["ts"], reverse=True)
|
||||
keep = []
|
||||
# collect the rule responsible for the keeping of each archive in this dict
|
||||
# keys are archive ids, values are a tuple
|
||||
|
|
@ -126,7 +107,7 @@ class PruneMixIn:
|
|||
if num is not None:
|
||||
keep += prune_split(archives, rule, num, kept_because)
|
||||
|
||||
to_delete = (set(archives) | checkpoints) - (set(keep) | set(keep_checkpoints))
|
||||
to_delete = set(archives) - set(keep)
|
||||
with Cache(repository, manifest, lock_wait=self.lock_wait, iec=args.iec) as cache:
|
||||
list_logger = logging.getLogger("borg.output.list")
|
||||
# set up counters for the progress display
|
||||
|
|
@ -134,7 +115,7 @@ class PruneMixIn:
|
|||
archives_deleted = 0
|
||||
uncommitted_deletes = 0
|
||||
pi = ProgressIndicatorPercent(total=len(to_delete), msg="Pruning archives %3.0f%%", msgid="prune")
|
||||
for archive in archives_checkpoints:
|
||||
for archive in archives:
|
||||
if sig_int and sig_int.action_done():
|
||||
break
|
||||
if archive in to_delete:
|
||||
|
|
@ -148,12 +129,9 @@ class PruneMixIn:
|
|||
archive.delete()
|
||||
uncommitted_deletes += 1
|
||||
else:
|
||||
if is_checkpoint(archive.name):
|
||||
log_message = "Keeping checkpoint archive:"
|
||||
else:
|
||||
log_message = "Keeping archive (rule: {rule} #{num}):".format(
|
||||
rule=kept_because[archive.id][0], num=kept_because[archive.id][1]
|
||||
)
|
||||
log_message = "Keeping archive (rule: {rule} #{num}):".format(
|
||||
rule=kept_because[archive.id][0], num=kept_because[archive.id][1]
|
||||
)
|
||||
if (
|
||||
args.output_list
|
||||
or (args.list_pruned and archive in to_delete)
|
||||
|
|
@ -184,11 +162,6 @@ class PruneMixIn:
|
|||
`GFS <https://en.wikipedia.org/wiki/Backup_rotation_scheme#Grandfather-father-son>`_
|
||||
(Grandfather-father-son) backup rotation scheme.
|
||||
|
||||
Also, prune automatically removes checkpoint archives (incomplete archives left
|
||||
behind by interrupted backup runs) except if the checkpoint is the latest
|
||||
archive (and thus still needed). Checkpoint archives are not considered when
|
||||
comparing archive counts against the retention limits (``--keep-X``).
|
||||
|
||||
If you use --match-archives (-a), then only archives that match the pattern are
|
||||
considered for deletion and only those archives count towards the totals
|
||||
specified by the rules.
|
||||
|
|
|
|||
|
|
@ -110,23 +110,16 @@ class RCompressMixIn:
|
|||
repo_objs = manifest.repo_objs
|
||||
ctype, clevel, olevel = get_csettings(repo_objs.compressor) # desired compression set by --compression
|
||||
|
||||
def checkpoint_func():
|
||||
while repository.async_response(wait=True) is not None:
|
||||
pass
|
||||
repository.commit(compact=True)
|
||||
|
||||
stats_find = defaultdict(int)
|
||||
stats_process = defaultdict(int)
|
||||
recompress_ids = find_chunks(repository, repo_objs, stats_find, ctype, clevel, olevel)
|
||||
recompress_candidate_count = len(recompress_ids)
|
||||
chunks_limit = min(1000, max(100, recompress_candidate_count // 1000))
|
||||
uncommitted_chunks = 0
|
||||
|
||||
if not isinstance(repository, (Repository3, RemoteRepository3)):
|
||||
# start a new transaction
|
||||
data = repository.get_manifest()
|
||||
repository.put_manifest(data)
|
||||
uncommitted_chunks += 1
|
||||
|
||||
pi = ProgressIndicatorPercent(
|
||||
total=len(recompress_ids), msg="Recompressing %3.1f%%", step=0.1, msgid="rcompress.process_chunks"
|
||||
|
|
@ -137,16 +130,14 @@ class RCompressMixIn:
|
|||
ids, recompress_ids = recompress_ids[:chunks_limit], recompress_ids[chunks_limit:]
|
||||
process_chunks(repository, repo_objs, stats_process, ids, olevel)
|
||||
pi.show(increase=len(ids))
|
||||
checkpointed = self.maybe_checkpoint(
|
||||
checkpoint_func=checkpoint_func, checkpoint_interval=args.checkpoint_interval
|
||||
)
|
||||
uncommitted_chunks = 0 if checkpointed else (uncommitted_chunks + len(ids))
|
||||
pi.finish()
|
||||
if sig_int:
|
||||
# Ctrl-C / SIGINT: do not checkpoint (commit) again, we already have a checkpoint in this case.
|
||||
# Ctrl-C / SIGINT: do not commit
|
||||
raise Error("Got Ctrl-C / SIGINT.")
|
||||
elif uncommitted_chunks > 0:
|
||||
checkpoint_func()
|
||||
else:
|
||||
while repository.async_response(wait=True) is not None:
|
||||
pass
|
||||
repository.commit(compact=True)
|
||||
if args.stats:
|
||||
print()
|
||||
print("Recompression stats:")
|
||||
|
|
@ -188,11 +179,6 @@ class RCompressMixIn:
|
|||
Please note that the outcome might not always be the desired compression
|
||||
type/level - if no compression gives a shorter output, that might be chosen.
|
||||
|
||||
Every ``--checkpoint-interval``, progress is committed to the repository and
|
||||
the repository is compacted (this is to keep temporary repo space usage in bounds).
|
||||
A lower checkpoint interval means lower temporary repo space usage, but also
|
||||
slower progress due to higher overhead (and vice versa).
|
||||
|
||||
Please note that this command can not work in low (or zero) free disk space
|
||||
conditions.
|
||||
|
||||
|
|
@ -228,14 +214,3 @@ class RCompressMixIn:
|
|||
)
|
||||
|
||||
subparser.add_argument("-s", "--stats", dest="stats", action="store_true", help="print statistics")
|
||||
|
||||
subparser.add_argument(
|
||||
"-c",
|
||||
"--checkpoint-interval",
|
||||
metavar="SECONDS",
|
||||
dest="checkpoint_interval",
|
||||
type=int,
|
||||
default=1800,
|
||||
action=Highlander,
|
||||
help="write checkpoint every SECONDS seconds (Default: 1800)",
|
||||
)
|
||||
|
|
|
|||
|
|
@ -34,8 +34,6 @@ class RecreateMixIn:
|
|||
progress=args.progress,
|
||||
stats=args.stats,
|
||||
file_status_printer=self.print_file_status,
|
||||
checkpoint_interval=args.checkpoint_interval,
|
||||
checkpoint_volume=args.checkpoint_volume,
|
||||
dry_run=args.dry_run,
|
||||
timestamp=args.timestamp,
|
||||
)
|
||||
|
|
@ -142,25 +140,6 @@ class RecreateMixIn:
|
|||
help="create a new archive with the name ARCHIVE, do not replace existing archive "
|
||||
"(only applies for a single archive)",
|
||||
)
|
||||
archive_group.add_argument(
|
||||
"-c",
|
||||
"--checkpoint-interval",
|
||||
dest="checkpoint_interval",
|
||||
type=int,
|
||||
default=1800,
|
||||
action=Highlander,
|
||||
metavar="SECONDS",
|
||||
help="write checkpoint every SECONDS seconds (Default: 1800)",
|
||||
)
|
||||
archive_group.add_argument(
|
||||
"--checkpoint-volume",
|
||||
metavar="BYTES",
|
||||
dest="checkpoint_volume",
|
||||
type=int,
|
||||
default=0,
|
||||
action=Highlander,
|
||||
help="write checkpoint every BYTES bytes (Default: 0, meaning no volume based checkpointing)",
|
||||
)
|
||||
archive_group.add_argument(
|
||||
"--comment",
|
||||
metavar="COMMENT",
|
||||
|
|
|
|||
|
|
@ -92,12 +92,6 @@ class RListMixIn:
|
|||
help="list repository contents",
|
||||
)
|
||||
subparser.set_defaults(func=self.do_rlist)
|
||||
subparser.add_argument(
|
||||
"--consider-checkpoints",
|
||||
action="store_true",
|
||||
dest="consider_checkpoints",
|
||||
help="Show checkpoint archives in the repository contents list (default: hidden).",
|
||||
)
|
||||
subparser.add_argument(
|
||||
"--short", dest="short", action="store_true", help="only print the archive names, nothing else"
|
||||
)
|
||||
|
|
|
|||
|
|
@ -269,16 +269,7 @@ class TarMixIn:
|
|||
start_monotonic=t0_monotonic,
|
||||
log_json=args.log_json,
|
||||
)
|
||||
cp = ChunksProcessor(
|
||||
cache=cache,
|
||||
key=key,
|
||||
add_item=archive.add_item,
|
||||
prepare_checkpoint=archive.prepare_checkpoint,
|
||||
write_checkpoint=archive.write_checkpoint,
|
||||
checkpoint_interval=args.checkpoint_interval,
|
||||
checkpoint_volume=args.checkpoint_volume,
|
||||
rechunkify=False,
|
||||
)
|
||||
cp = ChunksProcessor(cache=cache, key=key, add_item=archive.add_item, rechunkify=False)
|
||||
tfo = TarfileObjectProcessors(
|
||||
cache=cache,
|
||||
key=key,
|
||||
|
|
@ -524,25 +515,6 @@ class TarMixIn:
|
|||
help="manually specify the archive creation date/time (yyyy-mm-ddThh:mm:ss[(+|-)HH:MM] format, "
|
||||
"(+|-)HH:MM is the UTC offset, default: local time zone). Alternatively, give a reference file/directory.",
|
||||
)
|
||||
archive_group.add_argument(
|
||||
"-c",
|
||||
"--checkpoint-interval",
|
||||
dest="checkpoint_interval",
|
||||
type=int,
|
||||
default=1800,
|
||||
action=Highlander,
|
||||
metavar="SECONDS",
|
||||
help="write checkpoint every SECONDS seconds (Default: 1800)",
|
||||
)
|
||||
archive_group.add_argument(
|
||||
"--checkpoint-volume",
|
||||
metavar="BYTES",
|
||||
dest="checkpoint_volume",
|
||||
type=int,
|
||||
default=0,
|
||||
action=Highlander,
|
||||
help="write checkpoint every BYTES bytes (Default: 0, meaning no volume based checkpointing)",
|
||||
)
|
||||
archive_group.add_argument(
|
||||
"--chunker-params",
|
||||
dest="chunker_params",
|
||||
|
|
|
|||
|
|
@ -33,7 +33,6 @@ class TransferMixIn:
|
|||
)
|
||||
|
||||
dry_run = args.dry_run
|
||||
args.consider_checkpoints = True
|
||||
archive_names = tuple(x.name for x in other_manifest.archives.list_considering(args))
|
||||
if not archive_names:
|
||||
return
|
||||
|
|
@ -193,7 +192,7 @@ class TransferMixIn:
|
|||
If you want to globally change compression while transferring archives to the DST_REPO,
|
||||
give ``--compress=WANTED_COMPRESSION --recompress=always``.
|
||||
|
||||
The default is to transfer all archives, including checkpoint archives.
|
||||
The default is to transfer all archives.
|
||||
|
||||
You could use the misc. archive filter options to limit which archives it will
|
||||
transfer, e.g. using the ``-a`` option. This is recommended for big
|
||||
|
|
|
|||
|
|
@ -642,7 +642,7 @@ class ChunksMixin:
|
|||
marker = result[-1]
|
||||
# All chunks from the repository have a refcount of MAX_VALUE, which is sticky,
|
||||
# therefore we can't/won't delete them. Chunks we added ourselves in this transaction
|
||||
# (e.g. checkpoint archives) are tracked correctly.
|
||||
# are tracked correctly.
|
||||
init_entry = ChunkIndexEntry(refcount=ChunkIndex.MAX_VALUE, size=0)
|
||||
for id_ in result:
|
||||
num_chunks += 1
|
||||
|
|
|
|||
|
|
@ -244,8 +244,7 @@ class SigIntManager:
|
|||
self.ctx = None
|
||||
|
||||
|
||||
# global flag which might trigger some special behaviour on first ctrl-c / SIGINT,
|
||||
# e.g. if this is interrupting "borg create", it shall try to create a checkpoint.
|
||||
# global flag which might trigger some special behaviour on first ctrl-c / SIGINT.
|
||||
sig_int = SigIntManager()
|
||||
|
||||
|
||||
|
|
|
|||
|
|
@ -110,7 +110,6 @@ class Archives(abc.MutableMapping):
|
|||
def list(
|
||||
self,
|
||||
*,
|
||||
consider_checkpoints=True,
|
||||
match=None,
|
||||
match_end=r"\Z",
|
||||
sort_by=(),
|
||||
|
|
@ -149,8 +148,6 @@ class Archives(abc.MutableMapping):
|
|||
|
||||
if any([oldest, newest, older, newer]):
|
||||
archives = filter_archives_by_date(archives, oldest=oldest, newest=newest, newer=newer, older=older)
|
||||
if not consider_checkpoints:
|
||||
archives = [x for x in archives if ".checkpoint" not in x.name]
|
||||
for sortkey in reversed(sort_by):
|
||||
archives.sort(key=attrgetter(sortkey))
|
||||
if first:
|
||||
|
|
@ -163,18 +160,15 @@ class Archives(abc.MutableMapping):
|
|||
|
||||
def list_considering(self, args):
|
||||
"""
|
||||
get a list of archives, considering --first/last/prefix/match-archives/sort/consider-checkpoints cmdline args
|
||||
get a list of archives, considering --first/last/prefix/match-archives/sort cmdline args
|
||||
"""
|
||||
name = getattr(args, "name", None)
|
||||
consider_checkpoints = getattr(args, "consider_checkpoints", None)
|
||||
if name is not None:
|
||||
raise Error(
|
||||
"Giving a specific name is incompatible with options --first, --last, "
|
||||
"-a / --match-archives, and --consider-checkpoints."
|
||||
"Giving a specific name is incompatible with options --first, --last " "and -a / --match-archives."
|
||||
)
|
||||
return self.list(
|
||||
sort_by=args.sort_by.split(","),
|
||||
consider_checkpoints=consider_checkpoints,
|
||||
match=args.match_archives,
|
||||
first=getattr(args, "first", None),
|
||||
last=getattr(args, "last", None),
|
||||
|
|
|
|||
|
|
@ -584,7 +584,6 @@ class RemoteRepository:
|
|||
borg_cmd = self.ssh_cmd(location) + borg_cmd
|
||||
logger.debug("SSH command line: %s", borg_cmd)
|
||||
# we do not want the ssh getting killed by Ctrl-C/SIGINT because it is needed for clean shutdown of borg.
|
||||
# borg's SIGINT handler tries to write a checkpoint and requires the remote repo connection.
|
||||
self.p = Popen(borg_cmd, bufsize=0, stdin=PIPE, stdout=PIPE, stderr=PIPE, env=env, preexec_fn=ignore_sigint)
|
||||
self.stdin_fd = self.p.stdin.fileno()
|
||||
self.stdout_fd = self.p.stdout.fileno()
|
||||
|
|
|
|||
|
|
@ -623,7 +623,6 @@ class RemoteRepository3:
|
|||
borg_cmd = self.ssh_cmd(location) + borg_cmd
|
||||
logger.debug("SSH command line: %s", borg_cmd)
|
||||
# we do not want the ssh getting killed by Ctrl-C/SIGINT because it is needed for clean shutdown of borg.
|
||||
# borg's SIGINT handler tries to write a checkpoint and requires the remote repo connection.
|
||||
self.p = Popen(borg_cmd, bufsize=0, stdin=PIPE, stdout=PIPE, stderr=PIPE, env=env, preexec_fn=ignore_sigint)
|
||||
self.stdin_fd = self.p.stdin.fileno()
|
||||
self.stdout_fd = self.p.stdout.fileno()
|
||||
|
|
|
|||
|
|
@ -236,33 +236,9 @@ def test_create_stdin(archivers, request):
|
|||
assert extracted_data == input_data
|
||||
|
||||
|
||||
def test_create_stdin_checkpointing(archivers, request):
|
||||
archiver = request.getfixturevalue(archivers)
|
||||
chunk_size = 1000 # fixed chunker with this size, also volume based checkpointing after that volume
|
||||
cmd(archiver, "rcreate", RK_ENCRYPTION)
|
||||
input_data = b"X" * (chunk_size * 2 - 1) # one full and one partial chunk
|
||||
cmd(
|
||||
archiver,
|
||||
"create",
|
||||
f"--chunker-params=fixed,{chunk_size}",
|
||||
f"--checkpoint-volume={chunk_size}",
|
||||
"test",
|
||||
"-",
|
||||
input=input_data,
|
||||
)
|
||||
# repo looking good overall? checks for rc == 0.
|
||||
cmd(archiver, "check", "--debug")
|
||||
# verify that there are no part files in final archive
|
||||
out = cmd(archiver, "list", "test")
|
||||
assert "stdin.borg_part" not in out
|
||||
# verify full file
|
||||
out = cmd(archiver, "extract", "test", "stdin", "--stdout", binary_output=True)
|
||||
assert out == input_data
|
||||
|
||||
|
||||
def test_create_erroneous_file(archivers, request):
|
||||
archiver = request.getfixturevalue(archivers)
|
||||
chunk_size = 1000 # fixed chunker with this size, also volume based checkpointing after that volume
|
||||
chunk_size = 1000 # fixed chunker with this size
|
||||
create_regular_file(archiver.input_path, os.path.join(archiver.input_path, "file1"), size=chunk_size * 2)
|
||||
create_regular_file(archiver.input_path, os.path.join(archiver.input_path, "file2"), size=chunk_size * 2)
|
||||
create_regular_file(archiver.input_path, os.path.join(archiver.input_path, "file3"), size=chunk_size * 2)
|
||||
|
|
|
|||
|
|
@ -23,39 +23,18 @@ def test_prune_repository(archivers, request):
|
|||
cmd(archiver, "rcreate", RK_ENCRYPTION)
|
||||
cmd(archiver, "create", "test1", src_dir)
|
||||
cmd(archiver, "create", "test2", src_dir)
|
||||
# these are not really a checkpoints, but they look like some:
|
||||
cmd(archiver, "create", "test3.checkpoint", src_dir)
|
||||
cmd(archiver, "create", "test3.checkpoint.1", src_dir)
|
||||
cmd(archiver, "create", "test4.checkpoint", src_dir)
|
||||
output = cmd(archiver, "prune", "--list", "--dry-run", "--keep-daily=1")
|
||||
assert re.search(r"Would prune:\s+test1", output)
|
||||
# must keep the latest non-checkpoint archive:
|
||||
# must keep the latest archive:
|
||||
assert re.search(r"Keeping archive \(rule: daily #1\):\s+test2", output)
|
||||
# must keep the latest checkpoint archive:
|
||||
assert re.search(r"Keeping checkpoint archive:\s+test4.checkpoint", output)
|
||||
output = cmd(archiver, "rlist", "--consider-checkpoints")
|
||||
output = cmd(archiver, "rlist")
|
||||
assert "test1" in output
|
||||
assert "test2" in output
|
||||
assert "test3.checkpoint" in output
|
||||
assert "test3.checkpoint.1" in output
|
||||
assert "test4.checkpoint" in output
|
||||
cmd(archiver, "prune", "--keep-daily=1")
|
||||
output = cmd(archiver, "rlist", "--consider-checkpoints")
|
||||
output = cmd(archiver, "rlist")
|
||||
assert "test1" not in output
|
||||
# the latest non-checkpoint archive must be still there:
|
||||
# the latest archive must be still there:
|
||||
assert "test2" in output
|
||||
# only the latest checkpoint archive must still be there:
|
||||
assert "test3.checkpoint" not in output
|
||||
assert "test3.checkpoint.1" not in output
|
||||
assert "test4.checkpoint" in output
|
||||
# now we supersede the latest checkpoint by a successful backup:
|
||||
cmd(archiver, "create", "test5", src_dir)
|
||||
cmd(archiver, "prune", "--keep-daily=2")
|
||||
output = cmd(archiver, "rlist", "--consider-checkpoints")
|
||||
# all checkpoints should be gone now:
|
||||
assert "checkpoint" not in output
|
||||
# the latest archive must be still there
|
||||
assert "test5" in output
|
||||
|
||||
|
||||
# This test must match docs/misc/prune-example.txt
|
||||
|
|
|
|||
|
|
@ -80,26 +80,6 @@ def test_date_matching(archivers, request):
|
|||
assert "archive3" not in output
|
||||
|
||||
|
||||
def test_rlist_consider_checkpoints(archivers, request):
|
||||
archiver = request.getfixturevalue(archivers)
|
||||
|
||||
cmd(archiver, "rcreate", RK_ENCRYPTION)
|
||||
cmd(archiver, "create", "test1", src_dir)
|
||||
# these are not really a checkpoints, but they look like some:
|
||||
cmd(archiver, "create", "test2.checkpoint", src_dir)
|
||||
cmd(archiver, "create", "test3.checkpoint.1", src_dir)
|
||||
|
||||
output = cmd(archiver, "rlist")
|
||||
assert "test1" in output
|
||||
assert "test2.checkpoint" not in output
|
||||
assert "test3.checkpoint.1" not in output
|
||||
|
||||
output = cmd(archiver, "rlist", "--consider-checkpoints")
|
||||
assert "test1" in output
|
||||
assert "test2.checkpoint" in output
|
||||
assert "test3.checkpoint.1" in output
|
||||
|
||||
|
||||
def test_rlist_json(archivers, request):
|
||||
archiver = request.getfixturevalue(archivers)
|
||||
create_regular_file(archiver.input_path, "file1", size=1024 * 80)
|
||||
|
|
|
|||
|
|
@ -47,7 +47,6 @@ class TestAdHocCache:
|
|||
assert cache.add_chunk(H(1), {}, b"5678", stats=Statistics()) == (H(1), 4)
|
||||
|
||||
def test_deletes_chunks_during_lifetime(self, cache, repository):
|
||||
"""E.g. checkpoint archives"""
|
||||
cache.add_chunk(H(5), {}, b"1010", stats=Statistics())
|
||||
assert cache.seen_chunk(H(5)) == 1
|
||||
cache.chunk_decref(H(5), 1, Statistics())
|
||||
|
|
|
|||
|
|
@ -124,9 +124,9 @@ def test_mismatch(path, patterns):
|
|||
def test_match_end():
|
||||
regex = shellpattern.translate("*-home") # default is match_end == string end
|
||||
assert re.match(regex, "2017-07-03-home")
|
||||
assert not re.match(regex, "2017-07-03-home.checkpoint")
|
||||
assert not re.match(regex, "2017-07-03-home.xxx")
|
||||
|
||||
match_end = r"(%s)?\Z" % r"\.checkpoint(\.\d+)?" # with/without checkpoint ending
|
||||
match_end = r"(\.xxx)?\Z" # with/without .xxx ending
|
||||
regex = shellpattern.translate("*-home", match_end=match_end)
|
||||
assert re.match(regex, "2017-07-03-home")
|
||||
assert re.match(regex, "2017-07-03-home.checkpoint")
|
||||
assert re.match(regex, "2017-07-03-home.xxx")
|
||||
|
|
|
|||
Loading…
Reference in a new issue