remove archive checkpointing

borg1 needed this due to its transactional / rollback behaviour:
if there was uncommitted stuff in the repo, next repo opening automatically
rolled back to last commit. thus we needed checkpoint archives to reference
chunks and commit the repo.

borg2 does not do that anymore, unused chunks are only removed when the
user invokes borg compact.

thus, if a borg create gets interrupted, the user can just run borg create
again and it will find some chunks are already in the repo, making progress
even if borg create gets frequently interrupted.
This commit is contained in:
Thomas Waldmann 2024-08-22 17:52:23 +02:00
parent c2890efdd1
commit 5e3f2c04d5
No known key found for this signature in database
GPG key ID: 243ACFA951F78E01
25 changed files with 44 additions and 420 deletions

View file

@ -105,7 +105,7 @@ modify it to suit your needs (e.g. more backup sets, dumping databases etc.).
#
# Options for borg create
BORG_OPTS="--stats --one-file-system --compression lz4 --checkpoint-interval 86400"
BORG_OPTS="--stats --one-file-system --compression lz4"
# Set BORG_PASSPHRASE or BORG_PASSCOMMAND somewhere around here, using export,
# if encryption is used.

View file

@ -124,23 +124,14 @@ Are there other known limitations?
remove files which are in the destination, but not in the archive.
See :issue:`4598` for a workaround and more details.
.. _checkpoints_parts:
.. _interrupted_backup:
If a backup stops mid-way, does the already-backed-up data stay there?
----------------------------------------------------------------------
Yes, Borg supports resuming backups.
During a backup, a special checkpoint archive named ``<archive-name>.checkpoint``
is saved at every checkpoint interval (the default value for this is 30
minutes) containing all the data backed-up until that point.
This checkpoint archive is a valid archive, but it is only a partial backup
(not all files that you wanted to back up are contained in it and the last file
in it might be a partial file). Having it in the repo until a successful, full
backup is completed is useful because it references all the transmitted chunks up
to the checkpoint. This means that in case of an interruption, you only need to
retransfer the data since the last checkpoint.
Yes, the data transferred into the repo stays there - just avoid running
``borg compact`` before you completed the backup, because that would remove
unused chunks.
If a backup was interrupted, you normally do not need to do anything special,
just invoke ``borg create`` as you always do. If the repository is still locked,
@ -150,24 +141,14 @@ include the current datetime), it does not matter.
Borg always does full single-pass backups, so it will start again
from the beginning - but it will be much faster, because some of the data was
already stored into the repo (and is still referenced by the checkpoint
archive), so it does not need to get transmitted and stored again.
Once your backup has finished successfully, you can delete all
``<archive-name>.checkpoint`` archives. If you run ``borg prune``, it will
also care for deleting unneeded checkpoints.
Note: the checkpointing mechanism may create a partial (truncated) last file
in a checkpoint archive named ``<filename>.borg_part``. Such partial files
won't be contained in the final archive.
This is done so that checkpoints work cleanly and promptly while a big
file is being processed.
already stored into the repo, so it does not need to get transmitted and stored
again.
How can I back up huge file(s) over a unstable connection?
----------------------------------------------------------
Yes. For more details, see :ref:`checkpoints_parts`.
Yes. For more details, see :ref:`interrupted_backup`.
How can I restore huge file(s) over an unstable connection?
-----------------------------------------------------------
@ -794,10 +775,9 @@ If you feel your Borg backup is too slow somehow, here is what you can do:
- Don't use any expensive compression. The default is lz4 and super fast.
Uncompressed is often slower than lz4.
- Just wait. You can also interrupt it and start it again as often as you like,
it will converge against a valid "completed" state (see ``--checkpoint-interval``,
maybe use the default, but in any case don't make it too short). It is starting
it will converge against a valid "completed" state. It is starting
from the beginning each time, but it is still faster then as it does not store
data into the repo which it already has there from last checkpoint.
data into the repo which it already has there.
- If you dont need additional file attributes, you can disable them with ``--noflags``,
``--noacls``, ``--noxattrs``. This can lead to noticeable performance improvements
when your backup consists of many small files.
@ -1127,11 +1107,6 @@ conditions, but generally this should be avoided. If your backup disk is already
full when Borg starts a write command like `borg create`, it will abort
immediately and the repository will stay as-is.
If you run a backup that stops due to a disk running full, Borg will roll back,
delete the new segment file and thus freeing disk space automatically. There
may be a checkpoint archive left that has been saved before the disk got full.
You can keep it to speed up the next backup or delete it to get back more disk
space.
Miscellaneous
#############

View file

@ -44,7 +44,6 @@ from .helpers import ellipsis_truncate, ProgressIndicatorPercent, log_multi
from .helpers import os_open, flags_normal, flags_dir
from .helpers import os_stat
from .helpers import msgpack
from .helpers import sig_int
from .helpers.lrucache import LRUCache
from .manifest import Manifest
from .patterns import PathPrefixPattern, FnmatchPattern, IECommand
@ -357,18 +356,6 @@ class ChunkBuffer:
def is_full(self):
return self.buffer.tell() > self.BUFFER_SIZE
def save_chunks_state(self):
# as we only append to self.chunks, remembering the current length is good enough
self.saved_chunks_len = len(self.chunks)
def restore_chunks_state(self):
scl = self.saved_chunks_len
assert scl is not None, "forgot to call save_chunks_state?"
tail_chunks = self.chunks[scl:]
del self.chunks[scl:]
self.saved_chunks_len = None
return tail_chunks
class CacheChunkBuffer(ChunkBuffer):
def __init__(self, cache, key, stats, chunker_params=ITEMS_CHUNKER_PARAMS):
@ -509,12 +496,6 @@ class Archive:
self.items_buffer = CacheChunkBuffer(self.cache, self.key, self.stats)
if name in manifest.archives:
raise self.AlreadyExists(name)
i = 0
while True:
self.checkpoint_name = "{}.checkpoint{}".format(name, i and (".%d" % i) or "")
if self.checkpoint_name not in manifest.archives:
break
i += 1
else:
info = self.manifest.archives.get(name)
if info is None:
@ -629,32 +610,6 @@ Duration: {0.duration}
stats.show_progress(item=item, dt=0.2)
self.items_buffer.add(item)
def prepare_checkpoint(self):
# we need to flush the archive metadata stream to repo chunks, so that
# we have the metadata stream chunks WITHOUT the part file item we add later.
# The part file item will then get into its own metadata stream chunk, which we
# can easily NOT include into the next checkpoint or the final archive.
self.items_buffer.flush(flush=True)
# remember the current state of self.chunks, which corresponds to the flushed chunks
self.items_buffer.save_chunks_state()
def write_checkpoint(self):
metadata = self.save(self.checkpoint_name)
# that .save() has committed the repo.
# at next commit, we won't need this checkpoint archive any more because we will then
# have either a newer checkpoint archive or the final archive.
# so we can already remove it here, the next .save() will then commit this cleanup.
# remove its manifest entry, remove its ArchiveItem chunk, remove its item_ptrs chunks:
del self.manifest.archives[self.checkpoint_name]
self.cache.chunk_decref(self.id, 1, self.stats)
for id in metadata.item_ptrs:
self.cache.chunk_decref(id, 1, self.stats)
# also get rid of that part item, we do not want to have it in next checkpoint or final archive
tail_chunks = self.items_buffer.restore_chunks_state()
# tail_chunks contain the tail of the archive items metadata stream, not needed for next commit.
for id in tail_chunks:
self.cache.chunk_decref(id, 1, self.stats) # TODO can we have real size here?
def save(self, name=None, comment=None, timestamp=None, stats=None, additional_metadata=None):
name = name or self.name
if name in self.manifest.archives:
@ -1163,60 +1118,11 @@ def cached_hash(chunk, id_hash):
class ChunksProcessor:
# Processes an iterator of chunks for an Item
def __init__(
self,
*,
key,
cache,
add_item,
prepare_checkpoint,
write_checkpoint,
checkpoint_interval,
checkpoint_volume,
rechunkify,
):
def __init__(self, *, key, cache, add_item, rechunkify):
self.key = key
self.cache = cache
self.add_item = add_item
self.prepare_checkpoint = prepare_checkpoint
self.write_checkpoint = write_checkpoint
self.rechunkify = rechunkify
# time interval based checkpointing
self.checkpoint_interval = checkpoint_interval
self.last_checkpoint = time.monotonic()
# file content volume based checkpointing
self.checkpoint_volume = checkpoint_volume
self.current_volume = 0
self.last_volume_checkpoint = 0
def write_part_file(self, item):
self.prepare_checkpoint()
item = Item(internal_dict=item.as_dict())
# for borg recreate, we already have a size member in the source item (giving the total file size),
# but we consider only a part of the file here, thus we must recompute the size from the chunks:
item.get_size(memorize=True, from_chunks=True)
item.path += ".borg_part"
self.add_item(item, show_progress=False)
self.write_checkpoint()
def maybe_checkpoint(self, item):
checkpoint_done = False
sig_int_triggered = sig_int and sig_int.action_triggered()
if (
sig_int_triggered
or (self.checkpoint_interval and time.monotonic() - self.last_checkpoint > self.checkpoint_interval)
or (self.checkpoint_volume and self.current_volume - self.last_volume_checkpoint >= self.checkpoint_volume)
):
if sig_int_triggered:
logger.info("checkpoint requested: starting checkpoint creation...")
self.write_part_file(item)
checkpoint_done = True
self.last_checkpoint = time.monotonic()
self.last_volume_checkpoint = self.current_volume
if sig_int_triggered:
sig_int.action_completed()
logger.info("checkpoint requested: finished checkpoint creation!")
return checkpoint_done # whether a checkpoint archive was created
def process_file_chunks(self, item, cache, stats, show_progress, chunk_iter, chunk_processor=None):
if not chunk_processor:
@ -1237,16 +1143,13 @@ class ChunksProcessor:
for chunk in chunk_iter:
chunk_entry = chunk_processor(chunk)
item.chunks.append(chunk_entry)
self.current_volume += chunk_entry[1]
if show_progress:
stats.show_progress(item=item, dt=0.2)
self.maybe_checkpoint(item)
class FilesystemObjectProcessors:
# When ported to threading, then this doesn't need chunker, cache, key any more.
# write_checkpoint should then be in the item buffer,
# and process_file becomes a callback passed to __init__.
# process_file becomes a callback passed to __init__.
def __init__(
self,
@ -2195,7 +2098,7 @@ class ArchiveChecker:
if last and len(archive_infos) < last:
logger.warning("--last %d archives: only found %d archives", last, len(archive_infos))
else:
archive_infos = self.manifest.archives.list(sort_by=sort_by, consider_checkpoints=True)
archive_infos = self.manifest.archives.list(sort_by=sort_by)
num_archives = len(archive_infos)
pi = ProgressIndicatorPercent(
@ -2279,8 +2182,6 @@ class ArchiveRecreater:
progress=False,
file_status_printer=None,
timestamp=None,
checkpoint_interval=1800,
checkpoint_volume=0,
):
self.manifest = manifest
self.repository = manifest.repository
@ -2305,8 +2206,6 @@ class ArchiveRecreater:
self.stats = stats
self.progress = progress
self.print_file_status = file_status_printer or (lambda *args: None)
self.checkpoint_interval = None if dry_run else checkpoint_interval
self.checkpoint_volume = None if dry_run else checkpoint_volume
def recreate(self, archive_name, comment=None, target_name=None):
assert not self.is_temporary_archive(archive_name)
@ -2452,14 +2351,7 @@ class ArchiveRecreater:
"Rechunking archive from %s to %s", source_chunker_params or "(unknown)", target.chunker_params
)
target.process_file_chunks = ChunksProcessor(
cache=self.cache,
key=self.key,
add_item=target.add_item,
prepare_checkpoint=target.prepare_checkpoint,
write_checkpoint=target.write_checkpoint,
checkpoint_interval=self.checkpoint_interval,
checkpoint_volume=self.checkpoint_volume,
rechunkify=target.recreate_rechunkify,
cache=self.cache, key=self.key, add_item=target.add_item, rechunkify=target.recreate_rechunkify
).process_file_chunks
target.chunker = get_chunker(*target.chunker_params, seed=self.key.chunk_seed, sparse=False)
return target

View file

@ -14,7 +14,6 @@ try:
import os
import shlex
import signal
import time
from datetime import datetime, timezone
from ..logger import create_logger, setup_logging
@ -124,7 +123,6 @@ class Archiver(
def __init__(self, lock_wait=None, prog=None):
self.lock_wait = lock_wait
self.prog = prog
self.last_checkpoint = time.monotonic()
def print_warning(self, msg, *args, **kw):
warning_code = kw.get("wc", EXIT_WARNING) # note: wc=None can be used to not influence exit code
@ -455,20 +453,6 @@ class Archiver(
logger.debug("Enabling debug topic %s", topic)
logging.getLogger(topic).setLevel("DEBUG")
def maybe_checkpoint(self, *, checkpoint_func, checkpoint_interval):
checkpointed = False
sig_int_triggered = sig_int and sig_int.action_triggered()
if sig_int_triggered or checkpoint_interval and time.monotonic() - self.last_checkpoint > checkpoint_interval:
if sig_int_triggered:
logger.info("checkpoint requested: starting checkpoint creation...")
checkpoint_func()
checkpointed = True
self.last_checkpoint = time.monotonic()
if sig_int_triggered:
sig_int.action_completed()
logger.info("checkpoint requested: finished checkpoint creation!")
return checkpointed
def run(self, args):
os.umask(args.umask) # early, before opening files
self.lock_wait = args.lock_wait

View file

@ -25,7 +25,7 @@ class ArchiveGarbageCollector:
self.wanted_chunks = None # chunks that would be nice to have for next borg check --repair
self.total_files = None # overall number of source files written to all archives in this repo
self.total_size = None # overall size of source file content data written to all archives
self.archives_count = None # number of archives (including checkpoint archives)
self.archives_count = None # number of archives
def garbage_collect(self):
"""Removes unused chunks from a repository."""
@ -60,7 +60,7 @@ class ArchiveGarbageCollector:
"""Iterate over all items in all archives, create the dicts id -> size of all used/wanted chunks."""
used_chunks = {} # chunks referenced by item.chunks
wanted_chunks = {} # additional "wanted" chunks seen in item.chunks_healthy
archive_infos = self.manifest.archives.list(consider_checkpoints=True)
archive_infos = self.manifest.archives.list()
num_archives = len(archive_infos)
pi = ProgressIndicatorPercent(
total=num_archives, msg="Computing used/wanted chunks %3.1f%%", step=0.1, msgid="compact.analyze_archives"

View file

@ -196,8 +196,7 @@ class CreateMixIn:
archive.stats.rx_bytes = getattr(repository, "rx_bytes", 0)
archive.stats.tx_bytes = getattr(repository, "tx_bytes", 0)
if sig_int:
# do not save the archive if the user ctrl-c-ed - it is valid, but incomplete.
# we already have a checkpoint archive in this case.
# do not save the archive if the user ctrl-c-ed.
raise Error("Got Ctrl-C / SIGINT.")
else:
archive.save(comment=args.comment, timestamp=args.timestamp)
@ -252,16 +251,7 @@ class CreateMixIn:
numeric_ids=args.numeric_ids,
nobirthtime=args.nobirthtime,
)
cp = ChunksProcessor(
cache=cache,
key=key,
add_item=archive.add_item,
prepare_checkpoint=archive.prepare_checkpoint,
write_checkpoint=archive.write_checkpoint,
checkpoint_interval=args.checkpoint_interval,
checkpoint_volume=args.checkpoint_volume,
rechunkify=False,
)
cp = ChunksProcessor(cache=cache, key=key, add_item=archive.add_item, rechunkify=False)
fso = FilesystemObjectProcessors(
metadata_collector=metadata_collector,
cache=cache,
@ -585,9 +575,7 @@ class CreateMixIn:
The archive will consume almost no disk space for files or parts of files that
have already been stored in other archives.
The archive name needs to be unique. It must not end in '.checkpoint' or
'.checkpoint.N' (with N being a number), because these names are used for
checkpoints and treated in special ways.
The archive name needs to be unique.
In the archive name, you may use the following placeholders:
{now}, {utcnow}, {fqdn}, {hostname}, {user} and some others.
@ -942,25 +930,6 @@ class CreateMixIn:
help="manually specify the archive creation date/time (yyyy-mm-ddThh:mm:ss[(+|-)HH:MM] format, "
"(+|-)HH:MM is the UTC offset, default: local time zone). Alternatively, give a reference file/directory.",
)
archive_group.add_argument(
"-c",
"--checkpoint-interval",
metavar="SECONDS",
dest="checkpoint_interval",
type=int,
default=1800,
action=Highlander,
help="write checkpoint every SECONDS seconds (Default: 1800)",
)
archive_group.add_argument(
"--checkpoint-volume",
metavar="BYTES",
dest="checkpoint_volume",
type=int,
default=0,
action=Highlander,
help="write checkpoint every BYTES bytes (Default: 0, meaning no volume based checkpointing)",
)
archive_group.add_argument(
"--chunker-params",
metavar="PARAMS",

View file

@ -82,10 +82,4 @@ class DeleteMixIn:
subparser.add_argument(
"--list", dest="output_list", action="store_true", help="output verbose list of archives"
)
subparser.add_argument(
"--consider-checkpoints",
action="store_true",
dest="consider_checkpoints",
help="consider checkpoint archives for deletion (default: not considered).",
)
define_archive_filters_group(subparser)

View file

@ -18,7 +18,6 @@ class InfoMixIn:
def do_info(self, args, repository, manifest, cache):
"""Show archive details such as disk space used"""
args.consider_checkpoints = True
archive_names = tuple(x.name for x in manifest.archives.list_considering(args))
output_data = []

View file

@ -158,12 +158,6 @@ class MountMixIn:
from ._common import define_exclusion_group, define_archive_filters_group
parser.set_defaults(func=self.do_mount)
parser.add_argument(
"--consider-checkpoints",
action="store_true",
dest="consider_checkpoints",
help="Show checkpoint archives in the repository contents list (default: hidden).",
)
parser.add_argument("mountpoint", metavar="MOUNTPOINT", type=str, help="where to mount filesystem")
parser.add_argument(
"-f", "--foreground", dest="foreground", action="store_true", help="stay in foreground, do not daemonize"

View file

@ -4,7 +4,6 @@ from datetime import datetime, timezone, timedelta
import logging
from operator import attrgetter
import os
import re
from ._common import with_repository, Highlander
from ..archive import Archive
@ -91,25 +90,7 @@ class PruneMixIn:
format = os.environ.get("BORG_PRUNE_FORMAT", "{archive:<36} {time} [{id}]")
formatter = ArchiveFormatter(format, repository, manifest, manifest.key, iec=args.iec)
checkpoint_re = r"\.checkpoint(\.\d+)?"
archives_checkpoints = manifest.archives.list(
match=args.match_archives,
consider_checkpoints=True,
match_end=r"(%s)?\Z" % checkpoint_re,
sort_by=["ts"],
reverse=True,
)
is_checkpoint = re.compile(r"(%s)\Z" % checkpoint_re).search
checkpoints = [arch for arch in archives_checkpoints if is_checkpoint(arch.name)]
# keep the latest checkpoint, if there is no later non-checkpoint archive
if archives_checkpoints and checkpoints and archives_checkpoints[0] is checkpoints[0]:
keep_checkpoints = checkpoints[:1]
else:
keep_checkpoints = []
checkpoints = set(checkpoints)
# ignore all checkpoint archives to avoid keeping one (which is an incomplete backup)
# that is newer than a successfully completed backup - and killing the successful backup.
archives = [arch for arch in archives_checkpoints if arch not in checkpoints]
archives = manifest.archives.list(match=args.match_archives, sort_by=["ts"], reverse=True)
keep = []
# collect the rule responsible for the keeping of each archive in this dict
# keys are archive ids, values are a tuple
@ -126,7 +107,7 @@ class PruneMixIn:
if num is not None:
keep += prune_split(archives, rule, num, kept_because)
to_delete = (set(archives) | checkpoints) - (set(keep) | set(keep_checkpoints))
to_delete = set(archives) - set(keep)
with Cache(repository, manifest, lock_wait=self.lock_wait, iec=args.iec) as cache:
list_logger = logging.getLogger("borg.output.list")
# set up counters for the progress display
@ -134,7 +115,7 @@ class PruneMixIn:
archives_deleted = 0
uncommitted_deletes = 0
pi = ProgressIndicatorPercent(total=len(to_delete), msg="Pruning archives %3.0f%%", msgid="prune")
for archive in archives_checkpoints:
for archive in archives:
if sig_int and sig_int.action_done():
break
if archive in to_delete:
@ -148,12 +129,9 @@ class PruneMixIn:
archive.delete()
uncommitted_deletes += 1
else:
if is_checkpoint(archive.name):
log_message = "Keeping checkpoint archive:"
else:
log_message = "Keeping archive (rule: {rule} #{num}):".format(
rule=kept_because[archive.id][0], num=kept_because[archive.id][1]
)
log_message = "Keeping archive (rule: {rule} #{num}):".format(
rule=kept_because[archive.id][0], num=kept_because[archive.id][1]
)
if (
args.output_list
or (args.list_pruned and archive in to_delete)
@ -184,11 +162,6 @@ class PruneMixIn:
`GFS <https://en.wikipedia.org/wiki/Backup_rotation_scheme#Grandfather-father-son>`_
(Grandfather-father-son) backup rotation scheme.
Also, prune automatically removes checkpoint archives (incomplete archives left
behind by interrupted backup runs) except if the checkpoint is the latest
archive (and thus still needed). Checkpoint archives are not considered when
comparing archive counts against the retention limits (``--keep-X``).
If you use --match-archives (-a), then only archives that match the pattern are
considered for deletion and only those archives count towards the totals
specified by the rules.

View file

@ -110,23 +110,16 @@ class RCompressMixIn:
repo_objs = manifest.repo_objs
ctype, clevel, olevel = get_csettings(repo_objs.compressor) # desired compression set by --compression
def checkpoint_func():
while repository.async_response(wait=True) is not None:
pass
repository.commit(compact=True)
stats_find = defaultdict(int)
stats_process = defaultdict(int)
recompress_ids = find_chunks(repository, repo_objs, stats_find, ctype, clevel, olevel)
recompress_candidate_count = len(recompress_ids)
chunks_limit = min(1000, max(100, recompress_candidate_count // 1000))
uncommitted_chunks = 0
if not isinstance(repository, (Repository3, RemoteRepository3)):
# start a new transaction
data = repository.get_manifest()
repository.put_manifest(data)
uncommitted_chunks += 1
pi = ProgressIndicatorPercent(
total=len(recompress_ids), msg="Recompressing %3.1f%%", step=0.1, msgid="rcompress.process_chunks"
@ -137,16 +130,14 @@ class RCompressMixIn:
ids, recompress_ids = recompress_ids[:chunks_limit], recompress_ids[chunks_limit:]
process_chunks(repository, repo_objs, stats_process, ids, olevel)
pi.show(increase=len(ids))
checkpointed = self.maybe_checkpoint(
checkpoint_func=checkpoint_func, checkpoint_interval=args.checkpoint_interval
)
uncommitted_chunks = 0 if checkpointed else (uncommitted_chunks + len(ids))
pi.finish()
if sig_int:
# Ctrl-C / SIGINT: do not checkpoint (commit) again, we already have a checkpoint in this case.
# Ctrl-C / SIGINT: do not commit
raise Error("Got Ctrl-C / SIGINT.")
elif uncommitted_chunks > 0:
checkpoint_func()
else:
while repository.async_response(wait=True) is not None:
pass
repository.commit(compact=True)
if args.stats:
print()
print("Recompression stats:")
@ -188,11 +179,6 @@ class RCompressMixIn:
Please note that the outcome might not always be the desired compression
type/level - if no compression gives a shorter output, that might be chosen.
Every ``--checkpoint-interval``, progress is committed to the repository and
the repository is compacted (this is to keep temporary repo space usage in bounds).
A lower checkpoint interval means lower temporary repo space usage, but also
slower progress due to higher overhead (and vice versa).
Please note that this command can not work in low (or zero) free disk space
conditions.
@ -228,14 +214,3 @@ class RCompressMixIn:
)
subparser.add_argument("-s", "--stats", dest="stats", action="store_true", help="print statistics")
subparser.add_argument(
"-c",
"--checkpoint-interval",
metavar="SECONDS",
dest="checkpoint_interval",
type=int,
default=1800,
action=Highlander,
help="write checkpoint every SECONDS seconds (Default: 1800)",
)

View file

@ -34,8 +34,6 @@ class RecreateMixIn:
progress=args.progress,
stats=args.stats,
file_status_printer=self.print_file_status,
checkpoint_interval=args.checkpoint_interval,
checkpoint_volume=args.checkpoint_volume,
dry_run=args.dry_run,
timestamp=args.timestamp,
)
@ -142,25 +140,6 @@ class RecreateMixIn:
help="create a new archive with the name ARCHIVE, do not replace existing archive "
"(only applies for a single archive)",
)
archive_group.add_argument(
"-c",
"--checkpoint-interval",
dest="checkpoint_interval",
type=int,
default=1800,
action=Highlander,
metavar="SECONDS",
help="write checkpoint every SECONDS seconds (Default: 1800)",
)
archive_group.add_argument(
"--checkpoint-volume",
metavar="BYTES",
dest="checkpoint_volume",
type=int,
default=0,
action=Highlander,
help="write checkpoint every BYTES bytes (Default: 0, meaning no volume based checkpointing)",
)
archive_group.add_argument(
"--comment",
metavar="COMMENT",

View file

@ -92,12 +92,6 @@ class RListMixIn:
help="list repository contents",
)
subparser.set_defaults(func=self.do_rlist)
subparser.add_argument(
"--consider-checkpoints",
action="store_true",
dest="consider_checkpoints",
help="Show checkpoint archives in the repository contents list (default: hidden).",
)
subparser.add_argument(
"--short", dest="short", action="store_true", help="only print the archive names, nothing else"
)

View file

@ -269,16 +269,7 @@ class TarMixIn:
start_monotonic=t0_monotonic,
log_json=args.log_json,
)
cp = ChunksProcessor(
cache=cache,
key=key,
add_item=archive.add_item,
prepare_checkpoint=archive.prepare_checkpoint,
write_checkpoint=archive.write_checkpoint,
checkpoint_interval=args.checkpoint_interval,
checkpoint_volume=args.checkpoint_volume,
rechunkify=False,
)
cp = ChunksProcessor(cache=cache, key=key, add_item=archive.add_item, rechunkify=False)
tfo = TarfileObjectProcessors(
cache=cache,
key=key,
@ -524,25 +515,6 @@ class TarMixIn:
help="manually specify the archive creation date/time (yyyy-mm-ddThh:mm:ss[(+|-)HH:MM] format, "
"(+|-)HH:MM is the UTC offset, default: local time zone). Alternatively, give a reference file/directory.",
)
archive_group.add_argument(
"-c",
"--checkpoint-interval",
dest="checkpoint_interval",
type=int,
default=1800,
action=Highlander,
metavar="SECONDS",
help="write checkpoint every SECONDS seconds (Default: 1800)",
)
archive_group.add_argument(
"--checkpoint-volume",
metavar="BYTES",
dest="checkpoint_volume",
type=int,
default=0,
action=Highlander,
help="write checkpoint every BYTES bytes (Default: 0, meaning no volume based checkpointing)",
)
archive_group.add_argument(
"--chunker-params",
dest="chunker_params",

View file

@ -33,7 +33,6 @@ class TransferMixIn:
)
dry_run = args.dry_run
args.consider_checkpoints = True
archive_names = tuple(x.name for x in other_manifest.archives.list_considering(args))
if not archive_names:
return
@ -193,7 +192,7 @@ class TransferMixIn:
If you want to globally change compression while transferring archives to the DST_REPO,
give ``--compress=WANTED_COMPRESSION --recompress=always``.
The default is to transfer all archives, including checkpoint archives.
The default is to transfer all archives.
You could use the misc. archive filter options to limit which archives it will
transfer, e.g. using the ``-a`` option. This is recommended for big

View file

@ -642,7 +642,7 @@ class ChunksMixin:
marker = result[-1]
# All chunks from the repository have a refcount of MAX_VALUE, which is sticky,
# therefore we can't/won't delete them. Chunks we added ourselves in this transaction
# (e.g. checkpoint archives) are tracked correctly.
# are tracked correctly.
init_entry = ChunkIndexEntry(refcount=ChunkIndex.MAX_VALUE, size=0)
for id_ in result:
num_chunks += 1

View file

@ -244,8 +244,7 @@ class SigIntManager:
self.ctx = None
# global flag which might trigger some special behaviour on first ctrl-c / SIGINT,
# e.g. if this is interrupting "borg create", it shall try to create a checkpoint.
# global flag which might trigger some special behaviour on first ctrl-c / SIGINT.
sig_int = SigIntManager()

View file

@ -110,7 +110,6 @@ class Archives(abc.MutableMapping):
def list(
self,
*,
consider_checkpoints=True,
match=None,
match_end=r"\Z",
sort_by=(),
@ -149,8 +148,6 @@ class Archives(abc.MutableMapping):
if any([oldest, newest, older, newer]):
archives = filter_archives_by_date(archives, oldest=oldest, newest=newest, newer=newer, older=older)
if not consider_checkpoints:
archives = [x for x in archives if ".checkpoint" not in x.name]
for sortkey in reversed(sort_by):
archives.sort(key=attrgetter(sortkey))
if first:
@ -163,18 +160,15 @@ class Archives(abc.MutableMapping):
def list_considering(self, args):
"""
get a list of archives, considering --first/last/prefix/match-archives/sort/consider-checkpoints cmdline args
get a list of archives, considering --first/last/prefix/match-archives/sort cmdline args
"""
name = getattr(args, "name", None)
consider_checkpoints = getattr(args, "consider_checkpoints", None)
if name is not None:
raise Error(
"Giving a specific name is incompatible with options --first, --last, "
"-a / --match-archives, and --consider-checkpoints."
"Giving a specific name is incompatible with options --first, --last " "and -a / --match-archives."
)
return self.list(
sort_by=args.sort_by.split(","),
consider_checkpoints=consider_checkpoints,
match=args.match_archives,
first=getattr(args, "first", None),
last=getattr(args, "last", None),

View file

@ -584,7 +584,6 @@ class RemoteRepository:
borg_cmd = self.ssh_cmd(location) + borg_cmd
logger.debug("SSH command line: %s", borg_cmd)
# we do not want the ssh getting killed by Ctrl-C/SIGINT because it is needed for clean shutdown of borg.
# borg's SIGINT handler tries to write a checkpoint and requires the remote repo connection.
self.p = Popen(borg_cmd, bufsize=0, stdin=PIPE, stdout=PIPE, stderr=PIPE, env=env, preexec_fn=ignore_sigint)
self.stdin_fd = self.p.stdin.fileno()
self.stdout_fd = self.p.stdout.fileno()

View file

@ -623,7 +623,6 @@ class RemoteRepository3:
borg_cmd = self.ssh_cmd(location) + borg_cmd
logger.debug("SSH command line: %s", borg_cmd)
# we do not want the ssh getting killed by Ctrl-C/SIGINT because it is needed for clean shutdown of borg.
# borg's SIGINT handler tries to write a checkpoint and requires the remote repo connection.
self.p = Popen(borg_cmd, bufsize=0, stdin=PIPE, stdout=PIPE, stderr=PIPE, env=env, preexec_fn=ignore_sigint)
self.stdin_fd = self.p.stdin.fileno()
self.stdout_fd = self.p.stdout.fileno()

View file

@ -236,33 +236,9 @@ def test_create_stdin(archivers, request):
assert extracted_data == input_data
def test_create_stdin_checkpointing(archivers, request):
archiver = request.getfixturevalue(archivers)
chunk_size = 1000 # fixed chunker with this size, also volume based checkpointing after that volume
cmd(archiver, "rcreate", RK_ENCRYPTION)
input_data = b"X" * (chunk_size * 2 - 1) # one full and one partial chunk
cmd(
archiver,
"create",
f"--chunker-params=fixed,{chunk_size}",
f"--checkpoint-volume={chunk_size}",
"test",
"-",
input=input_data,
)
# repo looking good overall? checks for rc == 0.
cmd(archiver, "check", "--debug")
# verify that there are no part files in final archive
out = cmd(archiver, "list", "test")
assert "stdin.borg_part" not in out
# verify full file
out = cmd(archiver, "extract", "test", "stdin", "--stdout", binary_output=True)
assert out == input_data
def test_create_erroneous_file(archivers, request):
archiver = request.getfixturevalue(archivers)
chunk_size = 1000 # fixed chunker with this size, also volume based checkpointing after that volume
chunk_size = 1000 # fixed chunker with this size
create_regular_file(archiver.input_path, os.path.join(archiver.input_path, "file1"), size=chunk_size * 2)
create_regular_file(archiver.input_path, os.path.join(archiver.input_path, "file2"), size=chunk_size * 2)
create_regular_file(archiver.input_path, os.path.join(archiver.input_path, "file3"), size=chunk_size * 2)

View file

@ -23,39 +23,18 @@ def test_prune_repository(archivers, request):
cmd(archiver, "rcreate", RK_ENCRYPTION)
cmd(archiver, "create", "test1", src_dir)
cmd(archiver, "create", "test2", src_dir)
# these are not really a checkpoints, but they look like some:
cmd(archiver, "create", "test3.checkpoint", src_dir)
cmd(archiver, "create", "test3.checkpoint.1", src_dir)
cmd(archiver, "create", "test4.checkpoint", src_dir)
output = cmd(archiver, "prune", "--list", "--dry-run", "--keep-daily=1")
assert re.search(r"Would prune:\s+test1", output)
# must keep the latest non-checkpoint archive:
# must keep the latest archive:
assert re.search(r"Keeping archive \(rule: daily #1\):\s+test2", output)
# must keep the latest checkpoint archive:
assert re.search(r"Keeping checkpoint archive:\s+test4.checkpoint", output)
output = cmd(archiver, "rlist", "--consider-checkpoints")
output = cmd(archiver, "rlist")
assert "test1" in output
assert "test2" in output
assert "test3.checkpoint" in output
assert "test3.checkpoint.1" in output
assert "test4.checkpoint" in output
cmd(archiver, "prune", "--keep-daily=1")
output = cmd(archiver, "rlist", "--consider-checkpoints")
output = cmd(archiver, "rlist")
assert "test1" not in output
# the latest non-checkpoint archive must be still there:
# the latest archive must be still there:
assert "test2" in output
# only the latest checkpoint archive must still be there:
assert "test3.checkpoint" not in output
assert "test3.checkpoint.1" not in output
assert "test4.checkpoint" in output
# now we supersede the latest checkpoint by a successful backup:
cmd(archiver, "create", "test5", src_dir)
cmd(archiver, "prune", "--keep-daily=2")
output = cmd(archiver, "rlist", "--consider-checkpoints")
# all checkpoints should be gone now:
assert "checkpoint" not in output
# the latest archive must be still there
assert "test5" in output
# This test must match docs/misc/prune-example.txt

View file

@ -80,26 +80,6 @@ def test_date_matching(archivers, request):
assert "archive3" not in output
def test_rlist_consider_checkpoints(archivers, request):
archiver = request.getfixturevalue(archivers)
cmd(archiver, "rcreate", RK_ENCRYPTION)
cmd(archiver, "create", "test1", src_dir)
# these are not really a checkpoints, but they look like some:
cmd(archiver, "create", "test2.checkpoint", src_dir)
cmd(archiver, "create", "test3.checkpoint.1", src_dir)
output = cmd(archiver, "rlist")
assert "test1" in output
assert "test2.checkpoint" not in output
assert "test3.checkpoint.1" not in output
output = cmd(archiver, "rlist", "--consider-checkpoints")
assert "test1" in output
assert "test2.checkpoint" in output
assert "test3.checkpoint.1" in output
def test_rlist_json(archivers, request):
archiver = request.getfixturevalue(archivers)
create_regular_file(archiver.input_path, "file1", size=1024 * 80)

View file

@ -47,7 +47,6 @@ class TestAdHocCache:
assert cache.add_chunk(H(1), {}, b"5678", stats=Statistics()) == (H(1), 4)
def test_deletes_chunks_during_lifetime(self, cache, repository):
"""E.g. checkpoint archives"""
cache.add_chunk(H(5), {}, b"1010", stats=Statistics())
assert cache.seen_chunk(H(5)) == 1
cache.chunk_decref(H(5), 1, Statistics())

View file

@ -124,9 +124,9 @@ def test_mismatch(path, patterns):
def test_match_end():
regex = shellpattern.translate("*-home") # default is match_end == string end
assert re.match(regex, "2017-07-03-home")
assert not re.match(regex, "2017-07-03-home.checkpoint")
assert not re.match(regex, "2017-07-03-home.xxx")
match_end = r"(%s)?\Z" % r"\.checkpoint(\.\d+)?" # with/without checkpoint ending
match_end = r"(\.xxx)?\Z" # with/without .xxx ending
regex = shellpattern.translate("*-home", match_end=match_end)
assert re.match(regex, "2017-07-03-home")
assert re.match(regex, "2017-07-03-home.checkpoint")
assert re.match(regex, "2017-07-03-home.xxx")