Commit graph

10 commits

Author SHA1 Message Date
Michael Eischer
0d71f70a22 minor cleanups and typos 2026-01-31 22:01:23 +01:00
Winfried Plappert
ee154ce0ab restic rewrite --include
added function Count() to the *TreeWriter methods
2026-01-31 19:58:29 +00:00
Winfried Plappert
5148608c39 restic rewrite include - based on restic 0.18.1
cmd/restic/cmd_rewrite.go:
introduction of include filters for this command:
- add include filters, add error checking code
- add new parameter 'keepEmptyDirectoryFunc' to 'walker.NewSnapshotSizeRewriter()',
  so empty directories have to be kept to keep the directory structure intact
- add parameter 'keepEmptySnapshot' to 'filterAndReplaceSnapshot()' to keep snapshots
  intact when nothing is to be included
- introduce helper function 'gatherIncludeFilters()' and 'gatherExcludeFilters()' to
  keep code flow clean

cmd/restic/cmd_rewrite_integration_test.go:
add several new tests around the 'include' functionality

internal/filter/include.go:
this is where is include filter is defined

internal/walker/rewriter.go:
- struct RewriteOpts gains field 'KeepEmtpyDirectory', which is a 'NodeKeepEmptyDirectoryFunc()'
  which defaults to nil, so that al subdirectories are kept
- function 'NewSnapshotSizeRewriter()' gains the parameter 'keepEmptyDirecoryFilter' which
  controls the management of empty subdirectories in case of include filters active

internal/data/tree.go:
gains a function Count() for checking the number if node elements in a newly built tree

internal/walker/rewriter_test.go:
function 'NewSnapshotSizeRewriter()' gets an additional parameter nil to keeps things happy

cmd/restic/cmd_repair_snapshots.go:
function 'filterAndReplaceSnapshot()' gets an additional parameter 'keepEmptySnapshot=nil'

doc/045_working_with_repos.rst:
gets to mention include filters

changelog/unreleased/issue-4278:
the usual announcement file

git rebase master -i produced this

restic rewrite include - keep linter happy

cmd/restic/cmd_rewrite_integration_test.go:
linter likes strings.Contain() better than my strings.Index() >= 0
2026-01-31 19:42:56 +00:00
Michael Eischer
ce7c144aac data: add support for unknown keys to treeIterator
While not planned, it's also not completely impossible that a tree node
might get additional top-level fields. As the tree iterator is built
with a strict expectation of the top-level fields, this would result in
a parsing error. Future-proof the code by simply skipping unknown
fields.
2026-01-31 20:03:38 +01:00
Michael Eischer
6de64911fb data: test TreeFinder 2026-01-31 20:03:38 +01:00
Michael Eischer
24d56fe2a6 diff: switch to efficient DualTreeIterator
The previous implementation stored the whole tree in a map and used it
for checking overlap between trees. This is now replaced with the
DualTreeIterator, which iterates over two trees in parallel and returns
the merge stream in order. In case of overlap between both trees, it
returns both nodes at the same time. Otherwise, only a single node is
returned.
2026-01-31 20:03:38 +01:00
Michael Eischer
350f29d921 data: replace Tree with TreeNodeIterator
The TreeNodeIterator decodes nodes while iterating over a tree blob.
This should reduce peak memory usage as now only the serialized tree
blob and a single node have to alive at the same time. Using the
iterator has implications for the error handling however. Now it is
necessary that all loops that iterate through a tree check for errors
before using the node returned by the iterator.

The other change is that it is no longer possible to iterate over a tree
multiple times. Instead it must be loaded a second time. This only
affects the tree rewriting code.
2026-01-31 20:03:38 +01:00
Michael Eischer
278e457e1f data: use data.TreeWriter to serialize&write data.Tree
Always serialize trees via TreeJSONBuilder. Add a wrapper called
TreeWriter which combines serialization and saving the tree blob in the
repository. In the future, TreeJSONBuilder will have to upload tree
chunks while the tree is still serialized. This will a wrapper like
TreeWriter, so add it right now already.

The archiver.treeSaver still directly uses the TreeJSONBuilder as it
requires special handling.
2026-01-31 19:18:36 +01:00
Michael Eischer
d82ea53735 data: fix invalid trees used in test cases
data.TestCreateSnapshot which is used in particular by TestFindUsedBlobs
and TestFindUsedBlobs could generate trees with duplicate file names.
This is invalid and going forward will result in an error.
2026-01-31 19:18:36 +01:00
Michael Eischer
56ac8360c7 data: split node and snapshot code from restic package 2025-10-03 19:10:39 +02:00
Renamed from internal/restic/tree.go (Browse further)