Commit graph

49 commits

Author SHA1 Message Date
Michael Eischer
67c13c643d
Merge pull request #5691 from MichaelEischer/fix-rewriter-error 2026-02-01 12:09:52 +01:00
Michael Eischer
ef1d525f22 rewriter: return correct error if tree iteration fails 2026-01-31 22:07:07 +01:00
Michael Eischer
74d60ad223 rewriter: test KeepEmptyDirectory option 2026-01-31 22:01:23 +01:00
Michael Eischer
0d71f70a22 minor cleanups and typos 2026-01-31 22:01:23 +01:00
Winfried Plappert
5148608c39 restic rewrite include - based on restic 0.18.1
cmd/restic/cmd_rewrite.go:
introduction of include filters for this command:
- add include filters, add error checking code
- add new parameter 'keepEmptyDirectoryFunc' to 'walker.NewSnapshotSizeRewriter()',
  so empty directories have to be kept to keep the directory structure intact
- add parameter 'keepEmptySnapshot' to 'filterAndReplaceSnapshot()' to keep snapshots
  intact when nothing is to be included
- introduce helper function 'gatherIncludeFilters()' and 'gatherExcludeFilters()' to
  keep code flow clean

cmd/restic/cmd_rewrite_integration_test.go:
add several new tests around the 'include' functionality

internal/filter/include.go:
this is where is include filter is defined

internal/walker/rewriter.go:
- struct RewriteOpts gains field 'KeepEmtpyDirectory', which is a 'NodeKeepEmptyDirectoryFunc()'
  which defaults to nil, so that al subdirectories are kept
- function 'NewSnapshotSizeRewriter()' gains the parameter 'keepEmptyDirecoryFilter' which
  controls the management of empty subdirectories in case of include filters active

internal/data/tree.go:
gains a function Count() for checking the number if node elements in a newly built tree

internal/walker/rewriter_test.go:
function 'NewSnapshotSizeRewriter()' gets an additional parameter nil to keeps things happy

cmd/restic/cmd_repair_snapshots.go:
function 'filterAndReplaceSnapshot()' gets an additional parameter 'keepEmptySnapshot=nil'

doc/045_working_with_repos.rst:
gets to mention include filters

changelog/unreleased/issue-4278:
the usual announcement file

git rebase master -i produced this

restic rewrite include - keep linter happy

cmd/restic/cmd_rewrite_integration_test.go:
linter likes strings.Contain() better than my strings.Index() >= 0
2026-01-31 19:42:56 +00:00
Michael Eischer
17688c2313 data: move TestTreeMap to data package to allow reuse 2026-01-31 20:03:38 +01:00
Michael Eischer
350f29d921 data: replace Tree with TreeNodeIterator
The TreeNodeIterator decodes nodes while iterating over a tree blob.
This should reduce peak memory usage as now only the serialized tree
blob and a single node have to alive at the same time. Using the
iterator has implications for the error handling however. Now it is
necessary that all loops that iterate through a tree check for errors
before using the node returned by the iterator.

The other change is that it is no longer possible to iterate over a tree
multiple times. Instead it must be loaded a second time. This only
affects the tree rewriting code.
2026-01-31 20:03:38 +01:00
Michael Eischer
278e457e1f data: use data.TreeWriter to serialize&write data.Tree
Always serialize trees via TreeJSONBuilder. Add a wrapper called
TreeWriter which combines serialization and saving the tree blob in the
repository. In the future, TreeJSONBuilder will have to upload tree
chunks while the tree is still serialized. This will a wrapper like
TreeWriter, so add it right now already.

The archiver.treeSaver still directly uses the TreeJSONBuilder as it
requires special handling.
2026-01-31 19:18:36 +01:00
Michael Eischer
c6e33c3954 repository: enforce that SaveBlob is called within WithBlobUploader
This is achieved by removing SaveBlob from the public API and only
returning it via a uploader object that is passed in by
WithBlobUploader.
2025-10-12 18:26:26 +02:00
Michael Eischer
56ac8360c7 data: split node and snapshot code from restic package 2025-10-03 19:10:39 +02:00
Michael Eischer
ca5b0c0249 get rid of fmt.Print* usages 2025-10-03 18:55:46 +02:00
Michael Eischer
10cfe96cd4 walker: fix error handling if tree cannot be loaded
A tree that cannot be loaded is a fatal error when walking the tree.
Thus, return the error and exit the tree walk.
2025-06-02 20:04:26 +02:00
Michael Eischer
d3c3390a51 ls: proper error handling if output is not possible 2024-11-01 17:07:43 +01:00
Michael Eischer
ca1e5e10b6 add proper constants for node type 2024-08-31 18:20:01 +02:00
Michael Eischer
ae1cb889dd Add more checks for canceled contexts 2024-07-31 19:30:47 +02:00
Alex Johnson
3bf2927006 Update snapshot summary on rewrite
Signed-off-by: Alex Johnson <hello@alex-johnson.net>
2024-07-16 12:06:50 -05:00
coderwander
a82ed71de7 Fix struct names
Signed-off-by: coderwander <770732124@qq.com>
2024-04-18 10:02:09 +08:00
Alexander Neumann
c0514dd8ba Fix linter errors (except for tests) 2024-02-10 22:58:10 +01:00
Michael Eischer
d4ed7c8858 walker: add tests for leaveDir 2024-01-27 13:17:33 +01:00
Michael Eischer
2c80cfa4a5 walker: fix missing leaveDir if directory is partially skipped 2024-01-27 13:17:33 +01:00
Michael Eischer
9ecbda059c walker: add callback to inform about leaving a directory 2024-01-27 13:17:32 +01:00
Michael Eischer
fdcbb53017 walker: test skipping for root node 2024-01-19 21:16:06 +01:00
Michael Eischer
0b39940fdb walker: Remove ignoreTrees functionality
It was only used in two places:
- stats: apparently as a minor performance optimization, which is
  unlikely to be important
- find: filtered directories would be ignored. However, this
  optimization missed that it is possible that two directories have the
  exact same content. Such directories would be incorrectly ignored too.
  Example:
```
mkdir test test/a test/b
restic backup test
restic find latest test/b
-> incorrectly does not return anything
```

Thus, remove the functionality as it's apparently too complex to use
correctly.
2024-01-19 21:16:06 +01:00
Andrea Gelmini
241916d55b
Fix typos 2023-12-06 13:11:55 +01:00
Michael Eischer
5e4e268bdc Use _ as parameter name for unused Context
The context is required by the implemented interface.
2023-05-18 21:15:45 +02:00
Michael Eischer
1bd1f3008d walker: extend TreeRewriter to support snapshot repairing
This adds support for caching already rewritten trees, handling of load
errors and disabling the check that the serialization doesn't lead to
data loss.
2023-05-01 15:20:24 +02:00
Michael Eischer
38dac78180 walker: restructure FilterTree into TreeRewriter
The more generic RewriteNode callback replaces the SelectByName and
PrintExclude functions. The main part of this change is a preparation to
allow using the TreeRewriter for the `repair snapshots` command.
2023-05-01 15:20:12 +02:00
Michael Eischer
bc2399fbd9 walker: recurse into directory based on node type
A broken directory might also not have a subtree.
2023-05-01 15:20:00 +02:00
Michael Eischer
1a9705fc95 walker: Simplify change detection in FilterTree
Now the rewritten tree is always serialized which makes sure that we
don't accidentally miss any relevant changes.
2023-05-01 15:19:48 +02:00
Michael Eischer
cdb0fb9c06 tweak debug logs 2023-04-23 11:38:06 +02:00
Michael Eischer
f88acd4503 rewrite: Fail if a tree contains an unknown field
In principle, the JSON format of Tree objects is extensible without
requiring a format change. In order to not loose information just play
it safe and reject rewriting trees for which we could loose data.
2022-11-12 19:55:22 +01:00
Michael Eischer
0224e276ec walker: Add tests for FilterTree 2022-11-12 19:55:22 +01:00
Michael Eischer
ad14d6e4ac rewrite: use SelectByName like in the backup command 2022-11-12 19:55:22 +01:00
Michael Eischer
f6339b88af rewrite: extract tree filtering 2022-11-12 19:55:22 +01:00
Michael Eischer
7cf042118f walker: Convert tests to use TreeJSONBuilder
The old code marshalled the tree blobs different than other places in
restic. The hashed tree blob did not contain a final newline character.
2022-10-15 11:04:13 +02:00
Michael Eischer
fbcbd5318c repository: extract LoadTree/SaveTree
The repository has no real idea what a Tree is. So these methods never
belonged there.
2022-07-17 13:11:28 +02:00
Michael Eischer
6f53ecc1ae adapt workers based on whether an operation is CPU or IO-bound
Use runtime.GOMAXPROCS(0) as worker count for CPU-bound tasks,
repo.Connections() for IO-bound task and a combination if a task can be
both. Streaming packs is treated as IO-bound as adding more worker
cannot provide a speedup.

Typical IO-bound tasks are download / uploading / deleting files.
Decoding / Encoding / Verifying are usually CPU-bound. Several tasks are
a combination of both, e.g. for combined download and decode functions.
In the latter case add both limits together. As the backends have their
own concurrency limits restic still won't download more than
repo.Connections() files in parallel, but the additional workers can
decode already downloaded data in parallel.
2022-07-03 12:19:26 +02:00
Michael Eischer
254c8743fc Limit number of large tree blobs loaded in parallel by StreamTrees
Load tree blobs with more than 50MB only from a single goroutine. Very
large tree blobs with for example 400 MB size can otherwise require
roughly 1GB * streamTreeParallelism memory.
2022-02-19 12:26:09 +01:00
greatroar
c892c0bab9 internal/restic: Don't allocate in Tree.Insert
name         old time/op    new time/op    delta
BuildTree-8    34.6µs ± 4%     7.0µs ± 3%  -79.68%  (p=0.000 n=18+19)

name         old alloc/op   new alloc/op   delta
BuildTree-8    34.0kB ± 0%     0.9kB ± 0%  -97.37%  (p=0.000 n=20+20)

name         old allocs/op  new allocs/op  delta
BuildTree-8       108 ± 0%         1 ± 0%  -99.07%  (p=0.000 n=20+20)
2021-09-26 18:08:48 +02:00
Alexander Neumann
3c753c071c errcheck: More error handling 2021-01-30 20:02:37 +01:00
Michael Eischer
efbb850d92 Remove a few redundant type specifiers
This is the result of running `gofmt -s -w **/*.go`
2020-10-06 14:55:13 +02:00
Michael Eischer
1ede018ea6 error variable names should start with 'Err' 2020-09-05 10:07:17 +02:00
Michael Eischer
460e2ffbf6 Collapse a few boolean operations 2020-09-05 10:07:14 +02:00
Michael Eischer
a178e5628e Remove duplicate TreeLoader interface 2020-08-01 12:29:16 +02:00
Alexander Neumann
15ad0e5bc7 walk: Pass parent tree ID to WalkFunc 2018-08-19 23:28:04 +02:00
Mikael Berthe
1f27d17c0d walker.Walk: Pass parent tree-id to WalkFunc 2018-08-19 23:28:04 +02:00
Alexander Neumann
c2c06ae2c9 walker: Don't ignore empty trees by default
Closes #1849
2018-06-17 09:49:03 +02:00
Alexander Neumann
8f26fe271c ls: Use walker for ls 2018-06-09 23:35:20 +02:00
Alexander Neumann
3a86f4852b Add walker for trees in the repo 2018-06-09 23:35:20 +02:00