Note: although I hopefully fixed all the conflicts,
some tests are quite broken.
Conflicts:
borg/_chunker.c
borg/archive.py
borg/archiver.py
borg/cache.py
borg/helpers.py
borg/testsuite/archiver.py
every chunk has the encryption key type as first byte and we do not want to rewrite the whole repo
to change the passphrase type to repokey type. thus we simply dispatch this type to repokey
handler.
if there is a repokey that contains the same secrets as they were derived from the passphrase, it will just work.
if there is none yet, one needs to run migrate-to-repokey command to create it.
refactorings:
- introduced concept of default answer:
if the answer string is in the defaultish sequence, the return value of yes() will be the default.
e.g. if just pressing <enter> when asked on the console or if an empty string or "default" is
in the environment variable for overriding.
if an environment var has an invalid value and no retries are enabled: return default
if retries are enabled, next retry won't use the env var again, but either ask via input().
- simplify:
only one default - this should be a SAFE default as it is used in some special conditions
like EOF or invalid input with retries disallowed.
no isatty() magic, the "yes" shell command exists, so we could receive input even if it is not from a tty.
- clean:
separate retry flag from retry_msg
this is available in python 3.4+.
note:
before removing the pbkdf tests, i ran them with the pbkdf from stdlib to make sure it gives same result.
long term testing of this now belongs into stdlib tests, not into borg.
The fnmatch module in Python's standard library implements a pattern
format for paths which is similar to shell patterns. However, “*”
matches any character including path separators. This newly introduced
pattern syntax with the selector “sh” no longer matches the path
separator with “*”. Instead “**/” can be used to match zero or more
directory levels.
This change implements the functionality requested in issue #361:
extracting files with a given extension. It does so by permitting
patterns to be used instead plain prefix paths. The pattern styles
supported are the same as for exclusions.
The “extract” command supports extracting all files underneath a given
set of prefix paths. The forthcoming support for extracting files using
a pattern (i.e. only files ending in “.zip”) requires the introduction
of path prefixes as a third pattern style, making it also available for
exclusions.
The utility functions “adjust_patterns” and “exclude_path” produce
respectively use a standard list object containing pattern objects.
With the forthcoming introduction of patterns for filtering files
to be extracted it's better to move the logic of these classes into
a single class.
The wrapper allows adding any number of patterns to an internal list
together with a value to be returned if a match function finds that
one of the patterns matches. A fallback value is returned otherwise.
- Stop using “adjust_pattern” and “exclude_path” as they're utility
functions not relevant to testing pattern classes
- Cover a few more cases, especially with more than one path separator
and relative paths
- At least one dedicated test function for each pattern style as opposed
to a single, big test mixing styles
- Use positive instead of negative matching (i.e. the expected list of
resulting items is a list of items matching a pattern)
The class names “IncludePattern” and “ExcludePattern” may have been
appropriate when they were the only styles. With the recent addition of
regular expression support and with at least one more style being added
in forthcoming changes these classes should be renamed to be more
descriptive. “ExcludeRegex” is also renamed to match the new names.
The unit tests for Unicode in path patterns contained a lot of
unnecessary duplication. One set of duplication was for Mac OS X (also
known as Darwin) as it normalizes Unicode in paths to NFD. Then each
test case was repeated for every type of pattern.
With this change the tests become parametrized using py.test. The
duplicated code has been removed.
Patterns to exclude files can be loaded from a text file using the
“--exclude-from” option. Whitespace at the beginning or end of lines was
not stripped. Indented comments would be interpreted as a pattern and
a misplaced space at the end of a line--some text editors don't strip
them--could cause an exclusion pattern to not match as desired. With the
recent addition of regular expression support for exclusions the spaces
can be matched if necessary (“^\s” or “\s$”), though it's highly
unlikely that there are many paths deliberately starting or ending with
whitespace.
The existing option to exclude files and directories, “--exclude”, is
implemented using fnmatch[1]. fnmatch matches the slash (“/”) with “*”
and thus makes it impossible to write patterns where a directory with
a given name should be excluded at a specific depth in the directory
hierarchy, but not anywhere else. Consider this structure:
home/
home/aaa
home/aaa/.thumbnails
home/user
home/user/img
home/user/img/.thumbnails
fnmatch incorrectly excludes “home/user/img/.thumbnails” with a pattern
of “home/*/.thumbnails” when the intention is to exclude “.thumbnails”
in all home directories while retaining directories with the same name
in all other locations.
With this change regular expressions are introduced as an additional
pattern syntax. The syntax is selected using a prefix on “--exclude”'s
value. “re:” is for regular expression and “fm:”, the default, selects
fnmatch. Selecting the syntax is necessary when regular expressions are
desired or when the desired fnmatch pattern starts with two alphanumeric
characters followed by a colon (i.e. “aa:something/*”). The exclusion
described above can be implemented as follows:
--exclude 're:^home/[^/]+/\.thumbnails$'
The “--exclude-from” option permits loading exclusions from a text file
where the same prefixes can now be used, e.g. “re:\.tmp$”.
The documentation has been extended and now not only describes the two
pattern styles, but also the file format supported by “--exclude-from”.
This change has been discussed in issue #43 and in change request #497.
[1] https://docs.python.org/3/library/fnmatch.html
Signed-off-by: Michael Hanselmann <public@hansmi.ch>
The parsing code for exclude files (given via `--exclude-from`) was not
tested. Its core is factorized into a separate function to facilitate an
easier test. The observable behaviour is unchanged.
For 0.29 we worked towards a "silent by default" behaviour, so interactive usage will include -v more frequently in future.
But I noticed that this conflicts with the progress display. This would be no problem if users willingly decide which one
of --verbose or --progress they want to see, but before this fix, the progress display was activated magically when
a tty was detected. So, to counteract this magic, users would need to use --no-progress.
That's backwards imho, so I removed the magic again and users have to give --progress when they want
to see a progress indicator. Or (alternatively) they give --verbose when they want to see the long file list.
as soon as one target segment is full, it is a good time to commit it and remove the source segments
that are already completely unused (because they were transferred int the target segment).
so, for compact_segments(save_space=True), the additional space needed should be about 1 segment size.
note: we can't just do that at the end of one source segment as this might create very small
target segments, which is not wanted.
removed --log-level due to overlap with how --verbose works now.
for consistency, added --info as alias to --verbose (as the effect is
setting INFO log level).
also added --debug which sets DEBUG log level.
note: there are no messages emitted at DEBUG level yet.
WARNING is the default (because we want mostly silent behaviour,
except if something serious happens), so we don't need --warning
as an option.
the problem here was that we do not just have changed and unchanged items,
but also a lot of items besides regular files which we just back up "as is" without
determining whether they are changed or not. thus, we can't support changed/unchanged
in a way users would expect them to work.
the A/M/U status only applies to the data content of regular files (compared to the index).
for all items, we ALWAYS save the metadata, there is no changed / not changed detection there.
thus, I replaced this with a --filter option where you can just specify which
status chars you want to see listed in the output.
E.g. --filter AM will only show regular files with A(dded) or M(odified) state, but nothing else.
Not giving --filter defaults to showing all items no matter what status they have.
Output is emitted via logger at info level, so it won't show up except if the logger is at that level.