Compare commits

..

No commits in common. "master" and "1.0.3" have entirely different histories.

539 changed files with 23507 additions and 98905 deletions

23
.coveragerc Normal file
View file

@ -0,0 +1,23 @@
[run]
branch = True
source = borg
omit =
*/borg/__init__.py
*/borg/__main__.py
*/borg/_version.py
*/borg/fuse.py
*/borg/support/*
*/borg/testsuite/*
*/borg/hash_sizes.py
[report]
exclude_lines =
pragma: no cover
pragma: freebsd only
pragma: unknown platform only
def __repr__
raise AssertionError
raise NotImplementedError
if 0:
if __name__ == .__main__.:
ignore_errors = True

View file

@ -1,11 +0,0 @@
# EditorConfig is awesome: https://editorconfig.org/
root = true
[*]
end_of_line = lf
charset = utf-8
indent_style = space
indent_size = 4
insert_final_newline = true
trim_trailing_whitespace = true

View file

@ -1,2 +0,0 @@
# Migrate code style to Black
7957af562d5ce8266b177039783be4dc8bdd7898

4
.gitattributes vendored
View file

@ -1,5 +1 @@
borg/_version.py export-subst
*.py diff=python
docs/usage/*.rst.inc merge=ours
docs/man/* merge=ours

6
.github/FUNDING.yml vendored
View file

@ -1,6 +0,0 @@
# These are supported funding model platforms
github: borgbackup
liberapay: borgbackup
open_collective: borgbackup
custom: ['https://www.borgbackup.org/support/fund.html']

View file

@ -1,54 +0,0 @@
<!--
Thank you for reporting an issue.
*IMPORTANT* Before creating a new issue, please look around:
- BorgBackup documentation: https://borgbackup.readthedocs.io/en/stable/index.html
- FAQ: https://borgbackup.readthedocs.io/en/stable/faq.html
- Open issues in the GitHub tracker: https://github.com/borgbackup/borg/issues
If you cannot find a similar problem, then create a new issue.
Please fill in as much of the template as possible.
-->
## Have you checked the BorgBackup docs, FAQ, and open GitHub issues?
No
## Is this a bug/issue report or a question?
Bug/Issue/Question
## System information. For client/server mode, post info for both machines.
#### Your Borg version (borg -V).
#### Operating system (distribution) and version.
#### Hardware/network configuration and filesystems used.
#### How much data is handled by Borg?
#### Full Borg command line that led to the problem (leave out excludes and passwords).
## Describe the problem you're observing.
#### Can you reproduce the problem? If so, describe how. If not, describe troubleshooting steps you took before opening the issue.
#### Include any warnings/errors/backtraces from the system logs
<!--
If this complaint relates to Borg performance, please include CRUD benchmark
results and any steps you took to troubleshoot.
How to run the benchmark: https://borgbackup.readthedocs.io/en/stable/usage/benchmark.html
*IMPORTANT* Please mark logs and terminal command output, otherwise GitHub will not display them correctly.
An example is provided below.
Example:
```
this is an example of how log text should be marked (wrap it with ```)
```
-->

View file

@ -1,18 +0,0 @@
<!--
Thank you for contributing to BorgBackup!
Please make sure your PR complies with our contribution guidelines:
https://borgbackup.readthedocs.io/en/latest/development.html#contributions
-->
## Description
<!-- What does this PR do? Reference any related issues with "fixes #XXXX". -->
## Checklist
- [ ] PR is against `master` (or maintenance branch if only applicable there)
- [ ] New code has tests and docs where appropriate
- [ ] Tests pass (run `tox` or the relevant test subset)
- [ ] Commit messages are clean and reference related issues

View file

@ -1,24 +0,0 @@
version: 2
updates:
- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: "weekly"
groups:
actions:
patterns:
- "*"
- package-ecosystem: "pip"
directory: "/requirements.d"
ignore:
- dependency-name: "black"
schedule:
interval: "weekly"
cooldown:
semver-major-days: 90
semver-minor-days: 30
groups:
pip-dependencies:
patterns:
- "*"

View file

@ -1,38 +0,0 @@
name: Backport pull request
on:
pull_request_target:
types: [closed]
issue_comment:
types: [created]
permissions:
contents: write # so it can comment
pull-requests: write # so it can create pull requests
jobs:
backport:
name: Backport pull request
runs-on: ubuntu-24.04
timeout-minutes: 5
# Only run when pull request is merged
# or when a comment starting with `/backport` is created by someone other than the
# https://github.com/backport-action bot user (user id: 97796249). Note that if you use your
# own PAT as `github_token`, that you should replace this id with yours.
if: >
(
github.event_name == 'pull_request_target' &&
github.event.pull_request.merged
) || (
github.event_name == 'issue_comment' &&
github.event.issue.pull_request &&
github.event.comment.user.id != 97796249 &&
startsWith(github.event.comment.body, '/backport')
)
steps:
- uses: actions/checkout@v6
- name: Create backport pull requests
uses: korthout/backport-action@v4
with:
label_pattern: '^port/(.+)$'

View file

@ -1,30 +0,0 @@
# https://black.readthedocs.io/en/stable/integrations/github_actions.html#usage
# See also what we use locally in requirements.d/codestyle.txt — this should be the same version here.
name: Lint
on:
push:
paths:
- '**.py'
- 'pyproject.toml'
- '.github/workflows/black.yaml'
pull_request:
paths:
- '**.py'
- 'pyproject.toml'
- '.github/workflows/black.yaml'
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.ref }}
cancel-in-progress: ${{ github.event_name == 'pull_request' }}
jobs:
lint:
runs-on: ubuntu-24.04
timeout-minutes: 5
steps:
- uses: actions/checkout@v6
- uses: psf/black@87928e6d6761a4a6d22250e1fee5601b3998086e # 26.5.1
with:
version: "~= 24.0"

View file

@ -1,151 +0,0 @@
name: Canary (Unlocked Requirements)
on:
schedule:
- cron: '0 7 * * *' # Run at 07:00 UTC
workflow_dispatch: # Allow manual trigger
permissions:
contents: read
jobs:
canary_tests:
name: Canary (${{ matrix.os }}, ${{ matrix.python-version }}, ${{ matrix.toxenv }})
runs-on: ${{ matrix.os }}
timeout-minutes: 360
strategy:
fail-fast: false
matrix:
include:
# A representative subset of environments
- os: ubuntu-24.04
python-version: '3.11'
toxenv: py311-llfuse
- os: ubuntu-24.04
python-version: '3.12'
toxenv: py312-pyfuse3
- os: ubuntu-24.04
python-version: '3.14'
toxenv: py314-mfusepy
- os: macos-15
python-version: '3.14'
toxenv: py314-none
steps:
- uses: actions/checkout@v6
with:
fetch-depth: 0
fetch-tags: true
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v6
with:
python-version: ${{ matrix.python-version }}
- name: Install Linux packages
if: ${{ runner.os == 'Linux' }}
shell: bash
run: |
sudo apt-get update
sudo apt-get install -y pkg-config build-essential
sudo apt-get install -y libssl-dev libacl1-dev liblz4-dev
if [[ "${{ matrix.toxenv }}" == *"llfuse"* ]]; then
sudo apt-get install -y libfuse-dev fuse
elif [[ "${{ matrix.toxenv }}" == *"pyfuse3"* || "${{ matrix.toxenv }}" == *"mfusepy"* ]]; then
sudo apt-get install -y libfuse3-dev fuse3
fi
- name: Install macOS packages
if: ${{ runner.os == 'macOS' }}
shell: bash
run: |
brew bundle install || true
- name: Install Python requirements (UNLOCKED)
shell: bash
run: |
python -m pip install --upgrade pip setuptools wheel
# Use UNLOCKED requirements to catch upstream breakages
pip install -r requirements.d/development.txt
- name: Install borgbackup
shell: bash
run: |
if [[ "${{ matrix.toxenv }}" == *"llfuse"* ]]; then
pip install -e ".[llfuse,cockpit]"
elif [[ "${{ matrix.toxenv }}" == *"pyfuse3"* ]]; then
pip install -e ".[pyfuse3,cockpit]"
elif [[ "${{ matrix.toxenv }}" == *"mfusepy"* ]]; then
pip install -e ".[mfusepy,cockpit]"
else
pip install -e ".[cockpit]"
fi
- name: Run tests (Canary)
shell: bash
run: |
if [[ "${{ matrix.toxenv }}" == *"-windows" ]]; then
python -m pytest -n4 --benchmark-skip -vv -rs -k "not remote" --cov=borg --cov-config=pyproject.toml --cov-report=xml --junitxml=test-results.xml
else
# Force tox to use the unlocked requirements in its environment creation
# by overriding the deps if possible, or just trusting it uses development.txt
# which we already installed in the root. Actually tox creates its own venv.
# We need to tell tox to use the unlocked file.
tox -e ${{ matrix.toxenv }} --override "env_run_base.deps=[-rrequirements.d/development.txt]"
fi
windows_canary:
if: true # can be used to temporarily disable the build
name: Canary (Windows)
runs-on: windows-latest
timeout-minutes: 180
env:
PY_COLORS: 1
MSYS2_ARG_CONV_EXCL: "*"
MSYS2_ENV_CONV_EXCL: "*"
defaults:
run:
shell: msys2 {0}
steps:
- uses: actions/checkout@v6
with:
fetch-depth: 0
- uses: msys2/setup-msys2@v2
with:
msystem: UCRT64
update: true
- name: Install system packages
run: ./scripts/msys2-install-deps development
- name: Build python venv
run: |
# building cffi / argon2-cffi in the venv fails, so we try to use the system packages
python -m venv --system-site-packages env
. env/bin/activate
# python -m pip install --upgrade pip
# pip install --upgrade setuptools build wheel
pip install -r requirements.d/pyinstaller.txt
- name: Build
run: |
# build borg.exe
. env/bin/activate
pip install -e ".[cockpit,s3,sftp,rclone]"
mkdir -p dist/binary
pyinstaller -y --clean --distpath=dist/binary scripts/borg.exe.spec
# build sdist and wheel in dist/...
python -m build
- name: Run tests
run: |
# Ensure locally built binary in ./dist/binary/borg-dir is found during tests
export PATH="$GITHUB_WORKSPACE/dist/binary/borg-dir:$PATH"
borg.exe -V
. env/bin/activate
python -m pytest -n4 --benchmark-skip -vv -rs -k "not remote" --cov=borg --cov-config=pyproject.toml --cov-report=xml --junitxml=test-results.xml

View file

@ -1,705 +0,0 @@
# badge: https://github.com/borgbackup/borg/workflows/CI/badge.svg?branch=master
name: CI
on:
push:
branches: [ master ]
tags:
- '2.*'
pull_request:
branches: [ master ]
paths:
- '**.py'
- '**.pyx'
- '**.c'
- '**.h'
- '**.yml'
- '**.toml'
- '**.cfg'
- '**.ini'
- 'requirements.d/*'
- '!docs/**'
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.ref }}
cancel-in-progress: ${{ github.event_name == 'pull_request' }}
permissions:
contents: read
jobs:
lint:
runs-on: ubuntu-24.04
timeout-minutes: 5
steps:
- uses: actions/checkout@v6
- uses: astral-sh/ruff-action@v3
security:
runs-on: ubuntu-24.04
timeout-minutes: 5
steps:
- uses: actions/checkout@v6
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: '3.11'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install bandit[toml]
- name: Run Bandit
run: |
bandit -r src/borg -c pyproject.toml
asan_ubsan:
runs-on: ubuntu-24.04
timeout-minutes: 25
needs: [lint]
steps:
- uses: actions/checkout@v6
with:
# Just fetching one commit is not enough for setuptools-scm, so we fetch all.
fetch-depth: 0
fetch-tags: true
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: '3.12'
- name: Install system packages
run: |
sudo apt-get update
sudo apt-get install -y pkg-config build-essential
sudo apt-get install -y libssl-dev libacl1-dev liblz4-dev
- name: Install Python dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.d/development.lock.txt
- name: Build Borg with ASan/UBSan
# Build the C/Cython extensions with AddressSanitizer and UndefinedBehaviorSanitizer enabled.
# How this works:
# - The -fsanitize=address,undefined flags inject runtime checks into our native code. If a bug is hit
# (e.g., buffer overflow, use-after-free, out-of-bounds, or undefined behavior), the sanitizer prints
# a detailed error report to stderr, including a stack trace, and forces the process to exit with
# non-zero status. In CI, this will fail the step/job so you will notice.
# - ASAN_OPTIONS/UBSAN_OPTIONS configure the sanitizers' runtime behavior (see below for meanings).
env:
CFLAGS: "-O1 -g -fno-omit-frame-pointer -fsanitize=address,undefined"
CXXFLAGS: "-O1 -g -fno-omit-frame-pointer -fsanitize=address,undefined"
LDFLAGS: "-fsanitize=address,undefined"
# ASAN_OPTIONS controls AddressSanitizer runtime tweaks:
# - detect_leaks=0: Disable LeakSanitizer to avoid false positives with CPython/pymalloc in short-lived tests.
# - strict_string_checks=1: Make invalid string operations (e.g., over-reads) more likely to be detected.
# - check_initialization_order=1: Catch uses that depend on static initialization order (C++).
# - detect_stack_use_after_return=1: Detect stack-use-after-return via stack poisoning (may increase overhead).
ASAN_OPTIONS: "detect_leaks=0:strict_string_checks=1:check_initialization_order=1:detect_stack_use_after_return=1"
# UBSAN_OPTIONS controls UndefinedBehaviorSanitizer runtime:
# - print_stacktrace=1: Include a stack trace for UB reports to ease debugging.
# Note: UBSan is recoverable by default (process may continue after reporting). If you want CI to
# abort immediately and fail on the first UB, add `halt_on_error=1` (e.g., UBSAN_OPTIONS="print_stacktrace=1:halt_on_error=1").
UBSAN_OPTIONS: "print_stacktrace=1"
# PYTHONDEVMODE enables additional Python runtime checks and warnings.
PYTHONDEVMODE: "1"
run: pip install -e .
- name: Run tests under sanitizers
env:
ASAN_OPTIONS: "detect_leaks=0:strict_string_checks=1:check_initialization_order=1:detect_stack_use_after_return=1"
UBSAN_OPTIONS: "print_stacktrace=1"
PYTHONDEVMODE: "1"
# Ensure the ASan runtime is loaded first to avoid "ASan runtime does not come first" warnings.
# We discover libasan/libubsan paths via gcc and preload them for the Python test process.
# the remote tests are slow and likely won't find anything useful
run: |
set -euo pipefail
export LD_PRELOAD="$(gcc -print-file-name=libasan.so):$(gcc -print-file-name=libubsan.so)"
echo "Using LD_PRELOAD=$LD_PRELOAD"
pytest -v --benchmark-skip -k "not remote"
native_tests:
needs: [lint]
permissions:
contents: read
id-token: write
attestations: write
strategy:
fail-fast: true
# noinspection YAMLSchemaValidation
matrix: >-
${{ fromJSON(
github.event_name == 'pull_request' && '{
"include": [
{"os": "ubuntu-24.04", "python-version": "3.11", "toxenv": "mypy"},
{"os": "ubuntu-24.04", "python-version": "3.11", "toxenv": "docs"},
{"os": "ubuntu-24.04", "python-version": "3.11", "toxenv": "py311-llfuse"},
{"os": "ubuntu-24.04", "python-version": "3.12", "toxenv": "py312-pyfuse3"},
{"os": "ubuntu-24.04", "python-version": "3.14", "toxenv": "py314-mfusepy"}
]
}' || '{
"include": [
{"os": "ubuntu-24.04", "python-version": "3.11", "toxenv": "py311-llfuse"},
{"os": "ubuntu-24.04", "python-version": "3.12", "toxenv": "py312-pyfuse3"},
{"os": "ubuntu-24.04", "python-version": "3.13", "toxenv": "py313-mfusepy"},
{"os": "ubuntu-24.04", "python-version": "3.14", "toxenv": "py314-pyfuse3", "binary": "borg-linux-glibc239-x86_64-gh"},
{"os": "ubuntu-24.04-arm", "python-version": "3.14", "toxenv": "py314-pyfuse3", "binary": "borg-linux-glibc239-arm64-gh"},
{"os": "macos-15", "python-version": "3.14", "toxenv": "py314-none", "binary": "borg-macos-15-arm64-gh"},
{"os": "macos-15-intel", "python-version": "3.14", "toxenv": "py314-none", "binary": "borg-macos-15-x86_64-gh"}
]
}'
) }}
env:
TOXENV: ${{ matrix.toxenv }}
runs-on: ${{ matrix.os }}
# macOS machines can be slow, if overloaded.
timeout-minutes: 360
steps:
- uses: actions/checkout@v6
with:
# Just fetching one commit is not enough for setuptools-scm, so we fetch all.
fetch-depth: 0
fetch-tags: true
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v6
with:
python-version: ${{ matrix.python-version }}
- name: Cache pip
uses: actions/cache@v5
with:
path: ~/.cache/pip
key: ${{ runner.os }}-${{ runner.arch }}-pip-${{ hashFiles('requirements.d/development.lock.txt') }}
restore-keys: |
${{ runner.os }}-${{ runner.arch }}-pip-
${{ runner.os }}-${{ runner.arch }}-
- name: Cache tox environments
uses: actions/cache@v5
with:
path: .tox
key: ${{ runner.os }}-${{ runner.arch }}-tox-${{ matrix.toxenv }}-${{ hashFiles('requirements.d/development.lock.txt', 'pyproject.toml') }}
restore-keys: |
${{ runner.os }}-${{ runner.arch }}-tox-${{ matrix.toxenv }}-
${{ runner.os }}-${{ runner.arch }}-tox-
- name: Install Linux packages
if: ${{ runner.os == 'Linux' }}
run: |
sudo apt-get update
sudo apt-get install -y pkg-config build-essential
sudo apt-get install -y libssl-dev libacl1-dev liblz4-dev
sudo apt-get install -y bash zsh fish # for shell completion tests
sudo apt-get install -y rclone openssh-server curl
if [[ "$TOXENV" == *"llfuse"* ]]; then
sudo apt-get install -y libfuse-dev fuse # Required for Python llfuse module
elif [[ "$TOXENV" == *"pyfuse3"* || "$TOXENV" == *"mfusepy"* ]]; then
sudo apt-get install -y libfuse3-dev fuse3 # Required for Python pyfuse3 module
fi
- name: Install macOS packages
if: ${{ runner.os == 'macOS' }}
run: |
brew unlink pkg-config@0.29.2 || true
brew bundle install
- name: Configure OpenSSH SFTP server (test only)
if: ${{ runner.os == 'Linux' && !contains(matrix.toxenv, 'mypy') && !contains(matrix.toxenv, 'docs') }}
run: |
sudo mkdir -p /run/sshd
sudo useradd -m -s /bin/bash sftpuser || true
# Create SSH key for the CI user and authorize it for sftpuser
mkdir -p ~/.ssh
chmod 700 ~/.ssh
test -f ~/.ssh/id_ed25519 || ssh-keygen -t ed25519 -N '' -f ~/.ssh/id_ed25519
sudo mkdir -p /home/sftpuser/.ssh
sudo chmod 700 /home/sftpuser/.ssh
sudo cp ~/.ssh/id_ed25519.pub /home/sftpuser/.ssh/authorized_keys
sudo chown -R sftpuser:sftpuser /home/sftpuser/.ssh
sudo chmod 600 /home/sftpuser/.ssh/authorized_keys
# Allow publickey auth and enable Subsystem sftp
sudo sed -i 's/^#\?PasswordAuthentication .*/PasswordAuthentication no/' /etc/ssh/sshd_config
sudo sed -i 's/^#\?PubkeyAuthentication .*/PubkeyAuthentication yes/' /etc/ssh/sshd_config
if ! grep -q '^Subsystem sftp' /etc/ssh/sshd_config; then echo 'Subsystem sftp /usr/lib/openssh/sftp-server' | sudo tee -a /etc/ssh/sshd_config; fi
# Ensure host keys exist to avoid slow generation on first sshd start
sudo ssh-keygen -A
# Start sshd (listen on default 22 inside runner)
sudo /usr/sbin/sshd -D &
# Add host key to known_hosts so paramiko trusts it
ssh-keyscan -H localhost 127.0.0.1 | tee -a ~/.ssh/known_hosts
# Start ssh-agent and add our key so paramiko can use the agent
eval "$(ssh-agent -s)"
ssh-add ~/.ssh/id_ed25519
# The rest test starts "borg serve --rest" over ssh as sftpuser, which runs the borg
# under test from the tox venv under $HOME. Allow sftpuser to traverse into the runner
# home so it can reach that borg (the venv dirs/files are created world-r/x by tox/pip).
sudo chmod o+x "$HOME"
# Export SFTP test URL for tox via GITHUB_ENV
echo "BORG_TEST_SFTP_REPO=sftp://sftpuser@localhost:22/borg/sftp-repo" >> $GITHUB_ENV
echo "BORG_TEST_REST_REPO=rest://sftpuser@localhost:22/borg/rest-repo" >> $GITHUB_ENV
- name: Install and configure MinIO S3 server (test only)
if: ${{ runner.os == 'Linux' && !contains(matrix.toxenv, 'mypy') && !contains(matrix.toxenv, 'docs') }}
run: |
set -e
arch=$(uname -m)
case "$arch" in
x86_64|amd64) srv_url=https://dl.min.io/server/minio/release/linux-amd64/minio; cli_url=https://dl.min.io/client/mc/release/linux-amd64/mc ;;
aarch64|arm64) srv_url=https://dl.min.io/server/minio/release/linux-arm64/minio; cli_url=https://dl.min.io/client/mc/release/linux-arm64/mc ;;
*) echo "Unsupported arch: $arch"; exit 1 ;;
esac
curl -fsSL -o /usr/local/bin/minio "$srv_url"
curl -fsSL -o /usr/local/bin/mc "$cli_url"
sudo chmod +x /usr/local/bin/minio /usr/local/bin/mc
export PATH=/usr/local/bin:$PATH
# Start MinIO on :9000 with default credentials (minioadmin/minioadmin)
MINIO_DIR="$GITHUB_WORKSPACE/.minio-data"
MINIO_LOG="$GITHUB_WORKSPACE/.minio.log"
mkdir -p "$MINIO_DIR"
nohup minio server "$MINIO_DIR" --address ":9000" >"$MINIO_LOG" 2>&1 &
# Wait for MinIO port to be ready
for i in $(seq 1 60); do (echo > /dev/tcp/127.0.0.1/9000) >/dev/null 2>&1 && break; sleep 1; done
# Configure client and create bucket
mc alias set local http://127.0.0.1:9000 minioadmin minioadmin
mc mb --ignore-existing local/borg
# Export S3 test URL for tox via GITHUB_ENV
echo "BORG_TEST_S3_REPO=s3:minioadmin:minioadmin@http://127.0.0.1:9000/borg/s3-repo" >> $GITHUB_ENV
- name: Install Python requirements
run: |
python -m pip install --upgrade pip setuptools wheel
pip install -r requirements.d/development.lock.txt
- name: Install borgbackup
run: |
if [[ "$TOXENV" == *"llfuse"* ]]; then
pip install -ve ".[llfuse,cockpit,s3,sftp,rclone]"
elif [[ "$TOXENV" == *"pyfuse3"* ]]; then
pip install -ve ".[pyfuse3,cockpit,s3,sftp,rclone]"
elif [[ "$TOXENV" == *"mfusepy"* ]]; then
pip install -ve ".[mfusepy,cockpit,s3,sftp,rclone]"
else
pip install -ve ".[cockpit,s3,sftp,rclone]"
fi
- name: Build Borg fat binaries (${{ matrix.binary }})
if: ${{ matrix.binary && startsWith(github.ref, 'refs/tags/') }}
run: |
pip install -r requirements.d/pyinstaller.txt
./scripts/build-borg-using-pyinstaller.sh
- name: Smoke-test the built binary (${{ matrix.binary }})
if: ${{ matrix.binary && startsWith(github.ref, 'refs/tags/') }}
run: |
pushd dist/binary
echo "single-file binary"
chmod +x borg.exe
./borg.exe -V
echo "single-directory binary"
chmod +x borg-dir/borg.exe
./borg-dir/borg.exe -V
tar czf borg.tgz borg-dir
popd
# Ensure locally built binary in ./dist/binary/borg-dir is found during tests
export PATH="$GITHUB_WORKSPACE/dist/binary/borg-dir:$PATH"
echo "borg.exe binary in PATH"
borg.exe -V
- name: Prepare binaries (${{ matrix.binary }})
if: ${{ matrix.binary && startsWith(github.ref, 'refs/tags/') }}
run: |
mkdir -p artifacts
if [ -f dist/binary/borg.exe ]; then
cp dist/binary/borg.exe artifacts/${{ matrix.binary }}
fi
if [ -f dist/binary/borg.tgz ]; then
cp dist/binary/borg.tgz artifacts/${{ matrix.binary }}.tgz
fi
echo "binary files"
ls -l artifacts/
- name: Attest binaries provenance (${{ matrix.binary }})
if: ${{ matrix.binary && startsWith(github.ref, 'refs/tags/') }}
uses: actions/attest-build-provenance@v4
with:
subject-path: 'artifacts/*'
- name: Upload binaries (${{ matrix.binary }})
if: ${{ matrix.binary && startsWith(github.ref, 'refs/tags/') }}
uses: actions/upload-artifact@v7
with:
name: ${{ matrix.binary }}
path: artifacts/*
if-no-files-found: error
- name: run tox env
run: |
# do not use fakeroot, but run as root. avoids the dreaded EISDIR sporadic failures. see #2482.
#sudo -E bash -c "tox -e py"
# Ensure locally built binary in ./dist/binary/borg-dir is found during tests
export PATH="$GITHUB_WORKSPACE/dist/binary/borg-dir:$PATH"
tox --skip-missing-interpreters
- name: Upload test results to Codecov
if: ${{ !cancelled() && !contains(matrix.toxenv, 'mypy') && !contains(matrix.toxenv, 'docs') }}
uses: codecov/codecov-action@v7
env:
OS: ${{ runner.os }}
python: ${{ matrix.python-version }}
with:
token: ${{ secrets.CODECOV_TOKEN }}
report_type: test_results
env_vars: OS,python
files: test-results.xml
- name: Upload coverage to Codecov
if: ${{ !cancelled() && !contains(matrix.toxenv, 'mypy') && !contains(matrix.toxenv, 'docs') }}
uses: codecov/codecov-action@v7
env:
OS: ${{ runner.os }}
python: ${{ matrix.python-version }}
with:
token: ${{ secrets.CODECOV_TOKEN }}
report_type: coverage
env_vars: OS,python
files: coverage.xml
vm_tests:
permissions:
contents: read
id-token: write
attestations: write
runs-on: ubuntu-24.04
timeout-minutes: 180
needs: [lint]
continue-on-error: true
strategy:
fail-fast: false
matrix:
include:
- os: freebsd
version: '14.3'
display_name: FreeBSD
# Controls binary build and provenance attestation on tags
do_binaries: true
artifact_prefix: borg-freebsd-14-x86_64-gh
- os: netbsd
version: '10.1'
display_name: NetBSD
do_binaries: false
- os: openbsd
version: '7.8'
display_name: OpenBSD
do_binaries: false
- os: omnios
version: 'r151056'
display_name: OmniOS
do_binaries: false
steps:
- name: Check out repository
uses: actions/checkout@v6
with:
fetch-depth: 0
fetch-tags: true
- name: Test on ${{ matrix.display_name }}
id: cross_os
uses: cross-platform-actions/action@v1.2.0
env:
DO_BINARIES: ${{ matrix.do_binaries }}
with:
operating_system: ${{ matrix.os }}
version: ${{ matrix.version }}
shell: bash
run: |
set -euxo pipefail
case "${{ matrix.os }}" in
freebsd)
export IGNORE_OSVERSION=yes
sudo -E pkg update -f
sudo -E pkg install -y liblz4 pkgconf
sudo -E pkg install -y fusefs-libs
sudo -E kldload fusefs
sudo -E sysctl vfs.usermount=1
sudo -E chmod 666 /dev/fuse
sudo -E pkg install -y rust
sudo -E pkg install -y gmake
sudo -E pkg install -y git
sudo -E pkg install -y python311 py311-sqlite3 py311-pip py311-virtualenv
sudo ln -sf /usr/local/bin/python3.11 /usr/local/bin/python3
sudo ln -sf /usr/local/bin/python3.11 /usr/local/bin/python
sudo ln -sf /usr/local/bin/pip3.11 /usr/local/bin/pip3
sudo ln -sf /usr/local/bin/pip3.11 /usr/local/bin/pip
# required for libsodium/pynacl build
export MAKE=gmake
python -m venv .venv
. .venv/bin/activate
python -V
pip -V
python -m pip install --upgrade pip wheel
pip install -r requirements.d/development.lock.txt
pip install -e ".[mfusepy,cockpit,s3,sftp,rclone]"
tox -e py311-mfusepy
if [[ "${{ matrix.do_binaries }}" == "true" && "${{ startsWith(github.ref, 'refs/tags/') }}" == "true" ]]; then
python -m pip install -r requirements.d/pyinstaller.txt
./scripts/build-borg-using-pyinstaller.sh
pushd dist/binary
echo "single-file binary"
chmod +x borg.exe
./borg.exe -V
echo "single-directory binary"
chmod +x borg-dir/borg.exe
./borg-dir/borg.exe -V
tar czf borg.tgz borg-dir
popd
mkdir -p artifacts
if [ -f dist/binary/borg.exe ]; then
cp -v dist/binary/borg.exe artifacts/${{ matrix.artifact_prefix }}
fi
if [ -f dist/binary/borg.tgz ]; then
cp -v dist/binary/borg.tgz artifacts/${{ matrix.artifact_prefix }}.tgz
fi
fi
;;
netbsd)
arch="$(uname -m)"
sudo -E mkdir -p /usr/pkg/etc/pkgin
echo "https://ftp.NetBSD.org/pub/pkgsrc/packages/NetBSD/${arch}/10.1/All" | sudo tee /usr/pkg/etc/pkgin/repositories.conf > /dev/null
sudo -E pkgin update
sudo -E pkgin -y upgrade
sudo -E pkgin -y install lz4 git
sudo -E pkgin -y install rust
sudo -E pkgin -y install pkg-config
sudo -E pkgin -y install py311-pip py311-virtualenv py311-tox
sudo -E ln -sf /usr/pkg/bin/python3.11 /usr/pkg/bin/python3
sudo -E ln -sf /usr/pkg/bin/pip3.11 /usr/pkg/bin/pip3
sudo -E ln -sf /usr/pkg/bin/virtualenv-3.11 /usr/pkg/bin/virtualenv3
sudo -E ln -sf /usr/pkg/bin/tox-3.11 /usr/pkg/bin/tox3
# Ensure base system admin tools are on PATH for the non-root shell
export PATH="/sbin:/usr/sbin:$PATH"
echo "--- Preparing an extattr-enabled filesystem ---"
# On many NetBSD setups /tmp is tmpfs without extended attributes.
# Create a FFS image with extended attributes enabled and use it for TMPDIR.
VNDDEV="vnd0"
IMGFILE="/tmp/fs.img"
sudo -E dd if=/dev/zero of=${IMGFILE} bs=1m count=1024
sudo -E vndconfig -c "${VNDDEV}" "${IMGFILE}"
sudo -E newfs -O 2ea /dev/r${VNDDEV}a
MNT="/mnt/eafs"
sudo -E mkdir -p ${MNT}
sudo -E mount -t ffs -o extattr /dev/${VNDDEV}a $MNT
export TMPDIR="${MNT}/tmp"
sudo -E mkdir -p ${TMPDIR}
sudo -E chmod 1777 ${TMPDIR}
touch ${TMPDIR}/testfile
lsextattr user ${TMPDIR}/testfile && echo "[xattr] *** xattrs SUPPORTED on ${TMPDIR}! ***"
tox3 -e py311-none
;;
openbsd)
sudo -E pkg_add lz4 git
sudo -E pkg_add rust
sudo -E pkg_add openssl%3.5
sudo -E pkg_add py3-pip py3-virtualenv py3-tox
export BORG_OPENSSL_NAME=eopenssl35
tox -e py312-none
;;
omnios)
sudo pkg install gcc14 git pkg-config python-313 gnu-make gnu-coreutils rust
sudo ln -sf /usr/bin/python3.13 /usr/bin/python3
sudo ln -sf /usr/bin/python3.13-config /usr/bin/python3-config
sudo python3 -m ensurepip
sudo python3 -m pip install virtualenv
# On omniOS /tmp is swap-backed tmpfs (small, RAM-bound), so the pip/cargo
# build temps and the pytest temp tree quickly exhaust it ("no space left on
# device"). /var/tmp is disk-backed (ZFS), so redirect TMPDIR there.
export TMPDIR=/var/tmp/borg-ci
mkdir -p "$TMPDIR"
python3 -m venv .venv
. .venv/bin/activate
python -V
pip -V
python -m pip install --upgrade pip wheel
pip install -r requirements.d/development.lock.txt
# no fuse support on omnios in our tests usually
pip install -e .
tox -e py313-none
;;
haiku)
pkgman refresh
pkgman install -y git pkgconfig lz4
pkgman install -y openssl3
pkgman install -y rust_bin
pkgman install -y python3.11
pkgman install -y cffi
pkgman install -y lz4_devel openssl3_devel libffi_devel
# there is no pkgman package for tox, so we install it into a venv
python3 -m ensurepip --upgrade
python3 -m pip install --upgrade pip wheel
python3 -m venv .venv
. .venv/bin/activate
export PKG_CONFIG_PATH="/system/develop/lib/pkgconfig:/system/lib/pkgconfig:${PKG_CONFIG_PATH:-}"
export BORG_LIBLZ4_PREFIX=/system/develop
export BORG_OPENSSL_PREFIX=/system/develop
pip install -r requirements.d/development.lock.txt
pip install -e .
# troubles with either tox or pytest xdist, so we run pytest manually:
pytest -v -n auto -rs --cov=borg --cov-config=pyproject.toml --cov-report=xml --junitxml=test-results.xml --benchmark-skip -k "not remote and not socket"
;;
esac
- name: Upload artifacts
if: startsWith(github.ref, 'refs/tags/') && matrix.do_binaries
uses: actions/upload-artifact@v7
with:
name: ${{ matrix.artifact_prefix }}
path: artifacts/*
if-no-files-found: ignore
- name: Attest provenance
if: startsWith(github.ref, 'refs/tags/') && matrix.do_binaries
uses: actions/attest-build-provenance@v4
with:
subject-path: 'artifacts/*'
- name: Upload test results to Codecov
if: ${{ !cancelled() }}
uses: codecov/codecov-action@v7
env:
OS: ${{ matrix.os }}
with:
token: ${{ secrets.CODECOV_TOKEN }}
report_type: test_results
env_vars: OS
files: test-results.xml
- name: Upload coverage to Codecov
if: ${{ !cancelled() }}
uses: codecov/codecov-action@v7
env:
OS: ${{ matrix.os }}
with:
token: ${{ secrets.CODECOV_TOKEN }}
report_type: coverage
env_vars: OS
files: coverage.xml
windows_tests:
if: true # can be used to temporarily disable the build
runs-on: windows-latest
timeout-minutes: 90
needs: [lint]
env:
PY_COLORS: 1
MSYS2_ARG_CONV_EXCL: "*"
MSYS2_ENV_CONV_EXCL: "*"
defaults:
run:
shell: msys2 {0}
steps:
- uses: actions/checkout@v6
with:
fetch-depth: 0
- uses: msys2/setup-msys2@v2
with:
msystem: UCRT64
update: true
- name: Install system packages
run: ./scripts/msys2-install-deps development
- name: Build python venv
run: |
# building cffi / argon2-cffi in the venv fails, so we try to use the system packages
python -m venv --system-site-packages env
. env/bin/activate
# python -m pip install --upgrade pip
# pip install --upgrade setuptools build wheel
pip install -r requirements.d/pyinstaller.txt
- name: Build
run: |
# build borg.exe
. env/bin/activate
pip install -e ".[cockpit,s3,sftp,rclone]"
./scripts/build-borg-using-pyinstaller.sh
# build sdist and wheel in dist/...
python -m build
- uses: actions/upload-artifact@v7
with:
name: borg-windows
path: dist/binary/borg.exe
- name: Run tests
run: |
# Ensure locally built binary in ./dist/binary/borg-dir is found during tests
export PATH="$GITHUB_WORKSPACE/dist/binary/borg-dir:$PATH"
borg.exe -V
. env/bin/activate
python -m pytest -n4 --benchmark-skip -vv -rs -k "not remote" --cov=borg --cov-config=pyproject.toml --cov-report=xml --junitxml=test-results.xml
- name: Upload test results to Codecov
if: ${{ !cancelled() }}
uses: codecov/codecov-action@v7
env:
OS: ${{ runner.os }}
python: '3.11'
with:
token: ${{ secrets.CODECOV_TOKEN }}
report_type: test_results
env_vars: OS,python
files: test-results.xml
- name: Upload coverage to Codecov
if: ${{ !cancelled() }}
uses: codecov/codecov-action@v7
env:
OS: ${{ runner.os }}
python: '3.11'
with:
token: ${{ secrets.CODECOV_TOKEN }}
report_type: coverage
env_vars: OS,python
files: coverage.xml

View file

@ -1,86 +0,0 @@
# CodeQL semantic code analysis engine
name: "CodeQL"
on:
push:
branches: [ master ]
paths:
- '**.py'
- '**.pyx'
- '**.c'
- '**.h'
- '.github/workflows/codeql-analysis.yml'
pull_request:
# The branches below must be a subset of the branches above
branches: [ master ]
paths:
- '**.py'
- '**.pyx'
- '**.c'
- '**.h'
- '.github/workflows/codeql-analysis.yml'
schedule:
- cron: '39 2 * * 5'
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.ref }}
cancel-in-progress: ${{ github.event_name == 'pull_request' }}
jobs:
analyze:
name: Analyze
runs-on: ubuntu-24.04
timeout-minutes: 20
permissions:
actions: read
contents: read
security-events: write
strategy:
fail-fast: false
matrix:
language: [ 'cpp', 'python' ]
# CodeQL supports [ 'cpp', 'csharp', 'go', 'java', 'javascript', 'python', 'ruby' ]
# Learn more about CodeQL language support at https://codeql.github.com/docs/codeql-overview/supported-languages-and-frameworks/
steps:
- name: Checkout repository
uses: actions/checkout@v6
with:
# Just fetching one commit is not enough for setuptools-scm, so we fetch all.
fetch-depth: 0
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: 3.11
- name: Cache pip
uses: actions/cache@v5
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('requirements.d/development.txt') }}
restore-keys: |
${{ runner.os }}-pip-
${{ runner.os }}-
- name: Install requirements
run: |
sudo apt-get update
sudo apt-get install -y pkg-config build-essential
sudo apt-get install -y libssl-dev libacl1-dev liblz4-dev
# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v4
with:
languages: ${{ matrix.language }}
# If you wish to specify custom queries, you can do so here or in a config file.
# By default, queries listed here will override any specified in a config file.
# Prefix the list here with "+" to use these queries and those in the config file.
# queries: ./path/to/local/query, your-org/your-repo/queries@main
- name: Build and install Borg
run: |
python3 -m venv ../borg-env
source ../borg-env/bin/activate
pip3 install -r requirements.d/development.txt
pip3 install -ve .
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v4

36
.gitignore vendored
View file

@ -2,34 +2,24 @@ MANIFEST
docs/_build
build
dist
borg-env
.tox
src/borg/compress.c
src/borg/hashindex.c
src/borg/crypto/low_level.c
src/borg/item.c
src/borg/chunkers/buzhash.c
src/borg/chunkers/buzhash64.c
src/borg/chunkers/reader.c
src/borg/checksums.c
src/borg/platform/darwin.c
src/borg/platform/freebsd.c
src/borg/platform/netbsd.c
src/borg/platform/linux.c
src/borg/platform/syncfilerange.c
src/borg/platform/posix.c
src/borg/platform/windows.c
src/borg/_version.py
hashindex.c
chunker.c
compress.c
crypto.c
platform_darwin.c
platform_freebsd.c
platform_linux.c
*.egg-info
*.pyc
*.pyd
*.pyo
*.so
.idea/
.junie/
.vscode/
.cache/
borg/_version.py
borg.build/
borg.dist/
borg.exe
.coverage
.coverage.*
coverage.xml
test-results.xml
.vagrant
.DS_Store

View file

@ -1,14 +0,0 @@
Abdel-Rahman <abodyxplay1@gmail.com>
Brian Johnson <brian@sherbang.com>
Carlo Teubner <carlo.teubner@gmail.com>
Mark Edgington <edgimar@gmail.com>
Leo Famulari <leo@famulari.name>
Marian Beermann <public@enkore.de>
Thomas Waldmann <tw@waldmann-edv.de>
Dan Christensen <jdc@uwo.ca> <jdc+github@uwo.ca>
Antoine Beaupré <anarcat@koumbit.org> <anarcat@debian.org> <anarcat@users.noreply.github.com>
Hartmut Goebel <h.goebel@crazy-compilers.com> <htgoebel@users.noreply.github.com>
Michael Gajda <michaelg@speciesm.net> <michael.gajda@tu-dortmund.de>
Milkey Mouse <milkeymouse@meme.institute> <milkey-mouse@users.noreply.github.com>
Ronny Pfannschmidt <opensource@ronnypfannschmidt.de> <ronny.pfannschmidt@redhat.com>
Stefan Tatschner <rumpelsepp@sevenbyte.org> <stefan@sevenbyte.org>

View file

@ -1,9 +0,0 @@
repos:
- repo: https://github.com/psf/black
rev: 24.8.0
hooks:
- id: black
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.15.0
hooks:
- id: ruff

View file

@ -1,32 +0,0 @@
# .readthedocs.yaml - Read the Docs configuration file.
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details.
version: 2
build:
os: ubuntu-22.04
tools:
python: "3.11"
jobs:
post_checkout:
- git fetch --unshallow
apt_packages:
- build-essential
- pkg-config
- libacl1-dev
- libssl-dev
- liblz4-dev
python:
install:
- requirements: requirements.d/development.lock.txt
- requirements: requirements.d/docs.txt
- method: pip
path: .
sphinx:
configuration: docs/conf.py
formats:
- htmlzip
- pdf

43
.travis.yml Normal file
View file

@ -0,0 +1,43 @@
sudo: required
language: python
cache:
directories:
- $HOME/.cache/pip
matrix:
include:
- python: 3.4
os: linux
env: TOXENV=py34
- python: 3.5
os: linux
env: TOXENV=py35
- python: 3.5
os: linux
env: TOXENV=flake8
- language: generic
os: osx
osx_image: xcode6.4
env: TOXENV=py34
- language: generic
os: osx
osx_image: xcode6.4
env: TOXENV=py35
install:
- ./.travis/install.sh
script:
- ./.travis/run.sh
after_success:
- ./.travis/upload_coverage.sh
notifications:
irc:
channels:
- "irc.freenode.org#borgbackup"
use_notice: true
skip_join: true

44
.travis/install.sh Executable file
View file

@ -0,0 +1,44 @@
#!/bin/bash
set -e
set -x
if [[ "$(uname -s)" == 'Darwin' ]]; then
brew update || brew update
if [[ "${OPENSSL}" != "0.9.8" ]]; then
brew outdated openssl || brew upgrade openssl
fi
if which pyenv > /dev/null; then
eval "$(pyenv init -)"
fi
brew install lz4
brew outdated pyenv || brew upgrade pyenv
case "${TOXENV}" in
py34)
pyenv install 3.4.3
pyenv global 3.4.3
;;
py35)
pyenv install 3.5.1
pyenv global 3.5.1
;;
esac
pyenv rehash
python -m pip install --user 'virtualenv<14.0'
else
pip install 'virtualenv<14.0'
sudo add-apt-repository -y ppa:gezakovacs/lz4
sudo apt-get update
sudo apt-get install -y liblz4-dev
sudo apt-get install -y libacl1-dev
fi
python -m virtualenv ~/.venv
source ~/.venv/bin/activate
pip install -r requirements.d/development.txt
pip install codecov
pip install -e .

23
.travis/run.sh Executable file
View file

@ -0,0 +1,23 @@
#!/bin/bash
set -e
set -x
if [[ "$(uname -s)" == "Darwin" ]]; then
eval "$(pyenv init -)"
if [[ "${OPENSSL}" != "0.9.8" ]]; then
# set our flags to use homebrew openssl
export ARCHFLAGS="-arch x86_64"
export LDFLAGS="-L/usr/local/opt/openssl/lib"
export CFLAGS="-I/usr/local/opt/openssl/include"
fi
fi
source ~/.venv/bin/activate
if [[ "$(uname -s)" == "Darwin" ]]; then
# no fakeroot on OS X
sudo tox -e $TOXENV -r
else
fakeroot -u tox -r
fi

13
.travis/upload_coverage.sh Executable file
View file

@ -0,0 +1,13 @@
#!/bin/bash
set -e
set -x
NO_COVERAGE_TOXENVS=(pep8)
if ! [[ "${NO_COVERAGE_TOXENVS[*]}" =~ "${TOXENV}" ]]; then
source ~/.venv/bin/activate
ln .tox/.coverage .coverage
# on osx, tests run as root, need access to .coverage
sudo chmod 666 .coverage
codecov -e TRAVIS_OS_NAME TOXENV
fi

22
AUTHORS
View file

@ -1,28 +1,12 @@
Email addresses listed here are not intended for support.
Please see the `support section`_ instead.
.. _support section: https://borgbackup.readthedocs.io/en/stable/support.html
Borg authors ("The Borg Collective")
------------------------------------
Borg Contributors ("The Borg Collective")
=========================================
- Thomas Waldmann <tw@waldmann-edv.de>
- Antoine Beaupré <anarcat@debian.org>
- Radek Podgorny <radek@podgorny.cz>
- Yuri D'Elia
- Michael Hanselmann <public@hansmi.ch>
- Teemu Toivanen <public@profnetti.fi>
- Marian Beermann <public@enkore.de>
- Martin Hostettler <textshell@uchuujin.de>
- Daniel Reichelt <hacking@nachtgeist.net>
- Lauri Niskanen <ape@ape3000.com>
- Abdel-Rahman A. (Abogical)
- Gu1nness <guinness@crans.org>
- Andrey Andreyevich Bienkowski <hexagon-recursion@posteo.net>
Retired
```````
- Antoine Beaupré <anarcat@debian.org>
Borg is a fork of Attic.

View file

@ -1,10 +0,0 @@
brew 'pkgconf'
brew 'lz4'
brew 'openssl@3'
# osxfuse (aka macFUSE) is only required for "borg mount",
# but won't work on GitHub Actions' workers.
# It requires installing a kernel extension, so some users
# may want it and some won't.
#cask 'osxfuse'

View file

@ -1,65 +0,0 @@
# Contributing to BorgBackup
First of all, thank you for considering contributing to Borg!
This guide provides a brief overview of how to contribute.
For the full, detailed development documentation, please refer to the
[Development Docs](https://borgbackup.readthedocs.io/en/master/development.html).
## How to Contribute
1. **Discuss Changes:** Before starting major work, please discuss your proposed changes on the [GitHub issue tracker](https://github.com/borgbackup/borg/issues). Smaller changes can also be discussed in the comments of the pull request.
2. **Branching Model:** Most Pull Requests should be made against the `master` branch. Maintenance branches (e.g., `1.4-maint`) are generally reserved for bug fixes and smaller changes.
3. **Pull Requests:**
- Create a feature branch for your changes.
- Keep changesets clean and focused on a single topic.
- Reference any related issues in your commit messages.
- Ensure your PR includes tests and documentation for new features.
- Proof read your PR yourself, fix typos and other obvious issues.
## Responsible AI Usage
You are welcome to use AI tools, but we require that a human is always "in the loop".
AI-generated content must not be submitted without active critical review, modification, and integration by the human contributor. We require that the final contribution is a product of human creative control and that AI is only used as a supportive tool to assist the human author.
As the contributor, you are responsible for the entire content of your pull request.
This includes:
- Verifying the correctness and security of any AI-generated code.
- Ensuring that new or modified code is covered by correct tests.
- Proofreading and refining any AI-generated documentation or comments.
- Being able to explain, debug, and maintain the code you submit.
Always be aware of the limitations and the ecological footprint of AI tools and act accordingly:
- Do not just believe what AI tells you, but verify it critically. AI is known to hallucinate, to be over-confident and to always tell you that you are right, even when you are not.
- Do not use AI tools for tasks that can be done more efficiently manually or by simpler tools.
- Learn how to use AI tools efficiently.
## Development Setup
Borg is written in Python with some Cython/C. To set up a development environment:
1. Create and activate a virtual environment.
2. Install development dependencies: `pip install -r requirements.d/development.lock.txt`
3. Install borg in editable mode: `pip install -e .`
4. Install pre-commit hooks: `pre-commit install`
## Code Style
We use [Black](https://black.readthedocs.io/) for automated code formatting.
- Install black: `pip install -r requirements.d/codestyle.txt`
- Check formatting: `black --check .`
- Apply formatting: `black .`
## Running Tests
We use `tox` and `pytest` for testing.
- Run all tests: `tox`
For more advanced testing options (including Vagrant and Podman), see the full [Development documentation](https://borgbackup.readthedocs.io/en/master/development.html).
## Security
If you discover a security vulnerability, please follow our [Security Policy](SECURITY.md) for reporting it.

24
LICENSE
View file

@ -1,4 +1,4 @@
Copyright (C) 2015-2026 The Borg Collective (see AUTHORS file)
Copyright (C) 2015-2016 The Borg Collective (see AUTHORS file)
Copyright (C) 2010-2014 Jonas Borgström <jonas@borgstrom.se>
All rights reserved.
@ -16,14 +16,14 @@ are met:
products derived from this software without specific prior
written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN
IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

View file

@ -1,8 +1,9 @@
# The files we need to include in the sdist are handled automatically by
# setuptools_scm - it includes all git-committed files.
# But we want to exclude some committed files/directories not needed in the sdist:
exclude .editorconfig .gitattributes .gitignore .mailmap Vagrantfile
prune .github
include src/borg/platform/darwin.c src/borg/platform/freebsd.c src/borg/platform/linux.c src/borg/platform/posix.c
include src/borg/platform/syncfilerange.c
include src/borg/platform/windows.c
include README.rst AUTHORS LICENSE CHANGES.rst MANIFEST.in
recursive-include borg *.pyx
recursive-include docs *
recursive-exclude docs *.pyc
recursive-exclude docs *.pyo
prune docs/_build
prune .travis
exclude .coveragerc .gitattributes .gitignore .travis.yml Vagrantfile
include borg/_version.py

View file

@ -1,215 +1,182 @@
This is borg2!
--------------
Please note that this is the README for borg2 / master branch.
For the stable version's docs, please see here:
https://borgbackup.readthedocs.io/en/stable/
Borg2 is currently in beta testing and might get major and/or
breaking changes between beta releases (and there is no beta to
next-beta upgrade code, so you will have to delete and re-create repos).
Thus, **DO NOT USE BORG2 FOR YOUR PRODUCTION BACKUPS!** Please help with
testing it, but set it up *additionally* to your production backups.
TODO: the screencasts need a remake using borg2, see here:
https://github.com/borgbackup/borg/issues/6303
|screencast|
What is BorgBackup?
-------------------
===================
BorgBackup (short: Borg) is a deduplicating backup program.
Optionally, it supports compression and authenticated encryption.
The main goal of Borg is to provide an efficient and secure way to back up data.
The main goal of Borg is to provide an efficient and secure way to backup data.
The data deduplication technique used makes Borg suitable for daily backups
since only changes are stored.
The authenticated encryption technique makes it suitable for backups to targets not
fully trusted.
The authenticated encryption technique makes it suitable for backups to not
fully trusted targets.
See the `installation manual`_ or, if you have already
downloaded Borg, ``docs/installation.rst`` to get started with Borg.
There is also an `offline documentation`_ available, in multiple formats.
.. _installation manual: https://borgbackup.readthedocs.io/en/master/installation.html
.. _offline documentation: https://readthedocs.org/projects/borgbackup/downloads
.. _installation manual: https://borgbackup.readthedocs.org/en/stable/installation.html
Main features
~~~~~~~~~~~~~
-------------
**Space efficient storage**
Deduplication based on content-defined chunking is used to reduce the number
of bytes stored: each file is split into a number of variable length chunks
and only chunks that have never been seen before are added to the repository.
A chunk is considered duplicate if its id_hash value is identical.
A cryptographically strong hash or MAC function is used as id_hash, e.g.
(hmac-)sha256.
To deduplicate, all the chunks in the same repository are considered, no
matter whether they come from different machines, from previous backups,
from the same backup or even from the same single file.
Compared to other deduplication approaches, this method does NOT depend on:
* file/directory names staying the same: So you can move your stuff around
* file/directory names staying the same: So you can move your stuff around
without killing the deduplication, even between machines sharing a repo.
* complete files or time stamps staying the same: If a big file changes a
little, only a few new chunks need to be stored - this is great for VMs or
* complete files or time stamps staying the same: If a big file changes a
little, only a few new chunks need to be stored - this is great for VMs or
raw disks.
* The absolute position of a data chunk inside a file: Stuff may get shifted
* The absolute position of a data chunk inside a file: Stuff may get shifted
and will still be found by the deduplication algorithm.
**Speed**
* performance-critical code (chunking, compression, encryption) is
* performance critical code (chunking, compression, encryption) is
implemented in C/Cython
* local caching
* local caching of files/chunks index data
* quick detection of unmodified files
**Data encryption**
All data can be protected client-side using 256-bit authenticated encryption
(AES-OCB or chacha20-poly1305), ensuring data confidentiality, integrity and
authenticity.
**Obfuscation**
Optionally, Borg can actively obfuscate, e.g., the size of files/chunks to
make fingerprinting attacks more difficult.
All data can be protected using 256-bit AES encryption, data integrity and
authenticity is verified using HMAC-SHA256. Data is encrypted clientside.
**Compression**
All data can be optionally compressed:
* lz4 (super fast, low compression)
* zstd (wide range from high speed and low compression to high compression
and lower speed)
* zlib (medium speed and compression)
* lzma (low speed, high compression)
All data can be compressed by lz4 (super fast, low compression), zlib
(medium speed and compression) or lzma (low speed, high compression).
**Off-site backups**
Borg can store data on any remote host accessible over SSH. If Borg is
installed on the remote host, significant performance gains can be achieved
compared to using a network file system (sshfs, NFS, ...).
Borg can store data on any remote host accessible over SSH. If Borg is
installed on the remote host, big performance gains can be achieved
compared to using a network filesystem (sshfs, nfs, ...).
**Backups mountable as file systems**
Backup archives are mountable as user-space file systems for easy interactive
backup examination and restores (e.g., by using a regular file manager).
**Backups mountable as filesystems**
Backup archives are mountable as userspace filesystems for easy interactive
backup examination and restores (e.g. by using a regular file manager).
**Easy installation on multiple platforms**
We offer single-file binaries that do not require installing anything -
you can just run them on these platforms:
* Linux
* macOS
* Mac OS X
* FreeBSD
* OpenBSD and NetBSD (no xattrs/ACLs support or binaries yet)
* Cygwin (experimental, no binaries yet)
* Windows Subsystem for Linux (WSL) on Windows 10/11 (experimental)
* Cygwin (not supported, no binaries yet)
**Free and Open Source Software**
* security and functionality can be audited independently
* licensed under the BSD (3-clause) license, see `License`_ for the
complete license
* licensed under the BSD (3-clause) license
Easy to use
~~~~~~~~~~~
-----------
For ease of use, set the BORG_REPO environment variable::
Initialize a new backup repository and create a backup archive::
$ export BORG_REPO=/path/to/repo
$ borg init /path/to/repo
$ borg create /path/to/repo::Saturday1 ~/Documents
Create a new backup repository (see ``borg repo-create --help`` for encryption options)::
Now doing another backup, just to show off the great deduplication::
$ borg repo-create -e repokey-aes-ocb
$ borg create -v --stats /path/to/repo::Saturday2 ~/Documents
-----------------------------------------------------------------------------
Archive name: Saturday2
Archive fingerprint: 622b7c53c...
Time (start): Sat, 2016-02-27 14:48:13
Time (end): Sat, 2016-02-27 14:48:14
Duration: 0.88 seconds
Number of files: 163
-----------------------------------------------------------------------------
Original size Compressed size Deduplicated size
This archive: 6.85 MB 6.85 MB 30.79 kB <-- !
All archives: 13.69 MB 13.71 MB 6.88 MB
Create a new backup archive::
$ borg create Monday1 ~/Documents
Now do another backup, just to show off the great deduplication::
$ borg create -v --stats Monday2 ~/Documents
Repository: /path/to/repo
Archive name: Monday2
Archive fingerprint: 7714aef97c1a24539cc3dc73f79b060f14af04e2541da33d54c7ee8e81a00089
Time (start): Mon, 2022-10-03 19:57:35 +0200
Time (end): Mon, 2022-10-03 19:57:35 +0200
Duration: 0.01 seconds
Number of files: 24
Original size: 29.73 MB
Deduplicated size: 520 B
Unique chunks Total chunks
Chunk index: 167 330
-----------------------------------------------------------------------------
Helping, donations and bounties, becoming a Patron
--------------------------------------------------
Your help is always welcome!
Spread the word, give feedback, help with documentation, testing or development.
You can also give monetary support to the project, see here for details:
https://www.borgbackup.org/support/fund.html
For a graphical frontend refer to our complementary project `BorgWeb <https://borgweb.readthedocs.io/>`_.
Links
=====
* `Main Web Site <https://borgbackup.readthedocs.org/>`_
* `Releases <https://github.com/borgbackup/borg/releases>`_
* `PyPI packages <https://pypi.python.org/pypi/borgbackup>`_
* `ChangeLog <https://github.com/borgbackup/borg/blob/master/docs/changes.rst>`_
* `GitHub <https://github.com/borgbackup/borg>`_
* `Issue Tracker <https://github.com/borgbackup/borg/issues>`_
* `Bounties & Fundraisers <https://www.bountysource.com/teams/borgbackup>`_
* `Mailing List <https://mail.python.org/mailman/listinfo/borgbackup>`_
* `License <https://borgbackup.readthedocs.org/en/stable/authors.html#license>`_
Notes
-----
* `Main website <https://borgbackup.readthedocs.io/>`_
* `Releases <https://github.com/borgbackup/borg/releases>`_,
`PyPI packages <https://pypi.org/project/borgbackup/>`_ and
`Changelog <https://github.com/borgbackup/borg/blob/master/docs/changes.rst>`_
* `Offline documentation <https://readthedocs.org/projects/borgbackup/downloads>`_
* `GitHub <https://github.com/borgbackup/borg>`_ and
`Issue tracker <https://github.com/borgbackup/borg/issues>`_.
* `Web chat (IRC) <https://web.libera.chat/#borgbackup>`_ and
`Mailing list <https://mail.python.org/mailman/listinfo/borgbackup>`_
* `License <https://borgbackup.readthedocs.io/en/master/authors.html#license>`_
* `Security contact <https://borgbackup.readthedocs.io/en/master/support.html#security-contact>`_
Borg is a fork of `Attic`_ and maintained by "`The Borg collective`_".
Compatibility notes
-------------------
.. _Attic: https://github.com/jborg/attic
.. _The Borg collective: https://borgbackup.readthedocs.org/en/latest/authors.html
Differences between Attic and Borg
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Here's a (incomplete) list of some major changes:
* more open, faster paced development (see `issue #1 <https://github.com/borgbackup/borg/issues/1>`_)
* lots of attic issues fixed (see `issue #5 <https://github.com/borgbackup/borg/issues/5>`_)
* less chunk management overhead (less memory and disk usage for chunks index)
* faster remote cache resync (useful when backing up multiple machines into same repo)
* compression: no, lz4, zlib or lzma compression, adjustable compression levels
* repokey replaces problematic passphrase mode (you can't change the passphrase nor the pbkdf2 iteration count in "passphrase" mode)
* simple sparse file support, great for virtual machine disk files
* can read special files (e.g. block devices) or from stdin, write to stdout
* mkdir-based locking is more compatible than attic's posix locking
* uses fadvise to not spoil / blow up the fs cache
* better error messages / exception handling
* better logging, screen output, progress indication
* tested on misc. Linux systems, 32 and 64bit, FreeBSD, OpenBSD, NetBSD, Mac OS X
Please read the `ChangeLog`_ (or ``docs/changes.rst`` in the source distribution) for more
information.
BORG IS NOT COMPATIBLE WITH ORIGINAL ATTIC (but there is a one-way conversion).
EXPECT THAT WE WILL BREAK COMPATIBILITY REPEATEDLY WHEN MAJOR RELEASE NUMBER
CHANGES (like when going from 0.x.y to 1.0.0 or from 1.x.y to 2.0.0).
NOT RELEASED DEVELOPMENT VERSIONS HAVE UNKNOWN COMPATIBILITY PROPERTIES.
THIS IS SOFTWARE IN DEVELOPMENT, DECIDE FOR YOURSELF WHETHER IT FITS YOUR NEEDS.
THIS IS SOFTWARE IN DEVELOPMENT, DECIDE YOURSELF WHETHER IT FITS YOUR NEEDS.
Security issues should be reported to the `Security contact`_ (or
see ``docs/support.rst`` in the source distribution).
Borg is distributed under a 3-clause BSD license, see `License`_ for the complete license.
.. start-badges
|doc| |build| |coverage|
|doc| |build| |coverage| |bestpractices|
.. |doc| image:: https://readthedocs.org/projects/borgbackup/badge/?version=master
.. |doc| image:: https://readthedocs.org/projects/borgbackup/badge/?version=stable
:alt: Documentation
:target: https://borgbackup.readthedocs.io/en/master/
:target: https://borgbackup.readthedocs.org/en/stable/
.. |build| image:: https://github.com/borgbackup/borg/workflows/CI/badge.svg?branch=master
:alt: Build Status (master)
:target: https://github.com/borgbackup/borg/actions
.. |build| image:: https://api.travis-ci.org/borgbackup/borg.svg
:alt: Build Status
:target: https://travis-ci.org/borgbackup/borg
.. |coverage| image:: https://codecov.io/github/borgbackup/borg/coverage.svg?branch=master
:alt: Test Coverage
:target: https://codecov.io/github/borgbackup/borg?branch=master
.. |screencast_basic| image:: https://asciinema.org/a/133292.png
:alt: BorgBackup Basic Usage
:target: https://asciinema.org/a/133292?autoplay=1&speed=1
:width: 100%
.. _installation: https://asciinema.org/a/133291?autoplay=1&speed=1
.. _advanced usage: https://asciinema.org/a/133293?autoplay=1&speed=1
.. |bestpractices| image:: https://bestpractices.coreinfrastructure.org/projects/271/badge
:alt: Best Practices Score
:target: https://bestpractices.coreinfrastructure.org/projects/271
.. end-badges
.. |screencast| image:: https://asciinema.org/a/28691.png
:alt: BorgBackup Installation and Basic Usage
:target: https://asciinema.org/a/28691?autoplay=1&speed=2

View file

@ -1,19 +0,0 @@
# Security Policy
## Supported Versions
These Borg releases are currently supported with security updates.
| Version | Supported |
|---------|--------------------|
| 2.0.x | :x: (beta) |
| 1.4.x | :white_check_mark: |
| 1.2.x | :x: (no new releases, critical fixes may still be backported) |
| 1.1.x | :x: |
| < 1.1 | :x: |
## Reporting a Vulnerability
See here:
https://borgbackup.readthedocs.io/en/latest/support.html#security-contact

538
Vagrantfile vendored
View file

@ -1,164 +1,172 @@
# -*- mode: ruby -*-
# vi: set ft=ruby :
# Automated creation of testing environments/binaries on miscellaneous platforms
# Automated creation of testing environments / binaries on misc. platforms
$cpus = Integer(ENV.fetch('VMCPUS', '8')) # create VMs with that many cpus
$xdistn = Integer(ENV.fetch('XDISTN', '8')) # dispatch tests to that many pytest workers
$wmem = $xdistn * 256 # give the VM additional memory for workers [MB]
def packages_debianoid(user)
def packages_prepare_wheezy
return <<-EOF
export DEBIAN_FRONTEND=noninteractive
# this is to avoid grub asking about which device it should install to:
echo "set grub-pc/install_devices /dev/sda" | debconf-communicate
apt-get -y -qq update
apt-get -y -qq dist-upgrade
# debian 7 wheezy does not have lz4, but it is available from wheezy-backports:
echo "deb http://http.debian.net/debian wheezy-backports main" > /etc/apt/sources.list.d/wheezy-backports.list
EOF
end
def packages_debianoid
return <<-EOF
apt-get update
# install all the (security and other) updates
apt-get dist-upgrade -y
# for building borgbackup and dependencies:
apt install -y pkg-config
apt install -y libssl-dev libacl1-dev liblz4-dev || true
apt install -y libfuse-dev fuse || true
apt install -y libfuse3-dev fuse3 || true
apt install -y locales || true
sed -i '/en_US.UTF-8/s/^# //g' /etc/locale.gen && locale-gen
usermod -a -G fuse #{user}
chgrp fuse /dev/fuse
chmod 666 /dev/fuse
apt install -y fakeroot build-essential git curl
apt install -y python3-dev python3-setuptools virtualenv
apt-get install -y libssl-dev libacl1-dev liblz4-dev libfuse-dev fuse pkg-config
usermod -a -G fuse vagrant
apt-get install -y fakeroot build-essential git
apt-get install -y python3-dev python3-setuptools
# for building python:
apt install -y zlib1g-dev libbz2-dev libncurses5-dev libreadline-dev liblzma-dev libsqlite3-dev libffi-dev
apt-get install -y zlib1g-dev libbz2-dev libncurses5-dev libreadline-dev liblzma-dev libsqlite3-dev
# this way it works on older dists (like ubuntu 12.04) also:
# for python 3.2 on ubuntu 12.04 we need pip<8 and virtualenv<14 as
# newer versions are not compatible with py 3.2 any more.
easy_install3 'pip<8.0'
pip3 install 'virtualenv<14.0'
touch ~vagrant/.bash_profile ; chown vagrant ~vagrant/.bash_profile
EOF
end
def packages_redhatted
return <<-EOF
yum install -y epel-release
yum update -y
# for building borgbackup and dependencies:
yum install -y openssl-devel openssl libacl-devel libacl lz4-devel fuse-devel fuse pkgconfig
usermod -a -G fuse vagrant
yum install -y fakeroot gcc git patch
# needed to compile msgpack-python (otherwise it will use slow fallback code):
yum install -y gcc-c++
# for building python:
yum install -y zlib-devel bzip2-devel ncurses-devel readline-devel xz-devel sqlite-devel
#yum install -y python-pip
#pip install virtualenv
touch ~vagrant/.bash_profile ; chown vagrant ~vagrant/.bash_profile
EOF
end
def packages_darwin
return <<-EOF
# install all the (security and other) updates
sudo softwareupdate --install --all
# get osxfuse 3.0.x pre-release code from github:
curl -s -L https://github.com/osxfuse/osxfuse/releases/download/osxfuse-3.2.0/osxfuse-3.2.0.dmg >osxfuse.dmg
MOUNTDIR=$(echo `hdiutil mount osxfuse.dmg | tail -1 | awk '{$1="" ; print $0}'` | xargs -0 echo) \
&& sudo installer -pkg "${MOUNTDIR}/Extras/FUSE for OS X 3.2.0.pkg" -target /
sudo chown -R vagrant /usr/local # brew must be able to create stuff here
ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
brew update
brew install openssl
brew install lz4
brew install xz # required for python lzma module
brew install fakeroot
brew install git
brew install pkgconfig
touch ~vagrant/.bash_profile ; chown vagrant ~vagrant/.bash_profile
EOF
end
def packages_freebsd
return <<-EOF
# in case the VM has no hostname set
hostname freebsd
# install all the (security and other) updates, base system
freebsd-update --not-running-from-cron fetch install
# for building borgbackup and dependencies:
pkg install -y liblz4 pkgconf
pkg install -y fusefs-libs || true
pkg install -y fusefs-libs3 || true
pkg install -y rust
pkg install -y git bash # fakeroot causes lots of troubles on freebsd
pkg install -y python311 py311-sqlite3 py311-pip py311-virtualenv
# make sure there is a python3/pip3/virtualenv command
ln -sf /usr/local/bin/python3.11 /usr/local/bin/python3
ln -sf /usr/local/bin/pip-3.11 /usr/local/bin/pip3
ln -sf /usr/local/bin/virtualenv-3.11 /usr/local/bin/virtualenv
pkg install -y openssl liblz4 fusefs-libs pkgconf
pkg install -y fakeroot git bash
# for building python:
pkg install -y sqlite3
# make bash default / work:
chsh -s bash vagrant
mount -t fdescfs fdesc /dev/fd
echo 'fdesc /dev/fd fdescfs rw 0 0' >> /etc/fstab
echo 'fdesc /dev/fd fdescfs rw 0 0' >> /etc/fstab
# make FUSE work
echo 'fuse_load="YES"' >> /boot/loader.conf
echo 'vfs.usermount=1' >> /etc/sysctl.conf
kldload fusefs
kldload fuse
sysctl vfs.usermount=1
pw groupmod operator -M vagrant
# /dev/fuse has group operator
chmod 666 /dev/fuse
touch ~vagrant/.bash_profile ; chown vagrant ~vagrant/.bash_profile
# install all the (security and other) updates, packages
pkg update
yes | pkg upgrade
echo 'export BORG_OPENSSL_PREFIX=/usr' >> ~vagrant/.bash_profile
# (re)mount / with acls
mount -o acls /
EOF
end
def packages_openbsd
return <<-EOF
hostname "openbsd77.localdomain"
echo "$(hostname)" > /etc/myname
echo "127.0.0.1 localhost" > /etc/hosts
echo "::1 localhost" >> /etc/hosts
echo "127.0.0.1 $(hostname) $(hostname -s)" >> /etc/hosts
echo "https://ftp.eu.openbsd.org/pub/OpenBSD" > /etc/installurl
ftp https://cdn.openbsd.org/pub/OpenBSD/$(uname -r)/$(uname -m)/comp$(uname -r | tr -d .).tgz
tar -C / -xzphf comp$(uname -r | tr -d .).tgz
rm comp$(uname -r | tr -d .).tgz
. ~/.profile
mkdir -p /home/vagrant/borg
rsync -aH /vagrant/borg/ /home/vagrant/borg/
rm -rf /vagrant/borg
ln -sf /home/vagrant/borg /vagrant/
pkg_add bash
chsh -s bash vagrant
chsh -s /usr/local/bin/bash vagrant
pkg_add openssl
pkg_add lz4
# pkg_add fuse # does not install, sdl dependency missing
pkg_add git # no fakeroot
pkg_add rust
pkg_add openssl%3.4
pkg_add py3-pip
pkg_add py3-virtualenv
echo 'export BORG_OPENSSL_NAME=eopenssl30' >> ~vagrant/.bash_profile
pkg_add python-3.4.2
pkg_add py3-setuptools
ln -sf /usr/local/bin/python3.4 /usr/local/bin/python3
ln -sf /usr/local/bin/python3.4 /usr/local/bin/python
easy_install-3.4 pip
pip3 install virtualenv
touch ~vagrant/.bash_profile ; chown vagrant ~vagrant/.bash_profile
EOF
end
def packages_netbsd
return <<-EOF
echo 'https://ftp.NetBSD.org/pub/pkgsrc/packages/NetBSD/$arch/9.3/All' > /usr/pkg/etc/pkgin/repositories.conf
pkgin update
pkgin -y upgrade
pkg_add lz4 git
pkg_add rust
pkg_add bash
hostname netbsd # the box we use has an invalid hostname
PKG_PATH="ftp://ftp.NetBSD.org/pub/pkgsrc/packages/NetBSD/amd64/6.1.5/All/"
export PKG_PATH
pkg_add mozilla-rootcerts lz4 git bash
chsh -s bash vagrant
echo "export PROMPT_COMMAND=" >> ~vagrant/.bash_profile # bug in netbsd 9.3, .bash_profile broken for screen
echo "export PROMPT_COMMAND=" >> ~root/.bash_profile # bug in netbsd 9.3, .bash_profile broken for screen
pkg_add pkg-config
mkdir -p /usr/local/opt/lz4/include
mkdir -p /usr/local/opt/lz4/lib
ln -s /usr/pkg/include/lz4*.h /usr/local/opt/lz4/include/
ln -s /usr/pkg/lib/liblz4* /usr/local/opt/lz4/lib/
touch /etc/openssl/openssl.cnf # avoids a flood of "can't open ..."
mozilla-rootcerts install
pkg_add pkg-config # avoids some "pkg-config missing" error msg, even without fuse
# pkg_add fuse # llfuse supports netbsd, but is still buggy.
# https://bitbucket.org/nikratio/python-llfuse/issues/70/perfuse_open-setsockopt-no-buffer-space
pkg_add py311-sqlite3 py311-pip py311-virtualenv py311-expat
ln -s /usr/pkg/bin/python3.11 /usr/pkg/bin/python
ln -s /usr/pkg/bin/python3.11 /usr/pkg/bin/python3
ln -s /usr/pkg/bin/pip3.11 /usr/pkg/bin/pip
ln -s /usr/pkg/bin/pip3.11 /usr/pkg/bin/pip3
ln -s /usr/pkg/bin/virtualenv-3.11 /usr/pkg/bin/virtualenv
ln -s /usr/pkg/bin/virtualenv-3.11 /usr/pkg/bin/virtualenv3
EOF
end
def package_update_openindiana
return <<-EOF
echo "nameserver 1.1.1.1" > /etc/resolv.conf
# needs separate provisioning step + reboot to become effective:
pkg update
EOF
end
def packages_openindiana
return <<-EOF
pkg install gcc-13 git
pkg install pkg-config
pkg install python-313
ln -sf /usr/bin/python3.13 /usr/bin/python3
ln -sf /usr/bin/python3.13-config /usr/bin/python3-config
python3 -m ensurepip
ln -sf /usr/bin/pip3.13 /usr/bin/pip3
pip3 install virtualenv
# let borg's pkg-config find openssl:
pfexec pkg set-mediator -V 3 openssl
pkg_add python34 py34-setuptools
ln -s /usr/pkg/bin/python3.4 /usr/pkg/bin/python
ln -s /usr/pkg/bin/python3.4 /usr/pkg/bin/python3
easy_install-3.4 pip
pip install virtualenv
touch ~vagrant/.bash_profile ; chown vagrant ~vagrant/.bash_profile
EOF
end
def install_pyenv(boxname)
return <<-EOF
echo 'export PYTHON_CONFIGURE_OPTS="${PYTHON_CONFIGURE_OPTS} --enable-shared"' >> ~/.bash_profile
echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bash_profile
echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bash_profile
. ~/.bash_profile
curl -s -L https://raw.githubusercontent.com/yyuu/pyenv-installer/master/bin/pyenv-installer | bash
echo 'eval "$(pyenv init --path)"' >> ~/.bash_profile
echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc
echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc
echo 'eval "$(pyenv init -)"' >> ~/.bashrc
echo 'eval "$(pyenv virtualenv-init -)"' >> ~/.bashrc
echo 'export PATH="$HOME/.pyenv/bin:/vagrant/borg:$PATH"' >> ~/.bash_profile
echo 'eval "$(pyenv init -)"' >> ~/.bash_profile
echo 'eval "$(pyenv virtualenv-init -)"' >> ~/.bash_profile
echo 'export PYTHON_CONFIGURE_OPTS="--enable-shared"' >> ~/.bash_profile
echo 'export LANG=en_US.UTF-8' >> ~/.bash_profile
EOF
end
def fix_pyenv_darwin(boxname)
return <<-EOF
echo 'export PYTHON_CONFIGURE_OPTS="--enable-framework"' >> ~/.bash_profile
EOF
end
def install_pythons(boxname)
return <<-EOF
. ~/.bash_profile
echo "PYTHON_CONFIGURE_OPTS: ${PYTHON_CONFIGURE_OPTS}"
pyenv install 3.13.8
pyenv install 3.4.0 # tests
pyenv install 3.5.0 # tests
pyenv install 3.5.1 # binary build, use latest 3.5.x release
pyenv rehash
EOF
end
@ -175,35 +183,55 @@ def build_pyenv_venv(boxname)
return <<-EOF
. ~/.bash_profile
cd /vagrant/borg
# use the latest 3.13 release
pyenv global 3.13.8
pyenv virtualenv 3.13.8 borg-env
# use the latest 3.5 release
pyenv global 3.5.1
pyenv virtualenv 3.5.1 borg-env
ln -s ~/.pyenv/versions/borg-env .
EOF
end
def install_borg(fuse)
def install_borg(boxname)
return <<-EOF
. ~/.bash_profile
cd /vagrant/borg
. borg-env/bin/activate
pip install -U wheel # upgrade wheel, might be too old
pip install -U wheel # upgrade wheel, too old for 3.5
cd borg
pip install -r requirements.d/development.lock.txt
python3 scripts/make.py clean
# install borgstore WITH all options, so it pulls in the needed
# requirements, so they will also get into the binaries built. #8574
pip install borgstore[sftp,s3]
pip install -e .[#{fuse}]
# clean up (wrong/outdated) stuff we likely got via rsync:
rm -f borg/*.so borg/*.cpy*
rm -f borg/{chunker,crypto,compress,hashindex,platform_linux}.c
rm -rf borg/__pycache__ borg/support/__pycache__ borg/testsuite/__pycache__
pip install -r requirements.d/development.txt
# by using [fuse], setup.py can handle different fuse requirements:
pip install -e .[fuse]
EOF
end
def install_pyinstaller()
def install_pyinstaller(boxname)
return <<-EOF
. ~/.bash_profile
cd /vagrant/borg
. borg-env/bin/activate
pip install -r requirements.d/pyinstaller.txt
git clone https://github.com/pyinstaller/pyinstaller.git
cd pyinstaller
git checkout v3.1.1
pip install -e .
EOF
end
def install_pyinstaller_bootloader(boxname)
return <<-EOF
. ~/.bash_profile
cd /vagrant/borg
. borg-env/bin/activate
git clone https://github.com/pyinstaller/pyinstaller.git
cd pyinstaller
git checkout v3.1.1
# build bootloader, if it is not included
cd bootloader
python ./waf all
cd ..
pip install -e .
EOF
end
@ -213,26 +241,21 @@ def build_binary_with_pyinstaller(boxname)
cd /vagrant/borg
. borg-env/bin/activate
cd borg
pyinstaller --clean --distpath=/vagrant/borg scripts/borg.exe.spec
echo 'export PATH="/vagrant/borg:$PATH"' >> ~/.bash_profile
cd .. && tar -czvf borg.tgz borg-dir
pyinstaller -F -n borg.exe --distpath=/vagrant/borg --clean borg/__main__.py
EOF
end
def run_tests(boxname, skip_env)
def run_tests(boxname)
return <<-EOF
. ~/.bash_profile
cd /vagrant/borg/borg
. ../borg-env/bin/activate
if which pyenv 2> /dev/null; then
if which pyenv > /dev/null; then
# for testing, use the earliest point releases of the supported python versions:
pyenv global 3.13.8
pyenv local 3.13.8
pyenv global 3.4.0 3.5.0
fi
# otherwise: just use the system python
# some OSes can only run specific test envs, e.g. because they miss FUSE support:
export TOX_SKIP_ENV='#{skip_env}'
if which fakeroot 2> /dev/null; then
if which fakeroot > /dev/null; then
echo "Running tox WITH fakeroot -u"
fakeroot -u tox --skip-missing-interpreters
else
@ -242,190 +265,165 @@ def run_tests(boxname, skip_env)
EOF
end
def fs_init(user)
def fix_perms
return <<-EOF
# clean up (wrong/outdated) stuff we likely got via rsync:
rm -rf /vagrant/borg/borg/.tox 2> /dev/null
rm -rf /vagrant/borg/borg/borgbackup.egg-info 2> /dev/null
rm -rf /vagrant/borg/borg/__pycache__ 2> /dev/null
find /vagrant/borg/borg/src -name '__pycache__' -exec rm -rf {} \\; 2> /dev/null
chown -R #{user} /vagrant/borg
touch ~#{user}/.bash_profile ; chown #{user} ~#{user}/.bash_profile
echo 'export LANG=en_US.UTF-8' >> ~#{user}/.bash_profile
echo 'export LC_CTYPE=en_US.UTF-8' >> ~#{user}/.bash_profile
echo 'export XDISTN=#{$xdistn}' >> ~#{user}/.bash_profile
# . ~/.profile
chown -R vagrant /vagrant/borg
EOF
end
Vagrant.configure(2) do |config|
# use rsync to copy content to the folder
config.vm.synced_folder ".", "/vagrant/borg/borg", :type => "rsync", :rsync__args => ["--verbose", "--archive", "--delete", "--exclude", ".python-version"], :rsync__chown => false
config.vm.synced_folder ".", "/vagrant/borg/borg", :type => "rsync", :rsync__args => ["--verbose", "--archive", "--delete", "-z"]
# do not let the VM access . on the host machine via the default shared folder!
config.vm.synced_folder ".", "/vagrant", disabled: true
# fix permissions on synced folder
config.vm.provision "fix perms", :type => :shell, :inline => fix_perms
config.vm.provider :virtualbox do |v|
#v.gui = true
v.cpus = $cpus
v.cpus = 1
end
config.vm.define "noble" do |b|
b.vm.box = "bento/ubuntu-24.04"
# Linux
config.vm.define "centos7_64" do |b|
b.vm.box = "centos/7"
b.vm.provider :virtualbox do |v|
v.memory = 1024 + $wmem
v.memory = 768
end
b.vm.provision "fs init", :type => :shell, :inline => fs_init("vagrant")
b.vm.provision "packages debianoid", :type => :shell, :inline => packages_debianoid("vagrant")
b.vm.provision "build env", :type => :shell, :privileged => false, :inline => build_sys_venv("noble")
b.vm.provision "install borg", :type => :shell, :privileged => false, :inline => install_borg("llfuse")
b.vm.provision "run tests", :type => :shell, :privileged => false, :inline => run_tests("noble", ".*none.*")
b.vm.provision "install system packages", :type => :shell, :inline => packages_redhatted
b.vm.provision "install pyenv", :type => :shell, :privileged => false, :inline => install_pyenv("centos7_64")
b.vm.provision "install pythons", :type => :shell, :privileged => false, :inline => install_pythons("centos7_64")
b.vm.provision "build env", :type => :shell, :privileged => false, :inline => build_pyenv_venv("centos7_64")
b.vm.provision "install borg", :type => :shell, :privileged => false, :inline => install_borg("centos7_64")
b.vm.provision "run tests", :type => :shell, :privileged => false, :inline => run_tests("centos7_64")
end
config.vm.define "jammy" do |b|
b.vm.box = "ubuntu/jammy64"
b.vm.provider :virtualbox do |v|
v.memory = 1024 + $wmem
end
b.vm.provision "fs init", :type => :shell, :inline => fs_init("vagrant")
b.vm.provision "packages debianoid", :type => :shell, :inline => packages_debianoid("vagrant")
b.vm.provision "build env", :type => :shell, :privileged => false, :inline => build_sys_venv("jammy")
b.vm.provision "install borg", :type => :shell, :privileged => false, :inline => install_borg("llfuse")
b.vm.provision "run tests", :type => :shell, :privileged => false, :inline => run_tests("jammy", ".*none.*")
config.vm.define "centos6_32" do |b|
b.vm.box = "centos6-32"
b.vm.provision "install system packages", :type => :shell, :inline => packages_redhatted
b.vm.provision "install pyenv", :type => :shell, :privileged => false, :inline => install_pyenv("centos6_32")
b.vm.provision "install pythons", :type => :shell, :privileged => false, :inline => install_pythons("centos6_32")
b.vm.provision "build env", :type => :shell, :privileged => false, :inline => build_pyenv_venv("centos6_32")
b.vm.provision "install borg", :type => :shell, :privileged => false, :inline => install_borg("centos6_32")
b.vm.provision "install pyinstaller", :type => :shell, :privileged => false, :inline => install_pyinstaller("centos6_32")
b.vm.provision "build binary with pyinstaller", :type => :shell, :privileged => false, :inline => build_binary_with_pyinstaller("centos6_32")
b.vm.provision "run tests", :type => :shell, :privileged => false, :inline => run_tests("centos6_32")
end
config.vm.define "trixie" do |b|
b.vm.box = "debian/testing64"
config.vm.define "centos6_64" do |b|
b.vm.box = "centos6-64"
b.vm.provider :virtualbox do |v|
v.memory = 1024 + $wmem
v.memory = 768
end
b.vm.provision "fs init", :type => :shell, :inline => fs_init("vagrant")
b.vm.provision "packages debianoid", :type => :shell, :inline => packages_debianoid("vagrant")
b.vm.provision "install pyenv", :type => :shell, :privileged => false, :inline => install_pyenv("trixie")
b.vm.provision "install pythons", :type => :shell, :privileged => false, :inline => install_pythons("trixie")
b.vm.provision "build env", :type => :shell, :privileged => false, :inline => build_pyenv_venv("trixie")
b.vm.provision "install borg", :type => :shell, :privileged => false, :inline => install_borg("llfuse")
b.vm.provision "install pyinstaller", :type => :shell, :privileged => false, :inline => install_pyinstaller()
b.vm.provision "build binary with pyinstaller", :type => :shell, :privileged => false, :inline => build_binary_with_pyinstaller("trixie")
b.vm.provision "run tests", :type => :shell, :privileged => false, :inline => run_tests("trixie", ".*none.*")
b.vm.provision "install system packages", :type => :shell, :inline => packages_redhatted
b.vm.provision "install pyenv", :type => :shell, :privileged => false, :inline => install_pyenv("centos6_64")
b.vm.provision "install pythons", :type => :shell, :privileged => false, :inline => install_pythons("centos6_64")
b.vm.provision "build env", :type => :shell, :privileged => false, :inline => build_pyenv_venv("centos6_64")
b.vm.provision "install borg", :type => :shell, :privileged => false, :inline => install_borg("centos6_64")
b.vm.provision "install pyinstaller", :type => :shell, :privileged => false, :inline => install_pyinstaller("centos6_64")
b.vm.provision "build binary with pyinstaller", :type => :shell, :privileged => false, :inline => build_binary_with_pyinstaller("centos6_64")
b.vm.provision "run tests", :type => :shell, :privileged => false, :inline => run_tests("centos6_64")
end
config.vm.define "bookworm32" do |b|
b.vm.box = "generic-x32/debian12"
config.vm.define "trusty64" do |b|
b.vm.box = "ubuntu/trusty64"
b.vm.provider :virtualbox do |v|
v.memory = 1024 + $wmem
v.memory = 768
end
b.vm.provision "fs init", :type => :shell, :inline => fs_init("vagrant")
b.vm.provision "packages debianoid", :type => :shell, :inline => packages_debianoid("vagrant")
b.vm.provision "install pyenv", :type => :shell, :privileged => false, :inline => install_pyenv("bookworm32")
b.vm.provision "install pythons", :type => :shell, :privileged => false, :inline => install_pythons("bookworm32")
b.vm.provision "build env", :type => :shell, :privileged => false, :inline => build_pyenv_venv("bookworm32")
b.vm.provision "install borg", :type => :shell, :privileged => false, :inline => install_borg("llfuse")
b.vm.provision "install pyinstaller", :type => :shell, :privileged => false, :inline => install_pyinstaller()
b.vm.provision "build binary with pyinstaller", :type => :shell, :privileged => false, :inline => build_binary_with_pyinstaller("bookworm32")
b.vm.provision "run tests", :type => :shell, :privileged => false, :inline => run_tests("bookworm32", ".*none.*")
b.vm.provision "packages debianoid", :type => :shell, :inline => packages_debianoid
b.vm.provision "build env", :type => :shell, :privileged => false, :inline => build_sys_venv("trusty64")
b.vm.provision "install borg", :type => :shell, :privileged => false, :inline => install_borg("trusty64")
b.vm.provision "run tests", :type => :shell, :privileged => false, :inline => run_tests("trusty64")
end
config.vm.define "bookworm" do |b|
b.vm.box = "debian/bookworm64"
config.vm.define "jessie64" do |b|
b.vm.box = "debian/jessie64"
b.vm.provider :virtualbox do |v|
v.memory = 1024 + $wmem
v.memory = 768
end
b.vm.provision "fs init", :type => :shell, :inline => fs_init("vagrant")
b.vm.provision "packages debianoid", :type => :shell, :inline => packages_debianoid("vagrant")
b.vm.provision "install pyenv", :type => :shell, :privileged => false, :inline => install_pyenv("bookworm")
b.vm.provision "install pythons", :type => :shell, :privileged => false, :inline => install_pythons("bookworm")
b.vm.provision "build env", :type => :shell, :privileged => false, :inline => build_pyenv_venv("bookworm")
b.vm.provision "install borg", :type => :shell, :privileged => false, :inline => install_borg("llfuse")
b.vm.provision "install pyinstaller", :type => :shell, :privileged => false, :inline => install_pyinstaller()
b.vm.provision "build binary with pyinstaller", :type => :shell, :privileged => false, :inline => build_binary_with_pyinstaller("bookworm")
b.vm.provision "run tests", :type => :shell, :privileged => false, :inline => run_tests("bookworm", ".*none.*")
b.vm.provision "packages debianoid", :type => :shell, :inline => packages_debianoid
b.vm.provision "build env", :type => :shell, :privileged => false, :inline => build_sys_venv("jessie64")
b.vm.provision "install borg", :type => :shell, :privileged => false, :inline => install_borg("jessie64")
b.vm.provision "run tests", :type => :shell, :privileged => false, :inline => run_tests("jessie64")
end
config.vm.define "bullseye" do |b|
b.vm.box = "debian/bullseye64"
b.vm.provider :virtualbox do |v|
v.memory = 1024 + $wmem
end
b.vm.provision "fs init", :type => :shell, :inline => fs_init("vagrant")
b.vm.provision "packages debianoid", :type => :shell, :inline => packages_debianoid("vagrant")
b.vm.provision "install pyenv", :type => :shell, :privileged => false, :inline => install_pyenv("bullseye")
b.vm.provision "install pythons", :type => :shell, :privileged => false, :inline => install_pythons("bullseye")
b.vm.provision "build env", :type => :shell, :privileged => false, :inline => build_pyenv_venv("bullseye")
b.vm.provision "install borg", :type => :shell, :privileged => false, :inline => install_borg("llfuse")
b.vm.provision "install pyinstaller", :type => :shell, :privileged => false, :inline => install_pyinstaller()
b.vm.provision "build binary with pyinstaller", :type => :shell, :privileged => false, :inline => build_binary_with_pyinstaller("bullseye")
b.vm.provision "run tests", :type => :shell, :privileged => false, :inline => run_tests("bullseye", ".*none.*")
config.vm.define "wheezy32" do |b|
b.vm.box = "boxcutter/debian79-i386"
b.vm.provision "packages prepare wheezy", :type => :shell, :inline => packages_prepare_wheezy
b.vm.provision "packages debianoid", :type => :shell, :inline => packages_debianoid
b.vm.provision "install pyenv", :type => :shell, :privileged => false, :inline => install_pyenv("wheezy32")
b.vm.provision "install pythons", :type => :shell, :privileged => false, :inline => install_pythons("wheezy32")
b.vm.provision "build env", :type => :shell, :privileged => false, :inline => build_pyenv_venv("wheezy32")
b.vm.provision "install borg", :type => :shell, :privileged => false, :inline => install_borg("wheezy32")
b.vm.provision "install pyinstaller", :type => :shell, :privileged => false, :inline => install_pyinstaller("wheezy32")
b.vm.provision "build binary with pyinstaller", :type => :shell, :privileged => false, :inline => build_binary_with_pyinstaller("wheezy32")
b.vm.provision "run tests", :type => :shell, :privileged => false, :inline => run_tests("wheezy32")
end
config.vm.define "freebsd13" do |b|
b.vm.box = "generic/freebsd13"
b.vm.provider :virtualbox do |v|
v.memory = 1024 + $wmem
end
b.ssh.shell = "sh"
b.vm.provision "fs init", :type => :shell, :inline => fs_init("vagrant")
b.vm.provision "packages freebsd", :type => :shell, :inline => packages_freebsd
b.vm.provision "install pyenv", :type => :shell, :privileged => false, :inline => install_pyenv("freebsd13")
b.vm.provision "install pythons", :type => :shell, :privileged => false, :inline => install_pythons("freebsd13")
b.vm.provision "build env", :type => :shell, :privileged => false, :inline => build_pyenv_venv("freebsd13")
b.vm.provision "install borg", :type => :shell, :privileged => false, :inline => install_borg("llfuse")
b.vm.provision "install pyinstaller", :type => :shell, :privileged => false, :inline => install_pyinstaller()
b.vm.provision "build binary with pyinstaller", :type => :shell, :privileged => false, :inline => build_binary_with_pyinstaller("freebsd13")
b.vm.provision "run tests", :type => :shell, :privileged => false, :inline => run_tests("freebsd13", ".*(pyfuse3|none).*")
config.vm.define "wheezy64" do |b|
b.vm.box = "boxcutter/debian79"
b.vm.provision "packages prepare wheezy", :type => :shell, :inline => packages_prepare_wheezy
b.vm.provision "packages debianoid", :type => :shell, :inline => packages_debianoid
b.vm.provision "install pyenv", :type => :shell, :privileged => false, :inline => install_pyenv("wheezy64")
b.vm.provision "install pythons", :type => :shell, :privileged => false, :inline => install_pythons("wheezy64")
b.vm.provision "build env", :type => :shell, :privileged => false, :inline => build_pyenv_venv("wheezy64")
b.vm.provision "install borg", :type => :shell, :privileged => false, :inline => install_borg("wheezy64")
b.vm.provision "install pyinstaller", :type => :shell, :privileged => false, :inline => install_pyinstaller("wheezy64")
b.vm.provision "build binary with pyinstaller", :type => :shell, :privileged => false, :inline => build_binary_with_pyinstaller("wheezy64")
b.vm.provision "run tests", :type => :shell, :privileged => false, :inline => run_tests("wheezy64")
end
config.vm.define "freebsd14" do |b|
b.vm.box = "generic/freebsd14"
b.vm.provider :virtualbox do |v|
v.memory = 1024 + $wmem
end
b.ssh.shell = "sh"
b.vm.provision "fs init", :type => :shell, :inline => fs_init("vagrant")
b.vm.provision "packages freebsd", :type => :shell, :inline => packages_freebsd
b.vm.provision "install pyenv", :type => :shell, :privileged => false, :inline => install_pyenv("freebsd14")
b.vm.provision "install pythons", :type => :shell, :privileged => false, :inline => install_pythons("freebsd14")
b.vm.provision "build env", :type => :shell, :privileged => false, :inline => build_pyenv_venv("freebsd14")
b.vm.provision "install borg", :type => :shell, :privileged => false, :inline => install_borg("llfuse")
b.vm.provision "install pyinstaller", :type => :shell, :privileged => false, :inline => install_pyinstaller()
b.vm.provision "build binary with pyinstaller", :type => :shell, :privileged => false, :inline => build_binary_with_pyinstaller("freebsd14")
b.vm.provision "run tests", :type => :shell, :privileged => false, :inline => run_tests("freebsd14", ".*(pyfuse3|none).*")
# OS X
config.vm.define "darwin64" do |b|
b.vm.box = "jhcook/yosemite-clitools"
b.vm.provision "packages darwin", :type => :shell, :privileged => false, :inline => packages_darwin
b.vm.provision "install pyenv", :type => :shell, :privileged => false, :inline => install_pyenv("darwin64")
b.vm.provision "fix pyenv", :type => :shell, :privileged => false, :inline => fix_pyenv_darwin("darwin64")
b.vm.provision "install pythons", :type => :shell, :privileged => false, :inline => install_pythons("darwin64")
b.vm.provision "build env", :type => :shell, :privileged => false, :inline => build_pyenv_venv("darwin64")
b.vm.provision "install borg", :type => :shell, :privileged => false, :inline => install_borg("darwin64")
b.vm.provision "install pyinstaller", :type => :shell, :privileged => false, :inline => install_pyinstaller("darwin64")
b.vm.provision "build binary with pyinstaller", :type => :shell, :privileged => false, :inline => build_binary_with_pyinstaller("darwin64")
b.vm.provision "run tests", :type => :shell, :privileged => false, :inline => run_tests("darwin64")
end
config.vm.define "openbsd7" do |b|
b.vm.box = "l3system/openbsd77-amd64"
# BSD
config.vm.define "freebsd64" do |b|
b.vm.box = "geoffgarside/freebsd-10.2"
b.vm.provider :virtualbox do |v|
v.memory = 1024 + $wmem
v.memory = 768
end
b.vm.provision "install system packages", :type => :shell, :inline => packages_freebsd
b.vm.provision "install pyenv", :type => :shell, :privileged => false, :inline => install_pyenv("freebsd")
b.vm.provision "install pythons", :type => :shell, :privileged => false, :inline => install_pythons("freebsd")
b.vm.provision "build env", :type => :shell, :privileged => false, :inline => build_pyenv_venv("freebsd")
b.vm.provision "install borg", :type => :shell, :privileged => false, :inline => install_borg("freebsd")
b.vm.provision "install pyinstaller", :type => :shell, :privileged => false, :inline => install_pyinstaller_bootloader("freebsd")
b.vm.provision "build binary with pyinstaller", :type => :shell, :privileged => false, :inline => build_binary_with_pyinstaller("freebsd")
b.vm.provision "run tests", :type => :shell, :privileged => false, :inline => run_tests("freebsd")
end
config.vm.define "openbsd64" do |b|
b.vm.box = "bodgit/openbsd-5.7-amd64"
b.vm.provider :virtualbox do |v|
v.memory = 768
end
b.vm.provision "fs init", :type => :shell, :inline => fs_init("vagrant")
b.vm.provision "packages openbsd", :type => :shell, :inline => packages_openbsd
b.vm.provision "build env", :type => :shell, :privileged => false, :inline => build_sys_venv("openbsd7")
b.vm.provision "install borg", :type => :shell, :privileged => false, :inline => install_borg("nofuse")
b.vm.provision "run tests", :type => :shell, :privileged => false, :inline => run_tests("openbsd7", ".*fuse.*")
b.vm.provision "build env", :type => :shell, :privileged => false, :inline => build_sys_venv("openbsd64")
b.vm.provision "install borg", :type => :shell, :privileged => false, :inline => install_borg("openbsd64")
b.vm.provision "run tests", :type => :shell, :privileged => false, :inline => run_tests("openbsd64")
end
config.vm.define "netbsd9" do |b|
b.vm.box = "generic/netbsd9"
config.vm.define "netbsd64" do |b|
b.vm.box = "alex-skimlinks/netbsd-6.1.5-amd64"
b.vm.provider :virtualbox do |v|
v.memory = 4096 + $wmem # need big /tmp tmpfs in RAM!
v.memory = 768
end
b.vm.provision "fs init", :type => :shell, :inline => fs_init("vagrant")
b.vm.provision "packages netbsd", :type => :shell, :inline => packages_netbsd
b.vm.provision "build env", :type => :shell, :privileged => false, :inline => build_sys_venv("netbsd9")
b.vm.provision "install borg", :type => :shell, :privileged => false, :inline => install_borg("nofuse")
b.vm.provision "run tests", :type => :shell, :privileged => false, :inline => run_tests("netbsd9", ".*fuse.*")
end
# rsync on openindiana has troubles, does not set correct owner for /vagrant/borg and thus gives lots of
# permission errors. can be manually fixed in the VM by: sudo chown -R vagrant /vagrant/borg ; then rsync again.
config.vm.define "openindiana" do |b|
b.vm.box = "openindiana/hipster"
b.vm.provider :virtualbox do |v|
v.memory = 2048 + $wmem
end
b.vm.provision "fs init", :type => :shell, :inline => fs_init("vagrant")
b.vm.provision "package update openindiana", :type => :shell, :inline => package_update_openindiana, :reboot => true
b.vm.provision "packages openindiana", :type => :shell, :inline => packages_openindiana
b.vm.provision "build env", :type => :shell, :privileged => false, :inline => build_sys_venv("openindiana")
b.vm.provision "install borg", :type => :shell, :privileged => false, :inline => install_borg("nofuse")
b.vm.provision "run tests", :type => :shell, :privileged => false, :inline => run_tests("openindiana", ".*fuse.*")
b.vm.provision "build env", :type => :shell, :privileged => false, :inline => build_sys_venv("netbsd64")
b.vm.provision "install borg", :type => :shell, :privileged => false, :inline => install_borg("netbsd64")
b.vm.provision "run tests", :type => :shell, :privileged => false, :inline => run_tests("netbsd64")
end
end

3
borg/__init__.py Normal file
View file

@ -0,0 +1,3 @@
# This is a python package
from ._version import version as __version__

2
borg/__main__.py Normal file
View file

@ -0,0 +1,2 @@
from borg.archiver import main
main()

270
borg/_chunker.c Normal file
View file

@ -0,0 +1,270 @@
#include <Python.h>
#include <fcntl.h>
/* Cyclic polynomial / buzhash: https://en.wikipedia.org/wiki/Rolling_hash */
static uint32_t table_base[] =
{
0xe7f831ec, 0xf4026465, 0xafb50cae, 0x6d553c7a, 0xd639efe3, 0x19a7b895, 0x9aba5b21, 0x5417d6d4,
0x35fd2b84, 0xd1f6a159, 0x3f8e323f, 0xb419551c, 0xf444cebf, 0x21dc3b80, 0xde8d1e36, 0x84a32436,
0xbeb35a9d, 0xa36f24aa, 0xa4e60186, 0x98d18ffe, 0x3f042f9e, 0xdb228bcd, 0x096474b7, 0x5c20c2f7,
0xf9eec872, 0xe8625275, 0xb9d38f80, 0xd48eb716, 0x22a950b4, 0x3cbaaeaa, 0xc37cddd3, 0x8fea6f6a,
0x1d55d526, 0x7fd6d3b3, 0xdaa072ee, 0x4345ac40, 0xa077c642, 0x8f2bd45b, 0x28509110, 0x55557613,
0xffc17311, 0xd961ffef, 0xe532c287, 0xaab95937, 0x46d38365, 0xb065c703, 0xf2d91d0f, 0x92cd4bb0,
0x4007c712, 0xf35509dd, 0x505b2f69, 0x557ead81, 0x310f4563, 0xbddc5be8, 0x9760f38c, 0x701e0205,
0x00157244, 0x14912826, 0xdc4ca32b, 0x67b196de, 0x5db292e8, 0x8c1b406b, 0x01f34075, 0xfa2520f7,
0x73bc37ab, 0x1e18bc30, 0xfe2c6cb3, 0x20c522d0, 0x5639e3db, 0x942bda35, 0x899af9d1, 0xced44035,
0x98cc025b, 0x255f5771, 0x70fefa24, 0xe928fa4d, 0x2c030405, 0xb9325590, 0x20cb63bd, 0xa166305d,
0x80e52c0a, 0xa8fafe2f, 0x1ad13f7d, 0xcfaf3685, 0x6c83a199, 0x7d26718a, 0xde5dfcd9, 0x79cf7355,
0x8979d7fb, 0xebf8c55e, 0xebe408e4, 0xcd2affba, 0xe483be6e, 0xe239d6de, 0x5dc1e9e0, 0x0473931f,
0x851b097c, 0xac5db249, 0x09c0f9f2, 0xd8d2f134, 0xe6f38e41, 0xb1c71bf1, 0x52b6e4db, 0x07224424,
0x6cf73e85, 0x4f25d89c, 0x782a7d74, 0x10a68dcd, 0x3a868189, 0xd570d2dc, 0x69630745, 0x9542ed86,
0x331cd6b2, 0xa84b5b28, 0x07879c9d, 0x38372f64, 0x7185db11, 0x25ba7c83, 0x01061523, 0xe6792f9f,
0xe5df07d1, 0x4321b47f, 0x7d2469d8, 0x1a3a4f90, 0x48be29a3, 0x669071af, 0x8ec8dd31, 0x0810bfbf,
0x813a06b4, 0x68538345, 0x65865ddc, 0x43a71b8e, 0x78619a56, 0x5a34451d, 0x5bdaa3ed, 0x71edc7e9,
0x17ac9a20, 0x78d10bfa, 0x6c1e7f35, 0xd51839d9, 0x240cbc51, 0x33513cc1, 0xd2b4f795, 0xccaa8186,
0x0babe682, 0xa33cf164, 0x18c643ea, 0xc1ca105f, 0x9959147a, 0x6d3d94de, 0x0b654fbe, 0xed902ca0,
0x7d835cb5, 0x99ba1509, 0x6445c922, 0x495e76c2, 0xf07194bc, 0xa1631d7e, 0x677076a5, 0x89fffe35,
0x1a49bcf3, 0x8e6c948a, 0x0144c917, 0x8d93aea1, 0x16f87ddf, 0xc8f25d49, 0x1fb11297, 0x27e750cd,
0x2f422da1, 0xdee89a77, 0x1534c643, 0x457b7b8b, 0xaf172f7a, 0x6b9b09d6, 0x33573f7f, 0xf14e15c4,
0x526467d5, 0xaf488241, 0x87c3ee0d, 0x33be490c, 0x95aa6e52, 0x43ec242e, 0xd77de99b, 0xd018334f,
0x5b78d407, 0x498eb66b, 0xb1279fa8, 0xb38b0ea6, 0x90718376, 0xe325dee2, 0x8e2f2cba, 0xcaa5bdec,
0x9d652c56, 0xad68f5cb, 0xa77591af, 0x88e37ee8, 0xf8faa221, 0xfcbbbe47, 0x4f407786, 0xaf393889,
0xf444a1d9, 0x15ae1a2f, 0x40aa7097, 0x6f9486ac, 0x29d232a3, 0xe47609e9, 0xe8b631ff, 0xba8565f4,
0x11288749, 0x46c9a838, 0xeb1b7cd8, 0xf516bbb1, 0xfb74fda0, 0x010996e6, 0x4c994653, 0x1d889512,
0x53dcd9a3, 0xdd074697, 0x1e78e17c, 0x637c98bf, 0x930bb219, 0xcf7f75b0, 0xcb9355fb, 0x9e623009,
0xe466d82c, 0x28f968d3, 0xfeb385d9, 0x238e026c, 0xb8ed0560, 0x0c6a027a, 0x3d6fec4b, 0xbb4b2ec2,
0xe715031c, 0xeded011d, 0xcdc4d3b9, 0xc456fc96, 0xdd0eea20, 0xb3df8ec9, 0x12351993, 0xd9cbb01c,
0x603147a2, 0xcf37d17d, 0xf7fcd9dc, 0xd8556fa3, 0x104c8131, 0x13152774, 0xb4715811, 0x6a72c2c9,
0xc5ae37bb, 0xa76ce12a, 0x8150d8f3, 0x2ec29218, 0xa35f0984, 0x48c0647e, 0x0b5ff98c, 0x71893f7b
};
#define BARREL_SHIFT(v, shift) ( ((v) << shift) | ((v) >> (32 - shift)) )
size_t pagemask;
static uint32_t *
buzhash_init_table(uint32_t seed)
{
int i;
uint32_t *table = malloc(1024);
for(i = 0; i < 256; i++)
{
table[i] = table_base[i] ^ seed;
}
return table;
}
static uint32_t
buzhash(const unsigned char *data, size_t len, const uint32_t *h)
{
uint32_t i;
uint32_t sum = 0, imod;
for(i = len - 1; i > 0; i--)
{
imod = i & 0x1f;
sum ^= BARREL_SHIFT(h[*data], imod);
data++;
}
return sum ^ h[*data];
}
static uint32_t
buzhash_update(uint32_t sum, unsigned char remove, unsigned char add, size_t len, const uint32_t *h)
{
uint32_t lenmod = len & 0x1f;
return BARREL_SHIFT(sum, 1) ^ BARREL_SHIFT(h[remove], lenmod) ^ h[add];
}
typedef struct {
uint32_t chunk_mask;
uint32_t *table;
uint8_t *data;
PyObject *fd;
int fh;
int done, eof;
size_t min_size, buf_size, window_size, remaining, position, last;
off_t bytes_read, bytes_yielded;
} Chunker;
static Chunker *
chunker_init(size_t window_size, uint32_t chunk_mask, size_t min_size, size_t max_size, uint32_t seed)
{
Chunker *c = calloc(sizeof(Chunker), 1);
c->window_size = window_size;
c->chunk_mask = chunk_mask;
c->min_size = min_size;
c->table = buzhash_init_table(seed);
c->buf_size = max_size;
c->data = malloc(c->buf_size);
c->fh = -1;
return c;
}
static void
chunker_set_fd(Chunker *c, PyObject *fd, int fh)
{
Py_XDECREF(c->fd);
c->fd = fd;
Py_INCREF(fd);
c->fh = fh;
c->done = 0;
c->remaining = 0;
c->bytes_read = 0;
c->bytes_yielded = 0;
c->position = 0;
c->last = 0;
c->eof = 0;
}
static void
chunker_free(Chunker *c)
{
Py_XDECREF(c->fd);
free(c->table);
free(c->data);
free(c);
}
static int
chunker_fill(Chunker *c)
{
ssize_t n;
off_t offset, length;
int overshoot;
PyObject *data;
memmove(c->data, c->data + c->last, c->position + c->remaining - c->last);
c->position -= c->last;
c->last = 0;
n = c->buf_size - c->position - c->remaining;
if(c->eof || n == 0) {
return 1;
}
if(c->fh >= 0) {
offset = c->bytes_read;
// if we have a os-level file descriptor, use os-level API
n = read(c->fh, c->data + c->position + c->remaining, n);
if(n > 0) {
c->remaining += n;
c->bytes_read += n;
}
else
if(n == 0) {
c->eof = 1;
}
else {
// some error happened
PyErr_SetFromErrno(PyExc_OSError);
return 0;
}
length = c->bytes_read - offset;
#if ( ( _XOPEN_SOURCE >= 600 || _POSIX_C_SOURCE >= 200112L ) && defined(POSIX_FADV_DONTNEED) )
// Only do it once per run.
if (pagemask == 0)
pagemask = getpagesize() - 1;
// We tell the OS that we do not need the data that we just have read any
// more (that it maybe has in the cache). This avoids that we spoil the
// complete cache with data that we only read once and (due to cache
// size limit) kick out data from the cache that might be still useful
// for the OS or other processes.
// We rollback the initial offset back to the start of the page,
// to avoid it not being truncated as a partial page request.
if (length > 0) {
// Linux kernels prior to 4.7 have a bug where they truncate
// last partial page of POSIX_FADV_DONTNEED request, so we need
// to page-align it ourselves. We'll need the rest of this page
// on the next read (assuming this was not EOF)
overshoot = (offset + length) & pagemask;
} else {
// For length == 0 we set overshoot 0, so the below
// length - overshoot is 0, which means till end of file for
// fadvise. This will cancel the final page and is not part
// of the above workaround.
overshoot = 0;
}
posix_fadvise(c->fh, offset & ~pagemask, length - overshoot, POSIX_FADV_DONTNEED);
#endif
}
else {
// no os-level file descriptor, use Python file object API
data = PyObject_CallMethod(c->fd, "read", "i", n);
if(!data) {
return 0;
}
n = PyBytes_Size(data);
if(PyErr_Occurred()) {
// we wanted bytes(), but got something else
return 0;
}
if(n) {
memcpy(c->data + c->position + c->remaining, PyBytes_AsString(data), n);
c->remaining += n;
c->bytes_read += n;
}
else {
c->eof = 1;
}
Py_DECREF(data);
}
return 1;
}
static PyObject *
chunker_process(Chunker *c)
{
uint32_t sum, chunk_mask = c->chunk_mask;
size_t n = 0, old_last, min_size = c->min_size, window_size = c->window_size;
if(c->done) {
if(c->bytes_read == c->bytes_yielded)
PyErr_SetNone(PyExc_StopIteration);
else
PyErr_SetString(PyExc_Exception, "chunkifier byte count mismatch");
return NULL;
}
while(c->remaining <= window_size && !c->eof) {
if(!chunker_fill(c)) {
return NULL;
}
}
if(c->eof) {
c->done = 1;
if(c->remaining) {
c->bytes_yielded += c->remaining;
return PyMemoryView_FromMemory((char *)(c->data + c->position), c->remaining, PyBUF_READ);
}
else {
if(c->bytes_read == c->bytes_yielded)
PyErr_SetNone(PyExc_StopIteration);
else
PyErr_SetString(PyExc_Exception, "chunkifier byte count mismatch");
return NULL;
}
}
sum = buzhash(c->data + c->position, window_size, c->table);
while(c->remaining > c->window_size && ((sum & chunk_mask) || n < min_size)) {
sum = buzhash_update(sum, c->data[c->position],
c->data[c->position + window_size],
window_size, c->table);
c->position++;
c->remaining--;
n++;
if(c->remaining <= window_size) {
if(!chunker_fill(c)) {
return NULL;
}
}
}
if(c->remaining <= window_size) {
c->position += c->remaining;
c->remaining = 0;
}
old_last = c->last;
c->last = c->position;
n = c->last - old_last;
c->bytes_yielded += n;
return PyMemoryView_FromMemory((char *)(c->data + old_last), n, PyBUF_READ);
}

441
borg/_hashindex.c Normal file
View file

@ -0,0 +1,441 @@
#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#if defined (__SVR4) && defined (__sun)
#include <sys/isa_defs.h>
#endif
#if (defined(BYTE_ORDER)&&(BYTE_ORDER == BIG_ENDIAN)) || \
(defined(_BIG_ENDIAN)&&defined(__SVR4)&&defined(__sun))
#define _le32toh(x) __builtin_bswap32(x)
#define _htole32(x) __builtin_bswap32(x)
#elif (defined(BYTE_ORDER)&&(BYTE_ORDER == LITTLE_ENDIAN)) || \
(defined(_LITTLE_ENDIAN)&&defined(__SVR4)&&defined(__sun))
#define _le32toh(x) (x)
#define _htole32(x) (x)
#else
#error Unknown byte order
#endif
#define MAGIC "BORG_IDX"
#define MAGIC_LEN 8
typedef struct {
char magic[MAGIC_LEN];
int32_t num_entries;
int32_t num_buckets;
int8_t key_size;
int8_t value_size;
} __attribute__((__packed__)) HashHeader;
typedef struct {
void *buckets;
int num_entries;
int num_buckets;
int key_size;
int value_size;
off_t bucket_size;
int lower_limit;
int upper_limit;
} HashIndex;
/* prime (or w/ big prime factors) hash table sizes
* not sure we need primes for borg's usage (as we have a hash function based
* on sha256, we can assume an even, seemingly random distribution of values),
* but OTOH primes don't harm.
* also, growth of the sizes starts with fast-growing 2x steps, but slows down
* more and more down to 1.1x. this is to avoid huge jumps in memory allocation,
* like e.g. 4G -> 8G.
* these values are generated by hash_sizes.py.
*/
static int hash_sizes[] = {
1031, 2053, 4099, 8209, 16411, 32771, 65537, 131101, 262147, 445649,
757607, 1287917, 2189459, 3065243, 4291319, 6007867, 8410991,
11775359, 16485527, 23079703, 27695653, 33234787, 39881729, 47858071,
57429683, 68915617, 82698751, 99238507, 119086189, 144378011, 157223263,
173476439, 190253911, 209915011, 230493629, 253169431, 278728861,
306647623, 337318939, 370742809, 408229973, 449387209, 493428073,
543105119, 596976533, 657794869, 722676499, 795815791, 874066969,
962279771, 1057701643, 1164002657, 1280003147, 1407800297, 1548442699,
1703765389, 1873768367, 2062383853, /* 32bit int ends about here */
};
#define HASH_MIN_LOAD .25
#define HASH_MAX_LOAD .75 /* don't go higher than 0.75, otherwise performance severely suffers! */
#define MAX(x, y) ((x) > (y) ? (x): (y))
#define NELEMS(x) (sizeof(x) / sizeof((x)[0]))
#define EMPTY _htole32(0xffffffff)
#define DELETED _htole32(0xfffffffe)
#define BUCKET_ADDR(index, idx) (index->buckets + (idx * index->bucket_size))
#define BUCKET_MATCHES_KEY(index, idx, key) (memcmp(key, BUCKET_ADDR(index, idx), index->key_size) == 0)
#define BUCKET_IS_DELETED(index, idx) (*((uint32_t *)(BUCKET_ADDR(index, idx) + index->key_size)) == DELETED)
#define BUCKET_IS_EMPTY(index, idx) (*((uint32_t *)(BUCKET_ADDR(index, idx) + index->key_size)) == EMPTY)
#define BUCKET_MARK_DELETED(index, idx) (*((uint32_t *)(BUCKET_ADDR(index, idx) + index->key_size)) = DELETED)
#define BUCKET_MARK_EMPTY(index, idx) (*((uint32_t *)(BUCKET_ADDR(index, idx) + index->key_size)) = EMPTY)
#define EPRINTF_MSG(msg, ...) fprintf(stderr, "hashindex: " msg "\n", ##__VA_ARGS__)
#define EPRINTF_MSG_PATH(path, msg, ...) fprintf(stderr, "hashindex: %s: " msg "\n", path, ##__VA_ARGS__)
#define EPRINTF(msg, ...) fprintf(stderr, "hashindex: " msg "(%s)\n", ##__VA_ARGS__, strerror(errno))
#define EPRINTF_PATH(path, msg, ...) fprintf(stderr, "hashindex: %s: " msg " (%s)\n", path, ##__VA_ARGS__, strerror(errno))
static HashIndex *hashindex_read(const char *path);
static int hashindex_write(HashIndex *index, const char *path);
static HashIndex *hashindex_init(int capacity, int key_size, int value_size);
static const void *hashindex_get(HashIndex *index, const void *key);
static int hashindex_set(HashIndex *index, const void *key, const void *value);
static int hashindex_delete(HashIndex *index, const void *key);
static void *hashindex_next_key(HashIndex *index, const void *key);
/* Private API */
static int
hashindex_index(HashIndex *index, const void *key)
{
return _le32toh(*((uint32_t *)key)) % index->num_buckets;
}
static int
hashindex_lookup(HashIndex *index, const void *key)
{
int didx = -1;
int start = hashindex_index(index, key);
int idx = start;
for(;;) {
if(BUCKET_IS_EMPTY(index, idx))
{
return -1;
}
if(BUCKET_IS_DELETED(index, idx)) {
if(didx == -1) {
didx = idx;
}
}
else if(BUCKET_MATCHES_KEY(index, idx, key)) {
if (didx != -1) {
memcpy(BUCKET_ADDR(index, didx), BUCKET_ADDR(index, idx), index->bucket_size);
BUCKET_MARK_DELETED(index, idx);
idx = didx;
}
return idx;
}
idx = (idx + 1) % index->num_buckets;
if(idx == start) {
return -1;
}
}
}
static int
hashindex_resize(HashIndex *index, int capacity)
{
HashIndex *new;
void *key = NULL;
int32_t key_size = index->key_size;
if(!(new = hashindex_init(capacity, key_size, index->value_size))) {
return 0;
}
while((key = hashindex_next_key(index, key))) {
hashindex_set(new, key, key + key_size);
}
free(index->buckets);
index->buckets = new->buckets;
index->num_buckets = new->num_buckets;
index->lower_limit = new->lower_limit;
index->upper_limit = new->upper_limit;
free(new);
return 1;
}
int get_lower_limit(int num_buckets){
int min_buckets = hash_sizes[0];
if (num_buckets <= min_buckets)
return 0;
return (int)(num_buckets * HASH_MIN_LOAD);
}
int get_upper_limit(int num_buckets){
int max_buckets = hash_sizes[NELEMS(hash_sizes) - 1];
if (num_buckets >= max_buckets)
return num_buckets;
return (int)(num_buckets * HASH_MAX_LOAD);
}
int size_idx(int size){
/* find the hash_sizes index with entry >= size */
int elems = NELEMS(hash_sizes);
int entry, i=0;
do{
entry = hash_sizes[i++];
}while((entry < size) && (i < elems));
if (i >= elems)
return elems - 1;
i--;
return i;
}
int fit_size(int current){
int i = size_idx(current);
return hash_sizes[i];
}
int grow_size(int current){
int i = size_idx(current) + 1;
int elems = NELEMS(hash_sizes);
if (i >= elems)
return hash_sizes[elems - 1];
return hash_sizes[i];
}
int shrink_size(int current){
int i = size_idx(current) - 1;
if (i < 0)
return hash_sizes[0];
return hash_sizes[i];
}
/* Public API */
static HashIndex *
hashindex_read(const char *path)
{
FILE *fd;
off_t length, buckets_length, bytes_read;
HashHeader header;
HashIndex *index = NULL;
if((fd = fopen(path, "rb")) == NULL) {
EPRINTF_PATH(path, "fopen for reading failed");
return NULL;
}
bytes_read = fread(&header, 1, sizeof(HashHeader), fd);
if(bytes_read != sizeof(HashHeader)) {
if(ferror(fd)) {
EPRINTF_PATH(path, "fread header failed (expected %ju, got %ju)",
(uintmax_t) sizeof(HashHeader), (uintmax_t) bytes_read);
}
else {
EPRINTF_MSG_PATH(path, "fread header failed (expected %ju, got %ju)",
(uintmax_t) sizeof(HashHeader), (uintmax_t) bytes_read);
}
goto fail;
}
if(fseek(fd, 0, SEEK_END) < 0) {
EPRINTF_PATH(path, "fseek failed");
goto fail;
}
if((length = ftell(fd)) < 0) {
EPRINTF_PATH(path, "ftell failed");
goto fail;
}
if(fseek(fd, sizeof(HashHeader), SEEK_SET) < 0) {
EPRINTF_PATH(path, "fseek failed");
goto fail;
}
if(memcmp(header.magic, MAGIC, MAGIC_LEN)) {
EPRINTF_MSG_PATH(path, "Unknown MAGIC in header");
goto fail;
}
buckets_length = (off_t)_le32toh(header.num_buckets) * (header.key_size + header.value_size);
if((size_t) length != sizeof(HashHeader) + buckets_length) {
EPRINTF_MSG_PATH(path, "Incorrect file length (expected %ju, got %ju)",
(uintmax_t) sizeof(HashHeader) + buckets_length, (uintmax_t) length);
goto fail;
}
if(!(index = malloc(sizeof(HashIndex)))) {
EPRINTF_PATH(path, "malloc header failed");
goto fail;
}
if(!(index->buckets = malloc(buckets_length))) {
EPRINTF_PATH(path, "malloc buckets failed");
free(index);
index = NULL;
goto fail;
}
bytes_read = fread(index->buckets, 1, buckets_length, fd);
if(bytes_read != buckets_length) {
if(ferror(fd)) {
EPRINTF_PATH(path, "fread buckets failed (expected %ju, got %ju)",
(uintmax_t) buckets_length, (uintmax_t) bytes_read);
}
else {
EPRINTF_MSG_PATH(path, "fread buckets failed (expected %ju, got %ju)",
(uintmax_t) buckets_length, (uintmax_t) bytes_read);
}
free(index->buckets);
free(index);
index = NULL;
goto fail;
}
index->num_entries = _le32toh(header.num_entries);
index->num_buckets = _le32toh(header.num_buckets);
index->key_size = header.key_size;
index->value_size = header.value_size;
index->bucket_size = index->key_size + index->value_size;
index->lower_limit = get_lower_limit(index->num_buckets);
index->upper_limit = get_upper_limit(index->num_buckets);
fail:
if(fclose(fd) < 0) {
EPRINTF_PATH(path, "fclose failed");
}
return index;
}
static HashIndex *
hashindex_init(int capacity, int key_size, int value_size)
{
HashIndex *index;
int i;
capacity = fit_size(capacity);
if(!(index = malloc(sizeof(HashIndex)))) {
EPRINTF("malloc header failed");
return NULL;
}
if(!(index->buckets = calloc(capacity, key_size + value_size))) {
EPRINTF("malloc buckets failed");
free(index);
return NULL;
}
index->num_entries = 0;
index->key_size = key_size;
index->value_size = value_size;
index->num_buckets = capacity;
index->bucket_size = index->key_size + index->value_size;
index->lower_limit = get_lower_limit(index->num_buckets);
index->upper_limit = get_upper_limit(index->num_buckets);
for(i = 0; i < capacity; i++) {
BUCKET_MARK_EMPTY(index, i);
}
return index;
}
static void
hashindex_free(HashIndex *index)
{
free(index->buckets);
free(index);
}
static int
hashindex_write(HashIndex *index, const char *path)
{
off_t buckets_length = (off_t)index->num_buckets * index->bucket_size;
FILE *fd;
HashHeader header = {
.magic = MAGIC,
.num_entries = _htole32(index->num_entries),
.num_buckets = _htole32(index->num_buckets),
.key_size = index->key_size,
.value_size = index->value_size
};
int ret = 1;
if((fd = fopen(path, "wb")) == NULL) {
EPRINTF_PATH(path, "fopen for writing failed");
return 0;
}
if(fwrite(&header, 1, sizeof(header), fd) != sizeof(header)) {
EPRINTF_PATH(path, "fwrite header failed");
ret = 0;
}
if(fwrite(index->buckets, 1, buckets_length, fd) != (size_t) buckets_length) {
EPRINTF_PATH(path, "fwrite buckets failed");
ret = 0;
}
if(fclose(fd) < 0) {
EPRINTF_PATH(path, "fclose failed");
}
return ret;
}
static const void *
hashindex_get(HashIndex *index, const void *key)
{
int idx = hashindex_lookup(index, key);
if(idx < 0) {
return NULL;
}
return BUCKET_ADDR(index, idx) + index->key_size;
}
static int
hashindex_set(HashIndex *index, const void *key, const void *value)
{
int idx = hashindex_lookup(index, key);
uint8_t *ptr;
if(idx < 0)
{
if(index->num_entries > index->upper_limit) {
if(!hashindex_resize(index, grow_size(index->num_buckets))) {
return 0;
}
}
idx = hashindex_index(index, key);
while(!BUCKET_IS_EMPTY(index, idx) && !BUCKET_IS_DELETED(index, idx)) {
idx = (idx + 1) % index->num_buckets;
}
ptr = BUCKET_ADDR(index, idx);
memcpy(ptr, key, index->key_size);
memcpy(ptr + index->key_size, value, index->value_size);
index->num_entries += 1;
}
else
{
memcpy(BUCKET_ADDR(index, idx) + index->key_size, value, index->value_size);
}
return 1;
}
static int
hashindex_delete(HashIndex *index, const void *key)
{
int idx = hashindex_lookup(index, key);
if (idx < 0) {
return 1;
}
BUCKET_MARK_DELETED(index, idx);
index->num_entries -= 1;
if(index->num_entries < index->lower_limit) {
if(!hashindex_resize(index, shrink_size(index->num_buckets))) {
return 0;
}
}
return 1;
}
static void *
hashindex_next_key(HashIndex *index, const void *key)
{
int idx = 0;
if(key) {
idx = 1 + (key - index->buckets) / index->bucket_size;
}
if (idx == index->num_buckets) {
return NULL;
}
while(BUCKET_IS_EMPTY(index, idx) || BUCKET_IS_DELETED(index, idx)) {
idx ++;
if (idx == index->num_buckets) {
return NULL;
}
}
return BUCKET_ADDR(index, idx);
}
static int
hashindex_get_size(HashIndex *index)
{
return index->num_entries;
}

888
borg/archive.py Normal file
View file

@ -0,0 +1,888 @@
from binascii import hexlify
from datetime import datetime, timezone
from getpass import getuser
from itertools import groupby
import errno
from .logger import create_logger
logger = create_logger()
from .key import key_factory
from .remote import cache_if_remote
import os
import socket
import stat
import sys
import time
from io import BytesIO
from . import xattr
from .helpers import Error, uid2user, user2uid, gid2group, group2gid, \
parse_timestamp, to_localtime, format_time, format_timedelta, \
Manifest, Statistics, decode_dict, make_path_safe, StableDict, int_to_bigint, bigint_to_int, \
ProgressIndicatorPercent
from .platform import acl_get, acl_set
from .chunker import Chunker
from .hashindex import ChunkIndex
import msgpack
ITEMS_BUFFER = 1024 * 1024
CHUNK_MIN_EXP = 19 # 2**19 == 512kiB
CHUNK_MAX_EXP = 23 # 2**23 == 8MiB
HASH_WINDOW_SIZE = 0xfff # 4095B
HASH_MASK_BITS = 21 # results in ~2MiB chunks statistically
# defaults, use --chunker-params to override
CHUNKER_PARAMS = (CHUNK_MIN_EXP, CHUNK_MAX_EXP, HASH_MASK_BITS, HASH_WINDOW_SIZE)
# chunker params for the items metadata stream, finer granularity
ITEMS_CHUNKER_PARAMS = (12, 16, 14, HASH_WINDOW_SIZE)
has_lchmod = hasattr(os, 'lchmod')
has_lchflags = hasattr(os, 'lchflags')
flags_normal = os.O_RDONLY | getattr(os, 'O_BINARY', 0)
flags_noatime = flags_normal | getattr(os, 'O_NOATIME', 0)
class DownloadPipeline:
def __init__(self, repository, key):
self.repository = repository
self.key = key
def unpack_many(self, ids, filter=None, preload=False):
unpacker = msgpack.Unpacker(use_list=False)
for data in self.fetch_many(ids):
unpacker.feed(data)
items = [decode_dict(item, (b'path', b'source', b'user', b'group')) for item in unpacker]
if filter:
items = [item for item in items if filter(item)]
if preload:
for item in items:
if b'chunks' in item:
self.repository.preload([c[0] for c in item[b'chunks']])
for item in items:
yield item
def fetch_many(self, ids, is_preloaded=False):
for id_, data in zip(ids, self.repository.get_many(ids, is_preloaded=is_preloaded)):
yield self.key.decrypt(id_, data)
class ChunkBuffer:
BUFFER_SIZE = 1 * 1024 * 1024
def __init__(self, key, chunker_params=ITEMS_CHUNKER_PARAMS):
self.buffer = BytesIO()
self.packer = msgpack.Packer(unicode_errors='surrogateescape')
self.chunks = []
self.key = key
self.chunker = Chunker(self.key.chunk_seed, *chunker_params)
def add(self, item):
self.buffer.write(self.packer.pack(StableDict(item)))
if self.is_full():
self.flush()
def write_chunk(self, chunk):
raise NotImplementedError
def flush(self, flush=False):
if self.buffer.tell() == 0:
return
self.buffer.seek(0)
chunks = list(bytes(s) for s in self.chunker.chunkify(self.buffer))
self.buffer.seek(0)
self.buffer.truncate(0)
# Leave the last partial chunk in the buffer unless flush is True
end = None if flush or len(chunks) == 1 else -1
for chunk in chunks[:end]:
self.chunks.append(self.write_chunk(chunk))
if end == -1:
self.buffer.write(chunks[-1])
def is_full(self):
return self.buffer.tell() > self.BUFFER_SIZE
class CacheChunkBuffer(ChunkBuffer):
def __init__(self, cache, key, stats, chunker_params=ITEMS_CHUNKER_PARAMS):
super().__init__(key, chunker_params)
self.cache = cache
self.stats = stats
def write_chunk(self, chunk):
id_, _, _ = self.cache.add_chunk(self.key.id_hash(chunk), chunk, self.stats)
return id_
class Archive:
class DoesNotExist(Error):
"""Archive {} does not exist"""
class AlreadyExists(Error):
"""Archive {} already exists"""
class IncompatibleFilesystemEncodingError(Error):
"""Failed to encode filename "{}" into file system encoding "{}". Consider configuring the LANG environment variable."""
def __init__(self, repository, key, manifest, name, cache=None, create=False,
checkpoint_interval=300, numeric_owner=False, progress=False,
chunker_params=CHUNKER_PARAMS, start=None, end=None):
self.cwd = os.getcwd()
self.key = key
self.repository = repository
self.cache = cache
self.manifest = manifest
self.hard_links = {}
self.stats = Statistics()
self.show_progress = progress
self.name = name
self.checkpoint_interval = checkpoint_interval
self.numeric_owner = numeric_owner
if start is None:
start = datetime.utcnow()
self.start = start
if end is None:
end = datetime.utcnow()
self.end = end
self.pipeline = DownloadPipeline(self.repository, self.key)
if create:
self.items_buffer = CacheChunkBuffer(self.cache, self.key, self.stats)
self.chunker = Chunker(self.key.chunk_seed, *chunker_params)
if name in manifest.archives:
raise self.AlreadyExists(name)
self.last_checkpoint = time.time()
i = 0
while True:
self.checkpoint_name = '%s.checkpoint%s' % (name, i and ('.%d' % i) or '')
if self.checkpoint_name not in manifest.archives:
break
i += 1
else:
if name not in self.manifest.archives:
raise self.DoesNotExist(name)
info = self.manifest.archives[name]
self.load(info[b'id'])
self.zeros = b'\0' * (1 << chunker_params[1])
def _load_meta(self, id):
data = self.key.decrypt(id, self.repository.get(id))
metadata = msgpack.unpackb(data)
if metadata[b'version'] != 1:
raise Exception('Unknown archive metadata version')
return metadata
def load(self, id):
self.id = id
self.metadata = self._load_meta(self.id)
decode_dict(self.metadata, (b'name', b'hostname', b'username', b'time', b'time_end'))
self.metadata[b'cmdline'] = [arg.decode('utf-8', 'surrogateescape') for arg in self.metadata[b'cmdline']]
self.name = self.metadata[b'name']
@property
def ts(self):
"""Timestamp of archive creation (start) in UTC"""
ts = self.metadata[b'time']
return parse_timestamp(ts)
@property
def ts_end(self):
"""Timestamp of archive creation (end) in UTC"""
# fall back to time if there is no time_end present in metadata
ts = self.metadata.get(b'time_end') or self.metadata[b'time']
return parse_timestamp(ts)
@property
def fpr(self):
return hexlify(self.id).decode('ascii')
@property
def duration(self):
return format_timedelta(self.end - self.start)
def __str__(self):
return '''\
Archive name: {0.name}
Archive fingerprint: {0.fpr}
Time (start): {start}
Time (end): {end}
Duration: {0.duration}
Number of files: {0.stats.nfiles}'''.format(
self,
start=format_time(to_localtime(self.start.replace(tzinfo=timezone.utc))),
end=format_time(to_localtime(self.end.replace(tzinfo=timezone.utc))))
def __repr__(self):
return 'Archive(%r)' % self.name
def iter_items(self, filter=None, preload=False):
for item in self.pipeline.unpack_many(self.metadata[b'items'], filter=filter, preload=preload):
yield item
def add_item(self, item):
unknown_keys = set(item) - ITEM_KEYS
assert not unknown_keys, ('unknown item metadata keys detected, please update ITEM_KEYS: %s',
','.join(k.decode('ascii') for k in unknown_keys))
if self.show_progress:
self.stats.show_progress(item=item, dt=0.2)
self.items_buffer.add(item)
if time.time() - self.last_checkpoint > self.checkpoint_interval:
self.write_checkpoint()
self.last_checkpoint = time.time()
def write_checkpoint(self):
self.save(self.checkpoint_name)
del self.manifest.archives[self.checkpoint_name]
self.cache.chunk_decref(self.id, self.stats)
def save(self, name=None, timestamp=None):
name = name or self.name
if name in self.manifest.archives:
raise self.AlreadyExists(name)
self.items_buffer.flush(flush=True)
if timestamp is None:
self.end = datetime.utcnow()
start = self.start
end = self.end
else:
self.end = timestamp
start = timestamp
end = timestamp # we only have 1 value
metadata = StableDict({
'version': 1,
'name': name,
'items': self.items_buffer.chunks,
'cmdline': sys.argv,
'hostname': socket.gethostname(),
'username': getuser(),
'time': start.isoformat(),
'time_end': end.isoformat(),
})
data = msgpack.packb(metadata, unicode_errors='surrogateescape')
self.id = self.key.id_hash(data)
self.cache.add_chunk(self.id, data, self.stats)
self.manifest.archives[name] = {'id': self.id, 'time': metadata['time']}
self.manifest.write()
self.repository.commit()
self.cache.commit()
def calc_stats(self, cache):
def add(id):
count, size, csize = cache.chunks[id]
stats.update(size, csize, count == 1)
cache.chunks[id] = count - 1, size, csize
def add_file_chunks(chunks):
for id, _, _ in chunks:
add(id)
# This function is a bit evil since it abuses the cache to calculate
# the stats. The cache transaction must be rolled back afterwards
unpacker = msgpack.Unpacker(use_list=False)
cache.begin_txn()
stats = Statistics()
add(self.id)
for id, chunk in zip(self.metadata[b'items'], self.repository.get_many(self.metadata[b'items'])):
add(id)
unpacker.feed(self.key.decrypt(id, chunk))
for item in unpacker:
if b'chunks' in item:
stats.nfiles += 1
add_file_chunks(item[b'chunks'])
cache.rollback()
return stats
def extract_item(self, item, restore_attrs=True, dry_run=False, stdout=False, sparse=False):
if dry_run or stdout:
if b'chunks' in item:
for data in self.pipeline.fetch_many([c[0] for c in item[b'chunks']], is_preloaded=True):
if stdout:
sys.stdout.buffer.write(data)
if stdout:
sys.stdout.buffer.flush()
return
dest = self.cwd
if item[b'path'].startswith('/') or item[b'path'].startswith('..'):
raise Exception('Path should be relative and local')
path = os.path.join(dest, item[b'path'])
# Attempt to remove existing files, ignore errors on failure
try:
st = os.lstat(path)
if stat.S_ISDIR(st.st_mode):
os.rmdir(path)
else:
os.unlink(path)
except UnicodeEncodeError:
raise self.IncompatibleFilesystemEncodingError(path, sys.getfilesystemencoding()) from None
except OSError:
pass
mode = item[b'mode']
if stat.S_ISREG(mode):
if not os.path.exists(os.path.dirname(path)):
os.makedirs(os.path.dirname(path))
# Hard link?
if b'source' in item:
source = os.path.join(dest, item[b'source'])
if os.path.exists(path):
os.unlink(path)
os.link(source, path)
else:
with open(path, 'wb') as fd:
ids = [c[0] for c in item[b'chunks']]
for data in self.pipeline.fetch_many(ids, is_preloaded=True):
if sparse and self.zeros.startswith(data):
# all-zero chunk: create a hole in a sparse file
fd.seek(len(data), 1)
else:
fd.write(data)
pos = fd.tell()
fd.truncate(pos)
fd.flush()
self.restore_attrs(path, item, fd=fd.fileno())
elif stat.S_ISDIR(mode):
if not os.path.exists(path):
os.makedirs(path)
if restore_attrs:
self.restore_attrs(path, item)
elif stat.S_ISLNK(mode):
if not os.path.exists(os.path.dirname(path)):
os.makedirs(os.path.dirname(path))
source = item[b'source']
if os.path.exists(path):
os.unlink(path)
try:
os.symlink(source, path)
except UnicodeEncodeError:
raise self.IncompatibleFilesystemEncodingError(source, sys.getfilesystemencoding()) from None
self.restore_attrs(path, item, symlink=True)
elif stat.S_ISFIFO(mode):
if not os.path.exists(os.path.dirname(path)):
os.makedirs(os.path.dirname(path))
os.mkfifo(path)
self.restore_attrs(path, item)
elif stat.S_ISCHR(mode) or stat.S_ISBLK(mode):
os.mknod(path, item[b'mode'], item[b'rdev'])
self.restore_attrs(path, item)
else:
raise Exception('Unknown archive item type %r' % item[b'mode'])
def restore_attrs(self, path, item, symlink=False, fd=None):
uid = gid = None
if not self.numeric_owner:
uid = user2uid(item[b'user'])
gid = group2gid(item[b'group'])
uid = item[b'uid'] if uid is None else uid
gid = item[b'gid'] if gid is None else gid
# This code is a bit of a mess due to os specific differences
try:
if fd:
os.fchown(fd, uid, gid)
else:
os.lchown(path, uid, gid)
except OSError:
pass
if fd:
os.fchmod(fd, item[b'mode'])
elif not symlink:
os.chmod(path, item[b'mode'])
elif has_lchmod: # Not available on Linux
os.lchmod(path, item[b'mode'])
mtime = bigint_to_int(item[b'mtime'])
if b'atime' in item:
atime = bigint_to_int(item[b'atime'])
else:
# old archives only had mtime in item metadata
atime = mtime
if fd:
os.utime(fd, None, ns=(atime, mtime))
else:
os.utime(path, None, ns=(atime, mtime), follow_symlinks=False)
acl_set(path, item, self.numeric_owner)
# Only available on OS X and FreeBSD
if has_lchflags and b'bsdflags' in item:
try:
os.lchflags(path, item[b'bsdflags'])
except OSError:
pass
# chown removes Linux capabilities, so set the extended attributes at the end, after chown, since they include
# the Linux capabilities in the "security.capability" attribute.
xattrs = item.get(b'xattrs', {})
for k, v in xattrs.items():
try:
xattr.setxattr(fd or path, k, v, follow_symlinks=False)
except OSError as e:
if e.errno not in (errno.ENOTSUP, errno.EACCES):
# only raise if the errno is not on our ignore list:
# ENOTSUP == xattrs not supported here
# EACCES == permission denied to set this specific xattr
# (this may happen related to security.* keys)
raise
def rename(self, name):
if name in self.manifest.archives:
raise self.AlreadyExists(name)
metadata = StableDict(self._load_meta(self.id))
metadata[b'name'] = name
data = msgpack.packb(metadata, unicode_errors='surrogateescape')
new_id = self.key.id_hash(data)
self.cache.add_chunk(new_id, data, self.stats)
self.manifest.archives[name] = {'id': new_id, 'time': metadata[b'time']}
self.cache.chunk_decref(self.id, self.stats)
del self.manifest.archives[self.name]
def delete(self, stats, progress=False):
unpacker = msgpack.Unpacker(use_list=False)
items_ids = self.metadata[b'items']
pi = ProgressIndicatorPercent(total=len(items_ids), msg="Decrementing references %3.0f%%", same_line=True)
for (i, (items_id, data)) in enumerate(zip(items_ids, self.repository.get_many(items_ids))):
if progress:
pi.show(i)
unpacker.feed(self.key.decrypt(items_id, data))
self.cache.chunk_decref(items_id, stats)
for item in unpacker:
if b'chunks' in item:
for chunk_id, size, csize in item[b'chunks']:
self.cache.chunk_decref(chunk_id, stats)
if progress:
pi.finish()
self.cache.chunk_decref(self.id, stats)
del self.manifest.archives[self.name]
def stat_attrs(self, st, path):
item = {
b'mode': st.st_mode,
b'uid': st.st_uid, b'user': uid2user(st.st_uid),
b'gid': st.st_gid, b'group': gid2group(st.st_gid),
b'atime': int_to_bigint(st.st_atime_ns),
b'ctime': int_to_bigint(st.st_ctime_ns),
b'mtime': int_to_bigint(st.st_mtime_ns),
}
if self.numeric_owner:
item[b'user'] = item[b'group'] = None
xattrs = xattr.get_all(path, follow_symlinks=False)
if xattrs:
item[b'xattrs'] = StableDict(xattrs)
if has_lchflags and st.st_flags:
item[b'bsdflags'] = st.st_flags
acl_get(path, item, st, self.numeric_owner)
return item
def process_dir(self, path, st):
item = {b'path': make_path_safe(path)}
item.update(self.stat_attrs(st, path))
self.add_item(item)
return 'd' # directory
def process_fifo(self, path, st):
item = {b'path': make_path_safe(path)}
item.update(self.stat_attrs(st, path))
self.add_item(item)
return 'f' # fifo
def process_dev(self, path, st):
item = {b'path': make_path_safe(path), b'rdev': st.st_rdev}
item.update(self.stat_attrs(st, path))
self.add_item(item)
if stat.S_ISCHR(st.st_mode):
return 'c' # char device
elif stat.S_ISBLK(st.st_mode):
return 'b' # block device
def process_symlink(self, path, st):
source = os.readlink(path)
item = {b'path': make_path_safe(path), b'source': source}
item.update(self.stat_attrs(st, path))
self.add_item(item)
return 's' # symlink
def process_stdin(self, path, cache):
uid, gid = 0, 0
fd = sys.stdin.buffer # binary
chunks = []
for chunk in self.chunker.chunkify(fd):
chunks.append(cache.add_chunk(self.key.id_hash(chunk), chunk, self.stats))
self.stats.nfiles += 1
t = int_to_bigint(int(time.time()) * 1000000000)
item = {
b'path': path,
b'chunks': chunks,
b'mode': 0o100660, # regular file, ug=rw
b'uid': uid, b'user': uid2user(uid),
b'gid': gid, b'group': gid2group(gid),
b'mtime': t, b'atime': t, b'ctime': t,
}
self.add_item(item)
return 'i' # stdin
def process_file(self, path, st, cache, ignore_inode=False):
status = None
safe_path = make_path_safe(path)
# Is it a hard link?
if st.st_nlink > 1:
source = self.hard_links.get((st.st_ino, st.st_dev))
if (st.st_ino, st.st_dev) in self.hard_links:
item = self.stat_attrs(st, path)
item.update({b'path': safe_path, b'source': source})
self.add_item(item)
status = 'h' # regular file, hardlink (to already seen inodes)
return status
else:
self.hard_links[st.st_ino, st.st_dev] = safe_path
path_hash = self.key.id_hash(os.path.join(self.cwd, path).encode('utf-8', 'surrogateescape'))
first_run = not cache.files
ids = cache.file_known_and_unchanged(path_hash, st, ignore_inode)
if first_run:
logger.debug('Processing files ...')
chunks = None
if ids is not None:
# Make sure all ids are available
for id_ in ids:
if not cache.seen_chunk(id_):
break
else:
chunks = [cache.chunk_incref(id_, self.stats) for id_ in ids]
status = 'U' # regular file, unchanged
else:
status = 'A' # regular file, added
item = {b'path': safe_path}
# Only chunkify the file if needed
if chunks is None:
fh = Archive._open_rb(path)
with os.fdopen(fh, 'rb') as fd:
chunks = []
for chunk in self.chunker.chunkify(fd, fh):
chunks.append(cache.add_chunk(self.key.id_hash(chunk), chunk, self.stats))
if self.show_progress:
self.stats.show_progress(item=item, dt=0.2)
cache.memorize_file(path_hash, st, [c[0] for c in chunks])
status = status or 'M' # regular file, modified (if not 'A' already)
item[b'chunks'] = chunks
item.update(self.stat_attrs(st, path))
self.stats.nfiles += 1
self.add_item(item)
return status
@staticmethod
def list_archives(repository, key, manifest, cache=None):
# expensive! see also Manifest.list_archive_infos.
for name, info in manifest.archives.items():
yield Archive(repository, key, manifest, name, cache=cache)
@staticmethod
def _open_rb(path):
try:
# if we have O_NOATIME, this likely will succeed if we are root or owner of file:
return os.open(path, flags_noatime)
except PermissionError:
if flags_noatime == flags_normal:
# we do not have O_NOATIME, no need to try again:
raise
# Was this EPERM due to the O_NOATIME flag? Try again without it:
return os.open(path, flags_normal)
# this set must be kept complete, otherwise the RobustUnpacker might malfunction:
ITEM_KEYS = set([b'path', b'source', b'rdev', b'chunks',
b'mode', b'user', b'group', b'uid', b'gid', b'mtime', b'atime', b'ctime',
b'xattrs', b'bsdflags', b'acl_nfs4', b'acl_access', b'acl_default', b'acl_extended', ])
class RobustUnpacker:
"""A restartable/robust version of the streaming msgpack unpacker
"""
def __init__(self, validator):
super().__init__()
self.item_keys = [msgpack.packb(name) for name in ITEM_KEYS]
self.validator = validator
self._buffered_data = []
self._resync = False
self._unpacker = msgpack.Unpacker(object_hook=StableDict)
def resync(self):
self._buffered_data = []
self._resync = True
def feed(self, data):
if self._resync:
self._buffered_data.append(data)
else:
self._unpacker.feed(data)
def __iter__(self):
return self
def __next__(self):
if self._resync:
data = b''.join(self._buffered_data)
while self._resync:
if not data:
raise StopIteration
# Abort early if the data does not look like a serialized dict
if len(data) < 2 or ((data[0] & 0xf0) != 0x80) or ((data[1] & 0xe0) != 0xa0):
data = data[1:]
continue
# Make sure it looks like an item dict
for pattern in self.item_keys:
if data[1:].startswith(pattern):
break
else:
data = data[1:]
continue
self._unpacker = msgpack.Unpacker(object_hook=StableDict)
self._unpacker.feed(data)
try:
item = next(self._unpacker)
if self.validator(item):
self._resync = False
return item
# Ignore exceptions that might be raised when feeding
# msgpack with invalid data
except (TypeError, ValueError, StopIteration):
pass
data = data[1:]
else:
return next(self._unpacker)
class ArchiveChecker:
def __init__(self):
self.error_found = False
self.possibly_superseded = set()
def check(self, repository, repair=False, archive=None, last=None, prefix=None, save_space=False):
logger.info('Starting archive consistency check...')
self.check_all = archive is None and last is None and prefix is None
self.repair = repair
self.repository = repository
self.init_chunks()
self.key = self.identify_key(repository)
if Manifest.MANIFEST_ID not in self.chunks:
logger.error("Repository manifest not found!")
self.error_found = True
self.manifest = self.rebuild_manifest()
else:
self.manifest, _ = Manifest.load(repository, key=self.key)
self.rebuild_refcounts(archive=archive, last=last, prefix=prefix)
self.orphan_chunks_check()
self.finish(save_space=save_space)
if self.error_found:
logger.error('Archive consistency check complete, problems found.')
else:
logger.info('Archive consistency check complete, no problems found.')
return self.repair or not self.error_found
def init_chunks(self):
"""Fetch a list of all object keys from repository
"""
# Explicitly set the initial hash table capacity to avoid performance issues
# due to hash table "resonance"
capacity = int(len(self.repository) * 1.2)
self.chunks = ChunkIndex(capacity)
marker = None
while True:
result = self.repository.list(limit=10000, marker=marker)
if not result:
break
marker = result[-1]
for id_ in result:
self.chunks[id_] = (0, 0, 0)
def identify_key(self, repository):
cdata = repository.get(next(self.chunks.iteritems())[0])
return key_factory(repository, cdata)
def rebuild_manifest(self):
"""Rebuild the manifest object if it is missing
Iterates through all objects in the repository looking for archive metadata blocks.
"""
logger.info('Rebuilding missing manifest, this might take some time...')
manifest = Manifest(self.key, self.repository)
for chunk_id, _ in self.chunks.iteritems():
cdata = self.repository.get(chunk_id)
data = self.key.decrypt(chunk_id, cdata)
# Some basic sanity checks of the payload before feeding it into msgpack
if len(data) < 2 or ((data[0] & 0xf0) != 0x80) or ((data[1] & 0xe0) != 0xa0):
continue
if b'cmdline' not in data or b'\xa7version\x01' not in data:
continue
try:
archive = msgpack.unpackb(data)
# Ignore exceptions that might be raised when feeding
# msgpack with invalid data
except (TypeError, ValueError, StopIteration):
continue
if isinstance(archive, dict) and b'items' in archive and b'cmdline' in archive:
logger.info('Found archive %s', archive[b'name'].decode('utf-8'))
manifest.archives[archive[b'name'].decode('utf-8')] = {b'id': chunk_id, b'time': archive[b'time']}
logger.info('Manifest rebuild complete.')
return manifest
def rebuild_refcounts(self, archive=None, last=None, prefix=None):
"""Rebuild object reference counts by walking the metadata
Missing and/or incorrect data is repaired when detected
"""
# Exclude the manifest from chunks
del self.chunks[Manifest.MANIFEST_ID]
def mark_as_possibly_superseded(id_):
if self.chunks.get(id_, (0,))[0] == 0:
self.possibly_superseded.add(id_)
def add_callback(chunk):
id_ = self.key.id_hash(chunk)
cdata = self.key.encrypt(chunk)
add_reference(id_, len(chunk), len(cdata), cdata)
return id_
def add_reference(id_, size, csize, cdata=None):
try:
self.chunks.incref(id_)
except KeyError:
assert cdata is not None
self.chunks[id_] = 1, size, csize
if self.repair:
self.repository.put(id_, cdata)
def verify_file_chunks(item):
"""Verifies that all file chunks are present
Missing file chunks will be replaced with new chunks of the same
length containing all zeros.
"""
offset = 0
chunk_list = []
for chunk_id, size, csize in item[b'chunks']:
if chunk_id not in self.chunks:
# If a file chunk is missing, create an all empty replacement chunk
logger.error('{}: Missing file chunk detected (Byte {}-{})'.format(item[b'path'].decode('utf-8', 'surrogateescape'), offset, offset + size))
self.error_found = True
data = bytes(size)
chunk_id = self.key.id_hash(data)
cdata = self.key.encrypt(data)
csize = len(cdata)
add_reference(chunk_id, size, csize, cdata)
else:
add_reference(chunk_id, size, csize)
chunk_list.append((chunk_id, size, csize))
offset += size
item[b'chunks'] = chunk_list
def robust_iterator(archive):
"""Iterates through all archive items
Missing item chunks will be skipped and the msgpack stream will be restarted
"""
unpacker = RobustUnpacker(lambda item: isinstance(item, dict) and b'path' in item)
_state = 0
def missing_chunk_detector(chunk_id):
nonlocal _state
if _state % 2 != int(chunk_id not in self.chunks):
_state += 1
return _state
def report(msg, chunk_id, chunk_no):
cid = hexlify(chunk_id).decode('ascii')
msg += ' [chunk: %06d_%s]' % (chunk_no, cid) # see debug-dump-archive-items
self.error_found = True
logger.error(msg)
i = 0
for state, items in groupby(archive[b'items'], missing_chunk_detector):
items = list(items)
if state % 2:
for chunk_id in items:
report('item metadata chunk missing', chunk_id, i)
i += 1
continue
if state > 0:
unpacker.resync()
for chunk_id, cdata in zip(items, repository.get_many(items)):
unpacker.feed(self.key.decrypt(chunk_id, cdata))
try:
for item in unpacker:
if isinstance(item, dict):
yield item
else:
report('Did not get expected metadata dict when unpacking item metadata', chunk_id, i)
except Exception:
report('Exception while unpacking item metadata', chunk_id, i)
raise
i += 1
if archive is None:
# we need last N or all archives
archive_items = sorted(self.manifest.archives.items(), reverse=True,
key=lambda name_info: name_info[1][b'time'])
if prefix is not None:
archive_items = [item for item in archive_items if item[0].startswith(prefix)]
num_archives = len(archive_items)
end = None if last is None else min(num_archives, last)
else:
# we only want one specific archive
archive_items = [item for item in self.manifest.archives.items() if item[0] == archive]
num_archives = 1
end = 1
with cache_if_remote(self.repository) as repository:
for i, (name, info) in enumerate(archive_items[:end]):
logger.info('Analyzing archive {} ({}/{})'.format(name, num_archives - i, num_archives))
archive_id = info[b'id']
if archive_id not in self.chunks:
logger.error('Archive metadata block is missing!')
self.error_found = True
del self.manifest.archives[name]
continue
mark_as_possibly_superseded(archive_id)
cdata = self.repository.get(archive_id)
data = self.key.decrypt(archive_id, cdata)
archive = StableDict(msgpack.unpackb(data))
if archive[b'version'] != 1:
raise Exception('Unknown archive metadata version')
decode_dict(archive, (b'name', b'hostname', b'username', b'time', b'time_end'))
archive[b'cmdline'] = [arg.decode('utf-8', 'surrogateescape') for arg in archive[b'cmdline']]
items_buffer = ChunkBuffer(self.key)
items_buffer.write_chunk = add_callback
for item in robust_iterator(archive):
if b'chunks' in item:
verify_file_chunks(item)
items_buffer.add(item)
items_buffer.flush(flush=True)
for previous_item_id in archive[b'items']:
mark_as_possibly_superseded(previous_item_id)
archive[b'items'] = items_buffer.chunks
data = msgpack.packb(archive, unicode_errors='surrogateescape')
new_archive_id = self.key.id_hash(data)
cdata = self.key.encrypt(data)
add_reference(new_archive_id, len(data), len(cdata), cdata)
info[b'id'] = new_archive_id
def orphan_chunks_check(self):
if self.check_all:
unused = set()
for id_, (count, size, csize) in self.chunks.iteritems():
if count == 0:
unused.add(id_)
orphaned = unused - self.possibly_superseded
if orphaned:
logger.error('{} orphaned objects found!'.format(len(orphaned)))
self.error_found = True
if self.repair:
for id_ in unused:
self.repository.delete(id_)
else:
logger.info('Orphaned objects check skipped (needs all archives checked).')
def finish(self, save_space=False):
if self.repair:
self.manifest.write()
self.repository.commit(save_space=save_space)

1555
borg/archiver.py Normal file

File diff suppressed because it is too large Load diff

421
borg/cache.py Normal file
View file

@ -0,0 +1,421 @@
import configparser
from .remote import cache_if_remote
from collections import namedtuple
import os
import stat
from binascii import hexlify, unhexlify
import shutil
from .key import PlaintextKey
from .logger import create_logger
logger = create_logger()
from .helpers import Error, get_cache_dir, decode_dict, int_to_bigint, \
bigint_to_int, format_file_size, yes
from .locking import UpgradableLock
from .hashindex import ChunkIndex
import msgpack
class Cache:
"""Client Side cache
"""
class RepositoryReplay(Error):
"""Cache is newer than repository, refusing to continue"""
class CacheInitAbortedError(Error):
"""Cache initialization aborted"""
class RepositoryAccessAborted(Error):
"""Repository access aborted"""
class EncryptionMethodMismatch(Error):
"""Repository encryption method changed since last access, refusing to continue"""
@staticmethod
def break_lock(repository, path=None):
path = path or os.path.join(get_cache_dir(), hexlify(repository.id).decode('ascii'))
UpgradableLock(os.path.join(path, 'lock'), exclusive=True).break_lock()
@staticmethod
def destroy(repository, path=None):
"""destroy the cache for ``repository`` or at ``path``"""
path = path or os.path.join(get_cache_dir(), hexlify(repository.id).decode('ascii'))
config = os.path.join(path, 'config')
if os.path.exists(config):
os.remove(config) # kill config first
shutil.rmtree(path)
def __init__(self, repository, key, manifest, path=None, sync=True, do_files=False, warn_if_unencrypted=True,
lock_wait=None):
self.lock = None
self.timestamp = None
self.lock = None
self.txn_active = False
self.repository = repository
self.key = key
self.manifest = manifest
self.path = path or os.path.join(get_cache_dir(), hexlify(repository.id).decode('ascii'))
self.do_files = do_files
# Warn user before sending data to a never seen before unencrypted repository
if not os.path.exists(self.path):
if warn_if_unencrypted and isinstance(key, PlaintextKey):
msg = ("Warning: Attempting to access a previously unknown unencrypted repository!" +
"\n" +
"Do you want to continue? [yN] ")
if not yes(msg, false_msg="Aborting.", env_var_override='BORG_UNKNOWN_UNENCRYPTED_REPO_ACCESS_IS_OK'):
raise self.CacheInitAbortedError()
self.create()
self.open(lock_wait=lock_wait)
try:
# Warn user before sending data to a relocated repository
if self.previous_location and self.previous_location != repository._location.canonical_path():
msg = ("Warning: The repository at location {} was previously located at {}".format(repository._location.canonical_path(), self.previous_location) +
"\n" +
"Do you want to continue? [yN] ")
if not yes(msg, false_msg="Aborting.", env_var_override='BORG_RELOCATED_REPO_ACCESS_IS_OK'):
raise self.RepositoryAccessAborted()
if sync and self.manifest.id != self.manifest_id:
# If repository is older than the cache something fishy is going on
if self.timestamp and self.timestamp > manifest.timestamp:
raise self.RepositoryReplay()
# Make sure an encrypted repository has not been swapped for an unencrypted repository
if self.key_type is not None and self.key_type != str(key.TYPE):
raise self.EncryptionMethodMismatch()
self.sync()
self.commit()
except:
self.close()
raise
def __enter__(self):
return self
def __exit__(self, exc_type, exc_val, exc_tb):
self.close()
def __str__(self):
fmt = """\
All archives: {0.total_size:>20s} {0.total_csize:>20s} {0.unique_csize:>20s}
Unique chunks Total chunks
Chunk index: {0.total_unique_chunks:20d} {0.total_chunks:20d}"""
return fmt.format(self.format_tuple())
def format_tuple(self):
# XXX: this should really be moved down to `hashindex.pyx`
Summary = namedtuple('Summary', ['total_size', 'total_csize', 'unique_size', 'unique_csize', 'total_unique_chunks', 'total_chunks'])
stats = Summary(*self.chunks.summarize())._asdict()
for field in ['total_size', 'total_csize', 'unique_csize']:
stats[field] = format_file_size(stats[field])
return Summary(**stats)
def create(self):
"""Create a new empty cache at `self.path`
"""
os.makedirs(self.path)
with open(os.path.join(self.path, 'README'), 'w') as fd:
fd.write('This is a Borg cache')
config = configparser.ConfigParser(interpolation=None)
config.add_section('cache')
config.set('cache', 'version', '1')
config.set('cache', 'repository', hexlify(self.repository.id).decode('ascii'))
config.set('cache', 'manifest', '')
with open(os.path.join(self.path, 'config'), 'w') as fd:
config.write(fd)
ChunkIndex().write(os.path.join(self.path, 'chunks').encode('utf-8'))
os.makedirs(os.path.join(self.path, 'chunks.archive.d'))
with open(os.path.join(self.path, 'files'), 'wb') as fd:
pass # empty file
def _do_open(self):
self.config = configparser.ConfigParser(interpolation=None)
config_path = os.path.join(self.path, 'config')
self.config.read(config_path)
try:
cache_version = self.config.getint('cache', 'version')
wanted_version = 1
if cache_version != wanted_version:
raise Exception('%s has unexpected cache version %d (wanted: %d).' % (
config_path, cache_version, wanted_version))
except configparser.NoSectionError:
raise Exception('%s does not look like a Borg cache.' % config_path) from None
self.id = self.config.get('cache', 'repository')
self.manifest_id = unhexlify(self.config.get('cache', 'manifest'))
self.timestamp = self.config.get('cache', 'timestamp', fallback=None)
self.key_type = self.config.get('cache', 'key_type', fallback=None)
self.previous_location = self.config.get('cache', 'previous_location', fallback=None)
self.chunks = ChunkIndex.read(os.path.join(self.path, 'chunks').encode('utf-8'))
self.files = None
def open(self, lock_wait=None):
if not os.path.isdir(self.path):
raise Exception('%s Does not look like a Borg cache' % self.path)
self.lock = UpgradableLock(os.path.join(self.path, 'lock'), exclusive=True, timeout=lock_wait).acquire()
self.rollback()
def close(self):
if self.lock is not None:
self.lock.release()
self.lock = None
def _read_files(self):
self.files = {}
self._newest_mtime = 0
logger.debug('Reading files cache ...')
with open(os.path.join(self.path, 'files'), 'rb') as fd:
u = msgpack.Unpacker(use_list=True)
while True:
data = fd.read(64 * 1024)
if not data:
break
u.feed(data)
for path_hash, item in u:
item[0] += 1
# in the end, this takes about 240 Bytes per file
self.files[path_hash] = msgpack.packb(item)
def begin_txn(self):
# Initialize transaction snapshot
txn_dir = os.path.join(self.path, 'txn.tmp')
os.mkdir(txn_dir)
shutil.copy(os.path.join(self.path, 'config'), txn_dir)
shutil.copy(os.path.join(self.path, 'chunks'), txn_dir)
shutil.copy(os.path.join(self.path, 'files'), txn_dir)
os.rename(os.path.join(self.path, 'txn.tmp'),
os.path.join(self.path, 'txn.active'))
self.txn_active = True
def commit(self):
"""Commit transaction
"""
if not self.txn_active:
return
if self.files is not None:
with open(os.path.join(self.path, 'files'), 'wb') as fd:
for path_hash, item in self.files.items():
# Discard cached files with the newest mtime to avoid
# issues with filesystem snapshots and mtime precision
item = msgpack.unpackb(item)
if item[0] < 10 and bigint_to_int(item[3]) < self._newest_mtime:
msgpack.pack((path_hash, item), fd)
self.config.set('cache', 'manifest', hexlify(self.manifest.id).decode('ascii'))
self.config.set('cache', 'timestamp', self.manifest.timestamp)
self.config.set('cache', 'key_type', str(self.key.TYPE))
self.config.set('cache', 'previous_location', self.repository._location.canonical_path())
with open(os.path.join(self.path, 'config'), 'w') as fd:
self.config.write(fd)
self.chunks.write(os.path.join(self.path, 'chunks').encode('utf-8'))
os.rename(os.path.join(self.path, 'txn.active'),
os.path.join(self.path, 'txn.tmp'))
shutil.rmtree(os.path.join(self.path, 'txn.tmp'))
self.txn_active = False
def rollback(self):
"""Roll back partial and aborted transactions
"""
# Remove partial transaction
if os.path.exists(os.path.join(self.path, 'txn.tmp')):
shutil.rmtree(os.path.join(self.path, 'txn.tmp'))
# Roll back active transaction
txn_dir = os.path.join(self.path, 'txn.active')
if os.path.exists(txn_dir):
shutil.copy(os.path.join(txn_dir, 'config'), self.path)
shutil.copy(os.path.join(txn_dir, 'chunks'), self.path)
shutil.copy(os.path.join(txn_dir, 'files'), self.path)
os.rename(txn_dir, os.path.join(self.path, 'txn.tmp'))
if os.path.exists(os.path.join(self.path, 'txn.tmp')):
shutil.rmtree(os.path.join(self.path, 'txn.tmp'))
self.txn_active = False
self._do_open()
def sync(self):
"""Re-synchronize chunks cache with repository.
Maintains a directory with known backup archive indexes, so it only
needs to fetch infos from repo and build a chunk index once per backup
archive.
If out of sync, missing archive indexes get added, outdated indexes
get removed and a new master chunks index is built by merging all
archive indexes.
"""
archive_path = os.path.join(self.path, 'chunks.archive.d')
def mkpath(id, suffix=''):
id_hex = hexlify(id).decode('ascii')
path = os.path.join(archive_path, id_hex + suffix)
return path.encode('utf-8')
def cached_archives():
if self.do_cache:
fns = os.listdir(archive_path)
# filenames with 64 hex digits == 256bit
return set(unhexlify(fn) for fn in fns if len(fn) == 64)
else:
return set()
def repo_archives():
return set(info[b'id'] for info in self.manifest.archives.values())
def cleanup_outdated(ids):
for id in ids:
os.unlink(mkpath(id))
def fetch_and_build_idx(archive_id, repository, key):
chunk_idx = ChunkIndex()
cdata = repository.get(archive_id)
data = key.decrypt(archive_id, cdata)
chunk_idx.add(archive_id, 1, len(data), len(cdata))
archive = msgpack.unpackb(data)
if archive[b'version'] != 1:
raise Exception('Unknown archive metadata version')
decode_dict(archive, (b'name',))
unpacker = msgpack.Unpacker()
for item_id, chunk in zip(archive[b'items'], repository.get_many(archive[b'items'])):
data = key.decrypt(item_id, chunk)
chunk_idx.add(item_id, 1, len(data), len(chunk))
unpacker.feed(data)
for item in unpacker:
if not isinstance(item, dict):
logger.error('Error: Did not get expected metadata dict - archive corrupted!')
continue
if b'chunks' in item:
for chunk_id, size, csize in item[b'chunks']:
chunk_idx.add(chunk_id, 1, size, csize)
if self.do_cache:
fn = mkpath(archive_id)
fn_tmp = mkpath(archive_id, suffix='.tmp')
try:
chunk_idx.write(fn_tmp)
except Exception:
os.unlink(fn_tmp)
else:
os.rename(fn_tmp, fn)
return chunk_idx
def lookup_name(archive_id):
for name, info in self.manifest.archives.items():
if info[b'id'] == archive_id:
return name
def create_master_idx(chunk_idx):
logger.info('Synchronizing chunks cache...')
cached_ids = cached_archives()
archive_ids = repo_archives()
logger.info('Archives: %d, w/ cached Idx: %d, w/ outdated Idx: %d, w/o cached Idx: %d.' % (
len(archive_ids), len(cached_ids),
len(cached_ids - archive_ids), len(archive_ids - cached_ids), ))
# deallocates old hashindex, creates empty hashindex:
chunk_idx.clear()
cleanup_outdated(cached_ids - archive_ids)
if archive_ids:
chunk_idx = None
for archive_id in archive_ids:
archive_name = lookup_name(archive_id)
if archive_id in cached_ids:
archive_chunk_idx_path = mkpath(archive_id)
logger.info("Reading cached archive chunk index for %s ..." % archive_name)
archive_chunk_idx = ChunkIndex.read(archive_chunk_idx_path)
else:
logger.info('Fetching and building archive index for %s ...' % archive_name)
archive_chunk_idx = fetch_and_build_idx(archive_id, repository, self.key)
logger.info("Merging into master chunks index ...")
if chunk_idx is None:
# we just use the first archive's idx as starting point,
# to avoid growing the hash table from 0 size and also
# to save 1 merge call.
chunk_idx = archive_chunk_idx
else:
chunk_idx.merge(archive_chunk_idx)
logger.info('Done.')
return chunk_idx
def legacy_cleanup():
"""bring old cache dirs into the desired state (cleanup and adapt)"""
try:
os.unlink(os.path.join(self.path, 'chunks.archive'))
except:
pass
try:
os.unlink(os.path.join(self.path, 'chunks.archive.tmp'))
except:
pass
try:
os.mkdir(archive_path)
except:
pass
self.begin_txn()
with cache_if_remote(self.repository) as repository:
legacy_cleanup()
# TEMPORARY HACK: to avoid archive index caching, create a FILE named ~/.cache/borg/REPOID/chunks.archive.d -
# this is only recommended if you have a fast, low latency connection to your repo (e.g. if repo is local disk)
self.do_cache = os.path.isdir(archive_path)
self.chunks = create_master_idx(self.chunks)
def add_chunk(self, id, data, stats):
if not self.txn_active:
self.begin_txn()
size = len(data)
if self.seen_chunk(id, size):
return self.chunk_incref(id, stats)
data = self.key.encrypt(data)
csize = len(data)
self.repository.put(id, data, wait=False)
self.chunks[id] = (1, size, csize)
stats.update(size, csize, True)
return id, size, csize
def seen_chunk(self, id, size=None):
refcount, stored_size, _ = self.chunks.get(id, (0, None, None))
if size is not None and stored_size is not None and size != stored_size:
# we already have a chunk with that id, but different size.
# this is either a hash collision (unlikely) or corruption or a bug.
raise Exception("chunk has same id [%r], but different size (stored: %d new: %d)!" % (
id, stored_size, size))
return refcount
def chunk_incref(self, id, stats):
if not self.txn_active:
self.begin_txn()
count, size, csize = self.chunks.incref(id)
stats.update(size, csize, False)
return id, size, csize
def chunk_decref(self, id, stats):
if not self.txn_active:
self.begin_txn()
count, size, csize = self.chunks.decref(id)
if count == 0:
del self.chunks[id]
self.repository.delete(id, wait=False)
stats.update(-size, -csize, True)
else:
stats.update(-size, -csize, False)
def file_known_and_unchanged(self, path_hash, st, ignore_inode=False):
if not (self.do_files and stat.S_ISREG(st.st_mode)):
return None
if self.files is None:
self._read_files()
entry = self.files.get(path_hash)
if not entry:
return None
entry = msgpack.unpackb(entry)
if (entry[2] == st.st_size and bigint_to_int(entry[3]) == st.st_mtime_ns and
(ignore_inode or entry[1] == st.st_ino)):
# reset entry age
entry[0] = 0
self.files[path_hash] = msgpack.packb(entry)
return entry[4]
else:
return None
def memorize_file(self, path_hash, st, ids):
if not (self.do_files and stat.S_ISREG(st.st_mode)):
return
# Entry: Age, inode, size, mtime, chunk ids
mtime_ns = st.st_mtime_ns
self.files[path_hash] = msgpack.packb((0, st.st_ino, st.st_size, int_to_bigint(mtime_ns), ids))
self._newest_mtime = max(self._newest_mtime, mtime_ns)

65
borg/chunker.pyx Normal file
View file

@ -0,0 +1,65 @@
# -*- coding: utf-8 -*-
API_VERSION = 2
from libc.stdlib cimport free
cdef extern from "_chunker.c":
ctypedef int uint32_t
ctypedef struct _Chunker "Chunker":
pass
_Chunker *chunker_init(int window_size, int chunk_mask, int min_size, int max_size, uint32_t seed)
void chunker_set_fd(_Chunker *chunker, object f, int fd)
void chunker_free(_Chunker *chunker)
object chunker_process(_Chunker *chunker)
uint32_t *buzhash_init_table(uint32_t seed)
uint32_t c_buzhash "buzhash"(unsigned char *data, size_t len, uint32_t *h)
uint32_t c_buzhash_update "buzhash_update"(uint32_t sum, unsigned char remove, unsigned char add, size_t len, uint32_t *h)
cdef class Chunker:
cdef _Chunker *chunker
def __cinit__(self, int seed, int chunk_min_exp, int chunk_max_exp, int hash_mask_bits, int hash_window_size):
min_size = 1 << chunk_min_exp
max_size = 1 << chunk_max_exp
hash_mask = (1 << hash_mask_bits) - 1
self.chunker = chunker_init(hash_window_size, hash_mask, min_size, max_size, seed & 0xffffffff)
def chunkify(self, fd, fh=-1):
"""
Cut a file into chunks.
:param fd: Python file object
:param fh: OS-level file handle (if available),
defaults to -1 which means not to use OS-level fd.
"""
chunker_set_fd(self.chunker, fd, fh)
return self
def __dealloc__(self):
if self.chunker:
chunker_free(self.chunker)
def __iter__(self):
return self
def __next__(self):
return chunker_process(self.chunker)
def buzhash(unsigned char *data, unsigned long seed):
cdef uint32_t *table
cdef uint32_t sum
table = buzhash_init_table(seed & 0xffffffff)
sum = c_buzhash(data, len(data), table)
free(table)
return sum
def buzhash_update(uint32_t sum, unsigned char remove, unsigned char add, size_t len, unsigned long seed):
cdef uint32_t *table
table = buzhash_init_table(seed & 0xffffffff)
sum = c_buzhash_update(sum, remove, add, len, table)
free(table)
return sum

199
borg/compress.pyx Normal file
View file

@ -0,0 +1,199 @@
import zlib
try:
import lzma
except ImportError:
lzma = None
cdef extern from "lz4.h":
int LZ4_compress_limitedOutput(const char* source, char* dest, int inputSize, int maxOutputSize) nogil
int LZ4_decompress_safe(const char* source, char* dest, int inputSize, int maxOutputSize) nogil
cdef class CompressorBase:
"""
base class for all (de)compression classes,
also handles compression format auto detection and
adding/stripping the ID header (which enable auto detection).
"""
ID = b'\xFF\xFF' # reserved and not used
# overwrite with a unique 2-bytes bytestring in child classes
name = 'baseclass'
@classmethod
def detect(cls, data):
return data.startswith(cls.ID)
def __init__(self, **kwargs):
pass
def compress(self, data):
# add ID bytes
return self.ID + data
def decompress(self, data):
# strip ID bytes
return data[2:]
class CNONE(CompressorBase):
"""
none - no compression, just pass through data
"""
ID = b'\x00\x00'
name = 'none'
def compress(self, data):
return super().compress(data)
def decompress(self, data):
data = super().decompress(data)
if not isinstance(data, bytes):
data = bytes(data)
return data
cdef class LZ4(CompressorBase):
"""
raw LZ4 compression / decompression (liblz4).
Features:
- lz4 is super fast
- wrapper releases CPython's GIL to support multithreaded code
- buffer given by caller, avoiding frequent reallocation and buffer duplication
- uses safe lz4 methods that never go beyond the end of the output buffer
But beware:
- this is not very generic, the given buffer MUST be large enough to
handle all compression or decompression output (or it will fail).
- you must not do method calls to the same LZ4 instance from different
threads at the same time - create one LZ4 instance per thread!
"""
ID = b'\x01\x00'
name = 'lz4'
cdef char *buffer # helper buffer for (de)compression output
cdef int bufsize # size of this buffer
def __cinit__(self, **kwargs):
buffer = kwargs['buffer']
self.buffer = buffer
self.bufsize = len(buffer)
def compress(self, idata):
if not isinstance(idata, bytes):
idata = bytes(idata) # code below does not work with memoryview
cdef int isize = len(idata)
cdef int osize = self.bufsize
cdef char *source = idata
cdef char *dest = self.buffer
with nogil:
osize = LZ4_compress_limitedOutput(source, dest, isize, osize)
if not osize:
raise Exception('lz4 compress failed')
return super().compress(dest[:osize])
def decompress(self, idata):
if not isinstance(idata, bytes):
idata = bytes(idata) # code below does not work with memoryview
idata = super().decompress(idata)
cdef int isize = len(idata)
cdef int osize = self.bufsize
cdef char *source = idata
cdef char *dest = self.buffer
with nogil:
osize = LZ4_decompress_safe(source, dest, isize, osize)
if osize < 0:
# malformed input data, buffer too small, ...
raise Exception('lz4 decompress failed')
return dest[:osize]
class LZMA(CompressorBase):
"""
lzma compression / decompression
"""
ID = b'\x02\x00'
name = 'lzma'
def __init__(self, level=6, **kwargs):
super().__init__(**kwargs)
self.level = level
if lzma is None:
raise ValueError('No lzma support found.')
def compress(self, data):
# we do not need integrity checks in lzma, we do that already
data = lzma.compress(data, preset=self.level, check=lzma.CHECK_NONE)
return super().compress(data)
def decompress(self, data):
data = super().decompress(data)
return lzma.decompress(data)
class ZLIB(CompressorBase):
"""
zlib compression / decompression (python stdlib)
"""
ID = b'\x08\x00' # not used here, see detect()
# avoid all 0x.8.. IDs elsewhere!
name = 'zlib'
@classmethod
def detect(cls, data):
# matches misc. patterns 0x.8.. used by zlib
cmf, flg = data[:2]
is_deflate = cmf & 0x0f == 8
check_ok = (cmf * 256 + flg) % 31 == 0
return check_ok and is_deflate
def __init__(self, level=6, **kwargs):
super().__init__(**kwargs)
self.level = level
def compress(self, data):
# note: for compatibility no super call, do not add ID bytes
return zlib.compress(data, self.level)
def decompress(self, data):
# note: for compatibility no super call, do not strip ID bytes
return zlib.decompress(data)
COMPRESSOR_TABLE = {
CNONE.name: CNONE,
LZ4.name: LZ4,
ZLIB.name: ZLIB,
LZMA.name: LZMA,
}
COMPRESSOR_LIST = [LZ4, CNONE, ZLIB, LZMA, ] # check fast stuff first
def get_compressor(name, **kwargs):
cls = COMPRESSOR_TABLE[name]
return cls(**kwargs)
class Compressor:
"""
compresses using a compressor with given name and parameters
decompresses everything we can handle (autodetect)
"""
def __init__(self, name='null', **kwargs):
self.params = kwargs
self.compressor = get_compressor(name, **self.params)
def compress(self, data):
return self.compressor.compress(data)
def decompress(self, data):
hdr = bytes(data[:2]) # detect() does not work with memoryview
for cls in COMPRESSOR_LIST:
if cls.detect(hdr):
return cls(**self.params).decompress(data)
else:
raise ValueError('No decompressor for this data found: %r.', data[:2])
# a buffer used for (de)compression result, which can be slightly bigger
# than the chunk buffer in the worst (incompressible data) case, add 10%:
COMPR_BUFFER = bytes(int(1.1 * 2 ** 23)) # CHUNK_MAX_EXP == 23

138
borg/crypto.pyx Normal file
View file

@ -0,0 +1,138 @@
"""A thin OpenSSL wrapper
This could be replaced by PyCrypto maybe?
"""
from libc.stdlib cimport malloc, free
API_VERSION = 2
cdef extern from "openssl/rand.h":
int RAND_bytes(unsigned char *buf, int num)
cdef extern from "openssl/evp.h":
ctypedef struct EVP_MD:
pass
ctypedef struct EVP_CIPHER:
pass
ctypedef struct EVP_CIPHER_CTX:
unsigned char *iv
pass
ctypedef struct ENGINE:
pass
const EVP_CIPHER *EVP_aes_256_ctr()
void EVP_CIPHER_CTX_init(EVP_CIPHER_CTX *a)
void EVP_CIPHER_CTX_cleanup(EVP_CIPHER_CTX *a)
int EVP_EncryptInit_ex(EVP_CIPHER_CTX *ctx, const EVP_CIPHER *cipher, ENGINE *impl,
const unsigned char *key, const unsigned char *iv)
int EVP_DecryptInit_ex(EVP_CIPHER_CTX *ctx, const EVP_CIPHER *cipher, ENGINE *impl,
const unsigned char *key, const unsigned char *iv)
int EVP_EncryptUpdate(EVP_CIPHER_CTX *ctx, unsigned char *out, int *outl,
const unsigned char *in_, int inl)
int EVP_DecryptUpdate(EVP_CIPHER_CTX *ctx, unsigned char *out, int *outl,
const unsigned char *in_, int inl)
int EVP_EncryptFinal_ex(EVP_CIPHER_CTX *ctx, unsigned char *out, int *outl)
int EVP_DecryptFinal_ex(EVP_CIPHER_CTX *ctx, unsigned char *out, int *outl)
import struct
_int = struct.Struct('>I')
_long = struct.Struct('>Q')
bytes_to_int = lambda x, offset=0: _int.unpack_from(x, offset)[0]
bytes_to_long = lambda x, offset=0: _long.unpack_from(x, offset)[0]
long_to_bytes = lambda x: _long.pack(x)
def num_aes_blocks(int length):
"""Return the number of AES blocks required to encrypt/decrypt *length* bytes of data.
Note: this is only correct for modes without padding, like AES-CTR.
"""
return (length + 15) // 16
cdef class AES:
"""A thin wrapper around the OpenSSL EVP cipher API
"""
cdef EVP_CIPHER_CTX ctx
cdef int is_encrypt
def __cinit__(self, is_encrypt, key, iv=None):
EVP_CIPHER_CTX_init(&self.ctx)
self.is_encrypt = is_encrypt
# Set cipher type and mode
cipher_mode = EVP_aes_256_ctr()
if self.is_encrypt:
if not EVP_EncryptInit_ex(&self.ctx, cipher_mode, NULL, NULL, NULL):
raise Exception('EVP_EncryptInit_ex failed')
else: # decrypt
if not EVP_DecryptInit_ex(&self.ctx, cipher_mode, NULL, NULL, NULL):
raise Exception('EVP_DecryptInit_ex failed')
self.reset(key, iv)
def __dealloc__(self):
EVP_CIPHER_CTX_cleanup(&self.ctx)
def reset(self, key=None, iv=None):
cdef const unsigned char *key2 = NULL
cdef const unsigned char *iv2 = NULL
if key:
key2 = key
if iv:
iv2 = iv
# Initialise key and IV
if self.is_encrypt:
if not EVP_EncryptInit_ex(&self.ctx, NULL, NULL, key2, iv2):
raise Exception('EVP_EncryptInit_ex failed')
else: # decrypt
if not EVP_DecryptInit_ex(&self.ctx, NULL, NULL, key2, iv2):
raise Exception('EVP_DecryptInit_ex failed')
@property
def iv(self):
return self.ctx.iv[:16]
def encrypt(self, data):
cdef int inl = len(data)
cdef int ctl = 0
cdef int outl = 0
# note: modes that use padding, need up to one extra AES block (16b)
cdef unsigned char *out = <unsigned char *>malloc(inl+16)
if not out:
raise MemoryError
try:
if not EVP_EncryptUpdate(&self.ctx, out, &outl, data, inl):
raise Exception('EVP_EncryptUpdate failed')
ctl = outl
if not EVP_EncryptFinal_ex(&self.ctx, out+ctl, &outl):
raise Exception('EVP_EncryptFinal failed')
ctl += outl
return out[:ctl]
finally:
free(out)
def decrypt(self, data):
cdef int inl = len(data)
cdef int ptl = 0
cdef int outl = 0
# note: modes that use padding, need up to one extra AES block (16b).
# This is what the openssl docs say. I am not sure this is correct,
# but OTOH it will not cause any harm if our buffer is a little bigger.
cdef unsigned char *out = <unsigned char *>malloc(inl+16)
if not out:
raise MemoryError
try:
if not EVP_DecryptUpdate(&self.ctx, out, &outl, data, inl):
raise Exception('EVP_DecryptUpdate failed')
ptl = outl
if EVP_DecryptFinal_ex(&self.ctx, out+ptl, &outl) <= 0:
# this error check is very important for modes with padding or
# authentication. for them, a failure here means corrupted data.
# CTR mode does not use padding nor authentication.
raise Exception('EVP_DecryptFinal failed')
ptl += outl
return out[:ptl]
finally:
free(out)

268
borg/fuse.py Normal file
View file

@ -0,0 +1,268 @@
from collections import defaultdict
import errno
import io
import llfuse
import os
import stat
import tempfile
import time
from .archive import Archive
from .helpers import daemonize, bigint_to_int
from distutils.version import LooseVersion
import msgpack
# Does this version of llfuse support ns precision?
have_fuse_xtime_ns = hasattr(llfuse.EntryAttributes, 'st_mtime_ns')
fuse_version = LooseVersion(getattr(llfuse, '__version__', '0.1'))
if fuse_version >= '0.42':
def fuse_main():
return llfuse.main(workers=1)
else:
def fuse_main():
llfuse.main(single=True)
return None
class ItemCache:
def __init__(self):
self.fd = tempfile.TemporaryFile(prefix='borg-tmp')
self.offset = 1000000
def add(self, item):
pos = self.fd.seek(0, io.SEEK_END)
self.fd.write(msgpack.packb(item))
return pos + self.offset
def get(self, inode):
self.fd.seek(inode - self.offset, io.SEEK_SET)
return next(msgpack.Unpacker(self.fd, read_size=1024))
class FuseOperations(llfuse.Operations):
"""Export archive as a fuse filesystem
"""
def __init__(self, key, repository, manifest, archive, cached_repo):
super().__init__()
self._inode_count = 0
self.key = key
self.repository = cached_repo
self.items = {}
self.parent = {}
self.contents = defaultdict(dict)
self.default_dir = {b'mode': 0o40755, b'mtime': int(time.time() * 1e9), b'uid': os.getuid(), b'gid': os.getgid()}
self.pending_archives = {}
self.accounted_chunks = {}
self.cache = ItemCache()
if archive:
self.process_archive(archive)
else:
# Create root inode
self.parent[1] = self.allocate_inode()
self.items[1] = self.default_dir
for archive_name in manifest.archives:
# Create archive placeholder inode
archive_inode = self.allocate_inode()
self.items[archive_inode] = self.default_dir
self.parent[archive_inode] = 1
self.contents[1][os.fsencode(archive_name)] = archive_inode
self.pending_archives[archive_inode] = Archive(repository, key, manifest, archive_name)
def process_archive(self, archive, prefix=[]):
"""Build fuse inode hierarchy from archive metadata
"""
unpacker = msgpack.Unpacker()
for key, chunk in zip(archive.metadata[b'items'], self.repository.get_many(archive.metadata[b'items'])):
data = self.key.decrypt(key, chunk)
unpacker.feed(data)
for item in unpacker:
segments = prefix + os.fsencode(os.path.normpath(item[b'path'])).split(b'/')
del item[b'path']
num_segments = len(segments)
parent = 1
for i, segment in enumerate(segments, 1):
# Insert a default root inode if needed
if self._inode_count == 0 and segment:
archive_inode = self.allocate_inode()
self.items[archive_inode] = self.default_dir
self.parent[archive_inode] = parent
# Leaf segment?
if i == num_segments:
if b'source' in item and stat.S_ISREG(item[b'mode']):
inode = self._find_inode(item[b'source'], prefix)
item = self.cache.get(inode)
item[b'nlink'] = item.get(b'nlink', 1) + 1
self.items[inode] = item
else:
inode = self.cache.add(item)
self.parent[inode] = parent
if segment:
self.contents[parent][segment] = inode
elif segment in self.contents[parent]:
parent = self.contents[parent][segment]
else:
inode = self.allocate_inode()
self.items[inode] = self.default_dir
self.parent[inode] = parent
if segment:
self.contents[parent][segment] = inode
parent = inode
def allocate_inode(self):
self._inode_count += 1
return self._inode_count
def statfs(self, ctx=None):
stat_ = llfuse.StatvfsData()
stat_.f_bsize = 512
stat_.f_frsize = 512
stat_.f_blocks = 0
stat_.f_bfree = 0
stat_.f_bavail = 0
stat_.f_files = 0
stat_.f_ffree = 0
stat_.f_favail = 0
return stat_
def get_item(self, inode):
try:
return self.items[inode]
except KeyError:
return self.cache.get(inode)
def _find_inode(self, path, prefix=[]):
segments = prefix + os.fsencode(os.path.normpath(path)).split(b'/')
inode = 1
for segment in segments:
inode = self.contents[inode][segment]
return inode
def getattr(self, inode, ctx=None):
item = self.get_item(inode)
size = 0
dsize = 0
try:
for key, chunksize, _ in item[b'chunks']:
size += chunksize
if self.accounted_chunks.get(key, inode) == inode:
self.accounted_chunks[key] = inode
dsize += chunksize
except KeyError:
pass
entry = llfuse.EntryAttributes()
entry.st_ino = inode
entry.generation = 0
entry.entry_timeout = 300
entry.attr_timeout = 300
entry.st_mode = item[b'mode']
entry.st_nlink = item.get(b'nlink', 1)
entry.st_uid = item[b'uid']
entry.st_gid = item[b'gid']
entry.st_rdev = item.get(b'rdev', 0)
entry.st_size = size
entry.st_blksize = 512
entry.st_blocks = dsize / 512
# note: older archives only have mtime (not atime nor ctime)
if have_fuse_xtime_ns:
entry.st_mtime_ns = bigint_to_int(item[b'mtime'])
if b'atime' in item:
entry.st_atime_ns = bigint_to_int(item[b'atime'])
else:
entry.st_atime_ns = bigint_to_int(item[b'mtime'])
if b'ctime' in item:
entry.st_ctime_ns = bigint_to_int(item[b'ctime'])
else:
entry.st_ctime_ns = bigint_to_int(item[b'mtime'])
else:
entry.st_mtime = bigint_to_int(item[b'mtime']) / 1e9
if b'atime' in item:
entry.st_atime = bigint_to_int(item[b'atime']) / 1e9
else:
entry.st_atime = bigint_to_int(item[b'mtime']) / 1e9
if b'ctime' in item:
entry.st_ctime = bigint_to_int(item[b'ctime']) / 1e9
else:
entry.st_ctime = bigint_to_int(item[b'mtime']) / 1e9
return entry
def listxattr(self, inode, ctx=None):
item = self.get_item(inode)
return item.get(b'xattrs', {}).keys()
def getxattr(self, inode, name, ctx=None):
item = self.get_item(inode)
try:
return item.get(b'xattrs', {})[name]
except KeyError:
raise llfuse.FUSEError(errno.ENODATA) from None
def _load_pending_archive(self, inode):
# Check if this is an archive we need to load
archive = self.pending_archives.pop(inode, None)
if archive:
self.process_archive(archive, [os.fsencode(archive.name)])
def lookup(self, parent_inode, name, ctx=None):
self._load_pending_archive(parent_inode)
if name == b'.':
inode = parent_inode
elif name == b'..':
inode = self.parent[parent_inode]
else:
inode = self.contents[parent_inode].get(name)
if not inode:
raise llfuse.FUSEError(errno.ENOENT)
return self.getattr(inode)
def open(self, inode, flags, ctx=None):
return inode
def opendir(self, inode, ctx=None):
self._load_pending_archive(inode)
return inode
def read(self, fh, offset, size):
parts = []
item = self.get_item(fh)
for id, s, csize in item[b'chunks']:
if s < offset:
offset -= s
continue
n = min(size, s - offset)
chunk = self.key.decrypt(id, self.repository.get(id))
parts.append(chunk[offset:offset + n])
offset = 0
size -= n
if not size:
break
return b''.join(parts)
def readdir(self, fh, off):
entries = [(b'.', fh), (b'..', self.parent[fh])]
entries.extend(self.contents[fh].items())
for i, (name, inode) in enumerate(entries[off:], off):
yield name, self.getattr(inode), i + 1
def readlink(self, inode, ctx=None):
item = self.get_item(inode)
return os.fsencode(item[b'source'])
def mount(self, mountpoint, extra_options, foreground=False):
options = ['fsname=borgfs', 'ro']
if extra_options:
options.extend(extra_options.split(','))
llfuse.init(self, mountpoint, options)
if not foreground:
daemonize()
# If the file system crashes, we do not want to umount because in that
# case the mountpoint suddenly appears to become empty. This can have
# nasty consequences, imagine the user has e.g. an active rsync mirror
# job - seeing the mountpoint empty, rsync would delete everything in the
# mirror.
umount = False
try:
signal = fuse_main()
umount = (signal is None) # no crash and no signal -> umount request
finally:
llfuse.close(umount)

103
borg/hash_sizes.py Normal file
View file

@ -0,0 +1,103 @@
"""
Compute hashtable sizes with nices properties
- prime sizes (for small to medium sizes)
- 2 prime-factor sizes (for big sizes)
- fast growth for small sizes
- slow growth for big sizes
Note:
this is just a tool for developers.
within borgbackup, it is just used to generate hash_sizes definition for _hashindex.c.
"""
from collections import namedtuple
K, M, G = 2**10, 2**20, 2**30
# hash table size (in number of buckets)
start, end_p1, end_p2 = 1 * K, 127 * M, 2 * G - 10 * M # stay well below 2^31 - 1
Policy = namedtuple("Policy", "upto grow")
policies = [
# which growth factor to use when growing a hashtable of size < upto
# grow fast (*2.0) at the start so we do not have to resize too often (expensive).
# grow slow (*1.1) for huge hash tables (do not jump too much in memory usage)
Policy(256*K, 2.0),
Policy(2*M, 1.7),
Policy(16*M, 1.4),
Policy(128*M, 1.2),
Policy(2*G-1, 1.1),
]
# slightly modified version of:
# http://www.macdevcenter.com/pub/a/python/excerpt/pythonckbk_chap1/index1.html?page=2
def eratosthenes():
"""Yields the sequence of prime numbers via the Sieve of Eratosthenes."""
D = {} # map each composite integer to its first-found prime factor
q = 2 # q gets 2, 3, 4, 5, ... ad infinitum
while True:
p = D.pop(q, None)
if p is None:
# q not a key in D, so q is prime, therefore, yield it
yield q
# mark q squared as not-prime (with q as first-found prime factor)
D[q * q] = q
else:
# let x <- smallest (N*p)+q which wasn't yet known to be composite
# we just learned x is composite, with p first-found prime factor,
# since p is the first-found prime factor of q -- find and mark it
x = p + q
while x in D:
x += p
D[x] = p
q += 1
def two_prime_factors(pfix=65537):
"""Yields numbers with 2 prime factors pfix and p."""
for p in eratosthenes():
yield pfix * p
def get_grow_factor(size):
for p in policies:
if size < p.upto:
return p.grow
def find_bigger_prime(gen, i):
while True:
p = next(gen)
if p >= i:
return p
def main():
sizes = []
i = start
gen = eratosthenes()
while i < end_p1:
grow_factor = get_grow_factor(i)
p = find_bigger_prime(gen, i)
sizes.append(p)
i = int(i * grow_factor)
gen = two_prime_factors() # for lower ram consumption
while i < end_p2:
grow_factor = get_grow_factor(i)
p = find_bigger_prime(gen, i)
sizes.append(p)
i = int(i * grow_factor)
print("""\
static int hash_sizes[] = {
%s
};
""" % ', '.join(str(size) for size in sizes))
if __name__ == '__main__':
main()

345
borg/hashindex.pyx Normal file
View file

@ -0,0 +1,345 @@
# -*- coding: utf-8 -*-
import os
cimport cython
from libc.stdint cimport uint32_t, UINT32_MAX, uint64_t
API_VERSION = 2
cdef extern from "_hashindex.c":
ctypedef struct HashIndex:
pass
HashIndex *hashindex_read(char *path)
HashIndex *hashindex_init(int capacity, int key_size, int value_size)
void hashindex_free(HashIndex *index)
void hashindex_merge(HashIndex *index, HashIndex *other)
void hashindex_add(HashIndex *index, void *key, void *value)
int hashindex_get_size(HashIndex *index)
int hashindex_write(HashIndex *index, char *path)
void *hashindex_get(HashIndex *index, void *key)
void *hashindex_next_key(HashIndex *index, void *key)
int hashindex_delete(HashIndex *index, void *key)
int hashindex_set(HashIndex *index, void *key, void *value)
uint32_t _htole32(uint32_t v)
uint32_t _le32toh(uint32_t v)
cdef _NoDefault = object()
"""
The HashIndex is *not* a general purpose data structure. The value size must be at least 4 bytes, and these
first bytes are used for in-band signalling in the data structure itself.
The constant MAX_VALUE defines the valid range for these 4 bytes when interpreted as an uint32_t from 0
to MAX_VALUE (inclusive). The following reserved values beyond MAX_VALUE are currently in use
(byte order is LE)::
0xffffffff marks empty entries in the hashtable
0xfffffffe marks deleted entries in the hashtable
None of the publicly available classes in this module will accept nor return a reserved value;
AssertionError is raised instead.
"""
assert UINT32_MAX == 2**32-1
# module-level constant because cdef's in classes can't have default values
cdef uint32_t _MAX_VALUE = 2**32-1025
MAX_VALUE = _MAX_VALUE
assert _MAX_VALUE % 2 == 1
@cython.internal
cdef class IndexBase:
cdef HashIndex *index
cdef int key_size
def __cinit__(self, capacity=0, path=None, key_size=32):
self.key_size = key_size
if path:
path = os.fsencode(path)
self.index = hashindex_read(path)
if not self.index:
raise Exception('hashindex_read failed')
else:
self.index = hashindex_init(capacity, self.key_size, self.value_size)
if not self.index:
raise Exception('hashindex_init failed')
def __dealloc__(self):
if self.index:
hashindex_free(self.index)
@classmethod
def read(cls, path):
return cls(path=path)
def write(self, path):
path = os.fsencode(path)
if not hashindex_write(self.index, path):
raise Exception('hashindex_write failed')
def clear(self):
hashindex_free(self.index)
self.index = hashindex_init(0, self.key_size, self.value_size)
if not self.index:
raise Exception('hashindex_init failed')
def setdefault(self, key, value):
if not key in self:
self[key] = value
def __delitem__(self, key):
assert len(key) == self.key_size
if not hashindex_delete(self.index, <char *>key):
raise Exception('hashindex_delete failed')
def get(self, key, default=None):
try:
return self[key]
except KeyError:
return default
def pop(self, key, default=_NoDefault):
try:
value = self[key]
del self[key]
return value
except KeyError:
if default != _NoDefault:
return default
raise
def __len__(self):
return hashindex_get_size(self.index)
cdef class NSIndex(IndexBase):
value_size = 8
def __getitem__(self, key):
assert len(key) == self.key_size
data = <uint32_t *>hashindex_get(self.index, <char *>key)
if not data:
raise KeyError(key)
cdef uint32_t segment = _le32toh(data[0])
assert segment <= _MAX_VALUE, "maximum number of segments reached"
return segment, _le32toh(data[1])
def __setitem__(self, key, value):
assert len(key) == self.key_size
cdef uint32_t[2] data
cdef uint32_t segment = value[0]
assert segment <= _MAX_VALUE, "maximum number of segments reached"
data[0] = _htole32(segment)
data[1] = _htole32(value[1])
if not hashindex_set(self.index, <char *>key, data):
raise Exception('hashindex_set failed')
def __contains__(self, key):
cdef uint32_t segment
assert len(key) == self.key_size
data = <uint32_t *>hashindex_get(self.index, <char *>key)
if data != NULL:
segment = _le32toh(data[0])
assert segment <= _MAX_VALUE, "maximum number of segments reached"
return data != NULL
def iteritems(self, marker=None):
cdef const void *key
iter = NSKeyIterator(self.key_size)
iter.idx = self
iter.index = self.index
if marker:
key = hashindex_get(self.index, <char *>marker)
if marker is None:
raise IndexError
iter.key = key - self.key_size
return iter
cdef class NSKeyIterator:
cdef NSIndex idx
cdef HashIndex *index
cdef const void *key
cdef int key_size
def __cinit__(self, key_size):
self.key = NULL
self.key_size = key_size
def __iter__(self):
return self
def __next__(self):
self.key = hashindex_next_key(self.index, <char *>self.key)
if not self.key:
raise StopIteration
cdef uint32_t *value = <uint32_t *>(self.key + self.key_size)
cdef uint32_t segment = _le32toh(value[0])
assert segment <= _MAX_VALUE, "maximum number of segments reached"
return (<char *>self.key)[:self.key_size], (segment, _le32toh(value[1]))
cdef class ChunkIndex(IndexBase):
"""
Mapping of 32 byte keys to (refcount, size, csize), which are all 32-bit unsigned.
The reference count cannot overflow. If an overflow would occur, the refcount
is fixed to MAX_VALUE and will neither increase nor decrease by incref(), decref()
or add().
Prior signed 32-bit overflow is handled correctly for most cases: All values
from UINT32_MAX (2**32-1, inclusive) to MAX_VALUE (exclusive) are reserved and either
cause silent data loss (-1, -2) or will raise an AssertionError when accessed.
Other values are handled correctly. Note that previously the refcount could also reach
0 by *increasing* it.
Assigning refcounts in this reserved range is an invalid operation and raises AssertionError.
"""
value_size = 12
def __getitem__(self, key):
assert len(key) == self.key_size
data = <uint32_t *>hashindex_get(self.index, <char *>key)
if not data:
raise KeyError(key)
cdef uint32_t refcount = _le32toh(data[0])
assert refcount <= _MAX_VALUE
return refcount, _le32toh(data[1]), _le32toh(data[2])
def __setitem__(self, key, value):
assert len(key) == self.key_size
cdef uint32_t[3] data
cdef uint32_t refcount = value[0]
assert refcount <= _MAX_VALUE, "invalid reference count"
data[0] = _htole32(refcount)
data[1] = _htole32(value[1])
data[2] = _htole32(value[2])
if not hashindex_set(self.index, <char *>key, data):
raise Exception('hashindex_set failed')
def __contains__(self, key):
assert len(key) == self.key_size
data = <uint32_t *>hashindex_get(self.index, <char *>key)
if data != NULL:
assert data[0] <= _MAX_VALUE
return data != NULL
def incref(self, key):
"""Increase refcount for 'key', return (refcount, size, csize)"""
assert len(key) == self.key_size
data = <uint32_t *>hashindex_get(self.index, <char *>key)
if not data:
raise KeyError(key)
cdef uint32_t refcount = _le32toh(data[0])
assert refcount <= _MAX_VALUE, "invalid reference count"
if refcount != _MAX_VALUE:
refcount += 1
data[0] = _htole32(refcount)
return refcount, _le32toh(data[1]), _le32toh(data[2])
def decref(self, key):
"""Decrease refcount for 'key', return (refcount, size, csize)"""
assert len(key) == self.key_size
data = <uint32_t *>hashindex_get(self.index, <char *>key)
if not data:
raise KeyError(key)
cdef uint32_t refcount = _le32toh(data[0])
# Never decrease a reference count of zero
assert 0 < refcount <= _MAX_VALUE, "invalid reference count"
if refcount != _MAX_VALUE:
refcount -= 1
data[0] = _htole32(refcount)
return refcount, _le32toh(data[1]), _le32toh(data[2])
def iteritems(self, marker=None):
cdef const void *key
iter = ChunkKeyIterator(self.key_size)
iter.idx = self
iter.index = self.index
if marker:
key = hashindex_get(self.index, <char *>marker)
if marker is None:
raise IndexError
iter.key = key - self.key_size
return iter
def summarize(self):
cdef uint64_t size = 0, csize = 0, unique_size = 0, unique_csize = 0, chunks = 0, unique_chunks = 0
cdef uint32_t *values
cdef uint32_t refcount
cdef void *key = NULL
while True:
key = hashindex_next_key(self.index, key)
if not key:
break
unique_chunks += 1
values = <uint32_t*> (key + self.key_size)
refcount = _le32toh(values[0])
assert refcount <= MAX_VALUE, "invalid reference count"
chunks += refcount
unique_size += _le32toh(values[1])
unique_csize += _le32toh(values[2])
size += <uint64_t> _le32toh(values[1]) * _le32toh(values[0])
csize += <uint64_t> _le32toh(values[2]) * _le32toh(values[0])
return size, csize, unique_size, unique_csize, unique_chunks, chunks
def add(self, key, refs, size, csize):
assert len(key) == self.key_size
cdef uint32_t[3] data
data[0] = _htole32(refs)
data[1] = _htole32(size)
data[2] = _htole32(csize)
self._add(<char*> key, data)
cdef _add(self, void *key, uint32_t *data):
cdef uint64_t refcount1, refcount2, result64
values = <uint32_t*> hashindex_get(self.index, key)
if values:
refcount1 = _le32toh(values[0])
refcount2 = _le32toh(data[0])
assert refcount1 <= _MAX_VALUE
assert refcount2 <= _MAX_VALUE
result64 = refcount1 + refcount2
values[0] = _htole32(min(result64, _MAX_VALUE))
else:
hashindex_set(self.index, key, data)
def merge(self, ChunkIndex other):
cdef void *key = NULL
while True:
key = hashindex_next_key(other.index, key)
if not key:
break
self._add(key, <uint32_t*> (key + self.key_size))
cdef class ChunkKeyIterator:
cdef ChunkIndex idx
cdef HashIndex *index
cdef const void *key
cdef int key_size
def __cinit__(self, key_size):
self.key = NULL
self.key_size = key_size
def __iter__(self):
return self
def __next__(self):
self.key = hashindex_next_key(self.index, <char *>self.key)
if not self.key:
raise StopIteration
cdef uint32_t *value = <uint32_t *>(self.key + self.key_size)
cdef uint32_t refcount = _le32toh(value[0])
assert refcount <= MAX_VALUE, "invalid reference count"
return (<char *>self.key)[:self.key_size], (refcount, _le32toh(value[1]), _le32toh(value[2]))

1077
borg/helpers.py Normal file

File diff suppressed because it is too large Load diff

463
borg/key.py Normal file
View file

@ -0,0 +1,463 @@
from binascii import hexlify, a2b_base64, b2a_base64
import configparser
import getpass
import os
import sys
import textwrap
from hmac import HMAC, compare_digest
from hashlib import sha256, pbkdf2_hmac
from .helpers import IntegrityError, get_keys_dir, Error, yes
from .logger import create_logger
logger = create_logger()
from .crypto import AES, bytes_to_long, long_to_bytes, bytes_to_int, num_aes_blocks
from .compress import Compressor, COMPR_BUFFER
import msgpack
PREFIX = b'\0' * 8
class PassphraseWrong(Error):
"""passphrase supplied in BORG_PASSPHRASE is incorrect"""
class PasswordRetriesExceeded(Error):
"""exceeded the maximum password retries"""
class UnsupportedPayloadError(Error):
"""Unsupported payload type {}. A newer version is required to access this repository."""
class KeyfileNotFoundError(Error):
"""No key file for repository {} found in {}."""
class RepoKeyNotFoundError(Error):
"""No key entry found in the config of repository {}."""
def key_creator(repository, args):
if args.encryption == 'keyfile':
return KeyfileKey.create(repository, args)
elif args.encryption == 'repokey':
return RepoKey.create(repository, args)
else:
return PlaintextKey.create(repository, args)
def key_factory(repository, manifest_data):
key_type = manifest_data[0]
if key_type == KeyfileKey.TYPE:
return KeyfileKey.detect(repository, manifest_data)
elif key_type == RepoKey.TYPE:
return RepoKey.detect(repository, manifest_data)
elif key_type == PassphraseKey.TYPE:
# we just dispatch to repokey mode and assume the passphrase was migrated to a repokey.
# see also comment in PassphraseKey class.
return RepoKey.detect(repository, manifest_data)
elif key_type == PlaintextKey.TYPE:
return PlaintextKey.detect(repository, manifest_data)
else:
raise UnsupportedPayloadError(key_type)
class KeyBase:
TYPE = None # override in subclasses
def __init__(self, repository):
self.TYPE_STR = bytes([self.TYPE])
self.repository = repository
self.target = None # key location file path / repo obj
self.compressor = Compressor('none', buffer=COMPR_BUFFER)
def id_hash(self, data):
"""Return HMAC hash using the "id" HMAC key
"""
def encrypt(self, data):
pass
def decrypt(self, id, data):
pass
class PlaintextKey(KeyBase):
TYPE = 0x02
chunk_seed = 0
@classmethod
def create(cls, repository, args):
logger.info('Encryption NOT enabled.\nUse the "--encryption=repokey|keyfile" to enable encryption.')
return cls(repository)
@classmethod
def detect(cls, repository, manifest_data):
return cls(repository)
def id_hash(self, data):
return sha256(data).digest()
def encrypt(self, data):
return b''.join([self.TYPE_STR, self.compressor.compress(data)])
def decrypt(self, id, data):
if data[0] != self.TYPE:
raise IntegrityError('Invalid encryption envelope')
data = self.compressor.decompress(memoryview(data)[1:])
if id and sha256(data).digest() != id:
raise IntegrityError('Chunk id verification failed')
return data
class AESKeyBase(KeyBase):
"""Common base class shared by KeyfileKey and PassphraseKey
Chunks are encrypted using 256bit AES in Counter Mode (CTR)
Payload layout: TYPE(1) + HMAC(32) + NONCE(8) + CIPHERTEXT
To reduce payload size only 8 bytes of the 16 bytes nonce is saved
in the payload, the first 8 bytes are always zeros. This does not
affect security but limits the maximum repository capacity to
only 295 exabytes!
"""
PAYLOAD_OVERHEAD = 1 + 32 + 8 # TYPE + HMAC + NONCE
def id_hash(self, data):
"""Return HMAC hash using the "id" HMAC key
"""
return HMAC(self.id_key, data, sha256).digest()
def encrypt(self, data):
data = self.compressor.compress(data)
self.enc_cipher.reset()
data = b''.join((self.enc_cipher.iv[8:], self.enc_cipher.encrypt(data)))
hmac = HMAC(self.enc_hmac_key, data, sha256).digest()
return b''.join((self.TYPE_STR, hmac, data))
def decrypt(self, id, data):
if not (data[0] == self.TYPE or
data[0] == PassphraseKey.TYPE and isinstance(self, RepoKey)):
raise IntegrityError('Invalid encryption envelope')
hmac_given = memoryview(data)[1:33]
hmac_computed = memoryview(HMAC(self.enc_hmac_key, memoryview(data)[33:], sha256).digest())
if not compare_digest(hmac_computed, hmac_given):
raise IntegrityError('Encryption envelope checksum mismatch')
self.dec_cipher.reset(iv=PREFIX + data[33:41])
data = self.compressor.decompress(self.dec_cipher.decrypt(data[41:]))
if id:
hmac_given = id
hmac_computed = HMAC(self.id_key, data, sha256).digest()
if not compare_digest(hmac_computed, hmac_given):
raise IntegrityError('Chunk id verification failed')
return data
def extract_nonce(self, payload):
if not (payload[0] == self.TYPE or
payload[0] == PassphraseKey.TYPE and isinstance(self, RepoKey)):
raise IntegrityError('Invalid encryption envelope')
nonce = bytes_to_long(payload[33:41])
return nonce
def init_from_random_data(self, data):
self.enc_key = data[0:32]
self.enc_hmac_key = data[32:64]
self.id_key = data[64:96]
self.chunk_seed = bytes_to_int(data[96:100])
# Convert to signed int32
if self.chunk_seed & 0x80000000:
self.chunk_seed = self.chunk_seed - 0xffffffff - 1
def init_ciphers(self, enc_iv=b''):
self.enc_cipher = AES(is_encrypt=True, key=self.enc_key, iv=enc_iv)
self.dec_cipher = AES(is_encrypt=False, key=self.enc_key)
class Passphrase(str):
@classmethod
def env_passphrase(cls, default=None):
passphrase = os.environ.get('BORG_PASSPHRASE', default)
if passphrase is not None:
return cls(passphrase)
@classmethod
def getpass(cls, prompt):
return cls(getpass.getpass(prompt))
@classmethod
def verification(cls, passphrase):
if yes('Do you want your passphrase to be displayed for verification? [yN]: ',
env_var_override='BORG_DISPLAY_PASSPHRASE'):
print('Your passphrase (between double-quotes): "%s"' % passphrase,
file=sys.stderr)
print('Make sure the passphrase displayed above is exactly what you wanted.',
file=sys.stderr)
try:
passphrase.encode('ascii')
except UnicodeEncodeError:
print('Your passphrase (UTF-8 encoding in hex): %s' %
hexlify(passphrase.encode('utf-8')).decode('ascii'),
file=sys.stderr)
print('As you have a non-ASCII passphrase, it is recommended to keep the UTF-8 encoding in hex together with the passphrase at a safe place.',
file=sys.stderr)
@classmethod
def new(cls, allow_empty=False):
passphrase = cls.env_passphrase()
if passphrase is not None:
return passphrase
for retry in range(1, 11):
passphrase = cls.getpass('Enter new passphrase: ')
if allow_empty or passphrase:
passphrase2 = cls.getpass('Enter same passphrase again: ')
if passphrase == passphrase2:
cls.verification(passphrase)
logger.info('Remember your passphrase. Your data will be inaccessible without it.')
return passphrase
else:
print('Passphrases do not match', file=sys.stderr)
else:
print('Passphrase must not be blank', file=sys.stderr)
else:
raise PasswordRetriesExceeded
def __repr__(self):
return '<Passphrase "***hidden***">'
def kdf(self, salt, iterations, length):
return pbkdf2_hmac('sha256', self.encode('utf-8'), salt, iterations, length)
class PassphraseKey(AESKeyBase):
# This mode was killed in borg 1.0, see: https://github.com/borgbackup/borg/issues/97
# Reasons:
# - you can never ever change your passphrase for existing repos.
# - you can never ever use a different iterations count for existing repos.
# "Killed" means:
# - there is no automatic dispatch to this class via type byte
# - --encryption=passphrase is an invalid argument now
# This class is kept for a while to support migration from passphrase to repokey mode.
TYPE = 0x01
iterations = 100000 # must not be changed ever!
@classmethod
def create(cls, repository, args):
key = cls(repository)
logger.warning('WARNING: "passphrase" mode is unsupported since borg 1.0.')
passphrase = Passphrase.new(allow_empty=False)
key.init(repository, passphrase)
return key
@classmethod
def detect(cls, repository, manifest_data):
prompt = 'Enter passphrase for %s: ' % repository._location.orig
key = cls(repository)
passphrase = Passphrase.env_passphrase()
if passphrase is None:
passphrase = Passphrase.getpass(prompt)
for retry in range(1, 3):
key.init(repository, passphrase)
try:
key.decrypt(None, manifest_data)
num_blocks = num_aes_blocks(len(manifest_data) - 41)
key.init_ciphers(PREFIX + long_to_bytes(key.extract_nonce(manifest_data) + num_blocks))
return key
except IntegrityError:
passphrase = Passphrase.getpass(prompt)
else:
raise PasswordRetriesExceeded
def change_passphrase(self):
class ImmutablePassphraseError(Error):
"""The passphrase for this encryption key type can't be changed."""
raise ImmutablePassphraseError
def init(self, repository, passphrase):
self.init_from_random_data(passphrase.kdf(repository.id, self.iterations, 100))
self.init_ciphers()
class KeyfileKeyBase(AESKeyBase):
@classmethod
def detect(cls, repository, manifest_data):
key = cls(repository)
target = key.find_key()
prompt = 'Enter passphrase for key %s: ' % target
passphrase = Passphrase.env_passphrase()
if passphrase is None:
passphrase = Passphrase()
if not key.load(target, passphrase):
for retry in range(0, 3):
passphrase = Passphrase.getpass(prompt)
if key.load(target, passphrase):
break
else:
raise PasswordRetriesExceeded
else:
if not key.load(target, passphrase):
raise PassphraseWrong
num_blocks = num_aes_blocks(len(manifest_data) - 41)
key.init_ciphers(PREFIX + long_to_bytes(key.extract_nonce(manifest_data) + num_blocks))
return key
def find_key(self):
raise NotImplementedError
def load(self, target, passphrase):
raise NotImplementedError
def _load(self, key_data, passphrase):
cdata = a2b_base64(key_data)
data = self.decrypt_key_file(cdata, passphrase)
if data:
key = msgpack.unpackb(data)
if key[b'version'] != 1:
raise IntegrityError('Invalid key file header')
self.repository_id = key[b'repository_id']
self.enc_key = key[b'enc_key']
self.enc_hmac_key = key[b'enc_hmac_key']
self.id_key = key[b'id_key']
self.chunk_seed = key[b'chunk_seed']
return True
return False
def decrypt_key_file(self, data, passphrase):
d = msgpack.unpackb(data)
assert d[b'version'] == 1
assert d[b'algorithm'] == b'sha256'
key = passphrase.kdf(d[b'salt'], d[b'iterations'], 32)
data = AES(is_encrypt=False, key=key).decrypt(d[b'data'])
if HMAC(key, data, sha256).digest() == d[b'hash']:
return data
def encrypt_key_file(self, data, passphrase):
salt = os.urandom(32)
iterations = 100000
key = passphrase.kdf(salt, iterations, 32)
hash = HMAC(key, data, sha256).digest()
cdata = AES(is_encrypt=True, key=key).encrypt(data)
d = {
'version': 1,
'salt': salt,
'iterations': iterations,
'algorithm': 'sha256',
'hash': hash,
'data': cdata,
}
return msgpack.packb(d)
def _save(self, passphrase):
key = {
'version': 1,
'repository_id': self.repository_id,
'enc_key': self.enc_key,
'enc_hmac_key': self.enc_hmac_key,
'id_key': self.id_key,
'chunk_seed': self.chunk_seed,
}
data = self.encrypt_key_file(msgpack.packb(key), passphrase)
key_data = '\n'.join(textwrap.wrap(b2a_base64(data).decode('ascii')))
return key_data
def change_passphrase(self):
passphrase = Passphrase.new(allow_empty=True)
self.save(self.target, passphrase)
logger.info('Key updated')
@classmethod
def create(cls, repository, args):
passphrase = Passphrase.new(allow_empty=True)
key = cls(repository)
key.repository_id = repository.id
key.init_from_random_data(os.urandom(100))
key.init_ciphers()
target = key.get_new_target(args)
key.save(target, passphrase)
logger.info('Key in "%s" created.' % target)
logger.info('Keep this key safe. Your data will be inaccessible without it.')
return key
def save(self, target, passphrase):
raise NotImplementedError
def get_new_target(self, args):
raise NotImplementedError
class KeyfileKey(KeyfileKeyBase):
TYPE = 0x00
FILE_ID = 'BORG_KEY'
def find_key(self):
file_id = self.FILE_ID.encode()
first_line = file_id + b' ' + hexlify(self.repository.id)
keys_dir = get_keys_dir()
for name in os.listdir(keys_dir):
filename = os.path.join(keys_dir, name)
# we do the magic / id check in binary mode to avoid stumbling over
# decoding errors if somebody has binary files in the keys dir for some reason.
with open(filename, 'rb') as fd:
if fd.read(len(first_line)) == first_line:
return filename
raise KeyfileNotFoundError(self.repository._location.canonical_path(), get_keys_dir())
def get_new_target(self, args):
filename = args.location.to_key_filename()
path = filename
i = 1
while os.path.exists(path):
i += 1
path = filename + '.%d' % i
return path
def load(self, target, passphrase):
with open(target, 'r') as fd:
key_data = ''.join(fd.readlines()[1:])
success = self._load(key_data, passphrase)
if success:
self.target = target
return success
def save(self, target, passphrase):
key_data = self._save(passphrase)
with open(target, 'w') as fd:
fd.write('%s %s\n' % (self.FILE_ID, hexlify(self.repository_id).decode('ascii')))
fd.write(key_data)
fd.write('\n')
self.target = target
class RepoKey(KeyfileKeyBase):
TYPE = 0x03
def find_key(self):
loc = self.repository._location.canonical_path()
try:
self.repository.load_key()
return loc
except configparser.NoOptionError:
raise RepoKeyNotFoundError(loc) from None
def get_new_target(self, args):
return self.repository
def load(self, target, passphrase):
# what we get in target is just a repo location, but we already have the repo obj:
target = self.repository
key_data = target.load_key()
key_data = key_data.decode('utf-8') # remote repo: msgpack issue #99, getting bytes
success = self._load(key_data, passphrase)
if success:
self.target = target
return success
def save(self, target, passphrase):
key_data = self._save(passphrase)
key_data = key_data.encode('utf-8') # remote repo: msgpack issue #99, giving bytes
target.save_key(key_data)
self.target = target

307
borg/locking.py Normal file
View file

@ -0,0 +1,307 @@
import json
import os
import socket
import time
from borg.helpers import Error, ErrorWithTraceback
ADD, REMOVE = 'add', 'remove'
SHARED, EXCLUSIVE = 'shared', 'exclusive'
# only determine the PID and hostname once.
# for FUSE mounts, we fork a child process that needs to release
# the lock made by the parent, so it needs to use the same PID for that.
_pid = os.getpid()
_hostname = socket.gethostname()
def get_id():
"""Get identification tuple for 'us'"""
thread_id = 0
return _hostname, _pid, thread_id
class TimeoutTimer:
"""
A timer for timeout checks (can also deal with no timeout, give timeout=None [default]).
It can also compute and optionally execute a reasonable sleep time (e.g. to avoid
polling too often or to support thread/process rescheduling).
"""
def __init__(self, timeout=None, sleep=None):
"""
Initialize a timer.
:param timeout: time out interval [s] or None (no timeout)
:param sleep: sleep interval [s] (>= 0: do sleep call, <0: don't call sleep)
or None (autocompute: use 10% of timeout [but not more than 60s],
or 1s for no timeout)
"""
if timeout is not None and timeout < 0:
raise ValueError("timeout must be >= 0")
self.timeout_interval = timeout
if sleep is None:
if timeout is None:
sleep = 1.0
else:
sleep = min(60.0, timeout / 10.0)
self.sleep_interval = sleep
self.start_time = None
self.end_time = None
def __repr__(self):
return "<%s: start=%r end=%r timeout=%r sleep=%r>" % (
self.__class__.__name__, self.start_time, self.end_time,
self.timeout_interval, self.sleep_interval)
def start(self):
self.start_time = time.time()
if self.timeout_interval is not None:
self.end_time = self.start_time + self.timeout_interval
return self
def sleep(self):
if self.sleep_interval >= 0:
time.sleep(self.sleep_interval)
def timed_out(self):
return self.end_time is not None and time.time() >= self.end_time
def timed_out_or_sleep(self):
if self.timed_out():
return True
else:
self.sleep()
return False
class LockError(Error):
"""Failed to acquire the lock {}."""
class LockErrorT(ErrorWithTraceback):
"""Failed to acquire the lock {}."""
class LockTimeout(LockError):
"""Failed to create/acquire the lock {} (timeout)."""
class LockFailed(LockErrorT):
"""Failed to create/acquire the lock {} ({})."""
class NotLocked(LockErrorT):
"""Failed to release the lock {} (was not locked)."""
class NotMyLock(LockErrorT):
"""Failed to release the lock {} (was/is locked, but not by me)."""
class ExclusiveLock:
"""An exclusive Lock based on mkdir fs operation being atomic.
If possible, try to use the contextmanager here like:
with ExclusiveLock(...) as lock:
...
This makes sure the lock is released again if the block is left, no
matter how (e.g. if an exception occurred).
"""
def __init__(self, path, timeout=None, sleep=None, id=None):
self.timeout = timeout
self.sleep = sleep
self.path = os.path.abspath(path)
self.id = id or get_id()
self.unique_name = os.path.join(self.path, "%s.%d-%x" % self.id)
def __enter__(self):
return self.acquire()
def __exit__(self, *exc):
self.release()
def __repr__(self):
return "<%s: %r>" % (self.__class__.__name__, self.unique_name)
def acquire(self, timeout=None, sleep=None):
if timeout is None:
timeout = self.timeout
if sleep is None:
sleep = self.sleep
timer = TimeoutTimer(timeout, sleep).start()
while True:
try:
os.mkdir(self.path)
except FileExistsError: # already locked
if self.by_me():
return self
if timer.timed_out_or_sleep():
raise LockTimeout(self.path)
except OSError as err:
raise LockFailed(self.path, str(err)) from None
else:
with open(self.unique_name, "wb"):
pass
return self
def release(self):
if not self.is_locked():
raise NotLocked(self.path)
if not self.by_me():
raise NotMyLock(self.path)
os.unlink(self.unique_name)
os.rmdir(self.path)
def is_locked(self):
return os.path.exists(self.path)
def by_me(self):
return os.path.exists(self.unique_name)
def break_lock(self):
if self.is_locked():
for name in os.listdir(self.path):
os.unlink(os.path.join(self.path, name))
os.rmdir(self.path)
class LockRoster:
"""
A Lock Roster to track shared/exclusive lockers.
Note: you usually should call the methods with an exclusive lock held,
to avoid conflicting access by multiple threads/processes/machines.
"""
def __init__(self, path, id=None):
self.path = path
self.id = id or get_id()
def load(self):
try:
with open(self.path) as f:
data = json.load(f)
except (FileNotFoundError, ValueError):
# no or corrupt/empty roster file?
data = {}
return data
def save(self, data):
with open(self.path, "w") as f:
json.dump(data, f)
def remove(self):
try:
os.unlink(self.path)
except FileNotFoundError:
pass
def get(self, key):
roster = self.load()
return set(tuple(e) for e in roster.get(key, []))
def modify(self, key, op):
roster = self.load()
try:
elements = set(tuple(e) for e in roster[key])
except KeyError:
elements = set()
if op == ADD:
elements.add(self.id)
elif op == REMOVE:
elements.remove(self.id)
else:
raise ValueError('Unknown LockRoster op %r' % op)
roster[key] = list(list(e) for e in elements)
self.save(roster)
class UpgradableLock:
"""
A Lock for a resource that can be accessed in a shared or exclusive way.
Typically, write access to a resource needs an exclusive lock (1 writer,
noone is allowed reading) and read access to a resource needs a shared
lock (multiple readers are allowed).
If possible, try to use the contextmanager here like:
with UpgradableLock(...) as lock:
...
This makes sure the lock is released again if the block is left, no
matter how (e.g. if an exception occurred).
"""
def __init__(self, path, exclusive=False, sleep=None, timeout=None, id=None):
self.path = path
self.is_exclusive = exclusive
self.sleep = sleep
self.timeout = timeout
self.id = id or get_id()
# globally keeping track of shared and exclusive lockers:
self._roster = LockRoster(path + '.roster', id=id)
# an exclusive lock, used for:
# - holding while doing roster queries / updates
# - holding while the UpgradableLock itself is exclusive
self._lock = ExclusiveLock(path + '.exclusive', id=id, timeout=timeout)
def __enter__(self):
return self.acquire()
def __exit__(self, *exc):
self.release()
def __repr__(self):
return "<%s: %r>" % (self.__class__.__name__, self.id)
def acquire(self, exclusive=None, remove=None, sleep=None):
if exclusive is None:
exclusive = self.is_exclusive
sleep = sleep or self.sleep or 0.2
if exclusive:
self._wait_for_readers_finishing(remove, sleep)
self._roster.modify(EXCLUSIVE, ADD)
else:
with self._lock:
if remove is not None:
self._roster.modify(remove, REMOVE)
self._roster.modify(SHARED, ADD)
self.is_exclusive = exclusive
return self
def _wait_for_readers_finishing(self, remove, sleep):
timer = TimeoutTimer(self.timeout, sleep).start()
while True:
self._lock.acquire()
try:
if remove is not None:
self._roster.modify(remove, REMOVE)
if len(self._roster.get(SHARED)) == 0:
return # we are the only one and we keep the lock!
# restore the roster state as before (undo the roster change):
if remove is not None:
self._roster.modify(remove, ADD)
except:
# avoid orphan lock when an exception happens here, e.g. Ctrl-C!
self._lock.release()
raise
else:
self._lock.release()
if timer.timed_out_or_sleep():
raise LockTimeout(self.path)
def release(self):
if self.is_exclusive:
self._roster.modify(EXCLUSIVE, REMOVE)
self._lock.release()
else:
with self._lock:
self._roster.modify(SHARED, REMOVE)
def upgrade(self):
if not self.is_exclusive:
self.acquire(exclusive=True, remove=SHARED)
def downgrade(self):
if self.is_exclusive:
self.acquire(exclusive=False, remove=EXCLUSIVE)
def break_lock(self):
self._roster.remove()
self._lock.break_lock()

178
borg/logger.py Normal file
View file

@ -0,0 +1,178 @@
"""logging facilities
The way to use this is as follows:
* each module declares its own logger, using:
from .logger import create_logger
logger = create_logger()
* then each module uses logger.info/warning/debug/etc according to the
level it believes is appropriate:
logger.debug('debugging info for developers or power users')
logger.info('normal, informational output')
logger.warning('warn about a non-fatal error or sth else')
logger.error('a fatal error')
... and so on. see the `logging documentation
<https://docs.python.org/3/howto/logging.html#when-to-use-logging>`_
for more information
* console interaction happens on stderr, that includes interactive
reporting functions like `help`, `info` and `list`
* ...except ``input()`` is special, because we can't control the
stream it is using, unfortunately. we assume that it won't clutter
stdout, because interaction would be broken then anyways
* what is output on INFO level is additionally controlled by commandline
flags
"""
import inspect
import logging
import logging.config
import logging.handlers # needed for handlers defined there being configurable in logging.conf file
import os
import warnings
configured = False
# use something like this to ignore warnings:
# warnings.filterwarnings('ignore', r'... regex for warning message to ignore ...')
def _log_warning(message, category, filename, lineno, file=None, line=None):
# for warnings, we just want to use the logging system, not stderr or other files
msg = "{0}:{1}: {2}: {3}".format(filename, lineno, category.__name__, message)
logger = create_logger(__name__)
# Note: the warning will look like coming from here,
# but msg contains info about where it really comes from
logger.warning(msg)
def setup_logging(stream=None, conf_fname=None, env_var='BORG_LOGGING_CONF', level='info', is_serve=False):
"""setup logging module according to the arguments provided
if conf_fname is given (or the config file name can be determined via
the env_var, if given): load this logging configuration.
otherwise, set up a stream handler logger on stderr (by default, if no
stream is provided).
if is_serve == True, we configure a special log format as expected by
the borg client log message interceptor.
"""
global configured
err_msg = None
if env_var:
conf_fname = os.environ.get(env_var, conf_fname)
if conf_fname:
try:
conf_fname = os.path.abspath(conf_fname)
# we open the conf file here to be able to give a reasonable
# error message in case of failure (if we give the filename to
# fileConfig(), it silently ignores unreadable files and gives
# unhelpful error msgs like "No section: 'formatters'"):
with open(conf_fname) as f:
logging.config.fileConfig(f)
configured = True
logger = logging.getLogger(__name__)
logger.debug('using logging configuration read from "{0}"'.format(conf_fname))
warnings.showwarning = _log_warning
return None
except Exception as err: # XXX be more precise
err_msg = str(err)
# if we did not / not successfully load a logging configuration, fallback to this:
logger = logging.getLogger('')
handler = logging.StreamHandler(stream)
if is_serve:
fmt = '$LOG %(levelname)s Remote: %(message)s'
else:
fmt = '%(message)s'
handler.setFormatter(logging.Formatter(fmt))
logger.addHandler(handler)
logger.setLevel(level.upper())
configured = True
logger = logging.getLogger(__name__)
if err_msg:
logger.warning('setup_logging for "{0}" failed with "{1}".'.format(conf_fname, err_msg))
logger.debug('using builtin fallback logging configuration')
warnings.showwarning = _log_warning
return handler
def find_parent_module():
"""find the name of a the first module calling this module
if we cannot find it, we return the current module's name
(__name__) instead.
"""
try:
frame = inspect.currentframe().f_back
module = inspect.getmodule(frame)
while module is None or module.__name__ == __name__:
frame = frame.f_back
module = inspect.getmodule(frame)
return module.__name__
except AttributeError:
# somehow we failed to find our module
# return the logger module name by default
return __name__
def create_logger(name=None):
"""lazily create a Logger object with the proper path, which is returned by
find_parent_module() by default, or is provided via the commandline
this is really a shortcut for:
logger = logging.getLogger(__name__)
we use it to avoid errors and provide a more standard API.
We must create the logger lazily, because this is usually called from
module level (and thus executed at import time - BEFORE setup_logging()
was called). By doing it lazily we can do the setup first, we just have to
be careful not to call any logger methods before the setup_logging() call.
If you try, you'll get an exception.
"""
class LazyLogger:
def __init__(self, name=None):
self.__name = name or find_parent_module()
self.__real_logger = None
@property
def __logger(self):
if self.__real_logger is None:
if not configured:
raise Exception("tried to call a logger before setup_logging() was called")
self.__real_logger = logging.getLogger(self.__name)
return self.__real_logger
def setLevel(self, *args, **kw):
return self.__logger.setLevel(*args, **kw)
def log(self, *args, **kw):
return self.__logger.log(*args, **kw)
def exception(self, *args, **kw):
return self.__logger.exception(*args, **kw)
def debug(self, *args, **kw):
return self.__logger.debug(*args, **kw)
def info(self, *args, **kw):
return self.__logger.info(*args, **kw)
def warning(self, *args, **kw):
return self.__logger.warning(*args, **kw)
def error(self, *args, **kw):
return self.__logger.error(*args, **kw)
def critical(self, *args, **kw):
return self.__logger.critical(*args, **kw)
return LazyLogger(name)

41
borg/lrucache.py Normal file
View file

@ -0,0 +1,41 @@
class LRUCache:
def __init__(self, capacity, dispose):
self._cache = {}
self._lru = []
self._capacity = capacity
self._dispose = dispose
def __setitem__(self, key, value):
assert key not in self._cache, (
"Unexpected attempt to replace a cached item,"
" without first deleting the old item.")
self._lru.append(key)
while len(self._lru) > self._capacity:
del self[self._lru[0]]
self._cache[key] = value
def __getitem__(self, key):
value = self._cache[key] # raise KeyError if not found
self._lru.remove(key)
self._lru.append(key)
return value
def __delitem__(self, key):
value = self._cache.pop(key) # raise KeyError if not found
self._dispose(value)
self._lru.remove(key)
def __contains__(self, key):
return key in self._cache
def clear(self):
for value in self._cache.values():
self._dispose(value)
self._cache.clear()
# useful for testing
def items(self):
return self._cache.items()
def __len__(self):
return len(self._cache)

16
borg/platform.py Normal file
View file

@ -0,0 +1,16 @@
import sys
if sys.platform.startswith('linux'): # pragma: linux only
from .platform_linux import acl_get, acl_set, API_VERSION
elif sys.platform.startswith('freebsd'): # pragma: freebsd only
from .platform_freebsd import acl_get, acl_set, API_VERSION
elif sys.platform == 'darwin': # pragma: darwin only
from .platform_darwin import acl_get, acl_set, API_VERSION
else: # pragma: unknown platform only
API_VERSION = 2
def acl_get(path, item, st, numeric_owner=False):
pass
def acl_set(path, item, numeric_owner=False):
pass

86
borg/platform_darwin.pyx Normal file
View file

@ -0,0 +1,86 @@
import os
from .helpers import user2uid, group2gid, safe_decode, safe_encode
API_VERSION = 2
cdef extern from "sys/acl.h":
ctypedef struct _acl_t:
pass
ctypedef _acl_t *acl_t
int acl_free(void *obj)
acl_t acl_get_link_np(const char *path, int type)
acl_t acl_set_link_np(const char *path, int type, acl_t acl)
acl_t acl_from_text(const char *buf)
char *acl_to_text(acl_t acl, ssize_t *len_p)
int ACL_TYPE_EXTENDED
def _remove_numeric_id_if_possible(acl):
"""Replace the user/group field with the local uid/gid if possible
"""
entries = []
for entry in safe_decode(acl).split('\n'):
if entry:
fields = entry.split(':')
if fields[0] == 'user':
if user2uid(fields[2]) is not None:
fields[1] = fields[3] = ''
elif fields[0] == 'group':
if group2gid(fields[2]) is not None:
fields[1] = fields[3] = ''
entries.append(':'.join(fields))
return safe_encode('\n'.join(entries))
def _remove_non_numeric_identifier(acl):
"""Remove user and group names from the acl
"""
entries = []
for entry in safe_decode(acl).split('\n'):
if entry:
fields = entry.split(':')
if fields[0] in ('user', 'group'):
fields[2] = ''
entries.append(':'.join(fields))
else:
entries.append(entry)
return safe_encode('\n'.join(entries))
def acl_get(path, item, st, numeric_owner=False):
cdef acl_t acl = NULL
cdef char *text = NULL
try:
acl = acl_get_link_np(<bytes>os.fsencode(path), ACL_TYPE_EXTENDED)
if acl == NULL:
return
text = acl_to_text(acl, NULL)
if text == NULL:
return
if numeric_owner:
item[b'acl_extended'] = _remove_non_numeric_identifier(text)
else:
item[b'acl_extended'] = text
finally:
acl_free(text)
acl_free(acl)
def acl_set(path, item, numeric_owner=False):
cdef acl_t acl = NULL
try:
try:
if numeric_owner:
acl = acl_from_text(item[b'acl_extended'])
else:
acl = acl_from_text(<bytes>_remove_numeric_id_if_possible(item[b'acl_extended']))
except KeyError:
return
if acl == NULL:
return
if acl_set_link_np(<bytes>os.fsencode(path), ACL_TYPE_EXTENDED, acl):
return
finally:
acl_free(acl)

100
borg/platform_freebsd.pyx Normal file
View file

@ -0,0 +1,100 @@
import os
from .helpers import posix_acl_use_stored_uid_gid, safe_encode, safe_decode
API_VERSION = 2
cdef extern from "errno.h":
int errno
int EINVAL
cdef extern from "sys/types.h":
int ACL_TYPE_ACCESS
int ACL_TYPE_DEFAULT
int ACL_TYPE_NFS4
cdef extern from "sys/acl.h":
ctypedef struct _acl_t:
pass
ctypedef _acl_t *acl_t
int acl_free(void *obj)
acl_t acl_get_link_np(const char *path, int type)
acl_t acl_set_link_np(const char *path, int type, acl_t acl)
acl_t acl_from_text(const char *buf)
char *acl_to_text_np(acl_t acl, ssize_t *len, int flags)
int ACL_TEXT_NUMERIC_IDS
int ACL_TEXT_APPEND_ID
cdef extern from "unistd.h":
long lpathconf(const char *path, int name)
int _PC_ACL_NFS4
cdef _get_acl(p, type, item, attribute, int flags):
cdef acl_t acl
cdef char *text
acl = acl_get_link_np(p, type)
if acl:
text = acl_to_text_np(acl, NULL, flags)
if text:
item[attribute] = text
acl_free(text)
acl_free(acl)
def acl_get(path, item, st, numeric_owner=False):
"""Saves ACL Entries
If `numeric_owner` is True the user/group field is not preserved only uid/gid
"""
cdef int flags = ACL_TEXT_APPEND_ID
p = os.fsencode(path)
ret = lpathconf(p, _PC_ACL_NFS4)
if ret < 0 and errno == EINVAL:
return
flags |= ACL_TEXT_NUMERIC_IDS if numeric_owner else 0
if ret > 0:
_get_acl(p, ACL_TYPE_NFS4, item, b'acl_nfs4', flags)
else:
_get_acl(p, ACL_TYPE_ACCESS, item, b'acl_access', flags)
_get_acl(p, ACL_TYPE_DEFAULT, item, b'acl_default', flags)
cdef _set_acl(p, type, item, attribute, numeric_owner=False):
cdef acl_t acl
text = item.get(attribute)
if text:
if numeric_owner and type == ACL_TYPE_NFS4:
text = _nfs4_use_stored_uid_gid(text)
elif numeric_owner and type in(ACL_TYPE_ACCESS, ACL_TYPE_DEFAULT):
text = posix_acl_use_stored_uid_gid(text)
acl = acl_from_text(<bytes>text)
if acl:
acl_set_link_np(p, type, acl)
acl_free(acl)
cdef _nfs4_use_stored_uid_gid(acl):
"""Replace the user/group field with the stored uid/gid
"""
entries = []
for entry in safe_decode(acl).split('\n'):
if entry:
if entry.startswith('user:') or entry.startswith('group:'):
fields = entry.split(':')
entries.append(':'.join(fields[0], fields[5], *fields[2:-1]))
else:
entries.append(entry)
return safe_encode('\n'.join(entries))
def acl_set(path, item, numeric_owner=False):
"""Restore ACL Entries
If `numeric_owner` is True the stored uid/gid is used instead
of the user/group names
"""
p = os.fsencode(path)
_set_acl(p, ACL_TYPE_NFS4, item, b'acl_nfs4', numeric_owner)
_set_acl(p, ACL_TYPE_ACCESS, item, b'acl_access', numeric_owner)
_set_acl(p, ACL_TYPE_DEFAULT, item, b'acl_default', numeric_owner)

143
borg/platform_linux.pyx Normal file
View file

@ -0,0 +1,143 @@
import os
import re
from stat import S_ISLNK
from .helpers import posix_acl_use_stored_uid_gid, user2uid, group2gid, safe_decode, safe_encode
API_VERSION = 2
cdef extern from "sys/types.h":
int ACL_TYPE_ACCESS
int ACL_TYPE_DEFAULT
cdef extern from "sys/acl.h":
ctypedef struct _acl_t:
pass
ctypedef _acl_t *acl_t
int acl_free(void *obj)
acl_t acl_get_file(const char *path, int type)
acl_t acl_set_file(const char *path, int type, acl_t acl)
acl_t acl_from_text(const char *buf)
char *acl_to_text(acl_t acl, ssize_t *len)
cdef extern from "acl/libacl.h":
int acl_extended_file(const char *path)
_comment_re = re.compile(' *#.*', re.M)
def acl_use_local_uid_gid(acl):
"""Replace the user/group field with the local uid/gid if possible
"""
entries = []
for entry in safe_decode(acl).split('\n'):
if entry:
fields = entry.split(':')
if fields[0] == 'user' and fields[1]:
fields[1] = str(user2uid(fields[1], fields[3]))
elif fields[0] == 'group' and fields[1]:
fields[1] = str(group2gid(fields[1], fields[3]))
entries.append(':'.join(fields[:3]))
return safe_encode('\n'.join(entries))
cdef acl_append_numeric_ids(acl):
"""Extend the "POSIX 1003.1e draft standard 17" format with an additional uid/gid field
"""
entries = []
for entry in _comment_re.sub('', safe_decode(acl)).split('\n'):
if entry:
type, name, permission = entry.split(':')
if name and type == 'user':
entries.append(':'.join([type, name, permission, str(user2uid(name, name))]))
elif name and type == 'group':
entries.append(':'.join([type, name, permission, str(group2gid(name, name))]))
else:
entries.append(entry)
return safe_encode('\n'.join(entries))
cdef acl_numeric_ids(acl):
"""Replace the "POSIX 1003.1e draft standard 17" user/group field with uid/gid
"""
entries = []
for entry in _comment_re.sub('', safe_decode(acl)).split('\n'):
if entry:
type, name, permission = entry.split(':')
if name and type == 'user':
uid = str(user2uid(name, name))
entries.append(':'.join([type, uid, permission, uid]))
elif name and type == 'group':
gid = str(group2gid(name, name))
entries.append(':'.join([type, gid, permission, gid]))
else:
entries.append(entry)
return safe_encode('\n'.join(entries))
def acl_get(path, item, st, numeric_owner=False):
"""Saves ACL Entries
If `numeric_owner` is True the user/group field is not preserved only uid/gid
"""
cdef acl_t default_acl = NULL
cdef acl_t access_acl = NULL
cdef char *default_text = NULL
cdef char *access_text = NULL
p = <bytes>os.fsencode(path)
if S_ISLNK(st.st_mode) or acl_extended_file(p) <= 0:
return
if numeric_owner:
converter = acl_numeric_ids
else:
converter = acl_append_numeric_ids
try:
access_acl = acl_get_file(p, ACL_TYPE_ACCESS)
if access_acl:
access_text = acl_to_text(access_acl, NULL)
if access_text:
item[b'acl_access'] = converter(access_text)
default_acl = acl_get_file(p, ACL_TYPE_DEFAULT)
if default_acl:
default_text = acl_to_text(default_acl, NULL)
if default_text:
item[b'acl_default'] = converter(default_text)
finally:
acl_free(default_text)
acl_free(default_acl)
acl_free(access_text)
acl_free(access_acl)
def acl_set(path, item, numeric_owner=False):
"""Restore ACL Entries
If `numeric_owner` is True the stored uid/gid is used instead
of the user/group names
"""
cdef acl_t access_acl = NULL
cdef acl_t default_acl = NULL
p = <bytes>os.fsencode(path)
if numeric_owner:
converter = posix_acl_use_stored_uid_gid
else:
converter = acl_use_local_uid_gid
access_text = item.get(b'acl_access')
default_text = item.get(b'acl_default')
if access_text:
try:
access_acl = acl_from_text(<bytes>converter(access_text))
if access_acl:
acl_set_file(p, ACL_TYPE_ACCESS, access_acl)
finally:
acl_free(access_acl)
if default_text:
try:
default_acl = acl_from_text(<bytes>converter(default_text))
if default_acl:
acl_set_file(p, ACL_TYPE_DEFAULT, default_acl)
finally:
acl_free(default_acl)

453
borg/remote.py Normal file
View file

@ -0,0 +1,453 @@
import errno
import fcntl
import logging
import os
import select
import shlex
from subprocess import Popen, PIPE
import sys
import tempfile
from . import __version__
from .helpers import Error, IntegrityError, sysinfo
from .repository import Repository
import msgpack
RPC_PROTOCOL_VERSION = 2
BUFSIZE = 10 * 1024 * 1024
class ConnectionClosed(Error):
"""Connection closed by remote host"""
class ConnectionClosedWithHint(ConnectionClosed):
"""Connection closed by remote host. {}"""
class PathNotAllowed(Error):
"""Repository path not allowed"""
class InvalidRPCMethod(Error):
"""RPC method {} is not valid"""
class RepositoryServer: # pragma: no cover
rpc_methods = (
'__len__',
'check',
'commit',
'delete',
'destroy',
'get',
'list',
'negotiate',
'open',
'put',
'rollback',
'save_key',
'load_key',
'break_lock',
)
def __init__(self, restrict_to_paths):
self.repository = None
self.restrict_to_paths = restrict_to_paths
def serve(self):
stdin_fd = sys.stdin.fileno()
stdout_fd = sys.stdout.fileno()
stderr_fd = sys.stdout.fileno()
# Make stdin non-blocking
fl = fcntl.fcntl(stdin_fd, fcntl.F_GETFL)
fcntl.fcntl(stdin_fd, fcntl.F_SETFL, fl | os.O_NONBLOCK)
# Make stdout blocking
fl = fcntl.fcntl(stdout_fd, fcntl.F_GETFL)
fcntl.fcntl(stdout_fd, fcntl.F_SETFL, fl & ~os.O_NONBLOCK)
# Make stderr blocking
fl = fcntl.fcntl(stderr_fd, fcntl.F_GETFL)
fcntl.fcntl(stderr_fd, fcntl.F_SETFL, fl & ~os.O_NONBLOCK)
unpacker = msgpack.Unpacker(use_list=False)
while True:
r, w, es = select.select([stdin_fd], [], [], 10)
if r:
data = os.read(stdin_fd, BUFSIZE)
if not data:
self.repository.close()
return
unpacker.feed(data)
for unpacked in unpacker:
if not (isinstance(unpacked, tuple) and len(unpacked) == 4):
self.repository.close()
raise Exception("Unexpected RPC data format.")
type, msgid, method, args = unpacked
method = method.decode('ascii')
try:
if method not in self.rpc_methods:
raise InvalidRPCMethod(method)
try:
f = getattr(self, method)
except AttributeError:
f = getattr(self.repository, method)
res = f(*args)
except BaseException as e:
# These exceptions are reconstructed on the client end in RemoteRepository.call_many(),
# and will be handled just like locally raised exceptions. Suppress the remote traceback
# for these, except ErrorWithTraceback, which should always display a traceback.
if not isinstance(e, (Repository.DoesNotExist, Repository.AlreadyExists, PathNotAllowed)):
logging.exception('Borg %s: exception in RPC call:', __version__)
logging.error(sysinfo())
exc = "Remote Exception (see remote log for the traceback)"
os.write(stdout_fd, msgpack.packb((1, msgid, e.__class__.__name__, exc)))
else:
os.write(stdout_fd, msgpack.packb((1, msgid, None, res)))
if es:
self.repository.close()
return
def negotiate(self, versions):
return RPC_PROTOCOL_VERSION
def open(self, path, create=False, lock_wait=None, lock=True):
path = os.fsdecode(path)
if path.startswith('/~'):
path = path[1:]
path = os.path.realpath(os.path.expanduser(path))
if self.restrict_to_paths:
for restrict_to_path in self.restrict_to_paths:
if path.startswith(os.path.realpath(restrict_to_path)):
break
else:
raise PathNotAllowed(path)
self.repository = Repository(path, create, lock_wait=lock_wait, lock=lock)
self.repository.__enter__() # clean exit handled by serve() method
return self.repository.id
class RemoteRepository:
extra_test_args = []
class RPCError(Exception):
def __init__(self, name):
self.name = name
def __init__(self, location, create=False, lock_wait=None, lock=True, args=None):
self.location = self._location = location
self.preload_ids = []
self.msgid = 0
self.to_send = b''
self.cache = {}
self.ignore_responses = set()
self.responses = {}
self.unpacker = msgpack.Unpacker(use_list=False)
self.p = None
testing = location.host == '__testsuite__'
borg_cmd = self.borg_cmd(args, testing)
env = dict(os.environ)
if not testing:
borg_cmd = self.ssh_cmd(location) + borg_cmd
# pyinstaller binary adds LD_LIBRARY_PATH=/tmp/_ME... but we do not want
# that the system's ssh binary picks up (non-matching) libraries from there
env.pop('LD_LIBRARY_PATH', None)
self.p = Popen(borg_cmd, bufsize=0, stdin=PIPE, stdout=PIPE, stderr=PIPE, env=env)
self.stdin_fd = self.p.stdin.fileno()
self.stdout_fd = self.p.stdout.fileno()
self.stderr_fd = self.p.stderr.fileno()
fcntl.fcntl(self.stdin_fd, fcntl.F_SETFL, fcntl.fcntl(self.stdin_fd, fcntl.F_GETFL) | os.O_NONBLOCK)
fcntl.fcntl(self.stdout_fd, fcntl.F_SETFL, fcntl.fcntl(self.stdout_fd, fcntl.F_GETFL) | os.O_NONBLOCK)
fcntl.fcntl(self.stderr_fd, fcntl.F_SETFL, fcntl.fcntl(self.stderr_fd, fcntl.F_GETFL) | os.O_NONBLOCK)
self.r_fds = [self.stdout_fd, self.stderr_fd]
self.x_fds = [self.stdin_fd, self.stdout_fd, self.stderr_fd]
try:
version = self.call('negotiate', RPC_PROTOCOL_VERSION)
except ConnectionClosed:
raise ConnectionClosedWithHint('Is borg working on the server?') from None
if version != RPC_PROTOCOL_VERSION:
raise Exception('Server insisted on using unsupported protocol version %d' % version)
try:
self.id = self.call('open', self.location.path, create, lock_wait, lock)
except Exception:
self.close()
raise
def __del__(self):
if self.p:
self.close()
assert False, "cleanup happened in Repository.__del__"
def __repr__(self):
return '<%s %s>' % (self.__class__.__name__, self.location.canonical_path())
def __enter__(self):
return self
def __exit__(self, exc_type, exc_val, exc_tb):
if exc_type is not None:
self.rollback()
self.close()
def borg_cmd(self, args, testing):
"""return a borg serve command line"""
# give some args/options to "borg serve" process as they were given to us
opts = []
if args is not None:
opts.append('--umask=%03o' % args.umask)
root_logger = logging.getLogger()
if root_logger.isEnabledFor(logging.DEBUG):
opts.append('--debug')
elif root_logger.isEnabledFor(logging.INFO):
opts.append('--info')
elif root_logger.isEnabledFor(logging.WARNING):
pass # warning is default
elif root_logger.isEnabledFor(logging.ERROR):
opts.append('--error')
elif root_logger.isEnabledFor(logging.CRITICAL):
opts.append('--critical')
else:
raise ValueError('log level missing, fix this code')
if testing:
return [sys.executable, '-m', 'borg.archiver', 'serve'] + opts + self.extra_test_args
else: # pragma: no cover
return [args.remote_path, 'serve'] + opts
def ssh_cmd(self, location):
"""return a ssh command line that can be prefixed to a borg command line"""
args = shlex.split(os.environ.get('BORG_RSH', 'ssh'))
if location.port:
args += ['-p', str(location.port)]
if location.user:
args.append('%s@%s' % (location.user, location.host))
else:
args.append('%s' % location.host)
return args
def call(self, cmd, *args, **kw):
for resp in self.call_many(cmd, [args], **kw):
return resp
def call_many(self, cmd, calls, wait=True, is_preloaded=False):
if not calls:
return
def fetch_from_cache(args):
msgid = self.cache[args].pop(0)
if not self.cache[args]:
del self.cache[args]
return msgid
calls = list(calls)
waiting_for = []
w_fds = [self.stdin_fd]
while wait or calls:
while waiting_for:
try:
error, res = self.responses.pop(waiting_for[0])
waiting_for.pop(0)
if error:
if error == b'DoesNotExist':
raise Repository.DoesNotExist(self.location.orig)
elif error == b'AlreadyExists':
raise Repository.AlreadyExists(self.location.orig)
elif error == b'CheckNeeded':
raise Repository.CheckNeeded(self.location.orig)
elif error == b'IntegrityError':
raise IntegrityError(res)
elif error == b'PathNotAllowed':
raise PathNotAllowed(*res)
elif error == b'ObjectNotFound':
raise Repository.ObjectNotFound(res[0], self.location.orig)
elif error == b'InvalidRPCMethod':
raise InvalidRPCMethod(*res)
else:
raise self.RPCError(res.decode('utf-8'))
else:
yield res
if not waiting_for and not calls:
return
except KeyError:
break
r, w, x = select.select(self.r_fds, w_fds, self.x_fds, 1)
if x:
raise Exception('FD exception occurred')
for fd in r:
if fd is self.stdout_fd:
data = os.read(fd, BUFSIZE)
if not data:
raise ConnectionClosed()
self.unpacker.feed(data)
for unpacked in self.unpacker:
if not (isinstance(unpacked, tuple) and len(unpacked) == 4):
raise Exception("Unexpected RPC data format.")
type, msgid, error, res = unpacked
if msgid in self.ignore_responses:
self.ignore_responses.remove(msgid)
else:
self.responses[msgid] = error, res
elif fd is self.stderr_fd:
data = os.read(fd, 32768)
if not data:
raise ConnectionClosed()
data = data.decode('utf-8')
for line in data.splitlines(keepends=True):
if line.startswith('$LOG '):
_, level, msg = line.split(' ', 2)
level = getattr(logging, level, logging.CRITICAL) # str -> int
logging.log(level, msg.rstrip())
else:
sys.stderr.write("Remote: " + line)
if w:
while not self.to_send and (calls or self.preload_ids) and len(waiting_for) < 100:
if calls:
if is_preloaded:
if calls[0] in self.cache:
waiting_for.append(fetch_from_cache(calls.pop(0)))
else:
args = calls.pop(0)
if cmd == 'get' and args in self.cache:
waiting_for.append(fetch_from_cache(args))
else:
self.msgid += 1
waiting_for.append(self.msgid)
self.to_send = msgpack.packb((1, self.msgid, cmd, args))
if not self.to_send and self.preload_ids:
args = (self.preload_ids.pop(0),)
self.msgid += 1
self.cache.setdefault(args, []).append(self.msgid)
self.to_send = msgpack.packb((1, self.msgid, cmd, args))
if self.to_send:
try:
self.to_send = self.to_send[os.write(self.stdin_fd, self.to_send):]
except OSError as e:
# io.write might raise EAGAIN even though select indicates
# that the fd should be writable
if e.errno != errno.EAGAIN:
raise
if not self.to_send and not (calls or self.preload_ids):
w_fds = []
self.ignore_responses |= set(waiting_for)
def check(self, repair=False, save_space=False):
return self.call('check', repair, save_space)
def commit(self, save_space=False):
return self.call('commit', save_space)
def rollback(self, *args):
return self.call('rollback')
def destroy(self):
return self.call('destroy')
def __len__(self):
return self.call('__len__')
def list(self, limit=None, marker=None):
return self.call('list', limit, marker)
def get(self, id_):
for resp in self.get_many([id_]):
return resp
def get_many(self, ids, is_preloaded=False):
for resp in self.call_many('get', [(id_,) for id_ in ids], is_preloaded=is_preloaded):
yield resp
def put(self, id_, data, wait=True):
return self.call('put', id_, data, wait=wait)
def delete(self, id_, wait=True):
return self.call('delete', id_, wait=wait)
def save_key(self, keydata):
return self.call('save_key', keydata)
def load_key(self):
return self.call('load_key')
def break_lock(self):
return self.call('break_lock')
def close(self):
if self.p:
self.p.stdin.close()
self.p.stdout.close()
self.p.wait()
self.p = None
def preload(self, ids):
self.preload_ids += ids
class RepositoryNoCache:
"""A not caching Repository wrapper, passes through to repository.
Just to have same API (including the context manager) as RepositoryCache.
"""
def __init__(self, repository):
self.repository = repository
def close(self):
pass
def __enter__(self):
return self
def __exit__(self, exc_type, exc_val, exc_tb):
self.close()
def get(self, key):
return next(self.get_many([key]))
def get_many(self, keys):
for data in self.repository.get_many(keys):
yield data
class RepositoryCache(RepositoryNoCache):
"""A caching Repository wrapper
Caches Repository GET operations using a local temporary Repository.
"""
# maximum object size that will be cached, 64 kiB.
THRESHOLD = 2**16
def __init__(self, repository):
super().__init__(repository)
tmppath = tempfile.mkdtemp(prefix='borg-tmp')
self.caching_repo = Repository(tmppath, create=True, exclusive=True)
self.caching_repo.__enter__() # handled by context manager in base class
def close(self):
if self.caching_repo is not None:
self.caching_repo.destroy()
self.caching_repo = None
def get_many(self, keys):
unknown_keys = [key for key in keys if key not in self.caching_repo]
repository_iterator = zip(unknown_keys, self.repository.get_many(unknown_keys))
for key in keys:
try:
yield self.caching_repo.get(key)
except Repository.ObjectNotFound:
for key_, data in repository_iterator:
if key_ == key:
if len(data) <= self.THRESHOLD:
self.caching_repo.put(key, data)
yield data
break
# Consume any pending requests
for _ in repository_iterator:
pass
def cache_if_remote(repository):
if isinstance(repository, RemoteRepository):
return RepositoryCache(repository)
else:
return RepositoryNoCache(repository)

739
borg/repository.py Normal file
View file

@ -0,0 +1,739 @@
from configparser import ConfigParser
from binascii import hexlify, unhexlify
from datetime import datetime
from itertools import islice
import errno
import logging
logger = logging.getLogger(__name__)
import os
import shutil
import struct
from zlib import crc32
import msgpack
from .helpers import Error, ErrorWithTraceback, IntegrityError, Location, ProgressIndicatorPercent
from .hashindex import NSIndex
from .locking import UpgradableLock, LockError, LockErrorT
from .lrucache import LRUCache
MAX_OBJECT_SIZE = 20 * 1024 * 1024
MAGIC = b'BORG_SEG'
MAGIC_LEN = len(MAGIC)
TAG_PUT = 0
TAG_DELETE = 1
TAG_COMMIT = 2
class Repository:
"""Filesystem based transactional key value store
On disk layout:
dir/README
dir/config
dir/data/<X / SEGMENTS_PER_DIR>/<X>
dir/index.X
dir/hints.X
"""
DEFAULT_MAX_SEGMENT_SIZE = 5 * 1024 * 1024
DEFAULT_SEGMENTS_PER_DIR = 10000
class DoesNotExist(Error):
"""Repository {} does not exist."""
class AlreadyExists(Error):
"""Repository {} already exists."""
class InvalidRepository(Error):
"""{} is not a valid repository. Check repo config."""
class CheckNeeded(ErrorWithTraceback):
"""Inconsistency detected. Please run "borg check {}"."""
class ObjectNotFound(ErrorWithTraceback):
"""Object with key {} not found in repository {}."""
def __init__(self, path, create=False, exclusive=False, lock_wait=None, lock=True):
self.path = os.path.abspath(path)
self._location = Location('file://%s' % self.path)
self.io = None
self.lock = None
self.index = None
self._active_txn = False
self.lock_wait = lock_wait
self.do_lock = lock
self.do_create = create
self.exclusive = exclusive
def __del__(self):
if self.lock:
self.close()
assert False, "cleanup happened in Repository.__del__"
def __repr__(self):
return '<%s %s>' % (self.__class__.__name__, self.path)
def __enter__(self):
if self.do_create:
self.do_create = False
self.create(self.path)
self.open(self.path, self.exclusive, lock_wait=self.lock_wait, lock=self.do_lock)
return self
def __exit__(self, exc_type, exc_val, exc_tb):
if exc_type is not None:
self.rollback()
self.close()
def create(self, path):
"""Create a new empty repository at `path`
"""
if os.path.exists(path) and (not os.path.isdir(path) or os.listdir(path)):
raise self.AlreadyExists(path)
if not os.path.exists(path):
os.mkdir(path)
with open(os.path.join(path, 'README'), 'w') as fd:
fd.write('This is a Borg repository\n')
os.mkdir(os.path.join(path, 'data'))
config = ConfigParser(interpolation=None)
config.add_section('repository')
config.set('repository', 'version', '1')
config.set('repository', 'segments_per_dir', str(self.DEFAULT_SEGMENTS_PER_DIR))
config.set('repository', 'max_segment_size', str(self.DEFAULT_MAX_SEGMENT_SIZE))
config.set('repository', 'append_only', '0')
config.set('repository', 'id', hexlify(os.urandom(32)).decode('ascii'))
self.save_config(path, config)
def save_config(self, path, config):
config_path = os.path.join(path, 'config')
with open(config_path, 'w') as fd:
config.write(fd)
def save_key(self, keydata):
assert self.config
keydata = keydata.decode('utf-8') # remote repo: msgpack issue #99, getting bytes
self.config.set('repository', 'key', keydata)
self.save_config(self.path, self.config)
def load_key(self):
keydata = self.config.get('repository', 'key')
return keydata.encode('utf-8') # remote repo: msgpack issue #99, returning bytes
def destroy(self):
"""Destroy the repository at `self.path`
"""
if self.append_only:
raise ValueError(self.path + " is in append-only mode")
self.close()
os.remove(os.path.join(self.path, 'config')) # kill config first
shutil.rmtree(self.path)
def get_index_transaction_id(self):
indices = sorted((int(name[6:]) for name in os.listdir(self.path) if name.startswith('index.') and name[6:].isdigit()))
if indices:
return indices[-1]
else:
return None
def get_transaction_id(self):
index_transaction_id = self.get_index_transaction_id()
segments_transaction_id = self.io.get_segments_transaction_id()
if index_transaction_id is not None and segments_transaction_id is None:
raise self.CheckNeeded(self.path)
# Attempt to automatically rebuild index if we crashed between commit
# tag write and index save
if index_transaction_id != segments_transaction_id:
if index_transaction_id is not None and index_transaction_id > segments_transaction_id:
replay_from = None
else:
replay_from = index_transaction_id
self.replay_segments(replay_from, segments_transaction_id)
return self.get_index_transaction_id()
def break_lock(self):
UpgradableLock(os.path.join(self.path, 'lock')).break_lock()
def open(self, path, exclusive, lock_wait=None, lock=True):
self.path = path
if not os.path.isdir(path):
raise self.DoesNotExist(path)
if lock:
self.lock = UpgradableLock(os.path.join(path, 'lock'), exclusive, timeout=lock_wait).acquire()
else:
self.lock = None
self.config = ConfigParser(interpolation=None)
self.config.read(os.path.join(self.path, 'config'))
if 'repository' not in self.config.sections() or self.config.getint('repository', 'version') != 1:
raise self.InvalidRepository(path)
self.max_segment_size = self.config.getint('repository', 'max_segment_size')
self.segments_per_dir = self.config.getint('repository', 'segments_per_dir')
self.append_only = self.config.getboolean('repository', 'append_only', fallback=False)
self.id = unhexlify(self.config.get('repository', 'id').strip())
self.io = LoggedIO(self.path, self.max_segment_size, self.segments_per_dir)
def close(self):
if self.lock:
if self.io:
self.io.close()
self.io = None
self.lock.release()
self.lock = None
def commit(self, save_space=False):
"""Commit transaction
"""
self.io.write_commit()
if not self.append_only:
self.compact_segments(save_space=save_space)
self.write_index()
self.rollback()
def open_index(self, transaction_id):
if transaction_id is None:
return NSIndex()
return NSIndex.read((os.path.join(self.path, 'index.%d') % transaction_id).encode('utf-8'))
def prepare_txn(self, transaction_id, do_cleanup=True):
self._active_txn = True
try:
self.lock.upgrade()
except (LockError, LockErrorT):
# if upgrading the lock to exclusive fails, we do not have an
# active transaction. this is important for "serve" mode, where
# the repository instance lives on - even if exceptions happened.
self._active_txn = False
raise
if not self.index or transaction_id is None:
self.index = self.open_index(transaction_id)
if transaction_id is None:
self.segments = {} # XXX bad name: usage_count_of_segment_x = self.segments[x]
self.compact = set() # XXX bad name: segments_needing_compaction = self.compact
else:
if do_cleanup:
self.io.cleanup(transaction_id)
with open(os.path.join(self.path, 'hints.%d' % transaction_id), 'rb') as fd:
hints = msgpack.unpack(fd)
if hints[b'version'] != 1:
raise ValueError('Unknown hints file version: %d' % hints['version'])
self.segments = hints[b'segments']
self.compact = set(hints[b'compact'])
def write_index(self):
hints = {b'version': 1,
b'segments': self.segments,
b'compact': list(self.compact)}
transaction_id = self.io.get_segments_transaction_id()
hints_file = os.path.join(self.path, 'hints.%d' % transaction_id)
with open(hints_file + '.tmp', 'wb') as fd:
msgpack.pack(hints, fd)
fd.flush()
os.fsync(fd.fileno())
os.rename(hints_file + '.tmp', hints_file)
self.index.write(os.path.join(self.path, 'index.tmp'))
os.rename(os.path.join(self.path, 'index.tmp'),
os.path.join(self.path, 'index.%d' % transaction_id))
if self.append_only:
with open(os.path.join(self.path, 'transactions'), 'a') as log:
print('transaction %d, UTC time %s' % (transaction_id, datetime.utcnow().isoformat()), file=log)
# Remove old indices
current = '.%d' % transaction_id
for name in os.listdir(self.path):
if not name.startswith('index.') and not name.startswith('hints.'):
continue
if name.endswith(current):
continue
os.unlink(os.path.join(self.path, name))
self.index = None
def compact_segments(self, save_space=False):
"""Compact sparse segments by copying data into new segments
"""
if not self.compact:
return
index_transaction_id = self.get_index_transaction_id()
segments = self.segments
unused = [] # list of segments, that are not used anymore
def complete_xfer():
# complete the transfer (usually exactly when some target segment
# is full, or at the very end when everything is processed)
nonlocal unused
# commit the new, compact, used segments
self.io.write_commit()
# get rid of the old, sparse, unused segments. free space.
for segment in unused:
assert self.segments.pop(segment) == 0
self.io.delete_segment(segment)
unused = []
for segment in sorted(self.compact):
if self.io.segment_exists(segment):
for tag, key, offset, data in self.io.iter_objects(segment, include_data=True):
if tag == TAG_PUT and self.index.get(key, (-1, -1)) == (segment, offset):
try:
new_segment, offset = self.io.write_put(key, data, raise_full=save_space)
except LoggedIO.SegmentFull:
complete_xfer()
new_segment, offset = self.io.write_put(key, data)
self.index[key] = new_segment, offset
segments.setdefault(new_segment, 0)
segments[new_segment] += 1
segments[segment] -= 1
elif tag == TAG_DELETE:
if index_transaction_id is None or segment > index_transaction_id:
try:
self.io.write_delete(key, raise_full=save_space)
except LoggedIO.SegmentFull:
complete_xfer()
self.io.write_delete(key)
assert segments[segment] == 0
unused.append(segment)
complete_xfer()
self.compact = set()
def replay_segments(self, index_transaction_id, segments_transaction_id):
self.prepare_txn(index_transaction_id, do_cleanup=False)
try:
segment_count = sum(1 for _ in self.io.segment_iterator())
pi = ProgressIndicatorPercent(total=segment_count, msg="Replaying segments %3.0f%%", same_line=True)
for i, (segment, filename) in enumerate(self.io.segment_iterator()):
pi.show(i)
if index_transaction_id is not None and segment <= index_transaction_id:
continue
if segment > segments_transaction_id:
break
objects = self.io.iter_objects(segment)
self._update_index(segment, objects)
pi.finish()
self.write_index()
finally:
self.rollback()
def _update_index(self, segment, objects, report=None):
"""some code shared between replay_segments and check"""
self.segments[segment] = 0
for tag, key, offset in objects:
if tag == TAG_PUT:
try:
s, _ = self.index[key]
self.compact.add(s)
self.segments[s] -= 1
except KeyError:
pass
self.index[key] = segment, offset
self.segments[segment] += 1
elif tag == TAG_DELETE:
try:
s, _ = self.index.pop(key)
self.segments[s] -= 1
self.compact.add(s)
except KeyError:
pass
self.compact.add(segment)
elif tag == TAG_COMMIT:
continue
else:
msg = 'Unexpected tag {} in segment {}'.format(tag, segment)
if report is None:
raise self.CheckNeeded(msg)
else:
report(msg)
if self.segments[segment] == 0:
self.compact.add(segment)
def check(self, repair=False, save_space=False):
"""Check repository consistency
This method verifies all segment checksums and makes sure
the index is consistent with the data stored in the segments.
"""
if self.append_only and repair:
raise ValueError(self.path + " is in append-only mode")
error_found = False
def report_error(msg):
nonlocal error_found
error_found = True
logger.error(msg)
logger.info('Starting repository check')
assert not self._active_txn
try:
transaction_id = self.get_transaction_id()
current_index = self.open_index(transaction_id)
except Exception:
transaction_id = self.io.get_segments_transaction_id()
current_index = None
if transaction_id is None:
transaction_id = self.get_index_transaction_id()
if transaction_id is None:
transaction_id = self.io.get_latest_segment()
if repair:
self.io.cleanup(transaction_id)
segments_transaction_id = self.io.get_segments_transaction_id()
self.prepare_txn(None) # self.index, self.compact, self.segments all empty now!
segment_count = sum(1 for _ in self.io.segment_iterator())
pi = ProgressIndicatorPercent(total=segment_count, msg="Checking segments %3.1f%%", step=0.1, same_line=True)
for i, (segment, filename) in enumerate(self.io.segment_iterator()):
pi.show(i)
if segment > transaction_id:
continue
try:
objects = list(self.io.iter_objects(segment))
except IntegrityError as err:
report_error(str(err))
objects = []
if repair:
self.io.recover_segment(segment, filename)
objects = list(self.io.iter_objects(segment))
self._update_index(segment, objects, report_error)
pi.finish()
# self.index, self.segments, self.compact now reflect the state of the segment files up to <transaction_id>
# We might need to add a commit tag if no committed segment is found
if repair and segments_transaction_id is None:
report_error('Adding commit tag to segment {}'.format(transaction_id))
self.io.segment = transaction_id + 1
self.io.write_commit()
if current_index and not repair:
# current_index = "as found on disk"
# self.index = "as rebuilt in-memory from segments"
if len(current_index) != len(self.index):
report_error('Index object count mismatch. {} != {}'.format(len(current_index), len(self.index)))
elif current_index:
for key, value in self.index.iteritems():
if current_index.get(key, (-1, -1)) != value:
report_error('Index mismatch for key {}. {} != {}'.format(key, value, current_index.get(key, (-1, -1))))
if repair:
self.compact_segments(save_space=save_space)
self.write_index()
self.rollback()
if error_found:
if repair:
logger.info('Completed repository check, errors found and repaired.')
else:
logger.error('Completed repository check, errors found.')
else:
logger.info('Completed repository check, no problems found.')
return not error_found or repair
def rollback(self):
"""
"""
self.index = None
self._active_txn = False
def __len__(self):
if not self.index:
self.index = self.open_index(self.get_transaction_id())
return len(self.index)
def __contains__(self, id):
if not self.index:
self.index = self.open_index(self.get_transaction_id())
return id in self.index
def list(self, limit=None, marker=None):
if not self.index:
self.index = self.open_index(self.get_transaction_id())
return [id_ for id_, _ in islice(self.index.iteritems(marker=marker), limit)]
def get(self, id_):
if not self.index:
self.index = self.open_index(self.get_transaction_id())
try:
segment, offset = self.index[id_]
return self.io.read(segment, offset, id_)
except KeyError:
raise self.ObjectNotFound(id_, self.path) from None
def get_many(self, ids, is_preloaded=False):
for id_ in ids:
yield self.get(id_)
def put(self, id, data, wait=True):
if not self._active_txn:
self.prepare_txn(self.get_transaction_id())
try:
segment, _ = self.index[id]
self.segments[segment] -= 1
self.compact.add(segment)
segment = self.io.write_delete(id)
self.segments.setdefault(segment, 0)
self.compact.add(segment)
except KeyError:
pass
segment, offset = self.io.write_put(id, data)
self.segments.setdefault(segment, 0)
self.segments[segment] += 1
self.index[id] = segment, offset
def delete(self, id, wait=True):
if not self._active_txn:
self.prepare_txn(self.get_transaction_id())
try:
segment, offset = self.index.pop(id)
except KeyError:
raise self.ObjectNotFound(id, self.path) from None
self.segments[segment] -= 1
self.compact.add(segment)
segment = self.io.write_delete(id)
self.compact.add(segment)
self.segments.setdefault(segment, 0)
def preload(self, ids):
"""Preload objects (only applies to remote repositories)
"""
class LoggedIO:
class SegmentFull(Exception):
"""raised when a segment is full, before opening next"""
header_fmt = struct.Struct('<IIB')
assert header_fmt.size == 9
put_header_fmt = struct.Struct('<IIB32s')
assert put_header_fmt.size == 41
header_no_crc_fmt = struct.Struct('<IB')
assert header_no_crc_fmt.size == 5
crc_fmt = struct.Struct('<I')
assert crc_fmt.size == 4
_commit = header_no_crc_fmt.pack(9, TAG_COMMIT)
COMMIT = crc_fmt.pack(crc32(_commit)) + _commit
def __init__(self, path, limit, segments_per_dir, capacity=90):
self.path = path
self.fds = LRUCache(capacity,
dispose=lambda fd: fd.close())
self.segment = 0
self.limit = limit
self.segments_per_dir = segments_per_dir
self.offset = 0
self._write_fd = None
def close(self):
self.close_segment()
self.fds.clear()
self.fds = None # Just to make sure we're disabled
def segment_iterator(self, reverse=False):
data_path = os.path.join(self.path, 'data')
dirs = sorted((dir for dir in os.listdir(data_path) if dir.isdigit()), key=int, reverse=reverse)
for dir in dirs:
filenames = os.listdir(os.path.join(data_path, dir))
sorted_filenames = sorted((filename for filename in filenames
if filename.isdigit()), key=int, reverse=reverse)
for filename in sorted_filenames:
yield int(filename), os.path.join(data_path, dir, filename)
def get_latest_segment(self):
for segment, filename in self.segment_iterator(reverse=True):
return segment
return None
def get_segments_transaction_id(self):
"""Verify that the transaction id is consistent with the index transaction id
"""
for segment, filename in self.segment_iterator(reverse=True):
if self.is_committed_segment(segment):
return segment
return None
def cleanup(self, transaction_id):
"""Delete segment files left by aborted transactions
"""
self.segment = transaction_id + 1
for segment, filename in self.segment_iterator(reverse=True):
if segment > transaction_id:
os.unlink(filename)
else:
break
def is_committed_segment(self, segment):
"""Check if segment ends with a COMMIT_TAG tag
"""
try:
iterator = self.iter_objects(segment)
except IntegrityError:
return False
with open(self.segment_filename(segment), 'rb') as fd:
try:
fd.seek(-self.header_fmt.size, os.SEEK_END)
except OSError as e:
# return False if segment file is empty or too small
if e.errno == errno.EINVAL:
return False
raise e
if fd.read(self.header_fmt.size) != self.COMMIT:
return False
seen_commit = False
while True:
try:
tag, key, offset = next(iterator)
except IntegrityError:
return False
except StopIteration:
break
if tag == TAG_COMMIT:
seen_commit = True
continue
if seen_commit:
return False
return seen_commit
def segment_filename(self, segment):
return os.path.join(self.path, 'data', str(segment // self.segments_per_dir), str(segment))
def get_write_fd(self, no_new=False, raise_full=False):
if not no_new and self.offset and self.offset > self.limit:
if raise_full:
raise self.SegmentFull
self.close_segment()
if not self._write_fd:
if self.segment % self.segments_per_dir == 0:
dirname = os.path.join(self.path, 'data', str(self.segment // self.segments_per_dir))
if not os.path.exists(dirname):
os.mkdir(dirname)
self._write_fd = open(self.segment_filename(self.segment), 'ab')
self._write_fd.write(MAGIC)
self.offset = MAGIC_LEN
return self._write_fd
def get_fd(self, segment):
try:
return self.fds[segment]
except KeyError:
fd = open(self.segment_filename(segment), 'rb')
self.fds[segment] = fd
return fd
def delete_segment(self, segment):
if segment in self.fds:
del self.fds[segment]
try:
os.unlink(self.segment_filename(segment))
except FileNotFoundError:
pass
def segment_exists(self, segment):
return os.path.exists(self.segment_filename(segment))
def iter_objects(self, segment, include_data=False):
fd = self.get_fd(segment)
fd.seek(0)
if fd.read(MAGIC_LEN) != MAGIC:
raise IntegrityError('Invalid segment magic [segment {}, offset {}]'.format(segment, 0))
offset = MAGIC_LEN
header = fd.read(self.header_fmt.size)
while header:
size, tag, key, data = self._read(fd, self.header_fmt, header, segment, offset,
(TAG_PUT, TAG_DELETE, TAG_COMMIT))
if include_data:
yield tag, key, offset, data
else:
yield tag, key, offset
offset += size
header = fd.read(self.header_fmt.size)
def recover_segment(self, segment, filename):
if segment in self.fds:
del self.fds[segment]
with open(filename, 'rb') as fd:
data = memoryview(fd.read())
os.rename(filename, filename + '.beforerecover')
logger.info('attempting to recover ' + filename)
with open(filename, 'wb') as fd:
fd.write(MAGIC)
while len(data) >= self.header_fmt.size:
crc, size, tag = self.header_fmt.unpack(data[:self.header_fmt.size])
if size < self.header_fmt.size or size > len(data):
data = data[1:]
continue
if crc32(data[4:size]) & 0xffffffff != crc:
data = data[1:]
continue
fd.write(data[:size])
data = data[size:]
def read(self, segment, offset, id):
if segment == self.segment and self._write_fd:
self._write_fd.flush()
fd = self.get_fd(segment)
fd.seek(offset)
header = fd.read(self.put_header_fmt.size)
size, tag, key, data = self._read(fd, self.put_header_fmt, header, segment, offset, (TAG_PUT, ))
if id != key:
raise IntegrityError('Invalid segment entry header, is not for wanted id [segment {}, offset {}]'.format(
segment, offset))
return data
def _read(self, fd, fmt, header, segment, offset, acceptable_tags):
# some code shared by read() and iter_objects()
try:
hdr_tuple = fmt.unpack(header)
except struct.error as err:
raise IntegrityError('Invalid segment entry header [segment {}, offset {}]: {}'.format(
segment, offset, err)) from None
if fmt is self.put_header_fmt:
crc, size, tag, key = hdr_tuple
elif fmt is self.header_fmt:
crc, size, tag = hdr_tuple
key = None
else:
raise TypeError("_read called with unsupported format")
if size > MAX_OBJECT_SIZE or size < fmt.size:
raise IntegrityError('Invalid segment entry size [segment {}, offset {}]'.format(
segment, offset))
length = size - fmt.size
data = fd.read(length)
if len(data) != length:
raise IntegrityError('Segment entry data short read [segment {}, offset {}]: expected {}, got {} bytes'.format(
segment, offset, length, len(data)))
if crc32(data, crc32(memoryview(header)[4:])) & 0xffffffff != crc:
raise IntegrityError('Segment entry checksum mismatch [segment {}, offset {}]'.format(
segment, offset))
if tag not in acceptable_tags:
raise IntegrityError('Invalid segment entry header, did not get acceptable tag [segment {}, offset {}]'.format(
segment, offset))
if key is None and tag in (TAG_PUT, TAG_DELETE):
key, data = data[:32], data[32:]
return size, tag, key, data
def write_put(self, id, data, raise_full=False):
fd = self.get_write_fd(raise_full=raise_full)
size = len(data) + self.put_header_fmt.size
offset = self.offset
header = self.header_no_crc_fmt.pack(size, TAG_PUT)
crc = self.crc_fmt.pack(crc32(data, crc32(id, crc32(header))) & 0xffffffff)
fd.write(b''.join((crc, header, id, data)))
self.offset += size
return self.segment, offset
def write_delete(self, id, raise_full=False):
fd = self.get_write_fd(raise_full=raise_full)
header = self.header_no_crc_fmt.pack(self.put_header_fmt.size, TAG_DELETE)
crc = self.crc_fmt.pack(crc32(id, crc32(header)) & 0xffffffff)
fd.write(b''.join((crc, header, id)))
self.offset += self.put_header_fmt.size
return self.segment
def write_commit(self):
fd = self.get_write_fd(no_new=True)
header = self.header_no_crc_fmt.pack(self.header_fmt.size, TAG_COMMIT)
crc = self.crc_fmt.pack(crc32(header) & 0xffffffff)
fd.write(b''.join((crc, header)))
self.close_segment()
def close_segment(self):
if self._write_fd:
self.segment += 1
self.offset = 0
self._write_fd.flush()
os.fsync(self._write_fd.fileno())
if hasattr(os, 'posix_fadvise'): # only on UNIX
# tell the OS that it does not need to cache what we just wrote,
# avoids spoiling the cache for the OS and other processes.
os.posix_fadvise(self._write_fd.fileno(), 0, 0, os.POSIX_FADV_DONTNEED)
self._write_fd.close()
self._write_fd = None

62
borg/shellpattern.py Normal file
View file

@ -0,0 +1,62 @@
import re
import os
def translate(pat):
"""Translate a shell-style pattern to a regular expression.
The pattern may include "**<sep>" (<sep> stands for the platform-specific path separator; "/" on POSIX systems) for
matching zero or more directory levels and "*" for matching zero or more arbitrary characters with the exception of
any path separator. Wrap meta-characters in brackets for a literal match (i.e. "[?]" to match the literal character
"?").
This function is derived from the "fnmatch" module distributed with the Python standard library.
Copyright (C) 2001-2016 Python Software Foundation. All rights reserved.
TODO: support {alt1,alt2} shell-style alternatives
"""
sep = os.path.sep
n = len(pat)
i = 0
res = ""
while i < n:
c = pat[i]
i += 1
if c == "*":
if i + 1 < n and pat[i] == "*" and pat[i + 1] == sep:
# **/ == wildcard for 0+ full (relative) directory names with trailing slashes; the forward slash stands
# for the platform-specific path separator
res += r"(?:[^\%s]*\%s)*" % (sep, sep)
i += 2
else:
# * == wildcard for name parts (does not cross path separator)
res += r"[^\%s]*" % sep
elif c == "?":
# ? == any single character excluding path separator
res += r"[^\%s]" % sep
elif c == "[":
j = i
if j < n and pat[j] == "!":
j += 1
if j < n and pat[j] == "]":
j += 1
while j < n and pat[j] != "]":
j += 1
if j >= n:
res += "\\["
else:
stuff = pat[i:j].replace("\\", "\\\\")
i = j + 1
if stuff[0] == "!":
stuff = "^" + stuff[1:]
elif stuff[0] == "^":
stuff = "\\" + stuff
res += "[%s]" % stuff
else:
res += re.escape(c)
return res + r"\Z(?ms)"

151
borg/testsuite/__init__.py Normal file
View file

@ -0,0 +1,151 @@
from contextlib import contextmanager
import filecmp
import os
import posix
import stat
import sys
import sysconfig
import time
import unittest
from ..xattr import get_all
from ..logger import setup_logging
try:
import llfuse
# Does this version of llfuse support ns precision?
have_fuse_mtime_ns = hasattr(llfuse.EntryAttributes, 'st_mtime_ns')
except ImportError:
have_fuse_mtime_ns = False
has_lchflags = hasattr(os, 'lchflags')
# The mtime get/set precision varies on different OS and Python versions
if 'HAVE_FUTIMENS' in getattr(posix, '_have_functions', []):
st_mtime_ns_round = 0
elif 'HAVE_UTIMES' in sysconfig.get_config_vars():
st_mtime_ns_round = -6
else:
st_mtime_ns_round = -9
if sys.platform.startswith('netbsd'):
st_mtime_ns_round = -4 # only >1 microsecond resolution here?
# Ensure that the loggers exist for all tests
setup_logging()
class BaseTestCase(unittest.TestCase):
"""
"""
assert_in = unittest.TestCase.assertIn
assert_not_in = unittest.TestCase.assertNotIn
assert_equal = unittest.TestCase.assertEqual
assert_not_equal = unittest.TestCase.assertNotEqual
assert_raises = unittest.TestCase.assertRaises
assert_true = unittest.TestCase.assertTrue
@contextmanager
def assert_creates_file(self, path):
self.assert_true(not os.path.exists(path), '{} should not exist'.format(path))
yield
self.assert_true(os.path.exists(path), '{} should exist'.format(path))
def assert_dirs_equal(self, dir1, dir2):
diff = filecmp.dircmp(dir1, dir2)
self._assert_dirs_equal_cmp(diff)
def _assert_dirs_equal_cmp(self, diff):
self.assert_equal(diff.left_only, [])
self.assert_equal(diff.right_only, [])
self.assert_equal(diff.diff_files, [])
self.assert_equal(diff.funny_files, [])
for filename in diff.common:
path1 = os.path.join(diff.left, filename)
path2 = os.path.join(diff.right, filename)
s1 = os.lstat(path1)
s2 = os.lstat(path2)
# Assume path2 is on FUSE if st_dev is different
fuse = s1.st_dev != s2.st_dev
attrs = ['st_mode', 'st_uid', 'st_gid', 'st_rdev']
if has_lchflags:
attrs.append('st_flags')
if not fuse or not os.path.isdir(path1):
# dir nlink is always 1 on our fuse filesystem
attrs.append('st_nlink')
d1 = [filename] + [getattr(s1, a) for a in attrs]
d2 = [filename] + [getattr(s2, a) for a in attrs]
# ignore st_rdev if file is not a block/char device, fixes #203
if not stat.S_ISCHR(d1[1]) and not stat.S_ISBLK(d1[1]):
d1[4] = None
if not stat.S_ISCHR(d2[1]) and not stat.S_ISBLK(d2[1]):
d2[4] = None
# Older versions of llfuse do not support ns precision properly
if fuse and not have_fuse_mtime_ns:
d1.append(round(s1.st_mtime_ns, -4))
d2.append(round(s2.st_mtime_ns, -4))
else:
d1.append(round(s1.st_mtime_ns, st_mtime_ns_round))
d2.append(round(s2.st_mtime_ns, st_mtime_ns_round))
d1.append(get_all(path1, follow_symlinks=False))
d2.append(get_all(path2, follow_symlinks=False))
self.assert_equal(d1, d2)
for sub_diff in diff.subdirs.values():
self._assert_dirs_equal_cmp(sub_diff)
def wait_for_mount(self, path, timeout=5):
"""Wait until a filesystem is mounted on `path`
"""
timeout += time.time()
while timeout > time.time():
if os.path.ismount(path):
return
time.sleep(.1)
raise Exception('wait_for_mount(%s) timeout' % path)
class changedir:
def __init__(self, dir):
self.dir = dir
def __enter__(self):
self.old = os.getcwd()
os.chdir(self.dir)
def __exit__(self, *args, **kw):
os.chdir(self.old)
class environment_variable:
def __init__(self, **values):
self.values = values
self.old_values = {}
def __enter__(self):
for k, v in self.values.items():
self.old_values[k] = os.environ.get(k)
if v is None:
os.environ.pop(k, None)
else:
os.environ[k] = v
def __exit__(self, *args, **kw):
for k, v in self.old_values.items():
if v is None:
os.environ.pop(k, None)
else:
os.environ[k] = v
class FakeInputs:
"""Simulate multiple user inputs, can be used as input() replacement"""
def __init__(self, inputs):
self.inputs = inputs
def __call__(self, prompt=None):
if prompt is not None:
print(prompt, end='')
try:
return self.inputs.pop(0)
except IndexError:
raise EOFError from None

114
borg/testsuite/archive.py Normal file
View file

@ -0,0 +1,114 @@
from datetime import datetime, timezone
from unittest.mock import Mock
import msgpack
from ..archive import Archive, CacheChunkBuffer, RobustUnpacker
from ..key import PlaintextKey
from ..helpers import Manifest
from . import BaseTestCase
class MockCache:
def __init__(self):
self.objects = {}
def add_chunk(self, id, data, stats=None):
self.objects[id] = data
return id, len(data), len(data)
class ArchiveTimestampTestCase(BaseTestCase):
def _test_timestamp_parsing(self, isoformat, expected):
repository = Mock()
key = PlaintextKey(repository)
manifest = Manifest(repository, key)
a = Archive(repository, key, manifest, 'test', create=True)
a.metadata = {b'time': isoformat}
self.assert_equal(a.ts, expected)
def test_with_microseconds(self):
self._test_timestamp_parsing(
'1970-01-01T00:00:01.000001',
datetime(1970, 1, 1, 0, 0, 1, 1, timezone.utc))
def test_without_microseconds(self):
self._test_timestamp_parsing(
'1970-01-01T00:00:01',
datetime(1970, 1, 1, 0, 0, 1, 0, timezone.utc))
class ChunkBufferTestCase(BaseTestCase):
def test(self):
data = [{b'foo': 1}, {b'bar': 2}]
cache = MockCache()
key = PlaintextKey(None)
chunks = CacheChunkBuffer(cache, key, None)
for d in data:
chunks.add(d)
chunks.flush()
chunks.flush(flush=True)
self.assert_equal(len(chunks.chunks), 2)
unpacker = msgpack.Unpacker()
for id in chunks.chunks:
unpacker.feed(cache.objects[id])
self.assert_equal(data, list(unpacker))
class RobustUnpackerTestCase(BaseTestCase):
def make_chunks(self, items):
return b''.join(msgpack.packb({'path': item}) for item in items)
def _validator(self, value):
return isinstance(value, dict) and value.get(b'path') in (b'foo', b'bar', b'boo', b'baz')
def process(self, input):
unpacker = RobustUnpacker(validator=self._validator)
result = []
for should_sync, chunks in input:
if should_sync:
unpacker.resync()
for data in chunks:
unpacker.feed(data)
for item in unpacker:
result.append(item)
return result
def test_extra_garbage_no_sync(self):
chunks = [(False, [self.make_chunks([b'foo', b'bar'])]),
(False, [b'garbage'] + [self.make_chunks([b'boo', b'baz'])])]
result = self.process(chunks)
self.assert_equal(result, [
{b'path': b'foo'}, {b'path': b'bar'},
103, 97, 114, 98, 97, 103, 101,
{b'path': b'boo'},
{b'path': b'baz'}])
def split(self, left, length):
parts = []
while left:
parts.append(left[:length])
left = left[length:]
return parts
def test_correct_stream(self):
chunks = self.split(self.make_chunks([b'foo', b'bar', b'boo', b'baz']), 2)
input = [(False, chunks)]
result = self.process(input)
self.assert_equal(result, [{b'path': b'foo'}, {b'path': b'bar'}, {b'path': b'boo'}, {b'path': b'baz'}])
def test_missing_chunk(self):
chunks = self.split(self.make_chunks([b'foo', b'bar', b'boo', b'baz']), 4)
input = [(False, chunks[:3]), (True, chunks[4:])]
result = self.process(input)
self.assert_equal(result, [{b'path': b'foo'}, {b'path': b'boo'}, {b'path': b'baz'}])
def test_corrupt_chunk(self):
chunks = self.split(self.make_chunks([b'foo', b'bar', b'boo', b'baz']), 4)
input = [(False, chunks[:3]), (True, [b'gar', b'bage'] + chunks[3:])]
result = self.process(input)
self.assert_equal(result, [{b'path': b'foo'}, {b'path': b'boo'}, {b'path': b'baz'}])

1203
borg/testsuite/archiver.py Normal file

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,99 @@
"""
Do benchmarks using pytest-benchmark.
Usage:
py.test --benchmark-only
"""
import os
import pytest
from .archiver import changedir, cmd
@pytest.yield_fixture
def repo_url(request, tmpdir):
os.environ['BORG_PASSPHRASE'] = '123456'
os.environ['BORG_CHECK_I_KNOW_WHAT_I_AM_DOING'] = 'YES'
os.environ['BORG_DELETE_I_KNOW_WHAT_I_AM_DOING'] = 'YES'
os.environ['BORG_UNKNOWN_UNENCRYPTED_REPO_ACCESS_IS_OK'] = 'yes'
os.environ['BORG_KEYS_DIR'] = str(tmpdir.join('keys'))
os.environ['BORG_CACHE_DIR'] = str(tmpdir.join('cache'))
yield str(tmpdir.join('repository'))
tmpdir.remove(rec=1)
@pytest.fixture(params=["none", "repokey"])
def repo(request, cmd, repo_url):
cmd('init', '--encryption', request.param, repo_url)
return repo_url
@pytest.yield_fixture(scope='session', params=["zeros", "random"])
def testdata(request, tmpdir_factory):
count, size = 10, 1000*1000
p = tmpdir_factory.mktemp('data')
data_type = request.param
if data_type == 'zeros':
# do not use a binary zero (\0) to avoid sparse detection
def data(size):
return b'0' * size
if data_type == 'random':
def data(size):
return os.urandom(size)
for i in range(count):
with open(str(p.join(str(i))), "wb") as f:
f.write(data(size))
yield str(p)
p.remove(rec=1)
@pytest.fixture(params=['none', 'lz4'])
def archive(request, cmd, repo, testdata):
archive_url = repo + '::test'
cmd('create', '--compression', request.param, archive_url, testdata)
return archive_url
def test_create_none(benchmark, cmd, repo, testdata):
result, out = benchmark.pedantic(cmd, ('create', '--compression', 'none', repo + '::test', testdata))
assert result == 0
def test_create_lz4(benchmark, cmd, repo, testdata):
result, out = benchmark.pedantic(cmd, ('create', '--compression', 'lz4', repo + '::test', testdata))
assert result == 0
def test_extract(benchmark, cmd, archive, tmpdir):
with changedir(str(tmpdir)):
result, out = benchmark.pedantic(cmd, ('extract', archive))
assert result == 0
def test_delete(benchmark, cmd, archive):
result, out = benchmark.pedantic(cmd, ('delete', archive))
assert result == 0
def test_list(benchmark, cmd, archive):
result, out = benchmark(cmd, 'list', archive)
assert result == 0
def test_info(benchmark, cmd, archive):
result, out = benchmark(cmd, 'info', archive)
assert result == 0
def test_check(benchmark, cmd, archive):
repo = archive.split('::')[0]
result, out = benchmark(cmd, 'check', repo)
assert result == 0
def test_help(benchmark, cmd):
result, out = benchmark(cmd, 'help')
assert result == 0

42
borg/testsuite/chunker.py Normal file
View file

@ -0,0 +1,42 @@
from io import BytesIO
from ..chunker import Chunker, buzhash, buzhash_update
from ..archive import CHUNK_MAX_EXP, CHUNKER_PARAMS
from . import BaseTestCase
class ChunkerTestCase(BaseTestCase):
def test_chunkify(self):
data = b'0' * int(1.5 * (1 << CHUNK_MAX_EXP)) + b'Y'
parts = [bytes(c) for c in Chunker(0, 1, CHUNK_MAX_EXP, 2, 2).chunkify(BytesIO(data))]
self.assert_equal(len(parts), 2)
self.assert_equal(b''.join(parts), data)
self.assert_equal([bytes(c) for c in Chunker(0, 1, CHUNK_MAX_EXP, 2, 2).chunkify(BytesIO(b''))], [])
self.assert_equal([bytes(c) for c in Chunker(0, 1, CHUNK_MAX_EXP, 2, 2).chunkify(BytesIO(b'foobarboobaz' * 3))], [b'fooba', b'rboobaz', b'fooba', b'rboobaz', b'fooba', b'rboobaz'])
self.assert_equal([bytes(c) for c in Chunker(1, 1, CHUNK_MAX_EXP, 2, 2).chunkify(BytesIO(b'foobarboobaz' * 3))], [b'fo', b'obarb', b'oob', b'azf', b'oobarb', b'oob', b'azf', b'oobarb', b'oobaz'])
self.assert_equal([bytes(c) for c in Chunker(2, 1, CHUNK_MAX_EXP, 2, 2).chunkify(BytesIO(b'foobarboobaz' * 3))], [b'foob', b'ar', b'boobazfoob', b'ar', b'boobazfoob', b'ar', b'boobaz'])
self.assert_equal([bytes(c) for c in Chunker(0, 2, CHUNK_MAX_EXP, 2, 3).chunkify(BytesIO(b'foobarboobaz' * 3))], [b'foobarboobaz' * 3])
self.assert_equal([bytes(c) for c in Chunker(1, 2, CHUNK_MAX_EXP, 2, 3).chunkify(BytesIO(b'foobarboobaz' * 3))], [b'foobar', b'boobazfo', b'obar', b'boobazfo', b'obar', b'boobaz'])
self.assert_equal([bytes(c) for c in Chunker(2, 2, CHUNK_MAX_EXP, 2, 3).chunkify(BytesIO(b'foobarboobaz' * 3))], [b'foob', b'arboobaz', b'foob', b'arboobaz', b'foob', b'arboobaz'])
self.assert_equal([bytes(c) for c in Chunker(0, 3, CHUNK_MAX_EXP, 2, 3).chunkify(BytesIO(b'foobarboobaz' * 3))], [b'foobarboobaz' * 3])
self.assert_equal([bytes(c) for c in Chunker(1, 3, CHUNK_MAX_EXP, 2, 3).chunkify(BytesIO(b'foobarboobaz' * 3))], [b'foobarbo', b'obazfoobar', b'boobazfo', b'obarboobaz'])
self.assert_equal([bytes(c) for c in Chunker(2, 3, CHUNK_MAX_EXP, 2, 3).chunkify(BytesIO(b'foobarboobaz' * 3))], [b'foobarboobaz', b'foobarboobaz', b'foobarboobaz'])
def test_buzhash(self):
self.assert_equal(buzhash(b'abcdefghijklmnop', 0), 3795437769)
self.assert_equal(buzhash(b'abcdefghijklmnop', 1), 3795400502)
self.assert_equal(buzhash(b'abcdefghijklmnop', 1), buzhash_update(buzhash(b'Xabcdefghijklmno', 1), ord('X'), ord('p'), 16, 1))
# Test with more than 31 bytes to make sure our barrel_shift macro works correctly
self.assert_equal(buzhash(b'abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz', 0), 566521248)
def test_small_reads(self):
class SmallReadFile:
input = b'a' * (20 + 1)
def read(self, nbytes):
self.input = self.input[:-1]
return self.input[:1]
reconstructed = b''.join(Chunker(0, *CHUNKER_PARAMS).chunkify(SmallReadFile()))
assert reconstructed == b'a' * 20

100
borg/testsuite/compress.py Normal file
View file

@ -0,0 +1,100 @@
import zlib
try:
import lzma
except ImportError:
lzma = None
import pytest
from ..compress import get_compressor, Compressor, CNONE, ZLIB, LZ4
buffer = bytes(2**16)
data = b'fooooooooobaaaaaaaar' * 10
params = dict(name='zlib', level=6, buffer=buffer)
def test_get_compressor():
c = get_compressor(name='none')
assert isinstance(c, CNONE)
c = get_compressor(name='lz4', buffer=buffer)
assert isinstance(c, LZ4)
c = get_compressor(name='zlib')
assert isinstance(c, ZLIB)
with pytest.raises(KeyError):
get_compressor(name='foobar')
def test_cnull():
c = get_compressor(name='none')
cdata = c.compress(data)
assert len(cdata) > len(data)
assert data in cdata # it's not compressed and just in there 1:1
assert data == c.decompress(cdata)
assert data == Compressor(**params).decompress(cdata) # autodetect
def test_lz4():
c = get_compressor(name='lz4', buffer=buffer)
cdata = c.compress(data)
assert len(cdata) < len(data)
assert data == c.decompress(cdata)
assert data == Compressor(**params).decompress(cdata) # autodetect
def test_zlib():
c = get_compressor(name='zlib')
cdata = c.compress(data)
assert len(cdata) < len(data)
assert data == c.decompress(cdata)
assert data == Compressor(**params).decompress(cdata) # autodetect
def test_lzma():
if lzma is None:
pytest.skip("No lzma support found.")
c = get_compressor(name='lzma')
cdata = c.compress(data)
assert len(cdata) < len(data)
assert data == c.decompress(cdata)
assert data == Compressor(**params).decompress(cdata) # autodetect
def test_autodetect_invalid():
with pytest.raises(ValueError):
Compressor(**params).decompress(b'\xff\xfftotalcrap')
with pytest.raises(ValueError):
Compressor(**params).decompress(b'\x08\x00notreallyzlib')
def test_zlib_compat():
# for compatibility reasons, we do not add an extra header for zlib,
# nor do we expect one when decompressing / autodetecting
for level in range(10):
c = get_compressor(name='zlib', level=level)
cdata1 = c.compress(data)
cdata2 = zlib.compress(data, level)
assert cdata1 == cdata2
data2 = c.decompress(cdata2)
assert data == data2
data2 = Compressor(**params).decompress(cdata2)
assert data == data2
def test_compressor():
params_list = [
dict(name='none', buffer=buffer),
dict(name='lz4', buffer=buffer),
dict(name='zlib', level=0, buffer=buffer),
dict(name='zlib', level=6, buffer=buffer),
dict(name='zlib', level=9, buffer=buffer),
]
if lzma:
params_list += [
dict(name='lzma', level=0, buffer=buffer),
dict(name='lzma', level=6, buffer=buffer),
# we do not test lzma on level 9 because of the huge memory needs
]
for params in params_list:
c = Compressor(**params)
assert data == c.decompress(c.compress(data))

30
borg/testsuite/crypto.py Normal file
View file

@ -0,0 +1,30 @@
from binascii import hexlify
from ..crypto import AES, bytes_to_long, bytes_to_int, long_to_bytes
from . import BaseTestCase
class CryptoTestCase(BaseTestCase):
def test_bytes_to_int(self):
self.assert_equal(bytes_to_int(b'\0\0\0\1'), 1)
def test_bytes_to_long(self):
self.assert_equal(bytes_to_long(b'\0\0\0\0\0\0\0\1'), 1)
self.assert_equal(long_to_bytes(1), b'\0\0\0\0\0\0\0\1')
def test_aes(self):
key = b'X' * 32
data = b'foo' * 10
# encrypt
aes = AES(is_encrypt=True, key=key)
self.assert_equal(bytes_to_long(aes.iv, 8), 0)
cdata = aes.encrypt(data)
self.assert_equal(hexlify(cdata), b'c6efb702de12498f34a2c2bbc8149e759996d08bf6dc5c610aefc0c3a466')
self.assert_equal(bytes_to_long(aes.iv, 8), 2)
# decrypt
aes = AES(is_encrypt=False, key=key)
self.assert_equal(bytes_to_long(aes.iv, 8), 0)
pdata = aes.decrypt(cdata)
self.assert_equal(data, pdata)
self.assert_equal(bytes_to_long(aes.iv, 8), 2)

278
borg/testsuite/hashindex.py Normal file
View file

@ -0,0 +1,278 @@
import base64
import hashlib
import os
import struct
import tempfile
import zlib
import pytest
from ..hashindex import NSIndex, ChunkIndex
from .. import hashindex
from . import BaseTestCase
def H(x):
# make some 32byte long thing that depends on x
return bytes('%-0.32d' % x, 'ascii')
class HashIndexTestCase(BaseTestCase):
def _generic_test(self, cls, make_value, sha):
idx = cls()
self.assert_equal(len(idx), 0)
# Test set
for x in range(100):
idx[bytes('%-32d' % x, 'ascii')] = make_value(x)
self.assert_equal(len(idx), 100)
for x in range(100):
self.assert_equal(idx[bytes('%-32d' % x, 'ascii')], make_value(x))
# Test update
for x in range(100):
idx[bytes('%-32d' % x, 'ascii')] = make_value(x * 2)
self.assert_equal(len(idx), 100)
for x in range(100):
self.assert_equal(idx[bytes('%-32d' % x, 'ascii')], make_value(x * 2))
# Test delete
for x in range(50):
del idx[bytes('%-32d' % x, 'ascii')]
self.assert_equal(len(idx), 50)
idx_name = tempfile.NamedTemporaryFile()
idx.write(idx_name.name)
del idx
# Verify file contents
with open(idx_name.name, 'rb') as fd:
self.assert_equal(hashlib.sha256(fd.read()).hexdigest(), sha)
# Make sure we can open the file
idx = cls.read(idx_name.name)
self.assert_equal(len(idx), 50)
for x in range(50, 100):
self.assert_equal(idx[bytes('%-32d' % x, 'ascii')], make_value(x * 2))
idx.clear()
self.assert_equal(len(idx), 0)
idx.write(idx_name.name)
del idx
self.assert_equal(len(cls.read(idx_name.name)), 0)
def test_nsindex(self):
self._generic_test(NSIndex, lambda x: (x, x),
'80fba5b40f8cf12f1486f1ba33c9d852fb2b41a5b5961d3b9d1228cf2aa9c4c9')
def test_chunkindex(self):
self._generic_test(ChunkIndex, lambda x: (x, x, x),
'1d71865e72e3c3af18d3c7216b6fa7b014695eaa3ed7f14cf9cd02fba75d1c95')
def test_resize(self):
n = 2000 # Must be >= MIN_BUCKETS
idx_name = tempfile.NamedTemporaryFile()
idx = NSIndex()
idx.write(idx_name.name)
initial_size = os.path.getsize(idx_name.name)
self.assert_equal(len(idx), 0)
for x in range(n):
idx[bytes('%-32d' % x, 'ascii')] = x, x
idx.write(idx_name.name)
self.assert_true(initial_size < os.path.getsize(idx_name.name))
for x in range(n):
del idx[bytes('%-32d' % x, 'ascii')]
self.assert_equal(len(idx), 0)
idx.write(idx_name.name)
self.assert_equal(initial_size, os.path.getsize(idx_name.name))
def test_iteritems(self):
idx = NSIndex()
for x in range(100):
idx[bytes('%-0.32d' % x, 'ascii')] = x, x
all = list(idx.iteritems())
self.assert_equal(len(all), 100)
second_half = list(idx.iteritems(marker=all[49][0]))
self.assert_equal(len(second_half), 50)
self.assert_equal(second_half, all[50:])
def test_chunkindex_merge(self):
idx1 = ChunkIndex()
idx1[H(1)] = 1, 100, 100
idx1[H(2)] = 2, 200, 200
idx1[H(3)] = 3, 300, 300
# no H(4) entry
idx2 = ChunkIndex()
idx2[H(1)] = 4, 100, 100
idx2[H(2)] = 5, 200, 200
# no H(3) entry
idx2[H(4)] = 6, 400, 400
idx1.merge(idx2)
assert idx1[H(1)] == (5, 100, 100)
assert idx1[H(2)] == (7, 200, 200)
assert idx1[H(3)] == (3, 300, 300)
assert idx1[H(4)] == (6, 400, 400)
def test_chunkindex_summarize(self):
idx = ChunkIndex()
idx[H(1)] = 1, 1000, 100
idx[H(2)] = 2, 2000, 200
idx[H(3)] = 3, 3000, 300
size, csize, unique_size, unique_csize, unique_chunks, chunks = idx.summarize()
assert size == 1000 + 2 * 2000 + 3 * 3000
assert csize == 100 + 2 * 200 + 3 * 300
assert unique_size == 1000 + 2000 + 3000
assert unique_csize == 100 + 200 + 300
assert chunks == 1 + 2 + 3
assert unique_chunks == 3
class HashIndexRefcountingTestCase(BaseTestCase):
def test_chunkindex_limit(self):
idx = ChunkIndex()
idx[H(1)] = hashindex.MAX_VALUE - 1, 1, 2
# 5 is arbitray, any number of incref/decrefs shouldn't move it once it's limited
for i in range(5):
# first incref to move it to the limit
refcount, *_ = idx.incref(H(1))
assert refcount == hashindex.MAX_VALUE
for i in range(5):
refcount, *_ = idx.decref(H(1))
assert refcount == hashindex.MAX_VALUE
def _merge(self, refcounta, refcountb):
def merge(refcount1, refcount2):
idx1 = ChunkIndex()
idx1[H(1)] = refcount1, 1, 2
idx2 = ChunkIndex()
idx2[H(1)] = refcount2, 1, 2
idx1.merge(idx2)
refcount, *_ = idx1[H(1)]
return refcount
result = merge(refcounta, refcountb)
# check for commutativity
assert result == merge(refcountb, refcounta)
return result
def test_chunkindex_merge_limit1(self):
# Check that it does *not* limit at MAX_VALUE - 1
# (MAX_VALUE is odd)
half = hashindex.MAX_VALUE // 2
assert self._merge(half, half) == hashindex.MAX_VALUE - 1
def test_chunkindex_merge_limit2(self):
# 3000000000 + 2000000000 > MAX_VALUE
assert self._merge(3000000000, 2000000000) == hashindex.MAX_VALUE
def test_chunkindex_merge_limit3(self):
# Crossover point: both addition and limit semantics will yield the same result
half = hashindex.MAX_VALUE // 2
assert self._merge(half + 1, half) == hashindex.MAX_VALUE
def test_chunkindex_merge_limit4(self):
# Beyond crossover, result of addition would be 2**31
half = hashindex.MAX_VALUE // 2
assert self._merge(half + 2, half) == hashindex.MAX_VALUE
assert self._merge(half + 1, half + 1) == hashindex.MAX_VALUE
def test_chunkindex_add(self):
idx1 = ChunkIndex()
idx1.add(H(1), 5, 6, 7)
assert idx1[H(1)] == (5, 6, 7)
idx1.add(H(1), 1, 0, 0)
assert idx1[H(1)] == (6, 6, 7)
def test_incref_limit(self):
idx1 = ChunkIndex()
idx1[H(1)] = (hashindex.MAX_VALUE, 6, 7)
idx1.incref(H(1))
refcount, *_ = idx1[H(1)]
assert refcount == hashindex.MAX_VALUE
def test_decref_limit(self):
idx1 = ChunkIndex()
idx1[H(1)] = hashindex.MAX_VALUE, 6, 7
idx1.decref(H(1))
refcount, *_ = idx1[H(1)]
assert refcount == hashindex.MAX_VALUE
def test_decref_zero(self):
idx1 = ChunkIndex()
idx1[H(1)] = 0, 0, 0
with pytest.raises(AssertionError):
idx1.decref(H(1))
def test_incref_decref(self):
idx1 = ChunkIndex()
idx1.add(H(1), 5, 6, 7)
assert idx1[H(1)] == (5, 6, 7)
idx1.incref(H(1))
assert idx1[H(1)] == (6, 6, 7)
idx1.decref(H(1))
assert idx1[H(1)] == (5, 6, 7)
def test_setitem_raises(self):
idx1 = ChunkIndex()
with pytest.raises(AssertionError):
idx1[H(1)] = hashindex.MAX_VALUE + 1, 0, 0
def test_keyerror(self):
idx = ChunkIndex()
with pytest.raises(KeyError):
idx.incref(H(1))
with pytest.raises(KeyError):
idx.decref(H(1))
with pytest.raises(KeyError):
idx[H(1)]
with pytest.raises(OverflowError):
idx.add(H(1), -1, 0, 0)
class HashIndexDataTestCase(BaseTestCase):
# This bytestring was created with 1.0-maint at c2f9533
HASHINDEX = b'eJzt0L0NgmAUhtHLT0LDEI6AuAEhMVYmVnSuYefC7AB3Aj9KNedJbnfyFne6P67P27w0EdG1Eac+Cm1ZybAsy7Isy7Isy7Isy7I' \
b'sy7Isy7Isy7Isy7Isy7Isy7Isy7Isy7Isy7Isy7Isy7Isy7Isy7Isy7Isy7Isy7Isy7Isy7Isy7Isy7Isy7Isy7LsL9nhc+cqTZ' \
b'3XlO2Ys++Du5fX+l1/YFmWZVmWZVmWZVmWZVmWZVmWZVmWZVmWZVmWZVmWZVmWZVmWZVmWZVmWZVmWZVmWZVn2/+0O2rYccw=='
def _serialize_hashindex(self, idx):
with tempfile.TemporaryDirectory() as tempdir:
file = os.path.join(tempdir, 'idx')
idx.write(file)
with open(file, 'rb') as f:
return self._pack(f.read())
def _deserialize_hashindex(self, bytestring):
with tempfile.TemporaryDirectory() as tempdir:
file = os.path.join(tempdir, 'idx')
with open(file, 'wb') as f:
f.write(self._unpack(bytestring))
return ChunkIndex.read(file)
def _pack(self, bytestring):
return base64.b64encode(zlib.compress(bytestring))
def _unpack(self, bytestring):
return zlib.decompress(base64.b64decode(bytestring))
def test_identical_creation(self):
idx1 = ChunkIndex()
idx1[H(1)] = 1, 2, 3
idx1[H(2)] = 2**31 - 1, 0, 0
idx1[H(3)] = 4294962296, 0, 0 # 4294962296 is -5000 interpreted as an uint32_t
assert self._serialize_hashindex(idx1) == self.HASHINDEX
def test_read_known_good(self):
idx1 = self._deserialize_hashindex(self.HASHINDEX)
assert idx1[H(1)] == (1, 2, 3)
assert idx1[H(2)] == (2**31 - 1, 0, 0)
assert idx1[H(3)] == (4294962296, 0, 0)
idx2 = ChunkIndex()
idx2[H(3)] = 2**32 - 123456, 6, 7
idx1.merge(idx2)
assert idx1[H(3)] == (hashindex.MAX_VALUE, 0, 0)
def test_nsindex_segment_limit():
idx = NSIndex()
with pytest.raises(AssertionError):
idx[H(1)] = hashindex.MAX_VALUE + 1, 0
assert H(1) not in idx
idx[H(2)] = hashindex.MAX_VALUE, 0
assert H(2) in idx

879
borg/testsuite/helpers.py Normal file
View file

@ -0,0 +1,879 @@
import hashlib
from time import mktime, strptime
from datetime import datetime, timezone, timedelta
from io import StringIO
import os
import pytest
import sys
import msgpack
import msgpack.fallback
import time
from ..helpers import Location, format_file_size, format_timedelta, make_path_safe, \
prune_within, prune_split, get_cache_dir, get_keys_dir, Statistics, is_slow_msgpack, \
yes, TRUISH, FALSISH, DEFAULTISH, \
StableDict, int_to_bigint, bigint_to_int, parse_timestamp, CompressionSpec, ChunkerParams, \
ProgressIndicatorPercent, ProgressIndicatorEndless, load_excludes, parse_pattern, \
PatternMatcher, RegexPattern, PathPrefixPattern, FnmatchPattern, ShellPattern
from . import BaseTestCase, environment_variable, FakeInputs
class BigIntTestCase(BaseTestCase):
def test_bigint(self):
self.assert_equal(int_to_bigint(0), 0)
self.assert_equal(int_to_bigint(2**63-1), 2**63-1)
self.assert_equal(int_to_bigint(-2**63+1), -2**63+1)
self.assert_equal(int_to_bigint(2**63), b'\x00\x00\x00\x00\x00\x00\x00\x80\x00')
self.assert_equal(int_to_bigint(-2**63), b'\x00\x00\x00\x00\x00\x00\x00\x80\xff')
self.assert_equal(bigint_to_int(int_to_bigint(-2**70)), -2**70)
self.assert_equal(bigint_to_int(int_to_bigint(2**70)), 2**70)
class TestLocationWithoutEnv:
def test_ssh(self, monkeypatch):
monkeypatch.delenv('BORG_REPO', raising=False)
assert repr(Location('ssh://user@host:1234/some/path::archive')) == \
"Location(proto='ssh', user='user', host='host', port=1234, path='/some/path', archive='archive')"
assert repr(Location('ssh://user@host:1234/some/path')) == \
"Location(proto='ssh', user='user', host='host', port=1234, path='/some/path', archive=None)"
def test_file(self, monkeypatch):
monkeypatch.delenv('BORG_REPO', raising=False)
assert repr(Location('file:///some/path::archive')) == \
"Location(proto='file', user=None, host=None, port=None, path='/some/path', archive='archive')"
assert repr(Location('file:///some/path')) == \
"Location(proto='file', user=None, host=None, port=None, path='/some/path', archive=None)"
def test_scp(self, monkeypatch):
monkeypatch.delenv('BORG_REPO', raising=False)
assert repr(Location('user@host:/some/path::archive')) == \
"Location(proto='ssh', user='user', host='host', port=None, path='/some/path', archive='archive')"
assert repr(Location('user@host:/some/path')) == \
"Location(proto='ssh', user='user', host='host', port=None, path='/some/path', archive=None)"
def test_folder(self, monkeypatch):
monkeypatch.delenv('BORG_REPO', raising=False)
assert repr(Location('path::archive')) == \
"Location(proto='file', user=None, host=None, port=None, path='path', archive='archive')"
assert repr(Location('path')) == \
"Location(proto='file', user=None, host=None, port=None, path='path', archive=None)"
def test_abspath(self, monkeypatch):
monkeypatch.delenv('BORG_REPO', raising=False)
assert repr(Location('/some/absolute/path::archive')) == \
"Location(proto='file', user=None, host=None, port=None, path='/some/absolute/path', archive='archive')"
assert repr(Location('/some/absolute/path')) == \
"Location(proto='file', user=None, host=None, port=None, path='/some/absolute/path', archive=None)"
def test_relpath(self, monkeypatch):
monkeypatch.delenv('BORG_REPO', raising=False)
assert repr(Location('some/relative/path::archive')) == \
"Location(proto='file', user=None, host=None, port=None, path='some/relative/path', archive='archive')"
assert repr(Location('some/relative/path')) == \
"Location(proto='file', user=None, host=None, port=None, path='some/relative/path', archive=None)"
def test_underspecified(self, monkeypatch):
monkeypatch.delenv('BORG_REPO', raising=False)
with pytest.raises(ValueError):
Location('::archive')
with pytest.raises(ValueError):
Location('::')
with pytest.raises(ValueError):
Location()
def test_no_double_colon(self, monkeypatch):
monkeypatch.delenv('BORG_REPO', raising=False)
with pytest.raises(ValueError):
Location('ssh://localhost:22/path:archive')
def test_no_slashes(self, monkeypatch):
monkeypatch.delenv('BORG_REPO', raising=False)
with pytest.raises(ValueError):
Location('/some/path/to/repo::archive_name_with/slashes/is_invalid')
def test_canonical_path(self, monkeypatch):
monkeypatch.delenv('BORG_REPO', raising=False)
locations = ['some/path::archive', 'file://some/path::archive', 'host:some/path::archive',
'host:~user/some/path::archive', 'ssh://host/some/path::archive',
'ssh://user@host:1234/some/path::archive']
for location in locations:
assert Location(location).canonical_path() == \
Location(Location(location).canonical_path()).canonical_path()
def test_format_path(self, monkeypatch):
monkeypatch.delenv('BORG_REPO', raising=False)
test_pid = os.getpid()
assert repr(Location('/some/path::archive{pid}')) == \
"Location(proto='file', user=None, host=None, port=None, path='/some/path', archive='archive{}')".format(test_pid)
location_time1 = Location('/some/path::archive{now:%s}')
time.sleep(1.1)
location_time2 = Location('/some/path::archive{now:%s}')
assert location_time1.archive != location_time2.archive
class TestLocationWithEnv:
def test_ssh(self, monkeypatch):
monkeypatch.setenv('BORG_REPO', 'ssh://user@host:1234/some/path')
assert repr(Location('::archive')) == \
"Location(proto='ssh', user='user', host='host', port=1234, path='/some/path', archive='archive')"
assert repr(Location()) == \
"Location(proto='ssh', user='user', host='host', port=1234, path='/some/path', archive=None)"
def test_file(self, monkeypatch):
monkeypatch.setenv('BORG_REPO', 'file:///some/path')
assert repr(Location('::archive')) == \
"Location(proto='file', user=None, host=None, port=None, path='/some/path', archive='archive')"
assert repr(Location()) == \
"Location(proto='file', user=None, host=None, port=None, path='/some/path', archive=None)"
def test_scp(self, monkeypatch):
monkeypatch.setenv('BORG_REPO', 'user@host:/some/path')
assert repr(Location('::archive')) == \
"Location(proto='ssh', user='user', host='host', port=None, path='/some/path', archive='archive')"
assert repr(Location()) == \
"Location(proto='ssh', user='user', host='host', port=None, path='/some/path', archive=None)"
def test_folder(self, monkeypatch):
monkeypatch.setenv('BORG_REPO', 'path')
assert repr(Location('::archive')) == \
"Location(proto='file', user=None, host=None, port=None, path='path', archive='archive')"
assert repr(Location()) == \
"Location(proto='file', user=None, host=None, port=None, path='path', archive=None)"
def test_abspath(self, monkeypatch):
monkeypatch.setenv('BORG_REPO', '/some/absolute/path')
assert repr(Location('::archive')) == \
"Location(proto='file', user=None, host=None, port=None, path='/some/absolute/path', archive='archive')"
assert repr(Location()) == \
"Location(proto='file', user=None, host=None, port=None, path='/some/absolute/path', archive=None)"
def test_relpath(self, monkeypatch):
monkeypatch.setenv('BORG_REPO', 'some/relative/path')
assert repr(Location('::archive')) == \
"Location(proto='file', user=None, host=None, port=None, path='some/relative/path', archive='archive')"
assert repr(Location()) == \
"Location(proto='file', user=None, host=None, port=None, path='some/relative/path', archive=None)"
def test_no_slashes(self, monkeypatch):
monkeypatch.setenv('BORG_REPO', '/some/absolute/path')
with pytest.raises(ValueError):
Location('::archive_name_with/slashes/is_invalid')
class FormatTimedeltaTestCase(BaseTestCase):
def test(self):
t0 = datetime(2001, 1, 1, 10, 20, 3, 0)
t1 = datetime(2001, 1, 1, 12, 20, 4, 100000)
self.assert_equal(
format_timedelta(t1 - t0),
'2 hours 1.10 seconds'
)
def check_patterns(files, pattern, expected):
"""Utility for testing patterns.
"""
assert all([f == os.path.normpath(f) for f in files]), "Pattern matchers expect normalized input paths"
matched = [f for f in files if pattern.match(f)]
assert matched == (files if expected is None else expected)
@pytest.mark.parametrize("pattern, expected", [
# "None" means all files, i.e. all match the given pattern
("/", None),
("/./", None),
("", []),
("/home/u", []),
("/home/user", ["/home/user/.profile", "/home/user/.bashrc"]),
("/etc", ["/etc/server/config", "/etc/server/hosts"]),
("///etc//////", ["/etc/server/config", "/etc/server/hosts"]),
("/./home//..//home/user2", ["/home/user2/.profile", "/home/user2/public_html/index.html"]),
("/srv", ["/srv/messages", "/srv/dmesg"]),
])
def test_patterns_prefix(pattern, expected):
files = [
"/etc/server/config", "/etc/server/hosts", "/home", "/home/user/.profile", "/home/user/.bashrc",
"/home/user2/.profile", "/home/user2/public_html/index.html", "/srv/messages", "/srv/dmesg",
]
check_patterns(files, PathPrefixPattern(pattern), expected)
@pytest.mark.parametrize("pattern, expected", [
# "None" means all files, i.e. all match the given pattern
("", []),
("foo", []),
("relative", ["relative/path1", "relative/two"]),
("more", ["more/relative"]),
])
def test_patterns_prefix_relative(pattern, expected):
files = ["relative/path1", "relative/two", "more/relative"]
check_patterns(files, PathPrefixPattern(pattern), expected)
@pytest.mark.parametrize("pattern, expected", [
# "None" means all files, i.e. all match the given pattern
("/*", None),
("/./*", None),
("*", None),
("*/*", None),
("*///*", None),
("/home/u", []),
("/home/*",
["/home/user/.profile", "/home/user/.bashrc", "/home/user2/.profile", "/home/user2/public_html/index.html",
"/home/foo/.thumbnails", "/home/foo/bar/.thumbnails"]),
("/home/user/*", ["/home/user/.profile", "/home/user/.bashrc"]),
("/etc/*", ["/etc/server/config", "/etc/server/hosts"]),
("*/.pr????e", ["/home/user/.profile", "/home/user2/.profile"]),
("///etc//////*", ["/etc/server/config", "/etc/server/hosts"]),
("/./home//..//home/user2/*", ["/home/user2/.profile", "/home/user2/public_html/index.html"]),
("/srv*", ["/srv/messages", "/srv/dmesg"]),
("/home/*/.thumbnails", ["/home/foo/.thumbnails", "/home/foo/bar/.thumbnails"]),
])
def test_patterns_fnmatch(pattern, expected):
files = [
"/etc/server/config", "/etc/server/hosts", "/home", "/home/user/.profile", "/home/user/.bashrc",
"/home/user2/.profile", "/home/user2/public_html/index.html", "/srv/messages", "/srv/dmesg",
"/home/foo/.thumbnails", "/home/foo/bar/.thumbnails",
]
check_patterns(files, FnmatchPattern(pattern), expected)
@pytest.mark.parametrize("pattern, expected", [
# "None" means all files, i.e. all match the given pattern
("*", None),
("**/*", None),
("/**/*", None),
("/./*", None),
("*/*", None),
("*///*", None),
("/home/u", []),
("/home/*",
["/home/user/.profile", "/home/user/.bashrc", "/home/user2/.profile", "/home/user2/public_html/index.html",
"/home/foo/.thumbnails", "/home/foo/bar/.thumbnails"]),
("/home/user/*", ["/home/user/.profile", "/home/user/.bashrc"]),
("/etc/*/*", ["/etc/server/config", "/etc/server/hosts"]),
("/etc/**/*", ["/etc/server/config", "/etc/server/hosts"]),
("/etc/**/*/*", ["/etc/server/config", "/etc/server/hosts"]),
("*/.pr????e", []),
("**/.pr????e", ["/home/user/.profile", "/home/user2/.profile"]),
("///etc//////*", ["/etc/server/config", "/etc/server/hosts"]),
("/./home//..//home/user2/", ["/home/user2/.profile", "/home/user2/public_html/index.html"]),
("/./home//..//home/user2/**/*", ["/home/user2/.profile", "/home/user2/public_html/index.html"]),
("/srv*/", ["/srv/messages", "/srv/dmesg", "/srv2/blafasel"]),
("/srv*", ["/srv", "/srv/messages", "/srv/dmesg", "/srv2", "/srv2/blafasel"]),
("/srv/*", ["/srv/messages", "/srv/dmesg"]),
("/srv2/**", ["/srv2", "/srv2/blafasel"]),
("/srv2/**/", ["/srv2/blafasel"]),
("/home/*/.thumbnails", ["/home/foo/.thumbnails"]),
("/home/*/*/.thumbnails", ["/home/foo/bar/.thumbnails"]),
])
def test_patterns_shell(pattern, expected):
files = [
"/etc/server/config", "/etc/server/hosts", "/home", "/home/user/.profile", "/home/user/.bashrc",
"/home/user2/.profile", "/home/user2/public_html/index.html", "/srv", "/srv/messages", "/srv/dmesg",
"/srv2", "/srv2/blafasel", "/home/foo/.thumbnails", "/home/foo/bar/.thumbnails",
]
check_patterns(files, ShellPattern(pattern), expected)
@pytest.mark.parametrize("pattern, expected", [
# "None" means all files, i.e. all match the given pattern
("", None),
(".*", None),
("^/", None),
("^abc$", []),
("^[^/]", []),
("^(?!/srv|/foo|/opt)",
["/home", "/home/user/.profile", "/home/user/.bashrc", "/home/user2/.profile",
"/home/user2/public_html/index.html", "/home/foo/.thumbnails", "/home/foo/bar/.thumbnails", ]),
])
def test_patterns_regex(pattern, expected):
files = [
'/srv/data', '/foo/bar', '/home',
'/home/user/.profile', '/home/user/.bashrc',
'/home/user2/.profile', '/home/user2/public_html/index.html',
'/opt/log/messages.txt', '/opt/log/dmesg.txt',
"/home/foo/.thumbnails", "/home/foo/bar/.thumbnails",
]
obj = RegexPattern(pattern)
assert str(obj) == pattern
assert obj.pattern == pattern
check_patterns(files, obj, expected)
def test_regex_pattern():
# The forward slash must match the platform-specific path separator
assert RegexPattern("^/$").match("/")
assert RegexPattern("^/$").match(os.path.sep)
assert not RegexPattern(r"^\\$").match("/")
def use_normalized_unicode():
return sys.platform in ("darwin",)
def _make_test_patterns(pattern):
return [PathPrefixPattern(pattern),
FnmatchPattern(pattern),
RegexPattern("^{}/foo$".format(pattern)),
ShellPattern(pattern),
]
@pytest.mark.parametrize("pattern", _make_test_patterns("b\N{LATIN SMALL LETTER A WITH ACUTE}"))
def test_composed_unicode_pattern(pattern):
assert pattern.match("b\N{LATIN SMALL LETTER A WITH ACUTE}/foo")
assert pattern.match("ba\N{COMBINING ACUTE ACCENT}/foo") == use_normalized_unicode()
@pytest.mark.parametrize("pattern", _make_test_patterns("ba\N{COMBINING ACUTE ACCENT}"))
def test_decomposed_unicode_pattern(pattern):
assert pattern.match("b\N{LATIN SMALL LETTER A WITH ACUTE}/foo") == use_normalized_unicode()
assert pattern.match("ba\N{COMBINING ACUTE ACCENT}/foo")
@pytest.mark.parametrize("pattern", _make_test_patterns(str(b"ba\x80", "latin1")))
def test_invalid_unicode_pattern(pattern):
assert not pattern.match("ba/foo")
assert pattern.match(str(b"ba\x80/foo", "latin1"))
@pytest.mark.parametrize("lines, expected", [
# "None" means all files, i.e. none excluded
([], None),
(["# Comment only"], None),
(["*"], []),
(["# Comment",
"*/something00.txt",
" *whitespace* ",
# Whitespace before comment
" #/ws*",
# Empty line
"",
"# EOF"],
["/more/data", "/home", " #/wsfoobar"]),
(["re:.*"], []),
(["re:\s"], ["/data/something00.txt", "/more/data", "/home"]),
([r"re:(.)(\1)"], ["/more/data", "/home", "\tstart/whitespace", "/whitespace/end\t"]),
(["", "", "",
"# This is a test with mixed pattern styles",
# Case-insensitive pattern
"re:(?i)BAR|ME$",
"",
"*whitespace*",
"fm:*/something00*"],
["/more/data"]),
([r" re:^\s "], ["/data/something00.txt", "/more/data", "/home", "/whitespace/end\t"]),
([r" re:\s$ "], ["/data/something00.txt", "/more/data", "/home", " #/wsfoobar", "\tstart/whitespace"]),
(["pp:./"], None),
(["pp:/"], [" #/wsfoobar", "\tstart/whitespace"]),
(["pp:aaabbb"], None),
(["pp:/data", "pp: #/", "pp:\tstart", "pp:/whitespace"], ["/more/data", "/home"]),
])
def test_patterns_from_file(tmpdir, lines, expected):
files = [
'/data/something00.txt', '/more/data', '/home',
' #/wsfoobar',
'\tstart/whitespace',
'/whitespace/end\t',
]
def evaluate(filename):
matcher = PatternMatcher(fallback=True)
matcher.add(load_excludes(open(filename, "rt")), False)
return [path for path in files if matcher.match(path)]
exclfile = tmpdir.join("exclude.txt")
with exclfile.open("wt") as fh:
fh.write("\n".join(lines))
assert evaluate(str(exclfile)) == (files if expected is None else expected)
@pytest.mark.parametrize("pattern, cls", [
("", FnmatchPattern),
# Default style
("*", FnmatchPattern),
("/data/*", FnmatchPattern),
# fnmatch style
("fm:", FnmatchPattern),
("fm:*", FnmatchPattern),
("fm:/data/*", FnmatchPattern),
("fm:fm:/data/*", FnmatchPattern),
# Regular expression
("re:", RegexPattern),
("re:.*", RegexPattern),
("re:^/something/", RegexPattern),
("re:re:^/something/", RegexPattern),
# Path prefix
("pp:", PathPrefixPattern),
("pp:/", PathPrefixPattern),
("pp:/data/", PathPrefixPattern),
("pp:pp:/data/", PathPrefixPattern),
# Shell-pattern style
("sh:", ShellPattern),
("sh:*", ShellPattern),
("sh:/data/*", ShellPattern),
("sh:sh:/data/*", ShellPattern),
])
def test_parse_pattern(pattern, cls):
assert isinstance(parse_pattern(pattern), cls)
@pytest.mark.parametrize("pattern", ["aa:", "fo:*", "00:", "x1:abc"])
def test_parse_pattern_error(pattern):
with pytest.raises(ValueError):
parse_pattern(pattern)
def test_pattern_matcher():
pm = PatternMatcher()
assert pm.fallback is None
for i in ["", "foo", "bar"]:
assert pm.match(i) is None
pm.add([RegexPattern("^a")], "A")
pm.add([RegexPattern("^b"), RegexPattern("^z")], "B")
pm.add([RegexPattern("^$")], "Empty")
pm.fallback = "FileNotFound"
assert pm.match("") == "Empty"
assert pm.match("aaa") == "A"
assert pm.match("bbb") == "B"
assert pm.match("ccc") == "FileNotFound"
assert pm.match("xyz") == "FileNotFound"
assert pm.match("z") == "B"
assert PatternMatcher(fallback="hey!").fallback == "hey!"
def test_compression_specs():
with pytest.raises(ValueError):
CompressionSpec('')
assert CompressionSpec('none') == dict(name='none')
assert CompressionSpec('lz4') == dict(name='lz4')
assert CompressionSpec('zlib') == dict(name='zlib', level=6)
assert CompressionSpec('zlib,0') == dict(name='zlib', level=0)
assert CompressionSpec('zlib,9') == dict(name='zlib', level=9)
with pytest.raises(ValueError):
CompressionSpec('zlib,9,invalid')
assert CompressionSpec('lzma') == dict(name='lzma', level=6)
assert CompressionSpec('lzma,0') == dict(name='lzma', level=0)
assert CompressionSpec('lzma,9') == dict(name='lzma', level=9)
with pytest.raises(ValueError):
CompressionSpec('lzma,9,invalid')
with pytest.raises(ValueError):
CompressionSpec('invalid')
def test_chunkerparams():
assert ChunkerParams('19,23,21,4095') == (19, 23, 21, 4095)
assert ChunkerParams('10,23,16,4095') == (10, 23, 16, 4095)
with pytest.raises(ValueError):
ChunkerParams('19,24,21,4095')
class MakePathSafeTestCase(BaseTestCase):
def test(self):
self.assert_equal(make_path_safe('/foo/bar'), 'foo/bar')
self.assert_equal(make_path_safe('/foo/bar'), 'foo/bar')
self.assert_equal(make_path_safe('/f/bar'), 'f/bar')
self.assert_equal(make_path_safe('fo/bar'), 'fo/bar')
self.assert_equal(make_path_safe('../foo/bar'), 'foo/bar')
self.assert_equal(make_path_safe('../../foo/bar'), 'foo/bar')
self.assert_equal(make_path_safe('/'), '.')
self.assert_equal(make_path_safe('/'), '.')
class MockArchive:
def __init__(self, ts):
self.ts = ts
def __repr__(self):
return repr(self.ts)
class PruneSplitTestCase(BaseTestCase):
def test(self):
def local_to_UTC(month, day):
"""Convert noon on the month and day in 2013 to UTC."""
seconds = mktime(strptime('2013-%02d-%02d 12:00' % (month, day), '%Y-%m-%d %H:%M'))
return datetime.fromtimestamp(seconds, tz=timezone.utc)
def subset(lst, indices):
return {lst[i] for i in indices}
def dotest(test_archives, n, skip, indices):
for ta in test_archives, reversed(test_archives):
self.assert_equal(set(prune_split(ta, '%Y-%m', n, skip)),
subset(test_archives, indices))
test_pairs = [(1, 1), (2, 1), (2, 28), (3, 1), (3, 2), (3, 31), (5, 1)]
test_dates = [local_to_UTC(month, day) for month, day in test_pairs]
test_archives = [MockArchive(date) for date in test_dates]
dotest(test_archives, 3, [], [6, 5, 2])
dotest(test_archives, -1, [], [6, 5, 2, 0])
dotest(test_archives, 3, [test_archives[6]], [5, 2, 0])
dotest(test_archives, 3, [test_archives[5]], [6, 2, 0])
dotest(test_archives, 3, [test_archives[4]], [6, 5, 2])
dotest(test_archives, 0, [], [])
class PruneWithinTestCase(BaseTestCase):
def test(self):
def subset(lst, indices):
return {lst[i] for i in indices}
def dotest(test_archives, within, indices):
for ta in test_archives, reversed(test_archives):
self.assert_equal(set(prune_within(ta, within)),
subset(test_archives, indices))
# 1 minute, 1.5 hours, 2.5 hours, 3.5 hours, 25 hours, 49 hours
test_offsets = [60, 90*60, 150*60, 210*60, 25*60*60, 49*60*60]
now = datetime.now(timezone.utc)
test_dates = [now - timedelta(seconds=s) for s in test_offsets]
test_archives = [MockArchive(date) for date in test_dates]
dotest(test_archives, '1H', [0])
dotest(test_archives, '2H', [0, 1])
dotest(test_archives, '3H', [0, 1, 2])
dotest(test_archives, '24H', [0, 1, 2, 3])
dotest(test_archives, '26H', [0, 1, 2, 3, 4])
dotest(test_archives, '2d', [0, 1, 2, 3, 4])
dotest(test_archives, '50H', [0, 1, 2, 3, 4, 5])
dotest(test_archives, '3d', [0, 1, 2, 3, 4, 5])
dotest(test_archives, '1w', [0, 1, 2, 3, 4, 5])
dotest(test_archives, '1m', [0, 1, 2, 3, 4, 5])
dotest(test_archives, '1y', [0, 1, 2, 3, 4, 5])
class StableDictTestCase(BaseTestCase):
def test(self):
d = StableDict(foo=1, bar=2, boo=3, baz=4)
self.assert_equal(list(d.items()), [('bar', 2), ('baz', 4), ('boo', 3), ('foo', 1)])
self.assert_equal(hashlib.md5(msgpack.packb(d)).hexdigest(), 'fc78df42cd60691b3ac3dd2a2b39903f')
class TestParseTimestamp(BaseTestCase):
def test(self):
self.assert_equal(parse_timestamp('2015-04-19T20:25:00.226410'), datetime(2015, 4, 19, 20, 25, 0, 226410, timezone.utc))
self.assert_equal(parse_timestamp('2015-04-19T20:25:00'), datetime(2015, 4, 19, 20, 25, 0, 0, timezone.utc))
def test_get_cache_dir():
"""test that get_cache_dir respects environment"""
# reset BORG_CACHE_DIR in order to test default
old_env = None
if os.environ.get('BORG_CACHE_DIR'):
old_env = os.environ['BORG_CACHE_DIR']
del(os.environ['BORG_CACHE_DIR'])
assert get_cache_dir() == os.path.join(os.path.expanduser('~'), '.cache', 'borg')
os.environ['XDG_CACHE_HOME'] = '/var/tmp/.cache'
assert get_cache_dir() == os.path.join('/var/tmp/.cache', 'borg')
os.environ['BORG_CACHE_DIR'] = '/var/tmp'
assert get_cache_dir() == '/var/tmp'
# reset old env
if old_env is not None:
os.environ['BORG_CACHE_DIR'] = old_env
def test_get_keys_dir():
"""test that get_keys_dir respects environment"""
# reset BORG_KEYS_DIR in order to test default
old_env = None
if os.environ.get('BORG_KEYS_DIR'):
old_env = os.environ['BORG_KEYS_DIR']
del(os.environ['BORG_KEYS_DIR'])
assert get_keys_dir() == os.path.join(os.path.expanduser('~'), '.config', 'borg', 'keys')
os.environ['XDG_CONFIG_HOME'] = '/var/tmp/.config'
assert get_keys_dir() == os.path.join('/var/tmp/.config', 'borg', 'keys')
os.environ['BORG_KEYS_DIR'] = '/var/tmp'
assert get_keys_dir() == '/var/tmp'
# reset old env
if old_env is not None:
os.environ['BORG_KEYS_DIR'] = old_env
@pytest.fixture()
def stats():
stats = Statistics()
stats.update(20, 10, unique=True)
return stats
def test_stats_basic(stats):
assert stats.osize == 20
assert stats.csize == stats.usize == 10
stats.update(20, 10, unique=False)
assert stats.osize == 40
assert stats.csize == 20
assert stats.usize == 10
def tests_stats_progress(stats, columns=80):
os.environ['COLUMNS'] = str(columns)
out = StringIO()
stats.show_progress(stream=out)
s = '20 B O 10 B C 10 B D 0 N '
buf = ' ' * (columns - len(s))
assert out.getvalue() == s + buf + "\r"
out = StringIO()
stats.update(10**3, 0, unique=False)
stats.show_progress(item={b'path': 'foo'}, final=False, stream=out)
s = '1.02 kB O 10 B C 10 B D 0 N foo'
buf = ' ' * (columns - len(s))
assert out.getvalue() == s + buf + "\r"
out = StringIO()
stats.show_progress(item={b'path': 'foo'*40}, final=False, stream=out)
s = '1.02 kB O 10 B C 10 B D 0 N foofoofoofoofoofoofoofo...oofoofoofoofoofoofoofoofoo'
buf = ' ' * (columns - len(s))
assert out.getvalue() == s + buf + "\r"
def test_stats_format(stats):
assert str(stats) == """\
Original size Compressed size Deduplicated size
This archive: 20 B 10 B 10 B"""
s = "{0.osize_fmt}".format(stats)
assert s == "20 B"
# kind of redundant, but id is variable so we can't match reliably
assert repr(stats) == '<Statistics object at {:#x} (20, 10, 10)>'.format(id(stats))
def test_file_size():
"""test the size formatting routines"""
si_size_map = {
0: '0 B', # no rounding necessary for those
1: '1 B',
142: '142 B',
999: '999 B',
1000: '1.00 kB', # rounding starts here
1001: '1.00 kB', # should be rounded away
1234: '1.23 kB', # should be rounded down
1235: '1.24 kB', # should be rounded up
1010: '1.01 kB', # rounded down as well
999990000: '999.99 MB', # rounded down
999990001: '999.99 MB', # rounded down
999995000: '1.00 GB', # rounded up to next unit
10**6: '1.00 MB', # and all the remaining units, megabytes
10**9: '1.00 GB', # gigabytes
10**12: '1.00 TB', # terabytes
10**15: '1.00 PB', # petabytes
10**18: '1.00 EB', # exabytes
10**21: '1.00 ZB', # zottabytes
10**24: '1.00 YB', # yottabytes
}
for size, fmt in si_size_map.items():
assert format_file_size(size) == fmt
def test_file_size_precision():
assert format_file_size(1234, precision=1) == '1.2 kB' # rounded down
assert format_file_size(1254, precision=1) == '1.3 kB' # rounded up
assert format_file_size(999990000, precision=1) == '1.0 GB' # and not 999.9 MB or 1000.0 MB
def test_is_slow_msgpack():
saved_packer = msgpack.Packer
try:
msgpack.Packer = msgpack.fallback.Packer
assert is_slow_msgpack()
finally:
msgpack.Packer = saved_packer
# this assumes that we have fast msgpack on test platform:
assert not is_slow_msgpack()
def test_yes_input():
inputs = list(TRUISH)
input = FakeInputs(inputs)
for i in inputs:
assert yes(input=input)
inputs = list(FALSISH)
input = FakeInputs(inputs)
for i in inputs:
assert not yes(input=input)
def test_yes_input_defaults():
inputs = list(DEFAULTISH)
input = FakeInputs(inputs)
for i in inputs:
assert yes(default=True, input=input)
input = FakeInputs(inputs)
for i in inputs:
assert not yes(default=False, input=input)
def test_yes_input_custom():
input = FakeInputs(['YES', 'SURE', 'NOPE', ])
assert yes(truish=('YES', ), input=input)
assert yes(truish=('SURE', ), input=input)
assert not yes(falsish=('NOPE', ), input=input)
def test_yes_env():
for value in TRUISH:
with environment_variable(OVERRIDE_THIS=value):
assert yes(env_var_override='OVERRIDE_THIS')
for value in FALSISH:
with environment_variable(OVERRIDE_THIS=value):
assert not yes(env_var_override='OVERRIDE_THIS')
def test_yes_env_default():
for value in DEFAULTISH:
with environment_variable(OVERRIDE_THIS=value):
assert yes(env_var_override='OVERRIDE_THIS', default=True)
with environment_variable(OVERRIDE_THIS=value):
assert not yes(env_var_override='OVERRIDE_THIS', default=False)
def test_yes_defaults():
input = FakeInputs(['invalid', '', ' '])
assert not yes(input=input) # default=False
assert not yes(input=input)
assert not yes(input=input)
input = FakeInputs(['invalid', '', ' '])
assert yes(default=True, input=input)
assert yes(default=True, input=input)
assert yes(default=True, input=input)
input = FakeInputs([])
assert yes(default=True, input=input)
assert not yes(default=False, input=input)
with pytest.raises(ValueError):
yes(default=None)
def test_yes_retry():
input = FakeInputs(['foo', 'bar', TRUISH[0], ])
assert yes(retry_msg='Retry: ', input=input)
input = FakeInputs(['foo', 'bar', FALSISH[0], ])
assert not yes(retry_msg='Retry: ', input=input)
def test_yes_no_retry():
input = FakeInputs(['foo', 'bar', TRUISH[0], ])
assert not yes(retry=False, default=False, input=input)
input = FakeInputs(['foo', 'bar', FALSISH[0], ])
assert yes(retry=False, default=True, input=input)
def test_yes_output(capfd):
input = FakeInputs(['invalid', 'y', 'n'])
assert yes(msg='intro-msg', false_msg='false-msg', true_msg='true-msg', retry_msg='retry-msg', input=input)
out, err = capfd.readouterr()
assert out == ''
assert 'intro-msg' in err
assert 'retry-msg' in err
assert 'true-msg' in err
assert not yes(msg='intro-msg', false_msg='false-msg', true_msg='true-msg', retry_msg='retry-msg', input=input)
out, err = capfd.readouterr()
assert out == ''
assert 'intro-msg' in err
assert 'retry-msg' not in err
assert 'false-msg' in err
def test_progress_percentage_multiline(capfd):
pi = ProgressIndicatorPercent(1000, step=5, start=0, same_line=False, msg="%3.0f%%", file=sys.stderr)
pi.show(0)
out, err = capfd.readouterr()
assert err == ' 0%\n'
pi.show(420)
out, err = capfd.readouterr()
assert err == ' 42%\n'
pi.show(1000)
out, err = capfd.readouterr()
assert err == '100%\n'
pi.finish()
out, err = capfd.readouterr()
assert err == ''
def test_progress_percentage_sameline(capfd):
pi = ProgressIndicatorPercent(1000, step=5, start=0, same_line=True, msg="%3.0f%%", file=sys.stderr)
pi.show(0)
out, err = capfd.readouterr()
assert err == ' 0%\r'
pi.show(420)
out, err = capfd.readouterr()
assert err == ' 42%\r'
pi.show(1000)
out, err = capfd.readouterr()
assert err == '100%\r'
pi.finish()
out, err = capfd.readouterr()
assert err == ' ' * 4 + '\r'
def test_progress_percentage_step(capfd):
pi = ProgressIndicatorPercent(100, step=2, start=0, same_line=False, msg="%3.0f%%", file=sys.stderr)
pi.show()
out, err = capfd.readouterr()
assert err == ' 0%\n'
pi.show()
out, err = capfd.readouterr()
assert err == '' # no output at 1% as we have step == 2
pi.show()
out, err = capfd.readouterr()
assert err == ' 2%\n'
def test_progress_endless(capfd):
pi = ProgressIndicatorEndless(step=1, file=sys.stderr)
pi.show()
out, err = capfd.readouterr()
assert err == '.'
pi.show()
out, err = capfd.readouterr()
assert err == '.'
pi.finish()
out, err = capfd.readouterr()
assert err == '\n'
def test_progress_endless_step(capfd):
pi = ProgressIndicatorEndless(step=2, file=sys.stderr)
pi.show()
out, err = capfd.readouterr()
assert err == '' # no output here as we have step == 2
pi.show()
out, err = capfd.readouterr()
assert err == '.'
pi.show()
out, err = capfd.readouterr()
assert err == '' # no output here as we have step == 2
pi.show()
out, err = capfd.readouterr()
assert err == '.'

103
borg/testsuite/key.py Normal file
View file

@ -0,0 +1,103 @@
import os
import re
import shutil
import tempfile
from binascii import hexlify, unhexlify
from ..crypto import bytes_to_long, num_aes_blocks
from ..key import PlaintextKey, PassphraseKey, KeyfileKey
from ..helpers import Location
from . import BaseTestCase
class KeyTestCase(BaseTestCase):
class MockArgs:
location = Location(tempfile.mkstemp()[1])
keyfile2_key_file = """
BORG_KEY 0000000000000000000000000000000000000000000000000000000000000000
hqppdGVyYXRpb25zzgABhqCkaGFzaNoAIMyonNI+7Cjv0qHi0AOBM6bLGxACJhfgzVD2oq
bIS9SFqWFsZ29yaXRobaZzaGEyNTakc2FsdNoAINNK5qqJc1JWSUjACwFEWGTdM7Nd0a5l
1uBGPEb+9XM9p3ZlcnNpb24BpGRhdGHaANAYDT5yfPpU099oBJwMomsxouKyx/OG4QIXK2
hQCG2L2L/9PUu4WIuKvGrsXoP7syemujNfcZws5jLp2UPva4PkQhQsrF1RYDEMLh2eF9Ol
rwtkThq1tnh7KjWMG9Ijt7/aoQtq0zDYP/xaFF8XXSJxiyP5zjH5+spB6RL0oQHvbsliSh
/cXJq7jrqmrJ1phd6dg4SHAM/i+hubadZoS6m25OQzYAW09wZD/phG8OVa698Z5ed3HTaT
SmrtgJL3EoOKgUI9d6BLE4dJdBqntifo""".strip()
keyfile2_cdata = unhexlify(re.sub('\W', '', """
0055f161493fcfc16276e8c31493c4641e1eb19a79d0326fad0291e5a9c98e5933
00000000000003e8d21eaf9b86c297a8cd56432e1915bb
"""))
keyfile2_id = unhexlify('c3fbf14bc001ebcc3cd86e696c13482ed071740927cd7cbe1b01b4bfcee49314')
def setUp(self):
self.tmppath = tempfile.mkdtemp()
os.environ['BORG_KEYS_DIR'] = self.tmppath
def tearDown(self):
shutil.rmtree(self.tmppath)
class MockRepository:
class _Location:
orig = '/some/place'
_location = _Location()
id = bytes(32)
def test_plaintext(self):
key = PlaintextKey.create(None, None)
data = b'foo'
self.assert_equal(hexlify(key.id_hash(data)), b'2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae')
self.assert_equal(data, key.decrypt(key.id_hash(data), key.encrypt(data)))
def test_keyfile(self):
os.environ['BORG_PASSPHRASE'] = 'test'
key = KeyfileKey.create(self.MockRepository(), self.MockArgs())
self.assert_equal(bytes_to_long(key.enc_cipher.iv, 8), 0)
manifest = key.encrypt(b'XXX')
self.assert_equal(key.extract_nonce(manifest), 0)
manifest2 = key.encrypt(b'XXX')
self.assert_not_equal(manifest, manifest2)
self.assert_equal(key.decrypt(None, manifest), key.decrypt(None, manifest2))
self.assert_equal(key.extract_nonce(manifest2), 1)
iv = key.extract_nonce(manifest)
key2 = KeyfileKey.detect(self.MockRepository(), manifest)
self.assert_equal(bytes_to_long(key2.enc_cipher.iv, 8), iv + num_aes_blocks(len(manifest) - KeyfileKey.PAYLOAD_OVERHEAD))
# Key data sanity check
self.assert_equal(len(set([key2.id_key, key2.enc_key, key2.enc_hmac_key])), 3)
self.assert_equal(key2.chunk_seed == 0, False)
data = b'foo'
self.assert_equal(data, key2.decrypt(key.id_hash(data), key.encrypt(data)))
def test_keyfile2(self):
with open(os.path.join(os.environ['BORG_KEYS_DIR'], 'keyfile'), 'w') as fd:
fd.write(self.keyfile2_key_file)
os.environ['BORG_PASSPHRASE'] = 'passphrase'
key = KeyfileKey.detect(self.MockRepository(), self.keyfile2_cdata)
self.assert_equal(key.decrypt(self.keyfile2_id, self.keyfile2_cdata), b'payload')
def test_passphrase(self):
os.environ['BORG_PASSPHRASE'] = 'test'
key = PassphraseKey.create(self.MockRepository(), None)
self.assert_equal(bytes_to_long(key.enc_cipher.iv, 8), 0)
self.assert_equal(hexlify(key.id_key), b'793b0717f9d8fb01c751a487e9b827897ceea62409870600013fbc6b4d8d7ca6')
self.assert_equal(hexlify(key.enc_hmac_key), b'b885a05d329a086627412a6142aaeb9f6c54ab7950f996dd65587251f6bc0901')
self.assert_equal(hexlify(key.enc_key), b'2ff3654c6daf7381dbbe718d2b20b4f1ea1e34caa6cc65f6bb3ac376b93fed2a')
self.assert_equal(key.chunk_seed, -775740477)
manifest = key.encrypt(b'XXX')
self.assert_equal(key.extract_nonce(manifest), 0)
manifest2 = key.encrypt(b'XXX')
self.assert_not_equal(manifest, manifest2)
self.assert_equal(key.decrypt(None, manifest), key.decrypt(None, manifest2))
self.assert_equal(key.extract_nonce(manifest2), 1)
iv = key.extract_nonce(manifest)
key2 = PassphraseKey.detect(self.MockRepository(), manifest)
self.assert_equal(bytes_to_long(key2.enc_cipher.iv, 8), iv + num_aes_blocks(len(manifest) - PassphraseKey.PAYLOAD_OVERHEAD))
self.assert_equal(key.id_key, key2.id_key)
self.assert_equal(key.enc_hmac_key, key2.enc_hmac_key)
self.assert_equal(key.enc_key, key2.enc_key)
self.assert_equal(key.chunk_seed, key2.chunk_seed)
data = b'foo'
self.assert_equal(hexlify(key.id_hash(data)), b'818217cf07d37efad3860766dcdf1d21e401650fed2d76ed1d797d3aae925990')
self.assert_equal(data, key2.decrypt(key2.id_hash(data), key.encrypt(data)))

134
borg/testsuite/locking.py Normal file
View file

@ -0,0 +1,134 @@
import time
import pytest
from ..locking import get_id, TimeoutTimer, ExclusiveLock, UpgradableLock, LockRoster, \
ADD, REMOVE, SHARED, EXCLUSIVE, LockTimeout
ID1 = "foo", 1, 1
ID2 = "bar", 2, 2
def test_id():
hostname, pid, tid = get_id()
assert isinstance(hostname, str)
assert isinstance(pid, int)
assert isinstance(tid, int)
assert len(hostname) > 0
assert pid > 0
class TestTimeoutTimer:
def test_timeout(self):
timeout = 0.5
t = TimeoutTimer(timeout).start()
assert not t.timed_out()
time.sleep(timeout * 1.5)
assert t.timed_out()
def test_notimeout_sleep(self):
timeout, sleep = None, 0.5
t = TimeoutTimer(timeout, sleep).start()
assert not t.timed_out_or_sleep()
assert time.time() >= t.start_time + 1 * sleep
assert not t.timed_out_or_sleep()
assert time.time() >= t.start_time + 2 * sleep
@pytest.fixture()
def lockpath(tmpdir):
return str(tmpdir.join('lock'))
class TestExclusiveLock:
def test_checks(self, lockpath):
with ExclusiveLock(lockpath, timeout=1) as lock:
assert lock.is_locked() and lock.by_me()
def test_acquire_break_reacquire(self, lockpath):
lock = ExclusiveLock(lockpath, id=ID1).acquire()
lock.break_lock()
with ExclusiveLock(lockpath, id=ID2):
pass
def test_timeout(self, lockpath):
with ExclusiveLock(lockpath, id=ID1):
with pytest.raises(LockTimeout):
ExclusiveLock(lockpath, id=ID2, timeout=0.1).acquire()
class TestUpgradableLock:
def test_shared(self, lockpath):
lock1 = UpgradableLock(lockpath, exclusive=False, id=ID1).acquire()
lock2 = UpgradableLock(lockpath, exclusive=False, id=ID2).acquire()
assert len(lock1._roster.get(SHARED)) == 2
assert len(lock1._roster.get(EXCLUSIVE)) == 0
lock1.release()
lock2.release()
def test_exclusive(self, lockpath):
with UpgradableLock(lockpath, exclusive=True, id=ID1) as lock:
assert len(lock._roster.get(SHARED)) == 0
assert len(lock._roster.get(EXCLUSIVE)) == 1
def test_upgrade(self, lockpath):
with UpgradableLock(lockpath, exclusive=False) as lock:
lock.upgrade()
lock.upgrade() # NOP
assert len(lock._roster.get(SHARED)) == 0
assert len(lock._roster.get(EXCLUSIVE)) == 1
def test_downgrade(self, lockpath):
with UpgradableLock(lockpath, exclusive=True) as lock:
lock.downgrade()
lock.downgrade() # NOP
assert len(lock._roster.get(SHARED)) == 1
assert len(lock._roster.get(EXCLUSIVE)) == 0
def test_break(self, lockpath):
lock = UpgradableLock(lockpath, exclusive=True, id=ID1).acquire()
lock.break_lock()
assert len(lock._roster.get(SHARED)) == 0
assert len(lock._roster.get(EXCLUSIVE)) == 0
with UpgradableLock(lockpath, exclusive=True, id=ID2):
pass
def test_timeout(self, lockpath):
with UpgradableLock(lockpath, exclusive=False, id=ID1):
with pytest.raises(LockTimeout):
UpgradableLock(lockpath, exclusive=True, id=ID2, timeout=0.1).acquire()
with UpgradableLock(lockpath, exclusive=True, id=ID1):
with pytest.raises(LockTimeout):
UpgradableLock(lockpath, exclusive=False, id=ID2, timeout=0.1).acquire()
with UpgradableLock(lockpath, exclusive=True, id=ID1):
with pytest.raises(LockTimeout):
UpgradableLock(lockpath, exclusive=True, id=ID2, timeout=0.1).acquire()
@pytest.fixture()
def rosterpath(tmpdir):
return str(tmpdir.join('roster'))
class TestLockRoster:
def test_empty(self, rosterpath):
roster = LockRoster(rosterpath)
empty = roster.load()
roster.save(empty)
assert empty == {}
def test_modify_get(self, rosterpath):
roster1 = LockRoster(rosterpath, id=ID1)
assert roster1.get(SHARED) == set()
roster1.modify(SHARED, ADD)
assert roster1.get(SHARED) == {ID1, }
roster2 = LockRoster(rosterpath, id=ID2)
roster2.modify(SHARED, ADD)
assert roster2.get(SHARED) == {ID1, ID2, }
roster1 = LockRoster(rosterpath, id=ID1)
roster1.modify(SHARED, REMOVE)
assert roster1.get(SHARED) == {ID2, }
roster2 = LockRoster(rosterpath, id=ID2)
roster2.modify(SHARED, REMOVE)
assert roster2.get(SHARED) == set()

View file

@ -4,7 +4,6 @@ from io import StringIO
import pytest
from ..logger import find_parent_module, create_logger, setup_logging
logger = create_logger()
@ -12,30 +11,28 @@ logger = create_logger()
def io_logger():
io = StringIO()
handler = setup_logging(stream=io, env_var=None)
handler.setFormatter(logging.Formatter("%(name)s: %(message)s"))
handler.setFormatter(logging.Formatter('%(name)s: %(message)s'))
logger.setLevel(logging.DEBUG)
return io
def test_setup_logging(io_logger):
logger.info("hello world")
assert io_logger.getvalue() == "borg.testsuite.logger_test: hello world\n"
logger.info('hello world')
assert io_logger.getvalue() == "borg.testsuite.logger: hello world\n"
def test_multiple_loggers(io_logger):
logger = logging.getLogger(__name__)
logger.info("hello world 1")
assert io_logger.getvalue() == "borg.testsuite.logger_test: hello world 1\n"
logger = logging.getLogger("borg.testsuite.logger_test")
logger.info("hello world 2")
assert (
io_logger.getvalue() == "borg.testsuite.logger_test: hello world 1\nborg.testsuite.logger_test: hello world 2\n"
)
logger.info('hello world 1')
assert io_logger.getvalue() == "borg.testsuite.logger: hello world 1\n"
logger = logging.getLogger('borg.testsuite.logger')
logger.info('hello world 2')
assert io_logger.getvalue() == "borg.testsuite.logger: hello world 1\nborg.testsuite.logger: hello world 2\n"
io_logger.truncate(0)
io_logger.seek(0)
logger = logging.getLogger("borg.testsuite.logger_test")
logger.info("hello world 2")
assert io_logger.getvalue() == "borg.testsuite.logger_test: hello world 2\n"
logger = logging.getLogger('borg.testsuite.logger')
logger.info('hello world 2')
assert io_logger.getvalue() == "borg.testsuite.logger: hello world 2\n"
def test_parent_module():
@ -43,7 +40,7 @@ def test_parent_module():
def test_lazy_logger():
# Just calling all the methods of the proxy.
# just calling all the methods of the proxy
logger.setLevel(logging.DEBUG)
logger.debug("debug")
logger.info("info")

View file

@ -1,37 +1,33 @@
from tempfile import TemporaryFile
from ..lrucache import LRUCache
import pytest
from ...helpers.lrucache import LRUCache
from tempfile import TemporaryFile
class TestLRUCache:
def test_lrucache(self):
c = LRUCache(2)
c = LRUCache(2, dispose=lambda _: None)
assert len(c) == 0
assert c.items() == set()
for i, x in enumerate("abc"):
for i, x in enumerate('abc'):
c[x] = i
assert len(c) == 2
assert c.items() == {("b", 1), ("c", 2)}
assert "a" not in c
assert "b" in c
assert c.items() == set([('b', 1), ('c', 2)])
assert 'a' not in c
assert 'b' in c
with pytest.raises(KeyError):
c["a"]
assert c.get("a") is None
assert c.get("a", "foo") == "foo"
assert c["b"] == 1
assert c.get("b") == 1
assert c["c"] == 2
c["d"] = 3
c['a']
assert c['b'] == 1
assert c['c'] == 2
c['d'] = 3
assert len(c) == 2
assert c["c"] == 2
assert c["d"] == 3
del c["c"]
assert c['c'] == 2
assert c['d'] == 3
del c['c']
assert len(c) == 1
with pytest.raises(KeyError):
c["c"]
assert c["d"] == 3
c['c']
assert c['d'] == 3
c.clear()
assert c.items() == set()

140
borg/testsuite/platform.py Normal file
View file

@ -0,0 +1,140 @@
import os
import shutil
import sys
import tempfile
import unittest
from ..platform import acl_get, acl_set
from . import BaseTestCase
ACCESS_ACL = """
user::rw-
user:root:rw-:0
user:9999:r--:9999
group::r--
group:root:r--:0
group:9999:r--:9999
mask::rw-
other::r--
""".strip().encode('ascii')
DEFAULT_ACL = """
user::rw-
user:root:r--:0
user:8888:r--:8888
group::r--
group:root:r--:0
group:8888:r--:8888
mask::rw-
other::r--
""".strip().encode('ascii')
def fakeroot_detected():
return 'FAKEROOTKEY' in os.environ
@unittest.skipUnless(sys.platform.startswith('linux'), 'linux only test')
@unittest.skipIf(fakeroot_detected(), 'not compatible with fakeroot')
class PlatformLinuxTestCase(BaseTestCase):
def setUp(self):
self.tmpdir = tempfile.mkdtemp()
def tearDown(self):
shutil.rmtree(self.tmpdir)
def get_acl(self, path, numeric_owner=False):
item = {}
acl_get(path, item, os.stat(path), numeric_owner=numeric_owner)
return item
def set_acl(self, path, access=None, default=None, numeric_owner=False):
item = {b'acl_access': access, b'acl_default': default}
acl_set(path, item, numeric_owner=numeric_owner)
def test_access_acl(self):
file = tempfile.NamedTemporaryFile()
self.assert_equal(self.get_acl(file.name), {})
self.set_acl(file.name, access=b'user::rw-\ngroup::r--\nmask::rw-\nother::---\nuser:root:rw-:9999\ngroup:root:rw-:9999\n', numeric_owner=False)
self.assert_in(b'user:root:rw-:0', self.get_acl(file.name)[b'acl_access'])
self.assert_in(b'group:root:rw-:0', self.get_acl(file.name)[b'acl_access'])
self.assert_in(b'user:0:rw-:0', self.get_acl(file.name, numeric_owner=True)[b'acl_access'])
file2 = tempfile.NamedTemporaryFile()
self.set_acl(file2.name, access=b'user::rw-\ngroup::r--\nmask::rw-\nother::---\nuser:root:rw-:9999\ngroup:root:rw-:9999\n', numeric_owner=True)
self.assert_in(b'user:9999:rw-:9999', self.get_acl(file2.name)[b'acl_access'])
self.assert_in(b'group:9999:rw-:9999', self.get_acl(file2.name)[b'acl_access'])
def test_default_acl(self):
self.assert_equal(self.get_acl(self.tmpdir), {})
self.set_acl(self.tmpdir, access=ACCESS_ACL, default=DEFAULT_ACL)
self.assert_equal(self.get_acl(self.tmpdir)[b'acl_access'], ACCESS_ACL)
self.assert_equal(self.get_acl(self.tmpdir)[b'acl_default'], DEFAULT_ACL)
def test_non_ascii_acl(self):
# Testing non-ascii ACL processing to see whether our code is robust.
# I have no idea whether non-ascii ACLs are allowed by the standard,
# but in practice they seem to be out there and must not make our code explode.
file = tempfile.NamedTemporaryFile()
self.assert_equal(self.get_acl(file.name), {})
nothing_special = 'user::rw-\ngroup::r--\nmask::rw-\nother::---\n'.encode('ascii')
# TODO: can this be tested without having an existing system user übel with uid 666 gid 666?
user_entry = 'user:übel:rw-:666'.encode('utf-8')
user_entry_numeric = 'user:666:rw-:666'.encode('ascii')
group_entry = 'group:übel:rw-:666'.encode('utf-8')
group_entry_numeric = 'group:666:rw-:666'.encode('ascii')
acl = b'\n'.join([nothing_special, user_entry, group_entry])
self.set_acl(file.name, access=acl, numeric_owner=False)
acl_access = self.get_acl(file.name, numeric_owner=False)[b'acl_access']
self.assert_in(user_entry, acl_access)
self.assert_in(group_entry, acl_access)
acl_access_numeric = self.get_acl(file.name, numeric_owner=True)[b'acl_access']
self.assert_in(user_entry_numeric, acl_access_numeric)
self.assert_in(group_entry_numeric, acl_access_numeric)
file2 = tempfile.NamedTemporaryFile()
self.set_acl(file2.name, access=acl, numeric_owner=True)
acl_access = self.get_acl(file2.name, numeric_owner=False)[b'acl_access']
self.assert_in(user_entry, acl_access)
self.assert_in(group_entry, acl_access)
acl_access_numeric = self.get_acl(file.name, numeric_owner=True)[b'acl_access']
self.assert_in(user_entry_numeric, acl_access_numeric)
self.assert_in(group_entry_numeric, acl_access_numeric)
def test_utils(self):
from ..platform_linux import acl_use_local_uid_gid
self.assert_equal(acl_use_local_uid_gid(b'user:nonexistent1234:rw-:1234'), b'user:1234:rw-')
self.assert_equal(acl_use_local_uid_gid(b'group:nonexistent1234:rw-:1234'), b'group:1234:rw-')
self.assert_equal(acl_use_local_uid_gid(b'user:root:rw-:0'), b'user:0:rw-')
self.assert_equal(acl_use_local_uid_gid(b'group:root:rw-:0'), b'group:0:rw-')
@unittest.skipUnless(sys.platform.startswith('darwin'), 'OS X only test')
@unittest.skipIf(fakeroot_detected(), 'not compatible with fakeroot')
class PlatformDarwinTestCase(BaseTestCase):
def setUp(self):
self.tmpdir = tempfile.mkdtemp()
def tearDown(self):
shutil.rmtree(self.tmpdir)
def get_acl(self, path, numeric_owner=False):
item = {}
acl_get(path, item, os.stat(path), numeric_owner=numeric_owner)
return item
def set_acl(self, path, acl, numeric_owner=False):
item = {b'acl_extended': acl}
acl_set(path, item, numeric_owner=numeric_owner)
def test_access_acl(self):
file = tempfile.NamedTemporaryFile()
file2 = tempfile.NamedTemporaryFile()
self.assert_equal(self.get_acl(file.name), {})
self.set_acl(file.name, b'!#acl 1\ngroup:ABCDEFAB-CDEF-ABCD-EFAB-CDEF00000000:staff:0:allow:read\nuser:FFFFEEEE-DDDD-CCCC-BBBB-AAAA00000000:root:0:allow:read\n', numeric_owner=False)
self.assert_in(b'group:ABCDEFAB-CDEF-ABCD-EFAB-CDEF00000014:staff:20:allow:read', self.get_acl(file.name)[b'acl_extended'])
self.assert_in(b'user:FFFFEEEE-DDDD-CCCC-BBBB-AAAA00000000:root:0:allow:read', self.get_acl(file.name)[b'acl_extended'])
self.set_acl(file2.name, b'!#acl 1\ngroup:ABCDEFAB-CDEF-ABCD-EFAB-CDEF00000000:staff:0:allow:read\nuser:FFFFEEEE-DDDD-CCCC-BBBB-AAAA00000000:root:0:allow:read\n', numeric_owner=True)
self.assert_in(b'group:ABCDEFAB-CDEF-ABCD-EFAB-CDEF00000000:wheel:0:allow:read', self.get_acl(file2.name)[b'acl_extended'])
self.assert_in(b'group:ABCDEFAB-CDEF-ABCD-EFAB-CDEF00000000::0:allow:read', self.get_acl(file2.name, numeric_owner=True)[b'acl_extended'])

View file

@ -0,0 +1,398 @@
import os
import shutil
import sys
import tempfile
from unittest.mock import patch
from ..hashindex import NSIndex
from ..helpers import Location, IntegrityError
from ..locking import UpgradableLock, LockFailed
from ..remote import RemoteRepository, InvalidRPCMethod
from ..repository import Repository, LoggedIO, TAG_COMMIT
from . import BaseTestCase
class RepositoryTestCaseBase(BaseTestCase):
key_size = 32
def open(self, create=False):
return Repository(os.path.join(self.tmppath, 'repository'), create=create)
def setUp(self):
self.tmppath = tempfile.mkdtemp()
self.repository = self.open(create=True)
self.repository.__enter__()
def tearDown(self):
self.repository.close()
shutil.rmtree(self.tmppath)
def reopen(self):
if self.repository:
self.repository.close()
self.repository = self.open()
class RepositoryTestCase(RepositoryTestCaseBase):
def test1(self):
for x in range(100):
self.repository.put(('%-32d' % x).encode('ascii'), b'SOMEDATA')
key50 = ('%-32d' % 50).encode('ascii')
self.assert_equal(self.repository.get(key50), b'SOMEDATA')
self.repository.delete(key50)
self.assert_raises(Repository.ObjectNotFound, lambda: self.repository.get(key50))
self.repository.commit()
self.repository.close()
with self.open() as repository2:
self.assert_raises(Repository.ObjectNotFound, lambda: repository2.get(key50))
for x in range(100):
if x == 50:
continue
self.assert_equal(repository2.get(('%-32d' % x).encode('ascii')), b'SOMEDATA')
def test2(self):
"""Test multiple sequential transactions
"""
self.repository.put(b'00000000000000000000000000000000', b'foo')
self.repository.put(b'00000000000000000000000000000001', b'foo')
self.repository.commit()
self.repository.delete(b'00000000000000000000000000000000')
self.repository.put(b'00000000000000000000000000000001', b'bar')
self.repository.commit()
self.assert_equal(self.repository.get(b'00000000000000000000000000000001'), b'bar')
def test_consistency(self):
"""Test cache consistency
"""
self.repository.put(b'00000000000000000000000000000000', b'foo')
self.assert_equal(self.repository.get(b'00000000000000000000000000000000'), b'foo')
self.repository.put(b'00000000000000000000000000000000', b'foo2')
self.assert_equal(self.repository.get(b'00000000000000000000000000000000'), b'foo2')
self.repository.put(b'00000000000000000000000000000000', b'bar')
self.assert_equal(self.repository.get(b'00000000000000000000000000000000'), b'bar')
self.repository.delete(b'00000000000000000000000000000000')
self.assert_raises(Repository.ObjectNotFound, lambda: self.repository.get(b'00000000000000000000000000000000'))
def test_consistency2(self):
"""Test cache consistency2
"""
self.repository.put(b'00000000000000000000000000000000', b'foo')
self.assert_equal(self.repository.get(b'00000000000000000000000000000000'), b'foo')
self.repository.commit()
self.repository.put(b'00000000000000000000000000000000', b'foo2')
self.assert_equal(self.repository.get(b'00000000000000000000000000000000'), b'foo2')
self.repository.rollback()
self.assert_equal(self.repository.get(b'00000000000000000000000000000000'), b'foo')
def test_overwrite_in_same_transaction(self):
"""Test cache consistency2
"""
self.repository.put(b'00000000000000000000000000000000', b'foo')
self.repository.put(b'00000000000000000000000000000000', b'foo2')
self.repository.commit()
self.assert_equal(self.repository.get(b'00000000000000000000000000000000'), b'foo2')
def test_single_kind_transactions(self):
# put
self.repository.put(b'00000000000000000000000000000000', b'foo')
self.repository.commit()
self.repository.close()
# replace
self.repository = self.open()
with self.repository:
self.repository.put(b'00000000000000000000000000000000', b'bar')
self.repository.commit()
# delete
self.repository = self.open()
with self.repository:
self.repository.delete(b'00000000000000000000000000000000')
self.repository.commit()
def test_list(self):
for x in range(100):
self.repository.put(('%-32d' % x).encode('ascii'), b'SOMEDATA')
all = self.repository.list()
self.assert_equal(len(all), 100)
first_half = self.repository.list(limit=50)
self.assert_equal(len(first_half), 50)
self.assert_equal(first_half, all[:50])
second_half = self.repository.list(marker=first_half[-1])
self.assert_equal(len(second_half), 50)
self.assert_equal(second_half, all[50:])
self.assert_equal(len(self.repository.list(limit=50)), 50)
class RepositoryCommitTestCase(RepositoryTestCaseBase):
def add_keys(self):
self.repository.put(b'00000000000000000000000000000000', b'foo')
self.repository.put(b'00000000000000000000000000000001', b'bar')
self.repository.put(b'00000000000000000000000000000003', b'bar')
self.repository.commit()
self.repository.put(b'00000000000000000000000000000001', b'bar2')
self.repository.put(b'00000000000000000000000000000002', b'boo')
self.repository.delete(b'00000000000000000000000000000003')
def test_replay_of_missing_index(self):
self.add_keys()
for name in os.listdir(self.repository.path):
if name.startswith('index.'):
os.unlink(os.path.join(self.repository.path, name))
self.reopen()
with self.repository:
self.assert_equal(len(self.repository), 3)
self.assert_equal(self.repository.check(), True)
def test_crash_before_compact_segments(self):
self.add_keys()
self.repository.compact_segments = None
try:
self.repository.commit()
except TypeError:
pass
self.reopen()
with self.repository:
self.assert_equal(len(self.repository), 3)
self.assert_equal(self.repository.check(), True)
def test_replay_of_readonly_repository(self):
self.add_keys()
for name in os.listdir(self.repository.path):
if name.startswith('index.'):
os.unlink(os.path.join(self.repository.path, name))
with patch.object(UpgradableLock, 'upgrade', side_effect=LockFailed) as upgrade:
self.reopen()
with self.repository:
self.assert_raises(LockFailed, lambda: len(self.repository))
upgrade.assert_called_once_with()
def test_crash_before_write_index(self):
self.add_keys()
self.repository.write_index = None
try:
self.repository.commit()
except TypeError:
pass
self.reopen()
with self.repository:
self.assert_equal(len(self.repository), 3)
self.assert_equal(self.repository.check(), True)
def test_crash_before_deleting_compacted_segments(self):
self.add_keys()
self.repository.io.delete_segment = None
try:
self.repository.commit()
except TypeError:
pass
self.reopen()
with self.repository:
self.assert_equal(len(self.repository), 3)
self.assert_equal(self.repository.check(), True)
self.assert_equal(len(self.repository), 3)
def test_ignores_commit_tag_in_data(self):
self.repository.put(b'0' * 32, LoggedIO.COMMIT)
self.reopen()
with self.repository:
io = self.repository.io
assert not io.is_committed_segment(io.get_latest_segment())
class RepositoryAppendOnlyTestCase(RepositoryTestCaseBase):
def test_destroy_append_only(self):
# Can't destroy append only repo (via the API)
self.repository.append_only = True
with self.assert_raises(ValueError):
self.repository.destroy()
def test_append_only(self):
def segments_in_repository():
return len(list(self.repository.io.segment_iterator()))
self.repository.put(b'00000000000000000000000000000000', b'foo')
self.repository.commit()
self.repository.append_only = False
assert segments_in_repository() == 1
self.repository.put(b'00000000000000000000000000000000', b'foo')
self.repository.commit()
# normal: compact squashes the data together, only one segment
assert segments_in_repository() == 1
self.repository.append_only = True
assert segments_in_repository() == 1
self.repository.put(b'00000000000000000000000000000000', b'foo')
self.repository.commit()
# append only: does not compact, only new segments written
assert segments_in_repository() == 2
class RepositoryCheckTestCase(RepositoryTestCaseBase):
def list_indices(self):
return [name for name in os.listdir(os.path.join(self.tmppath, 'repository')) if name.startswith('index.')]
def check(self, repair=False, status=True):
self.assert_equal(self.repository.check(repair=repair), status)
# Make sure no tmp files are left behind
self.assert_equal([name for name in os.listdir(os.path.join(self.tmppath, 'repository')) if 'tmp' in name], [], 'Found tmp files')
def get_objects(self, *ids):
for id_ in ids:
self.repository.get(('%032d' % id_).encode('ascii'))
def add_objects(self, segments):
for ids in segments:
for id_ in ids:
self.repository.put(('%032d' % id_).encode('ascii'), b'data')
self.repository.commit()
def get_head(self):
return sorted(int(n) for n in os.listdir(os.path.join(self.tmppath, 'repository', 'data', '0')) if n.isdigit())[-1]
def open_index(self):
return NSIndex.read(os.path.join(self.tmppath, 'repository', 'index.{}'.format(self.get_head())))
def corrupt_object(self, id_):
idx = self.open_index()
segment, offset = idx[('%032d' % id_).encode('ascii')]
with open(os.path.join(self.tmppath, 'repository', 'data', '0', str(segment)), 'r+b') as fd:
fd.seek(offset)
fd.write(b'BOOM')
def delete_segment(self, segment):
os.unlink(os.path.join(self.tmppath, 'repository', 'data', '0', str(segment)))
def delete_index(self):
os.unlink(os.path.join(self.tmppath, 'repository', 'index.{}'.format(self.get_head())))
def rename_index(self, new_name):
os.rename(os.path.join(self.tmppath, 'repository', 'index.{}'.format(self.get_head())),
os.path.join(self.tmppath, 'repository', new_name))
def list_objects(self):
return set(int(key) for key in self.repository.list())
def test_repair_corrupted_segment(self):
self.add_objects([[1, 2, 3], [4, 5], [6]])
self.assert_equal(set([1, 2, 3, 4, 5, 6]), self.list_objects())
self.check(status=True)
self.corrupt_object(5)
self.assert_raises(IntegrityError, lambda: self.get_objects(5))
self.repository.rollback()
# Make sure a regular check does not repair anything
self.check(status=False)
self.check(status=False)
# Make sure a repair actually repairs the repo
self.check(repair=True, status=True)
self.get_objects(4)
self.check(status=True)
self.assert_equal(set([1, 2, 3, 4, 6]), self.list_objects())
def test_repair_missing_segment(self):
self.add_objects([[1, 2, 3], [4, 5, 6]])
self.assert_equal(set([1, 2, 3, 4, 5, 6]), self.list_objects())
self.check(status=True)
self.delete_segment(1)
self.repository.rollback()
self.check(repair=True, status=True)
self.assert_equal(set([1, 2, 3]), self.list_objects())
def test_repair_missing_commit_segment(self):
self.add_objects([[1, 2, 3], [4, 5, 6]])
self.delete_segment(1)
self.assert_raises(Repository.ObjectNotFound, lambda: self.get_objects(4))
self.assert_equal(set([1, 2, 3]), self.list_objects())
def test_repair_corrupted_commit_segment(self):
self.add_objects([[1, 2, 3], [4, 5, 6]])
with open(os.path.join(self.tmppath, 'repository', 'data', '0', '1'), 'r+b') as fd:
fd.seek(-1, os.SEEK_END)
fd.write(b'X')
self.assert_raises(Repository.ObjectNotFound, lambda: self.get_objects(4))
self.check(status=True)
self.get_objects(3)
self.assert_equal(set([1, 2, 3]), self.list_objects())
def test_repair_no_commits(self):
self.add_objects([[1, 2, 3]])
with open(os.path.join(self.tmppath, 'repository', 'data', '0', '0'), 'r+b') as fd:
fd.seek(-1, os.SEEK_END)
fd.write(b'X')
self.assert_raises(Repository.CheckNeeded, lambda: self.get_objects(4))
self.check(status=False)
self.check(status=False)
self.assert_equal(self.list_indices(), ['index.0'])
self.check(repair=True, status=True)
self.assert_equal(self.list_indices(), ['index.1'])
self.check(status=True)
self.get_objects(3)
self.assert_equal(set([1, 2, 3]), self.list_objects())
def test_repair_missing_index(self):
self.add_objects([[1, 2, 3], [4, 5, 6]])
self.delete_index()
self.check(status=True)
self.get_objects(4)
self.assert_equal(set([1, 2, 3, 4, 5, 6]), self.list_objects())
def test_repair_index_too_new(self):
self.add_objects([[1, 2, 3], [4, 5, 6]])
self.assert_equal(self.list_indices(), ['index.1'])
self.rename_index('index.100')
self.check(status=True)
self.assert_equal(self.list_indices(), ['index.1'])
self.get_objects(4)
self.assert_equal(set([1, 2, 3, 4, 5, 6]), self.list_objects())
def test_crash_before_compact(self):
self.repository.put(bytes(32), b'data')
self.repository.put(bytes(32), b'data2')
# Simulate a crash before compact
with patch.object(Repository, 'compact_segments') as compact:
self.repository.commit()
compact.assert_called_once_with(save_space=False)
self.reopen()
with self.repository:
self.check(repair=True)
self.assert_equal(self.repository.get(bytes(32)), b'data2')
class RemoteRepositoryTestCase(RepositoryTestCase):
def open(self, create=False):
return RemoteRepository(Location('__testsuite__:' + os.path.join(self.tmppath, 'repository')), create=create)
def test_invalid_rpc(self):
self.assert_raises(InvalidRPCMethod, lambda: self.repository.call('__init__', None))
def test_ssh_cmd(self):
assert self.repository.ssh_cmd(Location('example.com:foo')) == ['ssh', 'example.com']
assert self.repository.ssh_cmd(Location('ssh://example.com/foo')) == ['ssh', 'example.com']
assert self.repository.ssh_cmd(Location('ssh://user@example.com/foo')) == ['ssh', 'user@example.com']
assert self.repository.ssh_cmd(Location('ssh://user@example.com:1234/foo')) == ['ssh', '-p', '1234', 'user@example.com']
os.environ['BORG_RSH'] = 'ssh --foo'
assert self.repository.ssh_cmd(Location('example.com:foo')) == ['ssh', '--foo', 'example.com']
def test_borg_cmd(self):
class MockArgs:
remote_path = 'borg'
umask = 0o077
assert self.repository.borg_cmd(None, testing=True) == [sys.executable, '-m', 'borg.archiver', 'serve']
args = MockArgs()
# note: test logger is on info log level, so --info gets added automagically
assert self.repository.borg_cmd(args, testing=False) == ['borg', 'serve', '--umask=077', '--info']
args.remote_path = 'borg-0.28.2'
assert self.repository.borg_cmd(args, testing=False) == ['borg-0.28.2', 'serve', '--umask=077', '--info']
class RemoteRepositoryCheckTestCase(RepositoryCheckTestCase):
def open(self, create=False):
return RemoteRepository(Location('__testsuite__:' + os.path.join(self.tmppath, 'repository')), create=create)
def test_crash_before_compact(self):
# skip this test, we can't mock-patch a Repository class in another process!
pass

View file

@ -0,0 +1,113 @@
import re
import pytest
from .. import shellpattern
def check(path, pattern):
compiled = re.compile(shellpattern.translate(pattern))
return bool(compiled.match(path))
@pytest.mark.parametrize("path, patterns", [
# Literal string
("foo/bar", ["foo/bar"]),
("foo\\bar", ["foo\\bar"]),
# Non-ASCII
("foo/c/\u0152/e/bar", ["foo/*/\u0152/*/bar", "*/*/\u0152/*/*", "**/\u0152/*/*"]),
("\u00e4\u00f6\u00dc", ["???", "*", "\u00e4\u00f6\u00dc", "[\u00e4][\u00f6][\u00dc]"]),
# Question mark
("foo", ["fo?"]),
("foo", ["f?o"]),
("foo", ["f??"]),
("foo", ["?oo"]),
("foo", ["?o?"]),
("foo", ["??o"]),
("foo", ["???"]),
# Single asterisk
("", ["*"]),
("foo", ["*", "**", "***"]),
("foo", ["foo*"]),
("foobar", ["foo*"]),
("foobar", ["foo*bar"]),
("foobarbaz", ["foo*baz"]),
("bar", ["*bar"]),
("foobar", ["*bar"]),
("foo/bar", ["foo/*bar"]),
("foo/bar", ["foo/*ar"]),
("foo/bar", ["foo/*r"]),
("foo/bar", ["foo/*"]),
("foo/bar", ["foo*/bar"]),
("foo/bar", ["fo*/bar"]),
("foo/bar", ["f*/bar"]),
("foo/bar", ["*/bar"]),
# Double asterisk (matches 0..n directory layers)
("foo/bar", ["foo/**/bar"]),
("foo/1/bar", ["foo/**/bar"]),
("foo/1/22/333/bar", ["foo/**/bar"]),
("foo/", ["foo/**/"]),
("foo/1/", ["foo/**/"]),
("foo/1/22/333/", ["foo/**/"]),
("bar", ["**/bar"]),
("1/bar", ["**/bar"]),
("1/22/333/bar", ["**/bar"]),
("foo/bar/baz", ["foo/**/*"]),
# Set
("foo1", ["foo[12]"]),
("foo2", ["foo[12]"]),
("foo2/bar", ["foo[12]/*"]),
("f??f", ["f??f", "f[?][?]f"]),
("foo]", ["foo[]]"]),
# Inverted set
("foo3", ["foo[!12]"]),
("foo^", ["foo[^!]"]),
("foo!", ["foo[^!]"]),
])
def test_match(path, patterns):
for p in patterns:
assert check(path, p)
@pytest.mark.parametrize("path, patterns", [
("", ["?", "[]"]),
("foo", ["foo?"]),
("foo", ["?foo"]),
("foo", ["f?oo"]),
# do not match path separator
("foo/ar", ["foo?ar"]),
# do not match/cross over os.path.sep
("foo/bar", ["*"]),
("foo/bar", ["foo*bar"]),
("foo/bar", ["foo*ar"]),
("foo/bar", ["fo*bar"]),
("foo/bar", ["fo*ar"]),
# Double asterisk
("foobar", ["foo/**/bar"]),
# Two asterisks without slash do not match directory separator
("foo/bar", ["**"]),
# Double asterisk not matching filename
("foo/bar", ["**/"]),
# Set
("foo3", ["foo[12]"]),
# Inverted set
("foo1", ["foo[!12]"]),
("foo2", ["foo[!12]"]),
])
def test_mismatch(path, patterns):
for p in patterns:
assert not check(path, p)

206
borg/testsuite/upgrader.py Normal file
View file

@ -0,0 +1,206 @@
import os
import pytest
try:
import attic.repository
import attic.key
import attic.helpers
except ImportError:
attic = None
from ..upgrader import AtticRepositoryUpgrader, AtticKeyfileKey
from ..helpers import get_keys_dir
from ..key import KeyfileKey
from ..archiver import UMASK_DEFAULT
from ..repository import Repository
def repo_valid(path):
"""
utility function to check if borg can open a repository
:param path: the path to the repository
:returns: if borg can check the repository
"""
with Repository(str(path), create=False) as repository:
# can't check raises() because check() handles the error
return repository.check()
def key_valid(path):
"""
check that the new keyfile is alright
:param path: the path to the key file
:returns: if the file starts with the borg magic string
"""
keyfile = os.path.join(get_keys_dir(),
os.path.basename(path))
with open(keyfile, 'r') as f:
return f.read().startswith(KeyfileKey.FILE_ID)
@pytest.fixture()
def attic_repo(tmpdir):
"""
create an attic repo with some stuff in it
:param tmpdir: path to the repository to be created
:returns: a attic.repository.Repository object
"""
attic_repo = attic.repository.Repository(str(tmpdir), create=True)
# throw some stuff in that repo, copied from `RepositoryTestCase.test1`
for x in range(100):
attic_repo.put(('%-32d' % x).encode('ascii'), b'SOMEDATA')
attic_repo.commit()
attic_repo.close()
return attic_repo
@pytest.fixture(params=[True, False])
def inplace(request):
return request.param
@pytest.mark.skipif(attic is None, reason='cannot find an attic install')
def test_convert_segments(tmpdir, attic_repo, inplace):
"""test segment conversion
this will load the given attic repository, list all the segments
then convert them one at a time. we need to close the repo before
conversion otherwise we have errors from borg
:param tmpdir: a temporary directory to run the test in (builtin
fixture)
:param attic_repo: a populated attic repository (fixture)
"""
# check should fail because of magic number
assert not repo_valid(tmpdir)
repository = AtticRepositoryUpgrader(str(tmpdir), create=False)
with repository:
segments = [filename for i, filename in repository.io.segment_iterator()]
repository.convert_segments(segments, dryrun=False, inplace=inplace)
repository.convert_cache(dryrun=False)
assert repo_valid(tmpdir)
class MockArgs:
"""
mock attic location
this is used to simulate a key location with a properly loaded
repository object to create a key file
"""
def __init__(self, path):
self.repository = attic.helpers.Location(path)
@pytest.fixture()
def attic_key_file(attic_repo, tmpdir):
"""
create an attic key file from the given repo, in the keys
subdirectory of the given tmpdir
:param attic_repo: an attic.repository.Repository object (fixture
define above)
:param tmpdir: a temporary directory (a builtin fixture)
:returns: the KeyfileKey object as returned by
attic.key.KeyfileKey.create()
"""
keys_dir = str(tmpdir.mkdir('keys'))
# we use the repo dir for the created keyfile, because we do
# not want to clutter existing keyfiles
os.environ['ATTIC_KEYS_DIR'] = keys_dir
# we use the same directory for the converted files, which
# will clutter the previously created one, which we don't care
# about anyways. in real runs, the original key will be retained.
os.environ['BORG_KEYS_DIR'] = keys_dir
os.environ['ATTIC_PASSPHRASE'] = 'test'
return attic.key.KeyfileKey.create(attic_repo,
MockArgs(keys_dir))
@pytest.mark.skipif(attic is None, reason='cannot find an attic install')
def test_keys(tmpdir, attic_repo, attic_key_file):
"""test key conversion
test that we can convert the given key to a properly formatted
borg key. assumes that the ATTIC_KEYS_DIR and BORG_KEYS_DIR have
been properly populated by the attic_key_file fixture.
:param tmpdir: a temporary directory (a builtin fixture)
:param attic_repo: an attic.repository.Repository object (fixture
define above)
:param attic_key_file: an attic.key.KeyfileKey (fixture created above)
"""
with AtticRepositoryUpgrader(str(tmpdir), create=False) as repository:
keyfile = AtticKeyfileKey.find_key_file(repository)
AtticRepositoryUpgrader.convert_keyfiles(keyfile, dryrun=False)
assert key_valid(attic_key_file.path)
@pytest.mark.skipif(attic is None, reason='cannot find an attic install')
def test_convert_all(tmpdir, attic_repo, attic_key_file, inplace):
"""test all conversion steps
this runs everything. mostly redundant test, since everything is
done above. yet we expect a NotImplementedError because we do not
convert caches yet.
:param tmpdir: a temporary directory (a builtin fixture)
:param attic_repo: an attic.repository.Repository object (fixture
define above)
:param attic_key_file: an attic.key.KeyfileKey (fixture created above)
"""
# check should fail because of magic number
assert not repo_valid(tmpdir)
def stat_segment(path):
return os.stat(os.path.join(path, 'data', '0', '0'))
def first_inode(path):
return stat_segment(path).st_ino
orig_inode = first_inode(attic_repo.path)
with AtticRepositoryUpgrader(str(tmpdir), create=False) as repository:
# replicate command dispatch, partly
os.umask(UMASK_DEFAULT)
backup = repository.upgrade(dryrun=False, inplace=inplace)
if inplace:
assert backup is None
assert first_inode(repository.path) == orig_inode
else:
assert backup
assert first_inode(repository.path) != first_inode(backup)
# i have seen cases where the copied tree has world-readable
# permissions, which is wrong
assert stat_segment(backup).st_mode & UMASK_DEFAULT == 0
assert key_valid(attic_key_file.path)
assert repo_valid(tmpdir)
def test_hardlink(tmpdir, inplace):
"""test that we handle hard links properly
that is, if we are in "inplace" mode, hardlinks should *not*
change (ie. we write to the file directly, so we do not rewrite the
whole file, and we do not re-create the file).
if we are *not* in inplace mode, then the inode should change, as
we are supposed to leave the original inode alone."""
a = str(tmpdir.join('a'))
with open(a, 'wb') as tmp:
tmp.write(b'aXXX')
b = str(tmpdir.join('b'))
os.link(a, b)
AtticRepositoryUpgrader.header_replace(b, b'a', b'b', inplace=inplace)
if not inplace:
assert os.stat(a).st_ino != os.stat(b).st_ino
else:
assert os.stat(a).st_ino == os.stat(b).st_ino
with open(b, 'rb') as tmp:
assert tmp.read() == b'bXXX'

40
borg/testsuite/xattr.py Normal file
View file

@ -0,0 +1,40 @@
import os
import tempfile
import unittest
from ..xattr import is_enabled, getxattr, setxattr, listxattr
from . import BaseTestCase
@unittest.skipUnless(is_enabled(), 'xattr not enabled on filesystem')
class XattrTestCase(BaseTestCase):
def setUp(self):
self.tmpfile = tempfile.NamedTemporaryFile()
self.symlink = os.path.join(os.path.dirname(self.tmpfile.name), 'symlink')
os.symlink(self.tmpfile.name, self.symlink)
def tearDown(self):
os.unlink(self.symlink)
def assert_equal_se(self, is_x, want_x):
# check 2 xattr lists for equality, but ignore security.selinux attr
is_x = set(is_x) - {'security.selinux'}
want_x = set(want_x)
self.assert_equal(is_x, want_x)
def test(self):
self.assert_equal_se(listxattr(self.tmpfile.name), [])
self.assert_equal_se(listxattr(self.tmpfile.fileno()), [])
self.assert_equal_se(listxattr(self.symlink), [])
setxattr(self.tmpfile.name, 'user.foo', b'bar')
setxattr(self.tmpfile.fileno(), 'user.bar', b'foo')
setxattr(self.tmpfile.name, 'user.empty', None)
self.assert_equal_se(listxattr(self.tmpfile.name), ['user.foo', 'user.bar', 'user.empty'])
self.assert_equal_se(listxattr(self.tmpfile.fileno()), ['user.foo', 'user.bar', 'user.empty'])
self.assert_equal_se(listxattr(self.symlink), ['user.foo', 'user.bar', 'user.empty'])
self.assert_equal_se(listxattr(self.symlink, follow_symlinks=False), [])
self.assert_equal(getxattr(self.tmpfile.name, 'user.foo'), b'bar')
self.assert_equal(getxattr(self.tmpfile.fileno(), 'user.foo'), b'bar')
self.assert_equal(getxattr(self.symlink, 'user.foo'), b'bar')
self.assert_equal(getxattr(self.tmpfile.name, 'user.empty'), None)

327
borg/upgrader.py Normal file
View file

@ -0,0 +1,327 @@
from binascii import hexlify
import datetime
import logging
logger = logging.getLogger(__name__)
import os
import shutil
import time
from .helpers import get_keys_dir, get_cache_dir, ProgressIndicatorPercent
from .locking import UpgradableLock
from .repository import Repository, MAGIC
from .key import KeyfileKey, KeyfileNotFoundError
ATTIC_MAGIC = b'ATTICSEG'
class AtticRepositoryUpgrader(Repository):
def __init__(self, *args, **kw):
kw['lock'] = False # do not create borg lock files (now) in attic repo
super().__init__(*args, **kw)
def upgrade(self, dryrun=True, inplace=False, progress=False):
"""convert an attic repository to a borg repository
those are the files that need to be upgraded here, from most
important to least important: segments, key files, and various
caches, the latter being optional, as they will be rebuilt if
missing.
we nevertheless do the order in reverse, as we prefer to do
the fast stuff first, to improve interactivity.
"""
with self:
backup = None
if not inplace:
backup = '{}.upgrade-{:%Y-%m-%d-%H:%M:%S}'.format(self.path, datetime.datetime.now())
logger.info('making a hardlink copy in %s', backup)
if not dryrun:
shutil.copytree(self.path, backup, copy_function=os.link)
logger.info("opening attic repository with borg and converting")
# now lock the repo, after we have made the copy
self.lock = UpgradableLock(os.path.join(self.path, 'lock'), exclusive=True, timeout=1.0).acquire()
segments = [filename for i, filename in self.io.segment_iterator()]
try:
keyfile = self.find_attic_keyfile()
except KeyfileNotFoundError:
logger.warning("no key file found for repository")
else:
self.convert_keyfiles(keyfile, dryrun)
# partial open: just hold on to the lock
self.lock = UpgradableLock(os.path.join(self.path, 'lock'),
exclusive=True).acquire()
try:
self.convert_cache(dryrun)
self.convert_repo_index(dryrun=dryrun, inplace=inplace)
self.convert_segments(segments, dryrun=dryrun, inplace=inplace, progress=progress)
self.borg_readme()
finally:
self.lock.release()
self.lock = None
return backup
def borg_readme(self):
readme = os.path.join(self.path, 'README')
os.remove(readme)
with open(readme, 'w') as fd:
fd.write('This is a Borg repository\n')
@staticmethod
def convert_segments(segments, dryrun=True, inplace=False, progress=False):
"""convert repository segments from attic to borg
replacement pattern is `s/ATTICSEG/BORG_SEG/` in files in
`$ATTIC_REPO/data/**`.
luckily the magic string length didn't change so we can just
replace the 8 first bytes of all regular files in there."""
logger.info("converting %d segments..." % len(segments))
segment_count = len(segments)
pi = ProgressIndicatorPercent(total=segment_count, msg="Converting segments %3.0f%%", same_line=True)
for i, filename in enumerate(segments):
if progress:
pi.show(i)
if dryrun:
time.sleep(0.001)
else:
AtticRepositoryUpgrader.header_replace(filename, ATTIC_MAGIC, MAGIC, inplace=inplace)
if progress:
pi.finish()
@staticmethod
def header_replace(filename, old_magic, new_magic, inplace=True):
with open(filename, 'r+b') as segment:
segment.seek(0)
# only write if necessary
if segment.read(len(old_magic)) == old_magic:
if inplace:
segment.seek(0)
segment.write(new_magic)
else:
# rename the hardlink and rewrite the file. this works
# because the file is still open. so even though the file
# is renamed, we can still read it until it is closed.
os.rename(filename, filename + '.tmp')
with open(filename, 'wb') as new_segment:
new_segment.write(new_magic)
new_segment.write(segment.read())
# the little dance with the .tmp file is necessary
# because Windows won't allow overwriting an open file.
os.unlink(filename + '.tmp')
def find_attic_keyfile(self):
"""find the attic keyfiles
the keyfiles are loaded by `KeyfileKey.find_key_file()`. that
finds the keys with the right identifier for the repo.
this is expected to look into $HOME/.attic/keys or
$ATTIC_KEYS_DIR for key files matching the given Borg
repository.
it is expected to raise an exception (KeyfileNotFoundError) if
no key is found. whether that exception is from Borg or Attic
is unclear.
this is split in a separate function in case we want to use
the attic code here directly, instead of our local
implementation."""
return AtticKeyfileKey.find_key_file(self)
@staticmethod
def convert_keyfiles(keyfile, dryrun):
"""convert key files from attic to borg
replacement pattern is `s/ATTIC KEY/BORG_KEY/` in
`get_keys_dir()`, that is `$ATTIC_KEYS_DIR` or
`$HOME/.attic/keys`, and moved to `$BORG_KEYS_DIR` or
`$HOME/.config/borg/keys`.
no need to decrypt to convert. we need to rewrite the whole
key file because magic string length changed, but that's not a
problem because the keyfiles are small (compared to, say,
all the segments)."""
logger.info("converting keyfile %s" % keyfile)
with open(keyfile, 'r') as f:
data = f.read()
data = data.replace(AtticKeyfileKey.FILE_ID, KeyfileKey.FILE_ID, 1)
keyfile = os.path.join(get_keys_dir(), os.path.basename(keyfile))
logger.info("writing borg keyfile to %s" % keyfile)
if not dryrun:
with open(keyfile, 'w') as f:
f.write(data)
def convert_repo_index(self, dryrun, inplace):
"""convert some repo files
those are all hash indexes, so we need to
`s/ATTICIDX/BORG_IDX/` in a few locations:
* the repository index (in `$ATTIC_REPO/index.%d`, where `%d`
is the `Repository.get_index_transaction_id()`), which we
should probably update, with a lock, see
`Repository.open()`, which i'm not sure we should use
because it may write data on `Repository.close()`...
"""
transaction_id = self.get_index_transaction_id()
if transaction_id is None:
logger.warning('no index file found for repository %s' % self.path)
else:
index = os.path.join(self.path, 'index.%d' % transaction_id)
logger.info("converting repo index %s" % index)
if not dryrun:
AtticRepositoryUpgrader.header_replace(index, b'ATTICIDX', b'BORG_IDX', inplace=inplace)
def convert_cache(self, dryrun):
"""convert caches from attic to borg
those are all hash indexes, so we need to
`s/ATTICIDX/BORG_IDX/` in a few locations:
* the `files` and `chunks` cache (in `$ATTIC_CACHE_DIR` or
`$HOME/.cache/attic/<repoid>/`), which we could just drop,
but if we'd want to convert, we could open it with the
`Cache.open()`, edit in place and then `Cache.close()` to
make sure we have locking right
"""
# copy of attic's get_cache_dir()
attic_cache_dir = os.environ.get('ATTIC_CACHE_DIR',
os.path.join(os.path.expanduser('~'),
'.cache', 'attic'))
attic_cache_dir = os.path.join(attic_cache_dir, hexlify(self.id).decode('ascii'))
borg_cache_dir = os.path.join(get_cache_dir(), hexlify(self.id).decode('ascii'))
def copy_cache_file(path):
"""copy the given attic cache path into the borg directory
does nothing if dryrun is True. also expects
attic_cache_dir and borg_cache_dir to be set in the parent
scope, to the directories path including the repository
identifier.
:params path: the basename of the cache file to copy
(example: "files" or "chunks") as a string
:returns: the borg file that was created or None if no
Attic cache file was found.
"""
attic_file = os.path.join(attic_cache_dir, path)
if os.path.exists(attic_file):
borg_file = os.path.join(borg_cache_dir, path)
if os.path.exists(borg_file):
logger.warning("borg cache file already exists in %s, not copying from Attic", borg_file)
else:
logger.info("copying attic cache file from %s to %s" % (attic_file, borg_file))
if not dryrun:
shutil.copyfile(attic_file, borg_file)
return borg_file
else:
logger.warning("no %s cache file found in %s" % (path, attic_file))
return None
# XXX: untested, because generating cache files is a PITA, see
# Archiver.do_create() for proof
if os.path.exists(attic_cache_dir):
if not os.path.exists(borg_cache_dir):
os.makedirs(borg_cache_dir)
# file that we don't have a header to convert, just copy
for cache in ['config', 'files']:
copy_cache_file(cache)
# we need to convert the headers of those files, copy first
for cache in ['chunks']:
cache = copy_cache_file(cache)
logger.info("converting cache %s" % cache)
if not dryrun:
AtticRepositoryUpgrader.header_replace(cache, b'ATTICIDX', b'BORG_IDX')
class AtticKeyfileKey(KeyfileKey):
"""backwards compatible Attic key file parser"""
FILE_ID = 'ATTIC KEY'
# verbatim copy from attic
@staticmethod
def get_keys_dir():
"""Determine where to repository keys and cache"""
return os.environ.get('ATTIC_KEYS_DIR',
os.path.join(os.path.expanduser('~'), '.attic', 'keys'))
@classmethod
def find_key_file(cls, repository):
"""copy of attic's `find_key_file`_
this has two small modifications:
1. it uses the above `get_keys_dir`_ instead of the global one,
assumed to be borg's
2. it uses `repository.path`_ instead of
`repository._location.canonical_path`_ because we can't
assume the repository has been opened by the archiver yet
"""
get_keys_dir = cls.get_keys_dir
id = hexlify(repository.id).decode('ascii')
keys_dir = get_keys_dir()
if not os.path.exists(keys_dir):
raise KeyfileNotFoundError(repository.path, keys_dir)
for name in os.listdir(keys_dir):
filename = os.path.join(keys_dir, name)
with open(filename, 'r') as fd:
line = fd.readline().strip()
if line and line.startswith(cls.FILE_ID) and line[10:] == id:
return filename
raise KeyfileNotFoundError(repository.path, keys_dir)
class BorgRepositoryUpgrader(Repository):
def upgrade(self, dryrun=True, inplace=False, progress=False):
"""convert an old borg repository to a current borg repository
"""
logger.info("converting borg 0.xx to borg current")
with self:
try:
keyfile = self.find_borg0xx_keyfile()
except KeyfileNotFoundError:
logger.warning("no key file found for repository")
else:
self.move_keyfiles(keyfile, dryrun)
def find_borg0xx_keyfile(self):
return Borg0xxKeyfileKey.find_key_file(self)
def move_keyfiles(self, keyfile, dryrun):
filename = os.path.basename(keyfile)
new_keyfile = os.path.join(get_keys_dir(), filename)
try:
os.rename(keyfile, new_keyfile)
except FileExistsError:
# likely the attic -> borg upgrader already put it in the final location
pass
class Borg0xxKeyfileKey(KeyfileKey):
"""backwards compatible borg 0.xx key file parser"""
@staticmethod
def get_keys_dir():
return os.environ.get('BORG_KEYS_DIR',
os.path.join(os.path.expanduser('~'), '.borg', 'keys'))
@classmethod
def find_key_file(cls, repository):
get_keys_dir = cls.get_keys_dir
id = hexlify(repository.id).decode('ascii')
keys_dir = get_keys_dir()
if not os.path.exists(keys_dir):
raise KeyfileNotFoundError(repository.path, keys_dir)
for name in os.listdir(keys_dir):
filename = os.path.join(keys_dir, name)
with open(filename, 'r') as fd:
line = fd.readline().strip()
if line and line.startswith(cls.FILE_ID) and line[len(cls.FILE_ID) + 1:] == id:
return filename
raise KeyfileNotFoundError(repository.path, keys_dir)

296
borg/xattr.py Normal file
View file

@ -0,0 +1,296 @@
"""A basic extended attributes (xattr) implementation for Linux and MacOS X
"""
import errno
import os
import subprocess
import sys
import tempfile
from ctypes import CDLL, create_string_buffer, c_ssize_t, c_size_t, c_char_p, c_int, c_uint32, get_errno
from ctypes.util import find_library
from distutils.version import LooseVersion
from .logger import create_logger
logger = create_logger()
def is_enabled(path=None):
"""Determine if xattr is enabled on the filesystem
"""
with tempfile.NamedTemporaryFile(dir=path, prefix='borg-tmp') as fd:
try:
setxattr(fd.fileno(), 'user.name', b'value')
except OSError:
return False
return getxattr(fd.fileno(), 'user.name') == b'value'
def get_all(path, follow_symlinks=True):
try:
return dict((name, getxattr(path, name, follow_symlinks=follow_symlinks))
for name in listxattr(path, follow_symlinks=follow_symlinks))
except OSError as e:
if e.errno in (errno.ENOTSUP, errno.EPERM):
return {}
libc_name = find_library('c')
if libc_name is None:
# find_library didn't work, maybe we are on some minimal system that misses essential
# tools used by find_library, like ldconfig, gcc/cc, objdump.
# so we can only try some "usual" names for the C library:
if sys.platform.startswith('linux'):
libc_name = 'libc.so.6'
elif sys.platform.startswith(('freebsd', 'netbsd')):
libc_name = 'libc.so'
elif sys.platform == 'darwin':
libc_name = 'libc.dylib'
else:
msg = "Can't find C library. No fallback known. Try installing ldconfig, gcc/cc or objdump."
logger.error(msg)
raise Exception(msg)
# If we are running with fakeroot on Linux, then use the xattr functions of fakeroot. This is needed by
# the 'test_extract_capabilities' test, but also allows xattrs to work with fakeroot on Linux in normal use.
# TODO: Check whether fakeroot supports xattrs on all platforms supported below.
# TODO: If that's the case then we can make Borg fakeroot-xattr-compatible on these as well.
LD_PRELOAD = os.environ.get('LD_PRELOAD', '')
XATTR_FAKEROOT = False
if sys.platform.startswith('linux') and 'fakeroot' in LD_PRELOAD:
fakeroot_version = LooseVersion(subprocess.check_output(['fakeroot', '-v']).decode('ascii').split()[-1])
if fakeroot_version >= LooseVersion("1.20.2"):
# 1.20.2 has been confirmed to have xattr support
# 1.18.2 has been confirmed not to have xattr support
# Versions in-between are unknown
libc_name = LD_PRELOAD
XATTR_FAKEROOT = True
try:
libc = CDLL(libc_name, use_errno=True)
except OSError as e:
msg = "Can't find C library [%s]. Try installing ldconfig, gcc/cc or objdump." % e
logger.error(msg)
raise Exception(msg)
def _check(rv, path=None):
if rv < 0:
raise OSError(get_errno(), path)
return rv
if sys.platform.startswith('linux'): # pragma: linux only
libc.llistxattr.argtypes = (c_char_p, c_char_p, c_size_t)
libc.llistxattr.restype = c_ssize_t
libc.flistxattr.argtypes = (c_int, c_char_p, c_size_t)
libc.flistxattr.restype = c_ssize_t
libc.lsetxattr.argtypes = (c_char_p, c_char_p, c_char_p, c_size_t, c_int)
libc.lsetxattr.restype = c_int
libc.fsetxattr.argtypes = (c_int, c_char_p, c_char_p, c_size_t, c_int)
libc.fsetxattr.restype = c_int
libc.lgetxattr.argtypes = (c_char_p, c_char_p, c_char_p, c_size_t)
libc.lgetxattr.restype = c_ssize_t
libc.fgetxattr.argtypes = (c_int, c_char_p, c_char_p, c_size_t)
libc.fgetxattr.restype = c_ssize_t
def listxattr(path, *, follow_symlinks=True):
if isinstance(path, str):
path = os.fsencode(path)
if isinstance(path, int):
func = libc.flistxattr
elif follow_symlinks:
func = libc.listxattr
else:
func = libc.llistxattr
n = _check(func(path, None, 0), path)
if n == 0:
return []
namebuf = create_string_buffer(n)
n2 = _check(func(path, namebuf, n), path)
if n2 != n:
raise Exception('listxattr failed')
return [os.fsdecode(name) for name in namebuf.raw.split(b'\0')[:-1] if not name.startswith(b'system.posix_acl_')]
def getxattr(path, name, *, follow_symlinks=True):
name = os.fsencode(name)
if isinstance(path, str):
path = os.fsencode(path)
if isinstance(path, int):
func = libc.fgetxattr
elif follow_symlinks:
func = libc.getxattr
else:
func = libc.lgetxattr
n = _check(func(path, name, None, 0))
if n == 0:
return
valuebuf = create_string_buffer(n)
n2 = _check(func(path, name, valuebuf, n), path)
if n2 != n:
raise Exception('getxattr failed')
return valuebuf.raw
def setxattr(path, name, value, *, follow_symlinks=True):
name = os.fsencode(name)
value = value and os.fsencode(value)
if isinstance(path, str):
path = os.fsencode(path)
if isinstance(path, int):
func = libc.fsetxattr
elif follow_symlinks:
func = libc.setxattr
else:
func = libc.lsetxattr
_check(func(path, name, value, len(value) if value else 0, 0), path)
elif sys.platform == 'darwin': # pragma: darwin only
libc.listxattr.argtypes = (c_char_p, c_char_p, c_size_t, c_int)
libc.listxattr.restype = c_ssize_t
libc.flistxattr.argtypes = (c_int, c_char_p, c_size_t)
libc.flistxattr.restype = c_ssize_t
libc.setxattr.argtypes = (c_char_p, c_char_p, c_char_p, c_size_t, c_uint32, c_int)
libc.setxattr.restype = c_int
libc.fsetxattr.argtypes = (c_int, c_char_p, c_char_p, c_size_t, c_uint32, c_int)
libc.fsetxattr.restype = c_int
libc.getxattr.argtypes = (c_char_p, c_char_p, c_char_p, c_size_t, c_uint32, c_int)
libc.getxattr.restype = c_ssize_t
libc.fgetxattr.argtypes = (c_int, c_char_p, c_char_p, c_size_t, c_uint32, c_int)
libc.fgetxattr.restype = c_ssize_t
XATTR_NOFOLLOW = 0x0001
def listxattr(path, *, follow_symlinks=True):
func = libc.listxattr
flags = 0
if isinstance(path, str):
path = os.fsencode(path)
if isinstance(path, int):
func = libc.flistxattr
elif not follow_symlinks:
flags = XATTR_NOFOLLOW
n = _check(func(path, None, 0, flags), path)
if n == 0:
return []
namebuf = create_string_buffer(n)
n2 = _check(func(path, namebuf, n, flags), path)
if n2 != n:
raise Exception('listxattr failed')
return [os.fsdecode(name) for name in namebuf.raw.split(b'\0')[:-1]]
def getxattr(path, name, *, follow_symlinks=True):
name = os.fsencode(name)
func = libc.getxattr
flags = 0
if isinstance(path, str):
path = os.fsencode(path)
if isinstance(path, int):
func = libc.fgetxattr
elif not follow_symlinks:
flags = XATTR_NOFOLLOW
n = _check(func(path, name, None, 0, 0, flags))
if n == 0:
return
valuebuf = create_string_buffer(n)
n2 = _check(func(path, name, valuebuf, n, 0, flags), path)
if n2 != n:
raise Exception('getxattr failed')
return valuebuf.raw
def setxattr(path, name, value, *, follow_symlinks=True):
name = os.fsencode(name)
value = value and os.fsencode(value)
func = libc.setxattr
flags = 0
if isinstance(path, str):
path = os.fsencode(path)
if isinstance(path, int):
func = libc.fsetxattr
elif not follow_symlinks:
flags = XATTR_NOFOLLOW
_check(func(path, name, value, len(value) if value else 0, 0, flags), path)
elif sys.platform.startswith('freebsd'): # pragma: freebsd only
EXTATTR_NAMESPACE_USER = 0x0001
libc.extattr_list_fd.argtypes = (c_int, c_int, c_char_p, c_size_t)
libc.extattr_list_fd.restype = c_ssize_t
libc.extattr_list_link.argtypes = (c_char_p, c_int, c_char_p, c_size_t)
libc.extattr_list_link.restype = c_ssize_t
libc.extattr_list_file.argtypes = (c_char_p, c_int, c_char_p, c_size_t)
libc.extattr_list_file.restype = c_ssize_t
libc.extattr_get_fd.argtypes = (c_int, c_int, c_char_p, c_char_p, c_size_t)
libc.extattr_get_fd.restype = c_ssize_t
libc.extattr_get_link.argtypes = (c_char_p, c_int, c_char_p, c_char_p, c_size_t)
libc.extattr_get_link.restype = c_ssize_t
libc.extattr_get_file.argtypes = (c_char_p, c_int, c_char_p, c_char_p, c_size_t)
libc.extattr_get_file.restype = c_ssize_t
libc.extattr_set_fd.argtypes = (c_int, c_int, c_char_p, c_char_p, c_size_t)
libc.extattr_set_fd.restype = c_int
libc.extattr_set_link.argtypes = (c_char_p, c_int, c_char_p, c_char_p, c_size_t)
libc.extattr_set_link.restype = c_int
libc.extattr_set_file.argtypes = (c_char_p, c_int, c_char_p, c_char_p, c_size_t)
libc.extattr_set_file.restype = c_int
def listxattr(path, *, follow_symlinks=True):
ns = EXTATTR_NAMESPACE_USER
if isinstance(path, str):
path = os.fsencode(path)
if isinstance(path, int):
func = libc.extattr_list_fd
elif follow_symlinks:
func = libc.extattr_list_file
else:
func = libc.extattr_list_link
n = _check(func(path, ns, None, 0), path)
if n == 0:
return []
namebuf = create_string_buffer(n)
n2 = _check(func(path, ns, namebuf, n), path)
if n2 != n:
raise Exception('listxattr failed')
names = []
mv = memoryview(namebuf.raw)
while mv:
length = mv[0]
names.append(os.fsdecode(bytes(mv[1:1 + length])))
mv = mv[1 + length:]
return names
def getxattr(path, name, *, follow_symlinks=True):
name = os.fsencode(name)
if isinstance(path, str):
path = os.fsencode(path)
if isinstance(path, int):
func = libc.extattr_get_fd
elif follow_symlinks:
func = libc.extattr_get_file
else:
func = libc.extattr_get_link
n = _check(func(path, EXTATTR_NAMESPACE_USER, name, None, 0))
if n == 0:
return
valuebuf = create_string_buffer(n)
n2 = _check(func(path, EXTATTR_NAMESPACE_USER, name, valuebuf, n), path)
if n2 != n:
raise Exception('getxattr failed')
return valuebuf.raw
def setxattr(path, name, value, *, follow_symlinks=True):
name = os.fsencode(name)
value = value and os.fsencode(value)
if isinstance(path, str):
path = os.fsencode(path)
if isinstance(path, int):
func = libc.extattr_set_fd
elif follow_symlinks:
func = libc.extattr_set_file
else:
func = libc.extattr_set_link
_check(func(path, EXTATTR_NAMESPACE_USER, name, value, len(value) if value else 0), path)
else: # pragma: unknown platform only
def listxattr(path, *, follow_symlinks=True):
return []
def getxattr(path, name, *, follow_symlinks=True):
pass
def setxattr(path, name, value, *, follow_symlinks=True):
pass

View file

@ -1,5 +0,0 @@
Here we store third-party documentation, licenses, etc.
Please note that all files inside the "borg" package directory (except those
excluded in setup.py) will be installed, so do not keep docs or licenses
there.

View file

@ -21,7 +21,7 @@ help:
@echo " singlehtml to make a single large HTML file"
@echo " pickle to make pickle files"
@echo " json to make JSON files"
@echo " htmlhelp to make HTML files and an HTML help project"
@echo " htmlhelp to make HTML files and a HTML help project"
@echo " qthelp to make HTML files and a qthelp project"
@echo " devhelp to make HTML files and a Devhelp project"
@echo " epub to make an epub"

11
docs/_static/Makefile vendored
View file

@ -1,11 +0,0 @@
all: logo.pdf logo.png
logo.pdf: logo.svg
inkscape logo.svg --export-pdf=logo.pdf
logo.png: logo.svg
inkscape logo.svg --export-png=logo.png --export-dpi=72,72
clean:
rm -f logo.pdf logo.png

72
docs/_static/logo.pdf vendored
View file

@ -1,72 +0,0 @@
%PDF-1.4
%µí®û
3 0 obj
<< /Length 4 0 R
/Filter /FlateDecode
>>
stream
xœ}TAr!¼û
_@yBÙì!{Hòÿªàè¨5Êžì°iØù
µ
'‰¿?âËÆ<>ŸvI0&tö=ÞŽÈZü§=N*ˆQ,¤J|DÎPê ?Ô ÃœA“-\
$´ŒkÃæ’Ä@…Ö½dÑ…Y€3Í<6B>6T_±¸,3¼Ã0±«^Ñ'¼‡´˜äÕåÀ‡<C380>´4=åÝBJ5Ï~î»AÆ`ºêŒ3³WhF%$ ·¨acÜļR®xÎê|k]MX)÷`xLp¼ûˆ<CB86>RZ:>4V'}8ÇY(¯ „“˜¥Gð||ºâ¥¾ ÿhs?ýóìÙýõ˜­Ì$ â}£<>( VRØ%ïÂŽe3Ý鯵"ý4 Ÿ5)¥æ†Ë`•”¾ñOĹ){ö-ú­êÖ—“·Y<˹uPm³“èØŽm¶ƒ‰mèV·)ÖjVƒ”tÃn y€"‡<>©<EFBFBD>Cç$î.}_T—~Ô«wíXËø¿ÔÙxóùèª/QXÄj»^}5ÿRx­'b.ìŠná5üS
endstream
endobj
4 0 obj
430
endobj
2 0 obj
<<
/ExtGState <<
/a0 << /CA 1 /ca 1 >>
>>
>>
endobj
5 0 obj
<< /Type /Page
/Parent 1 0 R
/MediaBox [ 0 0 240 100 ]
/Contents 3 0 R
/Group <<
/Type /Group
/S /Transparency
/I true
/CS /DeviceRGB
>>
/Resources 2 0 R
>>
endobj
1 0 obj
<< /Type /Pages
/Kids [ 5 0 R ]
/Count 1
>>
endobj
6 0 obj
<< /Creator (cairo 1.14.8 (http://cairographics.org))
/Producer (cairo 1.14.8 (http://cairographics.org))
>>
endobj
7 0 obj
<< /Type /Catalog
/Pages 1 0 R
>>
endobj
xref
0 8
0000000000 65535 f
0000000830 00000 n
0000000544 00000 n
0000000015 00000 n
0000000522 00000 n
0000000616 00000 n
0000000895 00000 n
0000001022 00000 n
trailer
<< /Size 8
/Root 7 0 R
/Info 6 0 R
>>
startxref
1074
%%EOF

BIN
docs/_static/logo.png vendored

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.7 KiB

After

Width:  |  Height:  |  Size: 1.6 KiB

10
docs/_static/logo.svg vendored
View file

@ -1,10 +0,0 @@
<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.0//EN" "http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.dtd">
<!-- Created using Karbon, part of Calligra: http://www.calligra.org/karbon -->
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="240pt" height="100pt">
<rect width="320" height="133.333" fill="#000200"/>
<path id="p1" transform="translate(20.9086, 32.2192)" fill="#00dd00" d="M43.75 13.8021L26.6667 13.8021L26.6667 0L53.3854 0L67.2396 13.8021L67.2396 27.8646L60.3125 34.7917L67.2396 41.7187L67.2396 55.3125L53.3854 69.1146L26.6667 69.1146L26.6667 55.3125L43.75 55.3125L43.75 40.5729L26.6667 40.5729L26.6667 28.5417L43.75 28.5417ZM0 0L23.0208 0L23.0208 69.1146L0 69.1146Z"/>
<path id="p2" transform="translate(97.6794, 46.0213)" fill="#00dd00" d="M62.1354 41.5104L48.3333 55.3125L32.9167 55.3125L32.9167 42.3958L38.6458 42.3958L38.6458 13.8021L32.9167 13.8021L32.9167 0L48.3333 0L62.1354 13.8021ZM23.2813 42.3958L29.2708 42.3958L29.2708 55.3125L13.8021 55.3125L0 41.5104L0 13.8021L13.8021 0L29.2708 0L29.2708 13.8021L23.2813 13.8021Z"/>
<path id="p3" transform="translate(170.231, 46.0213)" fill="#00dd00" d="M36.5104 13.8021L26.7187 13.8021L26.7187 7.76042L34.4271 0L48.3854 0L59.5833 12.9167L59.5833 27.2396L36.5104 27.2396ZM0 55.3125L0 7.10543e-15L23.0208 7.10543e-15L23.0208 55.3125Z"/>
<path id="p4" transform="translate(236.429, 46.0213)" fill="#00dd00" d="M36.875 13.8021L26.6667 13.8021L26.6667 7.10543e-15L46.0937 7.10543e-15L59.8958 13.8021L59.8958 60.7812L46.0937 74.6875L15.7292 74.6875L8.80208 67.7083L8.80208 62.6042L36.875 62.6042ZM33.2292 42.3958L33.2292 48.4896L26.3542 55.3125L13.8021 55.3125L0 41.5104L0 13.8021L13.8021 0L23.0208 0L23.0208 42.3958Z"/>
</svg>

Before

Width:  |  Height:  |  Size: 1.7 KiB

View file

@ -1,27 +0,0 @@
<p class="borg-downloads" id="borg-downloads" style="display:none;">Downloads: <span id="borg-downloads-list"></span></p>
<script type="text/javascript">
// Populate offline download links (PDF, HTMLzip, ...) on a single line using ReadTheDocs data if available.
function borgInitDownloads(data) {
var downloads = data && data.versions && data.versions.current && data.versions.current.downloads;
if (!downloads) return;
var labels = {pdf: "PDF", htmlzip: "HTML", epub: "ePub"};
var span = document.getElementById("borg-downloads-list");
if (!span) return;
var first = true;
Object.keys(downloads).forEach(function(fmt) {
var url = downloads[fmt];
if (!url) return;
if (!first) span.appendChild(document.createTextNode(" | "));
var a = document.createElement("a");
a.href = url;
a.textContent = labels[fmt] || fmt;
span.appendChild(a);
first = false;
});
if (!first) document.getElementById("borg-downloads").style.display = "";
}
document.addEventListener("readthedocs-addons-data-ready", function(event) {
borgInitDownloads(event.detail.data());
});
</script>

View file

@ -1,20 +0,0 @@
<div class="sidebar-block">
<div class="sidebar-toc">
{# Restrict the sidebar ToC depth to two levels while generating command usage pages.
This avoids superfluous entries for each "Description" and "Examples" heading. #}
{% if pagename.startswith("usage/") and pagename not in (
"usage/general", "usage/help", "usage/debug", "usage/notes",
) %}
{% set maxdepth = 2 %}
{% else %}
{% set maxdepth = 3 %}
{% endif %}
{% set toctree = toctree(maxdepth=maxdepth, collapse=True) %}
{% if toctree %}
{{ toctree }}
{% else %}
{{ toc }}
{% endif %}
</div>
</div>

View file

@ -1,174 +0,0 @@
{%- extends "basic/layout.html" %}
{# Do this so that Bootstrap is included before the main CSS file. #}
{%- block htmltitle %}
{% set script_files = script_files + ["_static/myscript.js"] %}
<!-- Licensed under the Apache 2.0 License -->
<link rel="stylesheet" type="text/css" href="{{ pathto('_static/fonts/open-sans/stylesheet.css', 1) }}" />
<!-- Licensed under the SIL Open Font License -->
<link rel="stylesheet" type="text/css" href="{{ pathto('_static/fonts/source-serif-pro/source-serif-pro.css', 1) }}" />
<link rel="stylesheet" type="text/css" href="{{ pathto('_static/css/bootstrap.min.css', 1) }}" />
<link rel="stylesheet" type="text/css" href="{{ pathto('_static/css/bootstrap-theme.min.css', 1) }}" />
<meta name="viewport" content="width=device-width, initial-scale=1.0">
{{ super() }}
{%- endblock %}
{%- block extrahead %}
{% if theme_touch_icon %}
<link rel="apple-touch-icon" href="{{ pathto('_static/' ~ theme_touch_icon, 1) }}" />
{% endif %}
<meta name="readthedocs-addons-api-version" content="1" />
{{ super() }}
{% endblock %}
{# Displays the URL for the homepage if it's set, or the master_doc if it is not. #}
{% macro homepage() -%}
{%- if theme_homepage %}
{%- if hasdoc(theme_homepage) %}
{{ pathto(theme_homepage) }}
{%- else %}
{{ theme_homepage }}
{%- endif %}
{%- else %}
{{ pathto(master_doc) }}
{%- endif %}
{%- endmacro %}
{# Displays the URL for the tospage if it's set, or falls back to the homepage macro. #}
{% macro tospage() -%}
{%- if theme_tospage %}
{%- if hasdoc(theme_tospage) %}
{{ pathto(theme_tospage) }}
{%- else %}
{{ theme_tospage }}
{%- endif %}
{%- else %}
{{ homepage() }}
{%- endif %}
{%- endmacro %}
{# Displays the URL for the projectpage if it's set, or falls back to the homepage macro. #}
{% macro projectlink() -%}
{%- if theme_projectlink %}
{%- if hasdoc(theme_projectlink) %}
{{ pathto(theme_projectlink) }}
{%- else %}
{{ theme_projectlink }}
{%- endif %}
{%- else %}
{{ homepage() }}
{%- endif %}
{%- endmacro %}
{# Displays the next and previous links both before and after the content. #}
{% macro render_relations() -%}
{% if prev or next %}
<div class="footer-relations">
{% if prev %}
<div class="pull-left">
<a class="btn btn-default" href="{{ prev.link|e }}" title="{{ _('previous chapter')}} (use the left arrow)">{{ prev.title }}</a>
</div>
{% endif %}
{%- if next and next.title != '&lt;no title&gt;' %}
<div class="pull-right">
<a class="btn btn-default" href="{{ next.link|e }}" title="{{ _('next chapter')}} (use the right arrow)">{{ next.title }}</a>
</div>
{%- endif %}
</div>
<div class="clearer"></div>
{% endif %}
{%- endmacro %}
{%- macro guzzle_sidebar() %}
<div id="left-column">
<div class="sphinxsidebar">
{%- if sidebars != None %}
{#- New-style sidebar: explicitly include/exclude templates. #}
{%- for sidebartemplate in sidebars %}
{%- include sidebartemplate %}
{%- endfor %}
{% else %}
{% include "logo-text.html" %}
{% include "globaltoc.html" %}
{% include "searchbox.html" %}
{%- endif %}
</div>
</div>
{%- endmacro %}
{%- block content %}
{%- if pagename == 'index' and theme_index_template %}
{% include theme_index_template %}
{%- else %}
<div class="container-wrapper">
<div id="mobile-toggle">
<a href="#"><span class="glyphicon glyphicon-align-justify" aria-hidden="true"></span></a>
</div>
{%- block sidebar1 %}{{ guzzle_sidebar() }}{% endblock %}
{%- block document_wrapper %}
{%- block document %}
<div id="right-column">
{% block breadcrumbs %}
<div role="navigation" aria-label="breadcrumbs navigation">
<ol class="breadcrumb">
<li><a href="{{ pathto(master_doc) }}">Docs</a></li>
{% for doc in parents %}
<li><a href="{{ doc.link|e }}">{{ doc.title }}</a></li>
{% endfor %}
<li>{{ title }}</li>
</ol>
</div>
{% endblock %}
<div class="document clearer body" role="main">
{% block body %} {% endblock %}
</div>
{%- block bottom_rel_links %}
{{ render_relations() }}
{%- endblock %}
</div>
<div class="clearfix"></div>
{%- endblock %}
{%- endblock %}
{%- block comments -%}
{% if theme_disqus_comments_shortname %}
<div class="container comment-container">
{% include "comments.html" %}
</div>
{% endif %}
{%- endblock %}
</div>
{%- endif %}
{%- endblock %}
{%- block footer %}
<script type="text/javascript">
$("#mobile-toggle a").click(function () {
$("#left-column").toggle();
});
</script>
<script type="text/javascript" src="{{ pathto('_static/js/bootstrap.js', 1)}}"></script>
{%- block footer_wrapper %}
<div class="footer">
&copy; Copyright {{ copyright }}. Created using <a href="http://sphinx.pocoo.org/">Sphinx</a>.
</div>
{%- endblock %}
{%- block ga %}
{%- if theme_google_analytics_account %}
<script type="text/javascript">
var _gaq = _gaq || [];
_gaq.push(['_setAccount', '{{ theme_google_analytics_account }}']);
_gaq.push(['_trackPageview']);
(function() {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();
</script>
{%- endif %}
{%- endblock %}
{%- endblock %}

View file

@ -1,5 +0,0 @@
<a href="{{ homepage() }}" class="text-logo">
<img src='{{ pathto('_static/logo.svg', 1) }}' width='100%'>
{{ theme_project_nav_name or shorttitle }}
</a>

View file

@ -1,29 +0,0 @@
<div class="version-selector" id="borg-version-selector" style="display:none;">
<label for="version-select">Select your Borg version:</label>
<select id="version-select"></select>
</div>
<script type="text/javascript">
// Populate the version selector using ReadTheDocs data if available.
function borgInitVersionSelector(data) {
var versions = data && data.versions && data.versions.active;
if (!versions || !versions.length) return;
var current = data.versions && data.versions.current && data.versions.current.slug;
var select = document.getElementById("version-select");
if (!select) return;
versions.forEach(function(v) {
var opt = document.createElement("option");
opt.value = v.urls.documentation;
opt.textContent = v.slug;
if (v.slug === current) opt.selected = true;
select.appendChild(opt);
});
select.addEventListener("change", function() {
window.location.href = this.value;
});
document.getElementById("borg-version-selector").style.display = "";
}
document.addEventListener("readthedocs-addons-data-ready", function(event) {
borgInitVersionSelector(event.detail.data());
});
</script>

95
docs/api.rst Normal file
View file

@ -0,0 +1,95 @@
API Documentation
=================
.. automodule:: borg.archiver
:members:
:undoc-members:
.. automodule:: borg.upgrader
:members:
:undoc-members:
.. automodule:: borg.archive
:members:
:undoc-members:
.. automodule:: borg.fuse
:members:
:undoc-members:
.. automodule:: borg.platform
:members:
:undoc-members:
.. automodule:: borg.locking
:members:
:undoc-members:
.. automodule:: borg.shellpattern
:members:
:undoc-members:
.. automodule:: borg.repository
:members:
:undoc-members:
.. automodule:: borg.lrucache
:members:
:undoc-members:
.. automodule:: borg.remote
:members:
:undoc-members:
.. automodule:: borg.hash_sizes
:members:
:undoc-members:
.. automodule:: borg.xattr
:members:
:undoc-members:
.. automodule:: borg.helpers
:members:
:undoc-members:
.. automodule:: borg.cache
:members:
:undoc-members:
.. automodule:: borg.key
:members:
:undoc-members:
.. automodule:: borg.logger
:members:
:undoc-members:
.. automodule:: borg.platform_darwin
:members:
:undoc-members:
.. automodule:: borg.platform_linux
:members:
:undoc-members:
.. automodule:: borg.hashindex
:members:
:undoc-members:
.. automodule:: borg.compress
:members:
:undoc-members:
.. automodule:: borg.chunker
:members:
:undoc-members:
.. automodule:: borg.crypto
:members:
:undoc-members:
.. automodule:: borg.platform_freebsd
:members:
:undoc-members:

View file

@ -1,8 +1,5 @@
.. include:: global.rst.inc
Authors
=======
.. include:: ../AUTHORS
License

View file

@ -1,117 +0,0 @@
Binary BorgBackup builds
========================
General notes
-------------
The binaries are supposed to work on the specified platform without installing anything else.
There are some limitations, though:
- for Linux, your system must have the same or newer glibc version as the one used for building
- for macOS, you need to have the same or newer macOS version as the one used for building
- for other OSes, there are likely similar limitations
If you don't find something working on your system, check the older borg releases.
*.asc are GnuPG signatures - only provided for locally built binaries.
*.exe (or no extension) is the single-file fat binary.
*.tgz is the single-directory fat binary (extract it once with tar -xzf).
Using the single-directory build is faster and does not require as much space
in the temporary directory as the self-extracting single-file build.
macOS: to avoid issues, download the file via the command line OR remove the
"quarantine" attribute after downloading:
$ xattr -dr com.apple.quarantine borg-macos1012.tgz
Download the correct files
--------------------------
Binaries built on GitHub servers
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
borg-linux-glibc235-x86_64-gh Linux AMD/Intel (built on Ubuntu 22.04 LTS with glibc 2.35)
borg-linux-glibc235-arm64-gh Linux ARM (built on Ubuntu 22.04 LTS with glibc 2.35)
borg-macos-15-arm64-gh macOS Apple Silicon (built on macOS 15 w/o FUSE support)
borg-macos-15-x86_64-gh macOS Intel (built on macOS 15 w/o FUSE support)
borg-freebsd-14-x86_64-gh FreeBSD AMD/Intel (built on FreeBSD 14)
Binaries built locally
~~~~~~~~~~~~~~~~~~~~~~
borg-linux-glibc231-x86_64 Linux (built on Debian 11 "Bullseye" with glibc 2.31)
Note: if you don't find a specific binary here, check release 1.4.1 or 1.2.9.
Verifying your download
-----------------------
I provide GPG signatures for files which I have built locally on my machines.
To check the GPG signature, download both the file and the corresponding
signature (*.asc file) and then (on the shell) type, for example:
gpg --recv-keys 9F88FB52FAF7B393
gpg --verify borgbackup.tar.gz.asc borgbackup.tar.gz
The files are signed by:
Thomas Waldmann <tw@waldmann-edv.de>
GPG key fingerprint: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393
My fingerprint is also in the footer of all my BorgBackup mailing list posts.
Provenance attestations for GitHub-built binaries
-------------------------------------------------
For binaries built on GitHub (files with a "-gh" suffix in the name), we publish
an artifact provenance attestation that proves the binary was built by our
GitHub Actions workflow from a specific commit or tag. You can verify this using
the GitHub CLI (gh). Install it from https://cli.github.com/ and make sure you
use a recent version that supports "gh attestation".
Practical example (Linux, 2.0.0b20 tag):
curl -LO https://github.com/borgbackup/borg/releases/download/2.0.0b20/borg-linux-glibc235-x86_64-gh
gh attestation verify --repo borgbackup/borg --source-ref refs/tags/2.0.0b20 borg-linux-glibc235-x86_64-gh
If verification succeeds, gh prints a summary stating the subject (your file),
that it was attested by GitHub Actions, and the job/workflow reference.
Installing
----------
It is suggested that you rename or symlink the binary to just "borg".
If you need "borgfs", just also symlink it to the same binary; it will
detect internally under which name it was invoked.
On UNIX-like platforms, /usr/local/bin/ or ~/bin/ is a nice place for it,
but you can invoke it from anywhere by providing the full path to it.
Make sure the file is readable and executable (chmod +rx borg on UNIX-like
platforms).
Reporting issues
----------------
Please first check the FAQ and whether a GitHub issue already exists.
If you find a NEW issue, please open a ticket on our issue tracker:
https://github.com/borgbackup/borg/issues/
There, please give:
- the version number (it is displayed if you invoke borg -V)
- the sha256sum of the binary
- a good description of what the issue is
- a good description of how to reproduce your issue
- a traceback with system info (if you have one)
- your precise platform (CPU, 32/64-bit?), OS, distribution, release
- your Python and (g)libc versions

View file

@ -1,24 +0,0 @@
:orphan:
.. include:: global.rst.inc
Borg documentation
==================
.. When you add an element here, do not forget to add it to index.rst.
.. Note: Some things are in appendices (see latex_appendices in conf.py).
.. toctree::
:maxdepth: 2
introduction
installation
quickstart
usage
deployment
faq
support
changes
internals
development
authors

View file

@ -1,237 +1,18 @@
@import url("theme.css");
dt code {
font-weight: normal;
/* The Return of the Borg.
*
* Have a bit green and grey and darkness (and if only in the upper left corner).
*/
.wy-side-nav-search {
background-color: #000000 !important;
}
#internals .toctree-wrapper > ul {
column-count: 3;
-webkit-column-count: 3;
.wy-side-nav-search > a {
color: rgba(255, 255, 255, 0.5);
}
#internals .toctree-wrapper > ul > li {
display: inline-block;
font-weight: bold;
}
#internals .toctree-wrapper > ul > li > ul {
font-weight: normal;
}
/* bootstrap has a .container class which clashes with docutils' container class. */
.docutils.container {
width: auto;
margin: 0;
padding: 0;
}
/* the default (38px) produces a jumpy baseline in Firefox on Linux. */
h1 {
font-size: 36px;
}
.text-logo {
background-color: #000200;
color: #00dd00;
}
.text-logo:hover,
.text-logo:active,
.text-logo:focus {
color: #5afe57;
}
/* by default the top and bottom margins are unequal which looks a bit unbalanced. */
.sidebar-block {
padding: 0;
margin: 14px 0 24px 0;
}
#borg-documentation h1 + p .external img {
width: 100%;
}
.container.experimental,
#debugging-facilities {
/* don't change text dimensions */
margin: 0 -30px; /* padding below + border width */
padding: 0 10px; /* 10 px visual margin between edge of text and the border */
/* fallback for browsers that don't have repeating-linear-gradient: thick, red lines */
border-left: 20px solid red;
border-right: 20px solid red;
/* fancy red stripes */
border-image: repeating-linear-gradient(
-45deg,rgba(255,0,0,0.1) 0,rgba(255,0,0,0.75) 10px,rgba(0,0,0,0) 10px,rgba(0,0,0,0) 20px,rgba(255,0,0,0.75) 20px) 0 20 repeat;
}
.topic {
margin: 0 1em;
padding: 0 1em;
/* #4e4a4a = background of the ToC sidebar */
border-left: 2px solid #4e4a4a;;
border-right: 2px solid #4e4a4a;;
}
table.docutils:not(.footnote) td,
table.docutils:not(.footnote) th {
padding: .2em;
}
table.docutils:not(.footnote) {
border-collapse: collapse;
border: none;
}
table.docutils:not(.footnote) td,
table.docutils:not(.footnote) th {
border: 1px solid #ddd;
}
table.docutils:not(.footnote) tr:first-child th,
table.docutils:not(.footnote) tr:first-child td {
border-top: 0;
}
table.docutils:not(.footnote) tr:last-child td {
border-bottom: 0;
}
table.docutils:not(.footnote) tr td:first-child,
table.docutils:not(.footnote) tr th:first-child {
border-left: 0;
}
table.docutils:not(.footnote) tr td:last-child,
table.docutils:not(.footnote) tr th:last-child,
table.docutils.borg-options-table tr td {
border-right: 0;
}
table.docutils.option-list tr td,
table.docutils.borg-options-table tr td {
border-left: 0;
border-right: 0;
}
table.docutils.borg-options-table tr td:first-child:not([colspan="3"]) {
border-top: 0;
border-bottom: 0;
}
.borg-options-table td[colspan="3"] p {
margin: 0;
}
.borg-options-table {
width: 100%;
}
kbd, /* used in usage pages for options */
code,
.rst-content tt.literal,
.rst-content tt.literal,
.rst-content code.literal,
.rst-content tt,
.rst-content code,
p .literal,
p .literal span {
border: none;
padding: 0;
color: black; /* slight contrast with #404040 of regular text */
background: none;
}
kbd {
box-shadow: none;
line-height: 23px;
word-wrap: normal;
font-size: 15px;
font-family: Consolas, monospace;
}
.borg-options-table tr td:nth-child(2) .pre {
white-space: nowrap;
}
.borg-options-table tr td:first-child {
width: 2em;
}
cite {
white-space: nowrap;
color: black; /* slight contrast with #404040 of regular text */
font-family: Consolas, "Andale Mono WT", "Andale Mono", "Lucida Console", "Lucida Sans Typewriter",
"DejaVu Sans Mono", "Bitstream Vera Sans Mono", "Liberation Mono", "Nimbus Mono L", Monaco, "Courier New", Courier, monospace;
font-style: normal;
text-decoration: underline;
}
.borg-common-opt-ref {
font-weight: bold;
}
.sidebar-toc ul li.toctree-l2 a,
.sidebar-toc ul li.toctree-l3 a {
padding-right: 25px;
}
#common-options .option {
white-space: nowrap;
}
/* Remove the right-column max-width cap so content fills the full available width. */
#right-column {
max-width: none;
}
/* Hide the default RTD flyout since we show the version selector in the sidebar. */
readthedocs-flyout {
display: none !important;
}
/* Version selector in the sidebar. */
.version-selector {
padding: 0 22px;
margin: 7px 0 7px 0;
font-size: 14px;
}
.version-selector label {
display: block;
margin-bottom: 4px;
color: #000;
}
.version-selector select {
width: 100%;
padding: 4px;
background-color: #fafafa;
color: #000;
border: 1px solid #ccc;
border-radius: 3px;
}
.version-selector::after {
content: '';
display: block;
border-top: 1px solid #ccc;
margin: 7px 0 0 0;
}
/* Reduce top and bottom margin of searchbox block to 7px to match separator spacing. */
.sidebar-block:has(#main-search) {
margin-top: 7px;
margin-bottom: 7px;
}
/* Reduce the separator margin below the search block to 7px. */
.sphinxsidebar > .sidebar-block:has(#main-search):after {
margin: 7px 22px 0 22px;
}
/* Downloads line in the sidebar: a single line, left-aligned with the boxes
above it and the table of contents below it. */
.borg-downloads {
padding: 0 22px;
margin: 7px 0 7px 0;
font-size: 14px;
color: #000;
}
.borg-downloads:after {
content: '';
display: block;
border-top: 1px solid #ccc;
margin: 7px 0 0 0;
.wy-side-nav-search > div.version {
color: rgba(255, 255, 255, 0.5);
}

File diff suppressed because it is too large Load diff

View file

@ -1,807 +0,0 @@
.. _changelog_0x:
Change Log 0.x
==============
Version 0.30.0 (2016-01-23)
---------------------------
Compatibility notes:
- The new default logging level is WARNING. Previously, it was INFO, which was
more verbose. Use -v (or --info) to show once again log level INFO messages.
See the "general" section in the usage docs.
- For borg create, you need --list (in addition to -v) to see the long file
list (was needed so you can have e.g. --stats alone without the long list)
- See below about BORG_DELETE_I_KNOW_WHAT_I_AM_DOING (was:
BORG_CHECK_I_KNOW_WHAT_I_AM_DOING)
Bug fixes:
- fix crash when using borg create --dry-run --keep-tag-files, #570
- make sure teardown with cleanup happens for Cache and RepositoryCache,
avoiding leftover locks and TEMP dir contents, #285 (partially), #548
- fix locking KeyError, partial fix for #502
- log stats consistently, #526
- add abbreviated weekday to timestamp format, fixes #496
- strip whitespace when loading exclusions from file
- unset LD_LIBRARY_PATH before invoking ssh, fixes strange OpenSSL library
version warning when using the borg binary, #514
- add some error handling/fallback for C library loading, #494
- added BORG_DELETE_I_KNOW_WHAT_I_AM_DOING for check in "borg delete", #503
- remove unused "repair" rpc method name
New features:
- borg create: implement exclusions using regular expression patterns.
- borg create: implement inclusions using patterns.
- borg extract: support patterns, #361
- support different styles for patterns:
- fnmatch (`fm:` prefix, default when omitted), like borg <= 0.29.
- shell (`sh:` prefix) with `*` not matching directory separators and
`**/` matching 0..n directories
- path prefix (`pp:` prefix, for unifying borg create pp1 pp2 into the
patterns system), semantics like in borg <= 0.29
- regular expression (`re:`), new!
- --progress option for borg upgrade (#291) and borg delete <archive>
- update progress indication more often (e.g. for borg create within big
files or for borg check repo), #500
- finer chunker granularity for items metadata stream, #547, #487
- borg create --list is now used (in addition to -v) to enable the verbose
file list output
- display borg version below tracebacks, #532
Other changes:
- hashtable size (and thus: RAM and disk consumption) follows a growth policy:
grows fast while small, grows slower when getting bigger, #527
- Vagrantfile: use pyinstaller 3.1 to build binaries, freebsd sqlite3 fix,
fixes #569
- no separate binaries for centos6 any more because the generic linux binaries
also work on centos6 (or in general: on systems with a slightly older glibc
than debian7
- dev environment: require virtualenv<14.0 so we get a py32 compatible pip
- docs:
- add space-saving chunks.archive.d trick to FAQ
- important: clarify -v and log levels in usage -> general, please read!
- sphinx configuration: create a simple man page from usage docs
- add a repo server setup example
- disable unneeded SSH features in authorized_keys examples for security.
- borg prune only knows "--keep-within" and not "--within"
- add gource video to resources docs, #507
- add netbsd install instructions
- authors: make it more clear what refers to borg and what to attic
- document standalone binary requirements, #499
- rephrase the mailing list section
- development docs: run build_api and build_usage before tagging release
- internals docs: hash table max. load factor is 0.75 now
- markup, typo, grammar, phrasing, clarifications and other fixes.
- add gcc gcc-c++ to redhat/fedora/corora install docs, fixes #583
Version 0.29.0 (2015-12-13)
---------------------------
Compatibility notes:
- When upgrading to 0.29.0, you need to upgrade client as well as server
installations due to the locking and command-line interface changes; otherwise
you'll get an error message about an RPC protocol mismatch or a wrong command-line
option.
If you run a server that needs to support both old and new clients, it is
suggested that you have a "borg-0.28.2" and a "borg-0.29.0" command.
clients then can choose via e.g. "borg --remote-path=borg-0.29.0 ...".
- The default waiting time for a lock changed from infinity to 1 second for a
better interactive user experience. If the repo you want to access is
currently locked, borg will now terminate after 1s with an error message.
If you have scripts that should wait for the lock for a longer time, use
--lock-wait N (with N being the maximum wait time in seconds).
Bug fixes:
- hash table tuning (better chosen hashtable load factor 0.75 and prime initial
size of 1031 gave ~1000x speedup in some scenarios)
- avoid creation of an orphan lock for one case, #285
- --keep-tag-files: fix file mode and multiple tag files in one directory, #432
- fixes for "borg upgrade" (attic repo converter), #466
- remove --progress isatty magic (and also --no-progress option) again, #476
- borg init: display proper repo URL
- fix format of umask in help pages, #463
New features:
- implement --lock-wait, support timeout for UpgradableLock, #210
- implement borg break-lock command, #157
- include system info below traceback, #324
- sane remote logging, remote stderr, #461:
- remote log output: intercept it and log it via local logging system,
with "Remote: " prefixed to message. log remote tracebacks.
- remote stderr: output it to local stderr with "Remote: " prefixed.
- add --debug and --info (same as --verbose) to set the log level of the
builtin logging configuration (which otherwise defaults to warning), #426
note: there are few messages emitted at DEBUG level currently.
- optionally configure logging via env var BORG_LOGGING_CONF
- add --filter option for status characters: e.g. to show only the added
or modified files (and also errors), use "borg create -v --filter=AME ...".
- more progress indicators, #394
- use ISO-8601 date and time format, #375
- "borg check --prefix" to restrict archive checking to that name prefix, #206
Other changes:
- hashindex_add C implementation (speed up cache re-sync for new archives)
- increase FUSE read_size to 1024 (speed up metadata operations)
- check/delete/prune --save-space: free unused segments quickly, #239
- increase rpc protocol version to 2 (see also Compatibility notes), #458
- silence borg by default (via default log level WARNING)
- get rid of C compiler warnings, #391
- upgrade OS X FUSE to 3.0.9 on the OS X binary build system
- use python 3.5.1 to build binaries
- docs:
- new mailing list borgbackup@python.org, #468
- readthedocs: color and logo improvements
- load coverage icons over SSL (avoids mixed content)
- more precise binary installation steps
- update release procedure docs about OS X FUSE
- FAQ entry about unexpected 'A' status for unchanged file(s), #403
- add docs about 'E' file status
- add "borg upgrade" docs, #464
- add developer docs about output and logging
- clarify encryption, add note about client-side encryption
- add resources section, with videos, talks, presentations, #149
- Borg moved to Arch Linux [community]
- fix wrong installation instructions for archlinux
Version 0.28.2 (2015-11-15)
---------------------------
New features:
- borg create --exclude-if-present TAGFILE - exclude directories that have the
given file from the backup. You can additionally give --keep-tag-files to
preserve just the directory roots and the tag-files (but not back up other
directory contents), #395, attic #128, attic #142
Other changes:
- do not create docs sources at build time (just have them in the repo),
completely remove have_cython() hack, do not use the "mock" library at build
time, #384
- avoid hidden import, make it easier for PyInstaller, easier fix for #218
- docs:
- add description of item flags / status output, fixes #402
- explain how to regenerate usage and API files (build_api or
build_usage) and when to commit usage files directly into git, #384
- minor install docs improvements
Version 0.28.1 (2015-11-08)
---------------------------
Bug fixes:
- do not try to build api / usage docs for production install,
fixes unexpected "mock" build dependency, #384
Other changes:
- avoid using msgpack.packb at import time
- fix formatting issue in changes.rst
- fix build on readthedocs
Version 0.28.0 (2015-11-08)
---------------------------
Compatibility notes:
- changed return codes (exit codes), see docs. in short:
old: 0 = ok, 1 = error. now: 0 = ok, 1 = warning, 2 = error
New features:
- refactor return codes (exit codes), fixes #61
- add --show-rc option enable "terminating with X status, rc N" output, fixes 58, #351
- borg create backups atime and ctime additionally to mtime, fixes #317
- extract: support atime additionally to mtime
- FUSE: support ctime and atime additionally to mtime
- support borg --version
- emit a warning if we have a slow msgpack installed
- borg list --prefix=thishostname- REPO, fixes #205
- Debug commands (do not use except if you know what you do: debug-get-obj,
debug-put-obj, debug-delete-obj, debug-dump-archive-items.
Bug fixes:
- setup.py: fix bug related to BORG_LZ4_PREFIX processing
- fix "check" for repos that have incomplete chunks, fixes #364
- borg mount: fix unlocking of repository at umount time, fixes #331
- fix reading files without touching their atime, #334
- non-ascii ACL fixes for Linux, FreeBSD and OS X, #277
- fix acl_use_local_uid_gid() and add a test for it, attic #359
- borg upgrade: do not upgrade repositories in place by default, #299
- fix cascading failure with the index conversion code, #269
- borg check: implement 'cmdline' archive metadata value decoding, #311
- fix RobustUnpacker, it missed some metadata keys (new atime and ctime keys
were missing, but also bsdflags). add check for unknown metadata keys.
- create from stdin: also save atime, ctime (cosmetic)
- use default_notty=False for confirmations, fixes #345
- vagrant: fix msgpack installation on centos, fixes #342
- deal with unicode errors for symlinks in same way as for regular files and
have a helpful warning message about how to fix wrong locale setup, fixes #382
- add ACL keys the RobustUnpacker must know about
Other changes:
- improve file size displays, more flexible size formatters
- explicitly commit to the units standard, #289
- archiver: add E status (means that an error occurred when processing this
(single) item
- do binary releases via "github releases", closes #214
- create: use -x and --one-file-system (was: --do-not-cross-mountpoints), #296
- a lot of changes related to using "logging" module and screen output, #233
- show progress display if on a tty, output more progress information, #303
- factor out status output so it is consistent, fix surrogates removal,
maybe fixes #309
- move away from RawConfigParser to ConfigParser
- archive checker: better error logging, give chunk_id and sequence numbers
(can be used together with borg debug-dump-archive-items).
- do not mention the deprecated passphrase mode
- emit a deprecation warning for --compression N (giving a just a number)
- misc .coverragerc fixes (and coverage measurement improvements), fixes #319
- refactor confirmation code, reduce code duplication, add tests
- prettier error messages, fixes #307, #57
- tests:
- add a test to find disk-full issues, #327
- travis: also run tests on Python 3.5
- travis: use tox -r so it rebuilds the tox environments
- test the generated pyinstaller-based binary by archiver unit tests, #215
- vagrant: tests: announce whether fakeroot is used or not
- vagrant: add vagrant user to fuse group for debianoid systems also
- vagrant: llfuse install on darwin needs pkgconfig installed
- vagrant: use pyinstaller from develop branch, fixes #336
- benchmarks: test create, extract, list, delete, info, check, help, fixes #146
- benchmarks: test with both the binary and the python code
- archiver tests: test with both the binary and the python code, fixes #215
- make basic test more robust
- docs:
- moved docs to borgbackup.readthedocs.org, #155
- a lot of fixes and improvements, use mobile-friendly RTD standard theme
- use zlib,6 compression in some examples, fixes #275
- add missing rename usage to docs, closes #279
- include the help offered by borg help <topic> in the usage docs, fixes #293
- include a list of major changes compared to attic into README, fixes #224
- add OS X install instructions, #197
- more details about the release process, #260
- fix linux glibc requirement (binaries built on debian7 now)
- build: move usage and API generation to setup.py
- update docs about return codes, #61
- remove api docs (too much breakage on rtd)
- borgbackup install + basics presentation (asciinema)
- describe the current style guide in documentation
- add section about debug commands
- warn about not running out of space
- add example for rename
- improve chunker params docs, fixes #362
- minor development docs update
Version 0.27.0 (2015-10-07)
---------------------------
New features:
- "borg upgrade" command - attic -> borg one time converter / migration, #21
- temporary hack to avoid using lots of disk space for chunks.archive.d, #235:
To use it: rm -rf chunks.archive.d ; touch chunks.archive.d
- respect XDG_CACHE_HOME, attic #181
- add support for arbitrary SSH commands, attic #99
- borg delete --cache-only REPO (only delete cache, not REPO), attic #123
Bug fixes:
- use Debian 7 (wheezy) to build pyinstaller borgbackup binaries, fixes slow
down observed when running the Centos6-built binary on Ubuntu, #222
- do not crash on empty lock.roster, fixes #232
- fix multiple issues with the cache config version check, #234
- fix segment entry header size check, attic #352
plus other error handling improvements / code deduplication there.
- always give segment and offset in repo IntegrityErrors
Other changes:
- stop producing binary wheels, remove docs about it, #147
- docs:
- add warning about prune
- generate usage include files only as needed
- development docs: add Vagrant section
- update / improve / reformat FAQ
- hint to single-file pyinstaller binaries from README
Version 0.26.1 (2015-09-28)
---------------------------
This is a minor update, just docs and new pyinstaller binaries.
- docs update about python and binary requirements
- better docs for --read-special, fix #220
- re-built the binaries, fix #218 and #213 (glibc version issue)
- update web site about single-file pyinstaller binaries
Note: if you did a python-based installation, there is no need to upgrade.
Version 0.26.0 (2015-09-19)
---------------------------
New features:
- Faster cache sync (do all in one pass, remove tar/compression stuff), #163
- BORG_REPO env var to specify the default repo, #168
- read special files as if they were regular files, #79
- implement borg create --dry-run, attic issue #267
- Normalize paths before pattern matching on OS X, #143
- support OpenBSD and NetBSD (except xattrs/ACLs)
- support / run tests on Python 3.5
Bug fixes:
- borg mount repo: use absolute path, attic #200, attic #137
- chunker: use off_t to get 64bit on 32bit platform, #178
- initialize chunker fd to -1, so it's not equal to STDIN_FILENO (0)
- fix reaction to "no" answer at delete repo prompt, #182
- setup.py: detect lz4.h header file location
- to support python < 3.2.4, add less buggy argparse lib from 3.2.6 (#194)
- fix for obtaining ``char *`` from temporary Python value (old code causes
a compile error on Mint 17.2)
- llfuse 0.41 install troubles on some platforms, require < 0.41
(UnicodeDecodeError exception due to non-ascii llfuse setup.py)
- cython code: add some int types to get rid of unspecific python add /
subtract operations (avoid ``undefined symbol FPE_``... error on some platforms)
- fix verbose mode display of stdin backup
- extract: warn if a include pattern never matched, fixes #209,
implement counters for Include/ExcludePatterns
- archive names with slashes are invalid, attic issue #180
- chunker: add a check whether the POSIX_FADV_DONTNEED constant is defined -
fixes building on OpenBSD.
Other changes:
- detect inconsistency / corruption / hash collision, #170
- replace versioneer with setuptools_scm, #106
- docs:
- pkg-config is needed for llfuse installation
- be more clear about pruning, attic issue #132
- unit tests:
- xattr: ignore security.selinux attribute showing up
- ext3 seems to need a bit more space for a sparse file
- do not test lzma level 9 compression (avoid MemoryError)
- work around strange mtime granularity issue on netbsd, fixes #204
- ignore st_rdev if file is not a block/char device, fixes #203
- stay away from the setgid and sticky mode bits
- use Vagrant to do easy cross-platform testing (#196), currently:
- Debian 7 "wheezy" 32bit, Debian 8 "jessie" 64bit
- Ubuntu 12.04 32bit, Ubuntu 14.04 64bit
- Centos 7 64bit
- FreeBSD 10.2 64bit
- OpenBSD 5.7 64bit
- NetBSD 6.1.5 64bit
- Darwin (OS X Yosemite)
Version 0.25.0 (2015-08-29)
---------------------------
Compatibility notes:
- lz4 compression library (liblz4) is a new requirement (#156)
- the new compression code is very compatible: as long as you stay with zlib
compression, older borg releases will still be able to read data from a
repo/archive made with the new code (note: this is not the case for the
default "none" compression, use "zlib,0" if you want a "no compression" mode
that can be read by older borg). Also the new code is able to read repos and
archives made with older borg versions (for all zlib levels 0..9).
Deprecations:
- --compression N (with N being a number, as in 0.24) is deprecated.
We keep the --compression 0..9 for now not to break scripts, but it is
deprecated and will be removed later, so better fix your scripts now:
--compression 0 (as in 0.24) is the same as --compression zlib,0 (now).
BUT: if you do not want compression, use --compression none
(which is the default).
--compression 1 (in 0.24) is the same as --compression zlib,1 (now)
--compression 9 (in 0.24) is the same as --compression zlib,9 (now)
New features:
- create --compression none (default, means: do not compress, just pass through
data "as is". this is more efficient than zlib level 0 as used in borg 0.24)
- create --compression lz4 (super-fast, but not very high compression)
- create --compression zlib,N (slower, higher compression, default for N is 6)
- create --compression lzma,N (slowest, highest compression, default N is 6)
- honor the nodump flag (UF_NODUMP) and do not back up such items
- list --short just outputs a simple list of the files/directories in an archive
Bug fixes:
- fixed --chunker-params parameter order confusion / malfunction, fixes #154
- close fds of segments we delete (during compaction)
- close files which fell out the lrucache
- fadvise DONTNEED now is only called for the byte range actually read, not for
the whole file, fixes #158.
- fix issue with negative "all archives" size, fixes #165
- restore_xattrs: ignore if setxattr fails with EACCES, fixes #162
Other changes:
- remove fakeroot requirement for tests, tests run faster without fakeroot
(test setup does not fail any more without fakeroot, so you can run with or
without fakeroot), fixes #151 and #91.
- more tests for archiver
- recover_segment(): don't assume we have an fd for segment
- lrucache refactoring / cleanup, add dispose function, py.test tests
- generalize hashindex code for any key length (less hardcoding)
- lock roster: catch file not found in remove() method and ignore it
- travis CI: use requirements file
- improved docs:
- replace hack for llfuse with proper solution (install libfuse-dev)
- update docs about compression
- update development docs about fakeroot
- internals: add some words about lock files / locking system
- support: mention BountySource and for what it can be used
- theme: use a lighter green
- add pypi, wheel, dist package based install docs
- split install docs into system-specific preparations and generic instructions
Version 0.24.0 (2015-08-09)
---------------------------
Incompatible changes (compared to 0.23):
- borg now always issues --umask NNN option when invoking another borg via ssh
on the repository server. By that, it's making sure it uses the same umask
for remote repos as for local ones. Because of this, you must upgrade both
server and client(s) to 0.24.
- the default umask is 077 now (if you do not specify via --umask) which might
be a different one as you used previously. The default umask avoids that
you accidentally give access permissions for group and/or others to files
created by borg (e.g. the repository).
Deprecations:
- "--encryption passphrase" mode is deprecated, see #85 and #97.
See the new "--encryption repokey" mode for a replacement.
New features:
- borg create --chunker-params ... to configure the chunker, fixes #16
(attic #302, attic #300, and somehow also #41).
This can be used to reduce memory usage caused by chunk management overhead,
so borg does not create a huge chunks index/repo index and eats all your RAM
if you back up lots of data in huge files (like VM disk images).
See docs/misc/create_chunker-params.txt for more information.
- borg info now reports chunk counts in the chunk index.
- borg create --compression 0..9 to select zlib compression level, fixes #66
(attic #295).
- borg init --encryption repokey (to store the encryption key into the repo),
fixes #85
- improve at-end error logging, always log exceptions and set exit_code=1
- LoggedIO: better error checks / exceptions / exception handling
- implement --remote-path to allow non-default-path borg locations, #125
- implement --umask M and use 077 as default umask for better security, #117
- borg check: give a named single archive to it, fixes #139
- cache sync: show progress indication
- cache sync: reimplement the chunk index merging in C
Bug fixes:
- fix segfault that happened for unreadable files (chunker: n needs to be a
signed size_t), #116
- fix the repair mode, #144
- repo delete: add destroy to allowed rpc methods, fixes issue #114
- more compatible repository locking code (based on mkdir), maybe fixes #92
(attic #317, attic #201).
- better Exception msg if no Borg is installed on the remote repo server, #56
- create a RepositoryCache implementation that can cope with >2GiB,
fixes attic #326.
- fix Traceback when running check --repair, attic #232
- clarify help text, fixes #73.
- add help string for --no-files-cache, fixes #140
Other changes:
- improved docs:
- added docs/misc directory for misc. writeups that won't be included
"as is" into the html docs.
- document environment variables and return codes (attic #324, attic #52)
- web site: add related projects, fix web site url, IRC #borgbackup
- Fedora/Fedora-based install instructions added to docs
- Cygwin-based install instructions added to docs
- updated AUTHORS
- add FAQ entries about redundancy / integrity
- clarify that borg extract uses the cwd as extraction target
- update internals doc about chunker params, memory usage and compression
- added docs about development
- add some words about resource usage in general
- document how to back up a raw disk
- add note about how to run borg from virtual env
- add solutions for (ll)fuse installation problems
- document what borg check does, fixes #138
- reorganize borgbackup.github.io sidebar, prev/next at top
- deduplicate and refactor the docs / README.rst
- use borg-tmp as prefix for temporary files / directories
- short prune options without "keep-" are deprecated, do not suggest them
- improved tox configuration
- remove usage of unittest.mock, always use mock from pypi
- use entrypoints instead of scripts, for better use of the wheel format and
modern installs
- add requirements.d/development.txt and modify tox.ini
- use travis-ci for testing based on Linux and (new) OS X
- use coverage.py, pytest-cov and codecov.io for test coverage support
I forgot to list some stuff already implemented in 0.23.0, here they are:
New features:
- efficient archive list from manifest, meaning a big speedup for slow
repo connections and "list <repo>", "delete <repo>", "prune" (attic #242,
attic #167)
- big speedup for chunks cache sync (esp. for slow repo connections), fixes #18
- hashindex: improve error messages
Other changes:
- explicitly specify binary mode to open binary files
- some easy micro optimizations
Version 0.23.0 (2015-06-11)
---------------------------
Incompatible changes (compared to attic, fork related):
- changed sw name and cli command to "borg", updated docs
- package name (and name in urls) uses "borgbackup" to have fewer collisions
- changed repo / cache internal magic strings from ATTIC* to BORG*,
changed cache location to .cache/borg/ - this means that it currently won't
accept attic repos (see issue #21 about improving that)
Bug fixes:
- avoid defect python-msgpack releases, fixes attic #171, fixes attic #185
- fix traceback when trying to do unsupported passphrase change, fixes attic #189
- datetime does not like the year 10.000, fixes attic #139
- fix "info" all archives stats, fixes attic #183
- fix parsing with missing microseconds, fixes attic #282
- fix misleading hint the fuse ImportError handler gave, fixes attic #237
- check unpacked data from RPC for tuple type and correct length, fixes attic #127
- fix Repository._active_txn state when lock upgrade fails
- give specific path to xattr.is_enabled(), disable symlink setattr call that
always fails
- fix test setup for 32bit platforms, partial fix for attic #196
- upgraded versioneer, PEP440 compliance, fixes attic #257
New features:
- less memory usage: add global option --no-cache-files
- check --last N (only check the last N archives)
- check: sort archives in reverse time order
- rename repo::oldname newname (rename repository)
- create -v output more informative
- create --progress (backup progress indicator)
- create --timestamp (utc string or reference file/dir)
- create: if "-" is given as path, read binary from stdin
- extract: if --stdout is given, write all extracted binary data to stdout
- extract --sparse (simple sparse file support)
- extra debug information for 'fread failed'
- delete <repo> (deletes whole repo + local cache)
- FUSE: reflect deduplication in allocated blocks
- only allow whitelisted RPC calls in server mode
- normalize source/exclude paths before matching
- use posix_fadvise not to spoil the OS cache, fixes attic #252
- toplevel error handler: show tracebacks for better error analysis
- sigusr1 / sigint handler to print current file infos - attic PR #286
- RPCError: include the exception args we get from remote
Other changes:
- source: misc. cleanups, pep8, style
- docs and faq improvements, fixes, updates
- cleanup crypto.pyx, make it easier to adapt to other AES modes
- do os.fsync like recommended in the python docs
- source: Let chunker optionally work with os-level file descriptor.
- source: Linux: remove duplicate os.fsencode calls
- source: refactor _open_rb code a bit, so it is more consistent / regular
- source: refactor indicator (status) and item processing
- source: use py.test for better testing, flake8 for code style checks
- source: fix tox >=2.0 compatibility (test runner)
- pypi package: add python version classifiers, add FreeBSD to platforms
Attic Changelog
---------------
Here you can see the full list of changes between each Attic release until Borg
forked from Attic:
Version 0.17
~~~~~~~~~~~~
(bugfix release, released on X)
- Fix hashindex ARM memory alignment issue (#309)
- Improve hashindex error messages (#298)
Version 0.16
~~~~~~~~~~~~
(bugfix release, released on May 16, 2015)
- Fix typo preventing the security confirmation prompt from working (#303)
- Improve handling of systems with improperly configured file system encoding (#289)
- Fix "All archives" output for attic info. (#183)
- More user friendly error message when repository key file is not found (#236)
- Fix parsing of iso 8601 timestamps with zero microseconds (#282)
Version 0.15
~~~~~~~~~~~~
(bugfix release, released on Apr 15, 2015)
- xattr: Be less strict about unknown/unsupported platforms (#239)
- Reduce repository listing memory usage (#163).
- Fix BrokenPipeError for remote repositories (#233)
- Fix incorrect behavior with two character directory names (#265, #268)
- Require approval before accessing relocated/moved repository (#271)
- Require approval before accessing previously unknown unencrypted repositories (#271)
- Fix issue with hash index files larger than 2GB.
- Fix Python 3.2 compatibility issue with noatime open() (#164)
- Include missing pyx files in dist files (#168)
Version 0.14
~~~~~~~~~~~~
(feature release, released on Dec 17, 2014)
- Added support for stripping leading path segments (#95)
"attic extract --strip-segments X"
- Add workaround for old Linux systems without acl_extended_file_no_follow (#96)
- Add MacPorts' path to the default openssl search path (#101)
- HashIndex improvements, eliminates unnecessary IO on low memory systems.
- Fix "Number of files" output for attic info. (#124)
- limit create file permissions so files aren't read while restoring
- Fix issue with empty xattr values (#106)
Version 0.13
~~~~~~~~~~~~
(feature release, released on Jun 29, 2014)
- Fix sporadic "Resource temporarily unavailable" when using remote repositories
- Reduce file cache memory usage (#90)
- Faster AES encryption (utilizing AES-NI when available)
- Experimental Linux, OS X and FreeBSD ACL support (#66)
- Added support for backup and restore of BSDFlags (OSX, FreeBSD) (#56)
- Fix bug where xattrs on symlinks were not correctly restored
- Added cachedir support. CACHEDIR.TAG compatible cache directories
can now be excluded using ``--exclude-caches`` (#74)
- Fix crash on extreme mtime timestamps (year 2400+) (#81)
- Fix Python 3.2 specific lockf issue (EDEADLK)
Version 0.12
~~~~~~~~~~~~
(feature release, released on April 7, 2014)
- Python 3.4 support (#62)
- Various documentation improvements a new style
- ``attic mount`` now supports mounting an entire repository not only
individual archives (#59)
- Added option to restrict remote repository access to specific path(s):
``attic serve --restrict-to-path X`` (#51)
- Include "all archives" size information in "--stats" output. (#54)
- Added ``--stats`` option to ``attic delete`` and ``attic prune``
- Fixed bug where ``attic prune`` used UTC instead of the local time zone
when determining which archives to keep.
- Switch to SI units (Power of 1000 instead 1024) when printing file sizes
Version 0.11
~~~~~~~~~~~~
(feature release, released on March 7, 2014)
- New "check" command for repository consistency checking (#24)
- Documentation improvements
- Fix exception during "attic create" with repeated files (#39)
- New "--exclude-from" option for attic create/extract/verify.
- Improved archive metadata deduplication.
- "attic verify" has been deprecated. Use "attic extract --dry-run" instead.
- "attic prune --hourly|daily|..." has been deprecated.
Use "attic prune --keep-hourly|daily|..." instead.
- Ignore xattr errors during "extract" if not supported by the filesystem. (#46)
Version 0.10
~~~~~~~~~~~~
(bugfix release, released on Jan 30, 2014)
- Fix deadlock when extracting 0 sized files from remote repositories
- "--exclude" wildcard patterns are now properly applied to the full path
not just the file name part (#5).
- Make source code endianness agnostic (#1)
Version 0.9
~~~~~~~~~~~
(feature release, released on Jan 23, 2014)
- Remote repository speed and reliability improvements.
- Fix sorting of segment names to ignore NFS left over files. (#17)
- Fix incorrect display of time (#13)
- Improved error handling / reporting. (#12)
- Use fcntl() instead of flock() when locking repository/cache. (#15)
- Let ssh figure out port/user if not specified so we don't override .ssh/config (#9)
- Improved libcrypto path detection (#23).
Version 0.8.1
~~~~~~~~~~~~~
(bugfix release, released on Oct 4, 2013)
- Fix segmentation fault issue.
Version 0.8
~~~~~~~~~~~
(feature release, released on Oct 3, 2013)
- Fix xattr issue when backing up sshfs filesystems (#4)
- Fix issue with excessive index file size (#6)
- Support access of read only repositories.
- New syntax to enable repository encryption:
attic init --encryption="none|passphrase|keyfile".
- Detect and abort if repository is older than the cache.
Version 0.7
~~~~~~~~~~~
(feature release, released on Aug 5, 2013)
- Ported to FreeBSD
- Improved documentation
- Experimental: Archives mountable as FUSE filesystems.
- The "user." prefix is no longer stripped from xattrs on Linux
Version 0.6.1
~~~~~~~~~~~~~
(bugfix release, released on July 19, 2013)
- Fixed an issue where mtime was not always correctly restored.
Version 0.6
~~~~~~~~~~~
First public release on July 9, 2013

File diff suppressed because it is too large Load diff

View file

@ -1,7 +1,9 @@
# Documentation build configuration file, created by
# -*- coding: utf-8 -*-
#
# documentation build configuration file, created by
# sphinx-quickstart on Sat Sep 10 18:18:25 2011.
#
# This file is execfile()d with the current directory set to its containing directory.
# This file is execfile()d with the current directory set to its containing dir.
#
# Note that not all possible configuration values are present in this
# autogenerated file.
@ -12,164 +14,155 @@
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
import sys
import os
sys.path.insert(0, os.path.abspath("../src"))
import sys, os
sys.path.insert(0, os.path.abspath('..'))
from borg import __version__ as sw_version
on_rtd = os.environ.get('READTHEDOCS', None) == 'True'
# -- General configuration -----------------------------------------------------
# If your documentation needs a minimal Sphinx version, state it here.
# needs_sphinx = '1.0'
#needs_sphinx = '1.0'
# Add any Sphinx extension module names here, as strings. They can be extensions
# coming with Sphinx (named 'sphinx.ext.*') or your custom ones.
extensions = []
# Add any paths that contain templates here, relative to this directory.
templates_path = ["_templates"]
templates_path = ['_templates']
# The suffix of source filenames.
source_suffix = ".rst"
source_suffix = '.rst'
# The encoding of source files.
# source_encoding = 'utf-8-sig'
#source_encoding = 'utf-8-sig'
# The master toctree document.
master_doc = "index"
master_doc = 'index'
# General information about the project.
project = "Borg - Deduplicating Archiver"
copyright = "2010-2014 Jonas Borgström, 2015-2026 The Borg Collective (see AUTHORS file)"
project = 'Borg - Deduplicating Archiver'
copyright = '2010-2014 Jonas Borgström, 2015-2016 The Borg Collective (see AUTHORS file)'
# The version info for the project you're documenting, acts as replacement for
# |version| and |release|, also used in various other places throughout the
# built documents.
#
# The short X.Y version.
split_char = "+" if "+" in sw_version else "-"
version = sw_version.split(split_char)[0]
version = sw_version.split('-')[0]
# The full version, including alpha/beta/rc tags.
release = version
suppress_warnings = ["image.nonlocal_uri"]
# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
# language = None
#language = None
# There are two options for replacing |today|: either, you set today to some
# non-false value, then it is used:
# today = ''
#today = ''
# Else, today_fmt is used as the format for a strftime call.
today_fmt = "%Y-%m-%d"
today_fmt = '%Y-%m-%d'
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
exclude_patterns = ["_build"]
exclude_patterns = ['_build']
# The reST default role (used for this markup: `text`) to use for all documents.
# default_role = None
# The Borg docs contain no or very little Python docs.
# Thus, the primary domain is RST.
primary_domain = "rst"
#default_role = None
# If true, '()' will be appended to :func: etc. cross-reference text.
# add_function_parentheses = True
#add_function_parentheses = True
# If true, the current module name will be prepended to all description
# unit titles (such as .. function::).
# add_module_names = True
#add_module_names = True
# If true, sectionauthor and moduleauthor directives will be shown in the
# output. They are ignored by default.
# show_authors = False
#show_authors = False
# The name of the Pygments (syntax highlighting) style to use.
pygments_style = "sphinx"
pygments_style = 'sphinx'
# A list of ignored prefixes for module index sorting.
# modindex_common_prefix = []
#modindex_common_prefix = []
# -- Options for HTML output ---------------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of built-in themes.
import guzzle_sphinx_theme
html_theme_path = guzzle_sphinx_theme.html_theme_path()
html_theme = "guzzle_sphinx_theme"
def set_rst_settings(app):
app.env.settings.update({"field_name_limit": 0, "option_limit": 0})
def setup(app):
app.setup_extension("sphinxcontrib.jquery")
app.add_css_file("css/borg.css")
app.connect("builder-inited", set_rst_settings)
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#html_theme = ''
if not on_rtd: # only import and set the theme if we're building docs locally
import sphinx_rtd_theme
html_theme = 'sphinx_rtd_theme'
html_theme_path = [sphinx_rtd_theme.get_html_theme_path()]
html_style = 'css/borg.css'
else:
html_context = {
'css_files': [
'https://media.readthedocs.org/css/sphinx_rtd_theme.css',
'https://media.readthedocs.org/css/readthedocs-doc-embed.css',
'_static/css/borg.css',
],
}
# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
# documentation.
html_theme_options = {"project_nav_name": "Borg %s" % version}
#html_theme_options = {}
# Add any paths that contain custom themes here, relative to this directory.
# html_theme_path = ['_themes']
#html_theme_path = ['_themes']
# The name for this set of Sphinx documents. If None, it defaults to
# "<project> v<release> documentation".
# html_title = None
#html_title = None
# A shorter title for the navigation bar. Default is the same as html_title.
# html_short_title = None
#html_short_title = None
# The name of an image file (relative to this directory) to place at the top
# of the sidebar.
html_logo = "_static/logo.svg"
html_logo = '_static/logo.png'
# The name of an image file (within the static path) to use as favicon of the
# docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32
# pixels large.
html_favicon = "_static/favicon.ico"
html_favicon = '_static/favicon.ico'
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ["borg_theme"]
html_extra_path = ["../src/borg/paperkey.html"]
html_static_path = ['borg_theme']
# If not '', a 'Last updated on:' timestamp is inserted at every page bottom,
# using the given strftime format.
html_last_updated_fmt = "%Y-%m-%d"
html_last_updated_fmt = '%Y-%m-%d'
# If true, SmartyPants will be used to convert quotes and dashes to
# typographically correct entities.
html_use_smartypants = True
smartquotes_action = "qe" # no D in there means "do not transform -- and ---"
#html_use_smartypants = True
# Custom sidebar templates, maps document names to template names.
html_sidebars = {"**": ["logo-text.html", "versionselector.html", "searchbox.html", "downloads.html", "globaltoc.html"]}
html_sidebars = {
'index': ['sidebarlogo.html', 'sidebarusefullinks.html', 'searchbox.html'],
'**': ['sidebarlogo.html', 'relations.html', 'searchbox.html', 'localtoc.html', 'sidebarusefullinks.html']
}
# Additional templates that should be rendered to pages, maps page names to
# template names.
# html_additional_pages = {}
#html_additional_pages = {}
# If false, no module index is generated.
# html_domain_indices = True
#html_domain_indices = True
# If false, no index is generated.
html_use_index = False
# If true, the index is split into individual pages for each letter.
# html_split_index = False
#html_split_index = False
# If true, links to the reST sources are added to the pages.
html_show_sourcelink = False
@ -183,45 +176,52 @@ html_show_copyright = False
# If true, an OpenSearch description file will be output, and all pages will
# contain a <link> tag referring to it. The value of this option must be the
# base URL from which the finished HTML is served.
# html_use_opensearch = ''
#html_use_opensearch = ''
# This is the file name suffix for HTML files (e.g. ".xhtml").
# html_file_suffix = None
#html_file_suffix = None
# Output file base name for HTML help builder.
htmlhelp_basename = "borgdoc"
htmlhelp_basename = 'borgdoc'
# -- Options for LaTeX output --------------------------------------------------
# The paper size ('letter' or 'a4').
#latex_paper_size = 'letter'
# The font size ('10pt', '11pt' or '12pt').
#latex_font_size = '10pt'
# Grouping the document tree into LaTeX files. List of tuples
# (source start file, target name, title, author, documentclass [howto/manual]).
latex_documents = [("book", "Borg.tex", "Borg Documentation", "The Borg Collective", "manual")]
latex_documents = [
('index', 'Borg.tex', 'Borg Documentation',
'see "AUTHORS" file', 'manual'),
]
# The name of an image file (relative to this directory) to place at the top of
# the title page.
latex_logo = "_static/logo.pdf"
latex_elements = {"papersize": "a4paper", "pointsize": "10pt", "figure_align": "H"}
#latex_logo = None
# For "manual" documents, if this is true, then toplevel headings are parts,
# not chapters.
# latex_use_parts = False
#latex_use_parts = False
# If true, show page references after internal links.
# latex_show_pagerefs = False
#latex_show_pagerefs = False
# If true, show URL addresses after external links.
latex_show_urls = "footnote"
#latex_show_urls = False
# Additional stuff for the LaTeX preamble.
# latex_preamble = ''
#latex_preamble = ''
# Documents to append as an appendix to all manuals.
latex_appendices = ["support", "changes", "authors"]
#latex_appendices = []
# If false, no module index is generated.
# latex_domain_indices = True
#latex_domain_indices = True
# -- Options for manual page output --------------------------------------------
@ -229,23 +229,15 @@ latex_appendices = ["support", "changes", "authors"]
# One entry per manual page. List of tuples
# (source start file, name, description, authors, manual section).
man_pages = [
(
"usage",
"borg",
"BorgBackup is a deduplicating backup program with optional compression and authenticated encryption.",
["The Borg Collective (see AUTHORS file)"],
1,
)
('usage', 'borg',
'BorgBackup is a deduplicating backup program with optional compression and authenticated encryption.',
['The Borg Collective (see AUTHORS file)'],
1),
]
extensions = [
"sphinx.ext.extlinks",
"sphinx.ext.autodoc",
"sphinx.ext.todo",
"sphinx.ext.coverage",
"sphinx.ext.viewcode",
"sphinxcontrib.jquery", # jquery is not included anymore by default
"guzzle_sphinx_theme", # register the theme as an extension to generate a sitemap.xml
]
extensions = ['sphinx.ext.extlinks', 'sphinx.ext.autodoc', 'sphinx.ext.todo', 'sphinx.ext.coverage', 'sphinx.ext.viewcode']
extlinks = {"issue": ("https://github.com/borgbackup/borg/issues/%s", "#%s")}
extlinks = {
'issue': ('https://github.com/borgbackup/borg/issues/%s', '#'),
'targz_url': ('https://pypi.python.org/packages/source/b/borgbackup/%%s-%s.tar.gz' % version, None),
}

View file

@ -1,17 +1,166 @@
.. include:: global.rst.inc
.. highlight:: none
.. _deployment:
Deployment
==========
This chapter details deployment strategies for the following scenarios.
This chapter will give an example how to setup a borg repository server for multiple
clients.
.. toctree::
:titlesonly:
Machines
--------
deployment/central-backup-server
deployment/hosting-repositories
deployment/automated-local
deployment/image-backup
deployment/pull-backup
deployment/non-root-user
There are multiple machines used in this chapter and will further be named by their
respective fully qualified domain name (fqdn).
* The backup server: `backup01.srv.local`
* The clients:
- John Doe's desktop: `johndoe.clnt.local`
- Webserver 01: `web01.srv.local`
- Application server 01: `app01.srv.local`
User and group
--------------
The repository server needs to have only one UNIX user for all the clients.
Recommended user and group with additional settings:
* User: `backup`
* Group: `backup`
* Shell: `/bin/bash` (or other capable to run the `borg serve` command)
* Home: `/home/backup`
Most clients shall initiate a backup from the root user to catch all
users, groups and permissions (e.g. when backing up `/home`).
Folders
-------
The following folder tree layout is suggested on the repository server:
* User home directory, /home/backup
* Repositories path (storage pool): /home/backup/repos
* Clients restricted paths (`/home/backup/repos/<client fqdn>`):
- johndoe.clnt.local: `/home/backup/repos/johndoe.clnt.local`
- web01.srv.local: `/home/backup/repos/web01.srv.local`
- app01.srv.local: `/home/backup/repos/app01.srv.local`
Restrictions
------------
Borg is instructed to restrict clients into their own paths:
``borg serve --restrict-to-path /home/backup/repos/<client fqdn>``
There is only one ssh key per client allowed. Keys are added for ``johndoe.clnt.local``, ``web01.srv.local`` and
``app01.srv.local``. But they will access the backup under only one UNIX user account as:
``backup@backup01.srv.local``. Every key in ``$HOME/.ssh/authorized_keys`` has a
forced command and restrictions applied as shown below:
::
command="cd /home/backup/repos/<client fqdn>;
borg serve --restrict-to-path /home/backup/repos/<client fqdn>",
no-port-forwarding,no-X11-forwarding,no-pty,
no-agent-forwarding,no-user-rc <keytype> <key> <host>
.. note:: The text shown above needs to be written on a single line!
The options which are added to the key will perform the following:
1. Change working directory
2. Run ``borg serve`` restricted to the client base path
3. Restrict ssh and do not allow stuff which imposes a security risk
Due to the ``cd`` command we use, the server automatically changes the current
working directory. Then client doesn't need to have knowledge of the absolute
or relative remote repository path and can directly access the repositories at
``<user>@<host>:<repo>``.
.. note:: The setup above ignores all client given commandline parameters
which are normally appended to the `borg serve` command.
Client
------
The client needs to initialize the `pictures` repository like this:
borg init backup@backup01.srv.local:pictures
Or with the full path (should actually never be used, as only for demonstrational purposes).
The server should automatically change the current working directory to the `<client fqdn>` folder.
borg init backup@backup01.srv.local:/home/backup/repos/johndoe.clnt.local/pictures
When `johndoe.clnt.local` tries to access a not restricted path the following error is raised.
John Doe tries to backup into the Web 01 path:
borg init backup@backup01.srv.local:/home/backup/repos/web01.srv.local/pictures
::
~~~ SNIP ~~~
Remote: borg.remote.PathNotAllowed: /home/backup/repos/web01.srv.local/pictures
~~~ SNIP ~~~
Repository path not allowed
Ansible
-------
Ansible takes care of all the system-specific commands to add the user, create the
folder. Even when the configuration is changed the repository server configuration is
satisfied and reproducible.
Automate setting up an repository server with the user, group, folders and
permissions a Ansible playbook could be used. Keep in mind the playbook
uses the Arch Linux `pacman <https://www.archlinux.org/pacman/pacman.8.html>`_
package manager to install and keep borg up-to-date.
::
- hosts: backup01.srv.local
vars:
user: backup
group: backup
home: /home/backup
pool: "{{ home }}/repos"
auth_users:
- host: johndoe.clnt.local
key: "{{ lookup('file', '/path/to/keys/johndoe.clnt.local.pub') }}"
- host: web01.clnt.local
key: "{{ lookup('file', '/path/to/keys/web01.clnt.local.pub') }}"
- host: app01.clnt.local
key: "{{ lookup('file', '/path/to/keys/app01.clnt.local.pub') }}"
tasks:
- pacman: name=borg state=latest update_cache=yes
- group: name="{{ group }}" state=present
- user: name="{{ user }}" shell=/bin/bash home="{{ home }}" createhome=yes group="{{ group }}" groups= state=present
- file: path="{{ home }}" owner="{{ user }}" group="{{ group }}" mode=0700 state=directory
- file: path="{{ home }}/.ssh" owner="{{ user }}" group="{{ group }}" mode=0700 state=directory
- file: path="{{ pool }}" owner="{{ user }}" group="{{ group }}" mode=0700 state=directory
- authorized_key: user="{{ user }}"
key="{{ item.key }}"
key_options='command="cd {{ pool }}/{{ item.host }};borg serve --restrict-to-path {{ pool }}/{{ item.host }}",no-port-forwarding,no-X11-forwarding,no-pty,no-agent-forwarding,no-user-rc'
with_items: auth_users
- file: path="{{ home }}/.ssh/authorized_keys" owner="{{ user }}" group="{{ group }}" mode=0600 state=file
- file: path="{{ pool }}/{{ item.host }}" owner="{{ user }}" group="{{ group }}" mode=0700 state=directory
with_items: auth_users
Enhancements
------------
As this chapter only describes a simple and effective setup it could be further
enhanced when supporting (a limited set) of client supplied commands. A wrapper
for starting `borg serve` could be written. Or borg itself could be enhanced to
autodetect it runs under SSH by checking the `SSH_ORIGINAL_COMMAND` environment
variable. This is left open for future improvements.
When extending ssh autodetection in borg no external wrapper script is necessary
and no other interpreter or application has to be deployed.
See also
--------
* `SSH Daemon manpage <http://www.openbsd.org/cgi-bin/man.cgi/OpenBSD-current/man8/sshd.8>`_
* `Ansible <https://docs.ansible.com>`_

View file

@ -1,216 +0,0 @@
.. include:: ../global.rst.inc
.. highlight:: none
Automated backups to a local hard drive
=======================================
This guide shows how to automate backups to a hard drive directly connected
to your computer. If a backup hard drive is connected, backups are automatically
started, and the drive shut-down and disconnected when they are done.
This guide is written for a Linux-based operating system and makes use of
systemd and udev.
Overview
--------
A udev rule is created to trigger on the addition of block devices. The rule contains a tag
that triggers systemd to start a one-shot service. The one-shot service executes a script in
the standard systemd service environment, which automatically captures stdout/stderr and
logs it to the journal.
The script mounts the added block device if it is a registered backup drive and creates
backups on it. When done, it optionally unmounts the filesystem and spins the drive down,
so that it may be physically disconnected.
Configuring the system
----------------------
First, create the ``/etc/backups`` directory (as root).
All configuration goes into this directory.
Find out the ID of the partition table of your backup disk (here assumed to be /dev/sdz)::
lsblk --fs -o +PTUUID /dev/sdz
Then, create ``/etc/backups/80-backup.rules`` with the following content (all on one line)::
ACTION=="add", SUBSYSTEM=="block", ENV{ID_PART_TABLE_UUID}=="<the PTUUID you just noted>", TAG+="systemd", ENV{SYSTEMD_WANTS}+="automatic-backup.service"
The "systemd" tag in conjunction with the SYSTEMD_WANTS environment variable has systemd
launch the "automatic-backup" service, which we will create next, as the
``/etc/backups/automatic-backup.service`` file:
.. code-block:: ini
[Service]
Type=oneshot
ExecStart=/etc/backups/run.sh
Now, create the main backup script, ``/etc/backups/run.sh``. Below is a template;
modify it to suit your needs (e.g., more backup sets, dumping databases, etc.).
.. code-block:: bash
#!/bin/bash -ue
# The udev rule is not terribly accurate and may trigger our service before
# the kernel has finished probing partitions. Sleep for a bit to ensure
# the kernel is done.
#
# This can be avoided by using a more precise udev rule, e.g. matching
# a specific hardware path and partition.
sleep 5
#
# Script configuration
#
# The backup partition is mounted there
MOUNTPOINT=/mnt/backup
# This is the location of the Borg repository
TARGET=$MOUNTPOINT/borg-backups/backup.borg
# Archive name schema
DATE=$(date --iso-8601)-$(hostname)
# This is the file that will later contain UUIDs of registered backup drives
DISKS=/etc/backups/backup.disks
# Find whether the connected block device is a backup drive
for uuid in $(lsblk --noheadings --list --output uuid)
do
if grep --quiet --fixed-strings $uuid $DISKS; then
break
fi
uuid=
done
if [ ! $uuid ]; then
echo "No backup disk found, exiting"
exit 0
fi
echo "Disk $uuid is a backup disk"
partition_path=/dev/disk/by-uuid/$uuid
# Mount filesystem if not already done. This assumes that if something is already
# mounted at $MOUNTPOINT, it is the backup drive. It will not find the drive if
# it was mounted somewhere else.
findmnt $MOUNTPOINT >/dev/null || mount $partition_path $MOUNTPOINT
drive=$(lsblk --inverse --noheadings --list --paths --output name $partition_path | head --lines 1)
echo "Drive path: $drive"
#
# Create backups
#
# Options for borg create
BORG_OPTS="--stats --one-file-system --compression lz4"
# Set BORG_PASSPHRASE or BORG_PASSCOMMAND somewhere around here, using export,
# if encryption is used.
# Because no one can answer these questions non-interactively, it is better to
# fail quickly instead of hanging.
export BORG_RELOCATED_REPO_ACCESS_IS_OK=no
export BORG_UNKNOWN_UNENCRYPTED_REPO_ACCESS_IS_OK=no
# Log Borg version
borg --version
echo "Starting backup for $DATE"
# This is just an example, change it however you see fit
borg create $BORG_OPTS \
--exclude root/.cache \
--exclude var/lib/docker/devicemapper \
$TARGET::$DATE-$$-system \
/ /boot
# /home is often a separate partition/filesystem.
# Even if it is not (add --exclude /home above), it probably makes sense
# to have /home in a separate archive.
borg create $BORG_OPTS \
--exclude 'sh:home/*/.cache' \
$TARGET::$DATE-$$-home \
/home/
echo "Completed backup for $DATE"
# Just to be completely paranoid
sync
if [ -f /etc/backups/autoeject ]; then
umount $MOUNTPOINT
hdparm -Y $drive
fi
if [ -f /etc/backups/backup-suspend ]; then
systemctl suspend
fi
Create the ``/etc/backups/autoeject`` file to have the script automatically eject the drive
after creating the backup. Rename the file to something else (e.g., ``/etc/backups/autoeject-no``)
when you want to do something with the drive after creating backups (e.g., running checks).
Create the ``/etc/backups/backup-suspend`` file if the machine should suspend after completing
the backup. Don't forget to disconnect the device physically before resuming,
otherwise you'll enter a cycle. You can also add an option to power down instead.
Create an empty ``/etc/backups/backup.disks`` file, in which you will register your backup drives.
Finally, enable the udev rules and services:
.. code-block:: bash
ln -s /etc/backups/80-backup.rules /etc/udev/rules.d/80-backup.rules
ln -s /etc/backups/automatic-backup.service /etc/systemd/system/automatic-backup.service
systemctl daemon-reload
udevadm control --reload
Adding backup hard drives
-------------------------
Connect your backup hard drive. Format it, if not done already.
Find the UUID of the filesystem on which backups should be stored::
lsblk -o+uuid,label
Record the UUID in the ``/etc/backups/backup.disks`` file.
Mount the drive at /mnt/backup.
Initialize a Borg repository at the location indicated by ``TARGET``::
borg init --encryption ... /mnt/backup/borg-backups/backup.borg
Unmount and reconnect the drive, or manually start the ``automatic-backup`` service
to start the first backup::
systemctl start --no-block automatic-backup
See backup logs using journalctl::
journalctl -fu automatic-backup [-n number-of-lines]
Security considerations
-----------------------
The script as shown above will mount any filesystem with a UUID listed in
``/etc/backups/backup.disks``. The UUID check is a safety/annoyance-reduction
mechanism to keep the script from blowing up whenever a random USB thumb drive is connected.
It is not meant as a security mechanism. Mounting filesystems and reading repository
data exposes additional attack surfaces (kernel filesystem drivers,
possibly userspace services, and Borg itself). On the other hand, someone
standing right next to your computer can attempt a lot of attacks, most of which
are easier to do than, e.g., exploiting filesystems (installing a physical keylogger,
DMA attacks, stealing the machine, ...).
Borg ensures that backups are not created on random drives that "just happen"
to contain a Borg repository. If an unknown unencrypted repository is encountered,
then the script aborts (BORG_UNKNOWN_UNENCRYPTED_REPO_ACCESS_IS_OK=no).
Backups are only created on hard drives that contain a Borg repository that is
either known (by ID) to your machine or you are using encryption and the
passphrase of the repository has to match the passphrase supplied to Borg.

View file

@ -1,228 +0,0 @@
.. include:: ../global.rst.inc
.. highlight:: none
.. _central-backup-server:
Central repository server with Ansible or Salt
==============================================
This section gives an example of how to set up a Borg repository server for multiple
clients.
.. note::
This example predates Borg 2 and uses the legacy ``ssh://`` transport (served
by ``borg serve``) and ``borg init``. With Borg 2, the ``ssh://`` transport is
only used for legacy borg 1.x (v1) repositories; for current repositories use a
``rest://`` repository instead (Borg connects via ssh and runs a borgstore REST
server on the remote host), and use ``borg repo-create`` instead of ``borg init``.
Machines
--------
This section uses multiple machines, referred to by their
respective fully qualified domain names (FQDNs).
* The backup server: `backup01.srv.local`
* The clients:
- John Doe's desktop: `johndoe.clnt.local`
- Web server 01: `web01.srv.local`
- Application server 01: `app01.srv.local`
User and group
--------------
The repository server should have a single UNIX user for all the clients.
Recommended user and group with additional settings:
* User: `backup`
* Group: `backup`
* Shell: `/bin/bash` (or another shell capable of running the `borg serve` command)
* Home: `/home/backup`
Most clients should initiate a backup as the root user to capture all
users, groups, and permissions (e.g., when backing up `/home`).
Folders
-------
The following directory layout is suggested on the repository server:
* User home directory, /home/backup
* Repositories path (storage pool): /home/backup/repos
* Clients restricted paths (`/home/backup/repos/<client fqdn>`):
- johndoe.clnt.local: `/home/backup/repos/johndoe.clnt.local`
- web01.srv.local: `/home/backup/repos/web01.srv.local`
- app01.srv.local: `/home/backup/repos/app01.srv.local`
Restrictions
------------
Borg is instructed to restrict clients into their own paths:
``borg serve --restrict-to-path /home/backup/repos/<client fqdn>``
The client will be able to access any file or subdirectory inside of ``/home/backup/repos/<client fqdn>``
but no other directories. You can allow a client to access several separate directories by passing multiple
``--restrict-to-path`` flags, for instance: ``borg serve --restrict-to-path /home/backup/repos/<client fqdn> --restrict-to-path /home/backup/repos/<other client fqdn>``,
which could make sense if multiple machines belong to one person which should then have access to all the
backups of their machines.
Only one SSH key per client is allowed. Keys are added for ``johndoe.clnt.local``, ``web01.srv.local`` and
``app01.srv.local``. They will access the backup under a single UNIX user account as
``backup@backup01.srv.local``. Every key in ``$HOME/.ssh/authorized_keys`` has a
forced command and restrictions applied, as shown below:
::
command="cd /home/backup/repos/<client fqdn>;
borg serve --restrict-to-path /home/backup/repos/<client fqdn>",
restrict <keytype> <key> <host>
.. note:: The text shown above needs to be written on a single line!
The options added to the key perform the following:
1. Change working directory
2. Run ``borg serve`` restricted to the client base path
3. Restrict ssh and do not allow stuff which imposes a security risk
Because of the ``cd`` command, the server automatically changes the current
working directory. The client then does not need to know the absolute
or relative remote repository path and can directly access the repositories at
``ssh://<user>@<host>/./<repo>``.
.. note:: The setup above ignores all client-given command line parameters
that are normally appended to the `borg serve` command.
Client
------
The client needs to initialize the `pictures` repository like this:
::
borg init ssh://backup@backup01.srv.local/./pictures
Or with the full path (this should not be used in practice; it is only for demonstration purposes).
The server automatically changes the current working directory to the `<client fqdn>` directory.
::
borg init ssh://backup@backup01.srv.local/home/backup/repos/johndoe.clnt.local/pictures
When `johndoe.clnt.local` tries to access a path outside its restriction, the following error is raised.
John Doe tries to back up into the web01 path:
::
borg init ssh://backup@backup01.srv.local/home/backup/repos/web01.srv.local/pictures
::
~~~ SNIP ~~~
Remote: borg.remote.PathNotAllowed: /home/backup/repos/web01.srv.local/pictures
~~~ SNIP ~~~
Repository path not allowed
Ansible
-------
Ansible takes care of all the system-specific commands to add the user, create the
folder, install and configure software.
::
- hosts: backup01.srv.local
vars:
user: backup
group: backup
home: /home/backup
pool: "{{ home }}/repos"
auth_users:
- host: johndoe.clnt.local
key: "{{ lookup('file', '/path/to/keys/johndoe.clnt.local.pub') }}"
- host: web01.clnt.local
key: "{{ lookup('file', '/path/to/keys/web01.clnt.local.pub') }}"
- host: app01.clnt.local
key: "{{ lookup('file', '/path/to/keys/app01.clnt.local.pub') }}"
tasks:
- package: name=borg state=present
- group: name="{{ group }}" state=present
- user: name="{{ user }}" shell=/bin/bash home="{{ home }}" createhome=yes group="{{ group }}" groups= state=present
- file: path="{{ home }}" owner="{{ user }}" group="{{ group }}" mode=0700 state=directory
- file: path="{{ home }}/.ssh" owner="{{ user }}" group="{{ group }}" mode=0700 state=directory
- file: path="{{ pool }}" owner="{{ user }}" group="{{ group }}" mode=0700 state=directory
- authorized_key: user="{{ user }}"
key="{{ item.key }}"
key_options='command="cd {{ pool }}/{{ item.host }};borg serve --restrict-to-path {{ pool }}/{{ item.host }}",restrict'
with_items: "{{ auth_users }}"
- file: path="{{ home }}/.ssh/authorized_keys" owner="{{ user }}" group="{{ group }}" mode=0600 state=file
- file: path="{{ pool }}/{{ item.host }}" owner="{{ user }}" group="{{ group }}" mode=0700 state=directory
with_items: "{{ auth_users }}"
Salt
----
This is a configuration similar to the one above, configured to be deployed with
Salt running on a Debian system.
::
Install borg backup from pip:
pkg.installed:
- pkgs:
- python3
- python3-dev
- python3-pip
- python-virtualenv
- libssl-dev
- openssl
- libacl1-dev
- libacl1
- build-essential
- libfuse-dev
- fuse
- pkg-config
pip.installed:
- pkgs: ["borgbackup"]
- bin_env: /usr/bin/pip3
Setup backup user:
user.present:
- name: backup
- fullname: Backup User
- home: /home/backup
- shell: /bin/bash
# CAUTION!
# If you change the ssh command= option below, it won't necessarily get pushed to the backup
# server correctly unless you delete the ~/.ssh/authorized_keys file and re-create it!
{% for host in backupclients %}
Give backup access to {{host}}:
ssh_auth.present:
- user: backup
- source: salt://conf/ssh-pubkeys/{{host}}-backup.id_ecdsa.pub
- options:
- command="cd /home/backup/repos/{{host}}; borg serve --restrict-to-path /home/backup/repos/{{host}}"
- restrict
{% endfor %}
Enhancements
------------
As this section only describes a simple and effective setup, it could be further
enhanced when supporting (a limited set) of client supplied commands. A wrapper
for starting `borg serve` could be written. Or borg itself could be enhanced to
autodetect it runs under SSH by checking the `SSH_ORIGINAL_COMMAND` environment
variable. This is left open for future improvements.
When extending ssh autodetection in borg no external wrapper script is necessary
and no other interpreter or application has to be deployed.
See also
--------
* `SSH Daemon manpage <https://www.openbsd.org/cgi-bin/man.cgi/OpenBSD-current/man8/sshd.8>`_
* `Ansible <https://docs.ansible.com>`_
* `Salt <https://docs.saltstack.com/>`_

View file

@ -1,60 +0,0 @@
.. include:: ../global.rst.inc
.. highlight:: none
.. _hosting_repositories:
Hosting repositories
====================
This section shows how to provide repository storage securely for users.
Repositories are accessed through SSH. Each user of the service should
have their own login, which is only able to access that user's files.
Technically, it is possible to have multiple users share one login;
however, separating them is better. Separate logins increase isolation
and provide an additional layer of security and safety for both the
provider and the users.
For example, if a user manages to breach ``borg serve``, they can
only damage their own data (assuming that the system does not have further
vulnerabilities).
Use the standard directory structure of the operating system. Each user
is assigned a home directory, and that user's repositories reside in their
home directory.
The following ``~user/.ssh/authorized_keys`` file is the most important
piece for a correct deployment. It allows the user to log in via
their public key (which must be provided by the user), and restricts
SSH access to safe operations only.
::
command="borg serve --restrict-to-repository /home/<user>/repository",restrict
<key type> <key> <key host>
.. note:: The text shown above needs to be written on a **single** line!
.. warning::
If this file should be automatically updated (e.g. by a web console),
pay **utmost attention** to sanitizing user input. Strip all whitespace
around the user-supplied key, ensure that it **only** contains ASCII
with no control characters and that it consists of three parts separated
by a single space. Ensure that no newlines are contained within the key.
The ``restrict`` keyword enables all restrictions, i.e. disables port, agent
and X11 forwarding, as well as disabling PTY allocation and execution of ~/.ssh/rc.
If any future restriction capabilities are added to authorized_keys
files they will be included in this set.
The ``command`` keyword forces execution of the specified command
upon login. This must be ``borg serve``. The ``--restrict-to-repository``
option permits access to exactly **one** repository. It can be given
multiple times to permit access to more than one repository.
The repository may not exist yet; it can be initialized by the user,
which allows for encryption.
Refer to the `sshd(8) <https://www.openbsd.org/cgi-bin/man.cgi/OpenBSD-current/man8/sshd.8>`_
man page for more details on SSH options.
See also :ref:`borg_serve`

View file

@ -1,154 +0,0 @@
.. include:: ../global.rst.inc
.. highlight:: none
Backing up entire disk images
=============================
Backing up disk images can still be efficient with Borg because its `deduplication`_
technique makes sure only the modified parts of the file are stored. Borg also has
optional simple sparse file support for extraction.
It is of utmost importance to pin down the disk you want to back up.
Use the disk's SERIAL for that.
Use:
.. code-block:: bash
# You can find the short disk serial by:
# udevadm info --query=property --name=nvme1n1 | grep ID_SERIAL_SHORT | cut -d '=' -f 2
export BORG_REPO=/path/to/repo
DISK_SERIAL="7VS0224F"
DISK_ID=$(readlink -f /dev/disk/by-id/*"${DISK_SERIAL}") # Returns /dev/nvme1n1
mapfile -t PARTITIONS < <(lsblk -o NAME,TYPE -p -n -l "$DISK_ID" | awk '$2 == "part" {print $1}')
echo "Partitions of $DISK_ID:"
echo "${PARTITIONS[@]}"
echo "Disk Identifier: $DISK_ID"
# Use the following line to perform a Borg backup for the full disk:
# borg create --read-special disk-backup "$DISK_ID"
# Use the following to perform a Borg backup for all partitions of the disk
# borg create --read-special partitions-backup "${PARTITIONS[@]}"
# Example output:
# Partitions of /dev/nvme1n1:
# /dev/nvme1n1p1
# /dev/nvme1n1p2
# /dev/nvme1n1p3
# Disk Identifier: /dev/nvme1n1
# borg create --read-special disk-backup /dev/nvme1n1
# borg create --read-special partitions-backup /dev/nvme1n1p1 /dev/nvme1n1p2 /dev/nvme1n1p3
Decreasing the size of image backups
------------------------------------
Disk images are as large as the full disk when uncompressed and might not get much
smaller post-deduplication after heavy use because virtually all filesystems do not
actually delete file data on disk but instead delete the filesystem entries referencing
the data. Therefore, if a disk nears capacity and files are deleted again, the change
will barely decrease the space it takes up when compressed and deduplicated. Depending
on the filesystem, there are several ways to decrease the size of a disk image:
Using ntfsclone (NTFS, i.e. Windows VMs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
``ntfsclone`` can only operate on filesystems with the journal cleared (i.e. turned-off
machines), which somewhat limits its utility in the case of VM snapshots. However, when
it can be used, its special image format is even more efficient than just zeroing and
deduplicating. For backup, save the disk header and the contents of each partition::
HEADER_SIZE=$(sfdisk -lo Start $DISK | grep -A1 -P 'Start$' | tail -n1 | xargs echo)
PARTITIONS=$(sfdisk -lo Device,Type $DISK | sed -e '1,/Device\s*Type/d')
dd if=$DISK count=$HEADER_SIZE | borg create --repo repo hostname-partinfo -
echo "$PARTITIONS" | grep NTFS | cut -d' ' -f1 | while read x; do
PARTNUM=$(echo $x | grep -Eo "[0-9]+$")
ntfsclone -so - $x | borg create --repo repo hostname-part$PARTNUM -
done
# to back up non-NTFS partitions as well:
echo "$PARTITIONS" | grep -v NTFS | cut -d' ' -f1 | while read x; do
PARTNUM=$(echo $x | grep -Eo "[0-9]+$")
borg create --read-special --repo repo hostname-part$PARTNUM $x
done
Restoration is a similar process::
borg extract --stdout --repo repo hostname-partinfo | dd of=$DISK && partprobe
PARTITIONS=$(sfdisk -lo Device,Type $DISK | sed -e '1,/Device\s*Type/d')
borg list --format {archive}{NL} repo | grep 'part[0-9]*$' | while read x; do
PARTNUM=$(echo $x | grep -Eo "[0-9]+$")
PARTITION=$(echo "$PARTITIONS" | grep -E "$DISKp?$PARTNUM" | head -n1)
if echo "$PARTITION" | cut -d' ' -f2- | grep -q NTFS; then
borg extract --stdout --repo repo $x | ntfsclone -rO $(echo "$PARTITION" | cut -d' ' -f1) -
else
borg extract --stdout --repo repo $x | dd of=$(echo "$PARTITION" | cut -d' ' -f1)
fi
done
.. note::
When backing up a disk image (as opposed to a real block device), mount it as
a loopback image to use the above snippets::
DISK=$(losetup -Pf --show /path/to/disk/image)
# do backup as shown above
losetup -d $DISK
Using zerofree (ext2, ext3, ext4)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
``zerofree`` works similarly to ntfsclone in that it zeros out unused chunks of the FS,
except it works in place, zeroing the original partition. This makes the backup process
a bit simpler::
sfdisk -lo Device,Type $DISK | sed -e '1,/Device\s*Type/d' | grep Linux | cut -d' ' -f1 | xargs -n1 zerofree
borg create --read-special --repo repo hostname-disk $DISK
Because the partitions were zeroed in place, restoration is only one command::
borg extract --stdout --repo repo hostname-disk | dd of=$DISK
.. note:: The "traditional" way to zero out space on a partition, especially one already
mounted, is simply to ``dd`` from ``/dev/zero`` to a temporary file and delete
it. This is ill-advised for the reasons mentioned in the ``zerofree`` man page:
- it is slow.
- it makes the disk image (temporarily) grow to its maximal extent.
- it (temporarily) uses all free space on the disk, so other concurrent write actions may fail.
Virtual machines
----------------
If you use non-snapshotting backup tools like Borg to back up virtual machines, then
the VMs should be turned off for the duration of the backup. Backing up live VMs can
(and will) result in corrupted or inconsistent backup contents: a VM image is just a
regular file to Borg with the same issues as regular files when it comes to concurrent
reading and writing from the same file.
For backing up live VMs use filesystem snapshots on the VM host, which establishes
crash-consistency for the VM images. This means that with most filesystems (that
are journaling) the FS will always be fine in the backup (but may need a journal
replay to become accessible).
Usually this does not mean that file *contents* on the VM are consistent, since file
contents are normally not journaled. Notable exceptions are ext4 in data=journal mode,
ZFS and btrfs (unless nodatacow is used).
Applications designed with crash-consistency in mind (most relational databases like
PostgreSQL, SQLite etc. but also for example Borg repositories) should always be able
to recover to a consistent state from a backup created with crash-consistent snapshots
(even on ext4 with data=writeback or XFS). Other applications may require a lot of work
to reach application-consistency; it's a broad and complex issue that cannot be explained
in entirety here.
Hypervisor snapshots capturing most of the VM's state can also be used for backups and
can be a better alternative to pure filesystem-based snapshots of the VM's disk, since
no state is lost. Depending on the application this can be the easiest and most reliable
way to create application-consistent backups.
Borg does not intend to address these issues due to their huge complexity and
platform/software dependency. Combining Borg with the mechanisms provided by the platform
(snapshots, hypervisor features) will be the best approach to start tackling them.

View file

@ -1,66 +0,0 @@
.. include:: ../global.rst.inc
.. highlight:: none
.. _non_root_user:
================================
Backing up using a non-root user
================================
This section describes how to run Borg as a non-root user and still be able to
back up every file on the system.
Normally, Borg is run as the root user to bypass all filesystem permissions and
be able to read all files. However, in theory this also allows Borg to modify or
delete files on your system (for example, in case of a bug).
To eliminate this possibility, we can run Borg as a non-root user and give it read-only
permissions to all files on the system.
Using Linux capabilities inside a systemd service
=================================================
One way to do so is to use Linux `capabilities
<https://man7.org/linux/man-pages/man7/capabilities.7.html>`_ within a systemd
service.
Linux capabilities allow us to grant parts of the root users privileges to
a non-root user. This works on a per-thread level and does not grant permissions
to the non-root user as a whole.
For this, we need to run the backup script from a systemd service and use the `AmbientCapabilities
<https://www.freedesktop.org/software/systemd/man/latest/systemd.exec.html#AmbientCapabilities=>`_
option added in systemd 229.
A very basic unit file would look like this:
::
[Unit]
Description=Borg Backup
[Service]
Type=oneshot
User=borg
ExecStart=/usr/local/sbin/backup.sh
AmbientCapabilities=CAP_DAC_READ_SEARCH
The ``CAP_DAC_READ_SEARCH`` capability gives Borg read-only access to all files and directories on the system.
This service can then be started manually using ``systemctl start``, a systemd timer or other methods.
Restore considerations
======================
Use the root user when restoring files. If you use the non-root user, ``borg extract`` will
change ownership of all restored files to the non-root user. Using ``borg mount`` will not allow the
non-root user to access files it would not be able to access on the system itself.
Other than that, you can use the same restore process you would use when running the backup as root.
.. warning::
When using a local repository and running Borg commands as root, make sure to use only commands that do not
modify the repository itself, such as extract or mount. Modifying the repository as root will break it for the
non-root user, since some files inside the repository will then be owned by root.

Some files were not shown because too many files have changed in this diff Show more