i checked it: copying the index.* and hints.* files in advance is not needed, open() and close() do not modify them.
also: fix unicode exception with encoded filename
because Repository.__init__ normally opens and locks the repo, and the upgrader just
inherited from (borg) Repository, it created a lock file there before the "backup copy"
was made.
No big problem, but a bit unclean.
Fixed it to not lock at the beginning, then make the copy, then lock.
it seems it is possible that the chunks files are copied but *not*
converted. this may have happened here because the conversion was
interrupted, although the specific scenario is still unclear (but it
did happen during manual tests here). therefore reproducing this
problem seems to be difficult, hence the lack of tests for this
specific issue.
since we consider the header replacement code to be safe, that we
always convert shouldn't pose any additional threat to the existing
borg chunk cache.
this resolves bug #something where the index file could not be
converted, completely breaking conversion.
it seems that, during some refactoring, the index conversion code was
completely dropped. this was missed by the unit tests because the repo
can still be opened by the constructor even though the index is
invalid, so tests need improvements there.
instead, we perform the equivalent of `cp -al` on the repository to
keep a backup, and then rewrite the files, breaking the hardlinks as
necessary.
it has to be confirmed that the rest of Borg will also break hardlinks
when operating on files in the repository. if Borg operates in place
on any files of the repository, it could jeoperdize the backup, so
this needs to be verified. I believe that most files are written to a
temporary file and moved into place, however, so the backup should be
safe.
the rationale behind the backup copy is that we want to be extra
careful with user's data by default. the old behavior is retained
through the `--inplace`/`-i` commandline flag. plus, this way we don't
need to tell users to go through extra steps (`cp -a`, in particular)
before running the command.
also, it can take a long time to do the copy of the attic repository
we wish to work on. since `cp -a` doesn't provide progress
information, the new default behavior provides a nicer user experience
of giving an overall impression of the upgrade progress, while
retaining compatibility with Attic by default (in a separate
repository, of course).
this makes the upgrade command much less scary to use and hopefully
will convert drones to the borg collective.
the only place where the default inplace behavior is retained is in
the header_replace() function, to avoid breaking the cache conversion
code and to keep API stability and semantic coherence ("replace" by
defaults means in place).
this was a remnant of when i was writing the converter/upgrader code,
and was destined to be a general progress message in the migration
process. i removed a more technical, internal debugging message in
exchange
it seems the file cache does *not* have the ATTIC magic header (nor
does it have one in borg), so we don't need to edit the file - we just
copy it like a regular file.
while i'm here, simplify the cache conversion loop: it's no use
splitting the copy and the edition since the latter is so fast, just
do everything in one loop, which makes it much easier to read.
convert is too generic for the Attic conversion: we may have other
converters, from other, more foreign systems that will require
different options and different upgrade mechanisms that convert could
never cover appropriately. we are more likely to use an approach
similar to "git fast-import" instead here, and have the conversion
tools be external tool that feed standard data into borg during
conversion.
upgrade seems like a more natural fit: Attic could be considered like
a pre-historic version of Borg that requires invasive changes for borg
to be able to use the repository. we may require such changes in the
future of borg as well: if we make backwards-incompatible changes to
the repository layout or data format, it is possible that we require
such changes to be performed on the repository before it is usable
again. instead of scattering those conversions all over the code, we
should simply have assertions that check the layout is correct and
point the user to upgrade if it is not.
upgrade should eventually automatically detect the repository format
or version and perform appropriate conversions. Attic is only the
first one. we still need to implement an adequate API for
auto-detection and upgrade, only the seeds of that are present for now.
of course, changes to the upgrade command should be thoroughly
documented in the release notes and an eventual upgrade manual.