diff --git a/docs/faq.rst b/docs/faq.rst index 812014143..c2ff2489d 100644 --- a/docs/faq.rst +++ b/docs/faq.rst @@ -302,6 +302,34 @@ Yes, if you want to detect accidental data damage (like bit rot), use the If you want to be able to detect malicious tampering also, use an encrypted repo. It will then be able to check using CRCs and HMACs. +Can a previous bad backup spoil future backups? +----------------------------------------------- + +In general, no. If a backup was interrupted or failed for some reason, Borg's +transactional nature and journaling system ensure that the repository remains +consistent. Data that was successfully stored in a partial backup +(checkpoints) will even be reused to speed up the next attempt. + +However, there is one specific case where a past "bad" backup can affect +future ones due to how deduplication works: + +E.g. one could imagine that after computing the MAC (chunk id) of the correct +chunk content data that data gets corrupted (e.g. due to a RAM issue). This +issue will go unnoticed until the MAC is compared to the data again (e.g. when +reading the data from the repo or doing a repo check with ``--verify-data``). +As the MAC is correct, the deduplication "thinks" it already has the correct +data while in fact it only has the corrupted version of that data. In that +case the past bad backup affects the current or future backup due to +deduplication. + +This is not a Borg-specific issue, but a general property of deduplicating +storage systems. To avoid or detect such issues, you should: + +- Use reliable hardware (ECC RAM is recommended). +- Periodically run ``borg check --verify-data REPO`` to verify that the + stored data still matches its MAC (chunk id). Note that this cannot detect + if the data was already "garbage" when it was first processed and stored. + Can I use Borg on SMR hard drives? ----------------------------------