revert incorrect fix for put updating shadow_index, fixes #5661

A) the compaction code needs the shadow index only for this case:

segment A: PUT x, segment B: DEL x, with A < B  (DEL shadows the PUT).

B) for the following case, we have no shadowing DEL (or rather: it does not matter,
because there is a PUT right after the DEL) and x is in the repo index,
thus the shadow_index is not needed for the special case in the compaction code:

segment A: PUT x, segment B: DEL x PUT x

see also PR #5636.

reverts f079a83fed
and clarifies the code by more comments.

we keep the code deduplication of 5f32b5666a
and just add a update_shadow_index param to make it not look like there was
something accidentally forgotten, which was the whole reason for the reverted
"fix".
This commit is contained in:
Thomas Waldmann 2021-02-04 02:29:43 +01:00
parent df11a67f96
commit 36503c4e37

View file

@ -1142,9 +1142,11 @@ class Repository:
except KeyError:
pass
else:
# note: doing a delete first will do some bookkeeping,
# like updating the shadow_index, quota, ...
self._delete(id, segment, offset)
# note: doing a delete first will do some bookkeeping.
# we do not want to update the shadow_index here, because
# we know already that we will PUT to this id, so it will
# be in the repo index (and we won't need it in the shadow_index).
self._delete(id, segment, offset, update_shadow_index=False)
segment, offset = self.io.write_put(id, data)
self.storage_quota_use += len(data) + self.io.put_header_fmt.size
self.segments.setdefault(segment, 0)
@ -1167,11 +1169,16 @@ class Repository:
segment, offset = self.index.pop(id)
except KeyError:
raise self.ObjectNotFound(id, self.path) from None
self._delete(id, segment, offset)
# if we get here, there is an object with this id in the repo,
# we write a DEL here that shadows the respective PUT.
# after the delete, the object is not in the repo index any more,
# for the compaction code, we need to update the shadow_index in this case.
self._delete(id, segment, offset, update_shadow_index=True)
def _delete(self, id, segment, offset):
def _delete(self, id, segment, offset, *, update_shadow_index):
# common code used by put and delete
self.shadow_index.setdefault(id, []).append(segment)
if update_shadow_index:
self.shadow_index.setdefault(id, []).append(segment)
self.segments[segment] -= 1
size = self.io.read(segment, offset, id, read_data=False)
self.storage_quota_use -= size