The daemonset controller already has handling for NotFound errors.
Right now if the statefulset controller is attempting to scale down a
statefulset and its informer cache is stale, it can get hard-blocked on a
missing pod.
This issue will eventually self-resolve once the informer cache "catches
up", but in the process of exploring this issue I realized that
404s during pod deletions don't strictly need to abort the entire sync;
we can continue.
This is especially impactful for large statefulsets with
podManagementStrategy: Parallel, where a single "phantom" pod (actually
missing, but still present in the informer cache) can block thousands of
other pods from being cleaned up.
* wire now (time) to the availability checks in the StatefulSet controller
- this helps to make the controller reconcilliation consistent
* schedule pod availability checks at the correct time in StatefulSets
* replace "k8s.io/klog/v2/ktesting" with "k8s.io/kubernetes/test/utils/ktesting"
for advanced features (e.g. Eventually)
* add StatefulSetAvailabilityCheck test
The "// import <path>" comment has been superseded by Go modules.
We don't have to remove them, but doing so has some advantages:
- They are used inconsistently, which is confusing.
- We can then also remove the (currently broken) hack/update-vanity-imports.sh.
- Last but not least, it would be a first step towards avoiding the k8s.io domain.
This commit was generated with
sed -i -e 's;^package \(.*\) // import.*;package \1;' $(git grep -l '^package.*// import' | grep -v 'vendor/')
Everything was included, except for
package labels // import k8s.io/kubernetes/pkg/util/labels
because that package is marked as "read-only".
* lock feature gate for PodIndexLabel and mark it GA
Signed-off-by: Alay Patel <alayp@nvidia.com>
* add emulated version if testing disabling of PodIndexLabel FG
Signed-off-by: Alay Patel <alayp@nvidia.com>
---------
Signed-off-by: Alay Patel <alayp@nvidia.com>
* fix pods tracking and internal error checking in statefulset tests
* fix stateful set pod recreation and event spam
- do not emit events when pod reaches terminal phase
- do not try to recreate pod until the old pod has been removed from
etcd storage
* fix conflict race in statefulset rest update
statefulset controller does less requests per sync now and thus can
reconcile status faster, thus resulting in a higher chance for conflicts
- Increase the global level for broadcaster's logging to 3 so that users can ignore event messages by lowering the logging level. It reduces information noise.
- Making sure the context is properly injected into the broadcaster, this will allow the -v flag value to be used also in that broadcaster, rather than the above global value.
- test: use cancellation from ktesting
- golangci-hints: checked error return value