The Memory Manager Metrics BeforeEach asserts that zero pods are
running on the node after a kubelet config update. This hard assertion
flakes when a preceding serial test's namespace deletion hasn't
completed yet — framework namespace cleanup is async and the kubelet
restart in updateKubeletConfig can delay in-flight pod termination.
CI logs show leftover pods from MemoryQoS tests (memqos-burstable,
memqos-no-limit, etc.), Probe Stress tests (50-container pods), and
Summary API PSI tests (memory-pressure-pod), all still Running when
the assertion fires 4-7ms after the previous test finishes.
Replace the immediate Expect(count).To(BeZero()) with an Eventually
poll (2 minute timeout, 5 second interval) that gives pods time to
drain after the kubelet restart. The existing printAllPodsOnNode
diagnostic output is preserved inside the poll for debugging.
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
The expectation that StartTime changes on kubelet restart for static pods is no longer reliable due to faked init container status logic. This change updates the tests to assert on the specific behavior introduced by that logic.
For ADDED, we broadcast directly after receiving the pod info and successfully processing it for the first time from the API server, so this should always be the first thing broadcast for a given pod.
For DELETED, we broadcast after it has been processed in updateStatusInternal with podIsFinished, so this should always be the last thing broadcast for a given pod.
For MODIFIED, we broadcast once when we receive a new spec from the API server, and once when we finish processing in the status manager. This way the watchers can see the flow from when a spec is received to when it has reconciled. To avoid status flapping on slow status updates with many spec updates, we always overlay the latest status from the status manager on every broadcast. This hybrid state is fine because the ObservedGeneration will indicate how far desynced the spec has become from the status, so watchers can act accordingly.
- Fixed version in kube_features.go after rebase (1.35->1.36)
- Removed internal pod cache in PodsServer to reduce memory footprint and avoid duplication.
- Injected pod.Manager into PodsServer to serve as the single source of truth for pod data.
- Refactored WatchPods to broadcast UIDs and fetch fresh pod data from podManager, ensuring consistency.
- Updated convertWatchEventType to safely handle unknown event types.
- Refactored unit tests to use MockManager and added a test case for static pods.
- Updated e2e suite with static pod test
The tiered approach uses memory.min for Guaranteed pods (hard protection)
and memory.low for Burstable pods (soft protection). Rename the policy
value to reflect this design.
Add server-side streaming RPCs to bypass the gRPC 16MB message size
limit on nodes with many containers/pods. This implements KEP-5825.
New RuntimeService streaming RPCs:
- StreamPodSandboxes
- StreamContainers
- StreamContainerStats
- StreamPodSandboxStats
- StreamPodSandboxMetrics
New ImageService streaming RPC:
- StreamImages
Each streaming RPC accepts the same filter as its unary counterpart
and streams results one item at a time.
Feature gate: CRIListStreaming
KEP: https://kep.k8s.io/5825🤖 Generated with [Claude Code](https://claude.com/claude-code)
Signed-off-by: Ayato Tokubi <atokubi@redhat.com>
When a pod has a sidecar (initContainer with restartPolicy: Always) with
a startupProbe, and one or more regular containers crash after a kubelet
restart, the kubelet fails to restart the regular containers. RestartCount
stays at 0 indefinitely.
When ChangeContainerStatusOnKubeletRestart is disabled (default in v1.35),
the prober worker skips seeding probe results for containers that predate
the kubelet restart. For a sidecar with a startupProbe this means
startupManager.Get() returns found=false permanently. In
computeInitContainerActions, the sidecar Running case breaks out early at
the !found check, leaving podHasInitialized=false. computePodActions then
returns early at the !hasInitialized guard without restarting the crashed
regular containers.
Fix: when the gate is off and a restartable init container's startup probe
is being seeded for the first time after a kubelet restart, check the
container's Started field in the pod status. If Started=true, the sidecar
had already passed startup before the restart, so seed the startup manager
with Success. This allows computeInitContainerActions to detect pod
initialization via the sidecar Running path without altering readiness or
liveness probe seeding behaviour.
Add and update tests to cover the fix:
- worker unit tests for sidecar startup/readiness/liveness restart behaviour
- e2e node regression test for sidecar with startupProbe across kubelet restart
Fixes: https://github.com/kubernetes/kubernetes/issues/136910