Commit graph

3483 commits

Author SHA1 Message Date
Qi Wang
2aaa5b654b skip MemoryQoS rollback test until implementation is resolved
skip MemoryQoS rollback test until we figure out the mechanism to rollback.

Signed-off-by: Qi Wang <qiwan@redhat.com>
2026-04-20 12:41:45 -04:00
yashsingh74
afdb5e5d1f
Update CNI plugins to v1.9.1
Signed-off-by: yashsingh74 <yashsingh1774@gmail.com>
2026-04-01 14:06:34 +05:30
Davanum Srinivas
10efa46fbb
e2e_node: wait for pod drain before asserting zero pods in Memory Manager Metrics
The Memory Manager Metrics BeforeEach asserts that zero pods are
running on the node after a kubelet config update. This hard assertion
flakes when a preceding serial test's namespace deletion hasn't
completed yet — framework namespace cleanup is async and the kubelet
restart in updateKubeletConfig can delay in-flight pod termination.

CI logs show leftover pods from MemoryQoS tests (memqos-burstable,
memqos-no-limit, etc.), Probe Stress tests (50-container pods), and
Summary API PSI tests (memory-pressure-pod), all still Running when
the assertion fires 4-7ms after the previous test finishes.

Replace the immediate Expect(count).To(BeZero()) with an Eventually
poll (2 minute timeout, 5 second interval) that gives pods time to
drain after the kubelet restart. The existing printAllPodsOnNode
diagnostic output is preserved inside the poll for debugging.

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2026-03-28 15:27:25 -04:00
Kubernetes Prow Robot
c6a95ffd4c
Merge pull request #137996 from pacoxu/inplace-disable
set InPlacePodLevelResourcesVerticalScaling to false if needed
2026-03-28 08:42:11 +05:30
Kubernetes Prow Robot
473b7635de
Merge pull request #138006 from tallclair/push-kooxxktxovkr
Flaky test fix for 'should restart failing container when pod restartPolicy is Always'
2026-03-25 02:18:16 +05:30
Tim Allclair
72ed617db1 Flaky test fix for 'should restart failing container when pod restartPolicy is Always' 2026-03-24 16:38:46 +00:00
Paco Xu
7c65919285 set InPlacePodLevelResourcesVerticalScaling to false if PodLevelResources is set to false 2026-03-24 16:57:46 +08:00
Kubernetes Prow Robot
eca347edbf
Merge pull request #137957 from tallclair/push-vxyyxtvrypxt
Fix user namespace test cleanup race
2026-03-24 10:24:21 +05:30
Tim Allclair
2e60e1407b Fix user namespace test cleanup race 2026-03-23 23:22:44 +00:00
Tim Allclair
be5285b46f Fix restartable init container startup race 2026-03-23 19:58:38 +00:00
Kubernetes Prow Robot
6c87b171ca
Merge pull request #137870 from tallclair/push-lplxytyovnkw
test: fix flaky static pod tests by asserting on termination message …
2026-03-20 00:30:30 +05:30
Tim Allclair
57eb99dbb8 test: fix flaky static pod tests by asserting on termination message and container ID instead of StartTime
The expectation that StartTime changes on kubelet restart for static pods is no longer reliable due to faked init container status logic. This change updates the tests to assert on the specific behavior introduced by that logic.
2026-03-19 16:45:14 +00:00
Kubernetes Prow Robot
b910026535
Merge pull request #137889 from sohankunkerkar/memqos-fix-reconcile
Remove reconcilePodMemoryProtection that resets pod cgroup values on systemd
2026-03-19 21:32:42 +05:30
Kubernetes Prow Robot
ac10370ad2
Merge pull request #136987 from bitoku/kep-5825-cri
[KEP-5825] cri-api: Add streaming RPCs for CRI list operations
2026-03-19 18:28:39 +05:30
Sohan Kunkerkar
e3849e2b55 Remove reconcilePodMemoryProtection that resets pod cgroup values on systemd 2026-03-19 07:53:34 -04:00
Kubernetes Prow Robot
88573def71
Merge pull request #137146 from george-angel/fix/container-restart-sidecar-exited
kubelet: fix containers not restarting when sidecar keeps running
2026-03-19 15:36:32 +05:30
Kubernetes Prow Robot
caecddc909
Merge pull request #134627 from briansonnenberg/brians-kubelet-pods-api
[KEP-4188] New Kubelet gRPC API returning node-local Pod info
2026-03-19 07:52:30 +05:30
Kubernetes Prow Robot
ca5c40e58a
Merge pull request #137764 from pacoxu/sig-node-standalone
e2e_node: fix pod StartTime assertion to compare time value
2026-03-19 05:42:59 +05:30
Kubernetes Prow Robot
c2a7819806
Merge pull request #137719 from sohankunkerkar/memqos-kernel-metrics-e2e
KEP-2570: Add tiered memory protection, metrics, rollback fix, and E2E tests for MemoryQoS
2026-03-19 05:42:44 +05:30
Kubernetes Prow Robot
87924903db
Merge pull request #137629 from stlaz/ensure-secret-images-allowlist-fix
Image Pulling Authorization: fix the allowlisting policy
2026-03-19 05:42:36 +05:30
Brian Sonnenberg
b6fbc88793 ci fixes 2026-03-18 23:07:37 +00:00
Brian Sonnenberg
69167c14bd Refactored broadcast logic, specifically:
For ADDED, we broadcast directly after receiving the pod info and successfully processing it for the first time from the API server, so this should always be the first thing broadcast for a given pod.

For DELETED, we broadcast after it has been processed in updateStatusInternal with podIsFinished, so this should always be the last thing broadcast for a given pod.

For MODIFIED, we broadcast once when we receive a new spec from the API server, and once when we finish processing in the status manager.  This way the watchers can see the flow from when a spec is received to when it has reconciled.  To avoid status flapping on slow status updates with many spec updates, we always overlay the latest status from the status manager on every broadcast.  This hybrid state is fine because the ObservedGeneration will indicate how far desynced the spec has become from the status, so watchers can act accordingly.
2026-03-18 23:07:36 +00:00
Brian Sonnenberg
0809c4f37f linter fixes 2026-03-18 23:07:36 +00:00
Brian Sonnenberg
fd330c303d Refactor PodsServer to use PodManager as source of truth
- Fixed version in kube_features.go after rebase (1.35->1.36)
- Removed internal pod cache in PodsServer to reduce memory footprint and avoid duplication.
- Injected pod.Manager into PodsServer to serve as the single source of truth for pod data.
- Refactored WatchPods to broadcast UIDs and fetch fresh pod data from podManager, ensuring consistency.
- Updated convertWatchEventType to safely handle unknown event types.
- Refactored unit tests to use MockManager and added a test case for static pods.
- Updated e2e suite with static pod test
2026-03-18 23:07:36 +00:00
Brian Sonnenberg
044f65ca5c [KEP-4188] New Kubelet gRPC API with endpoint returning local Pod information 2026-03-18 23:07:36 +00:00
Sohan Kunkerkar
d2b77a8133 Rename HardReservation to TieredReservation for memoryReservationPolicy
The tiered approach uses memory.min for Guaranteed pods (hard protection)
and memory.low for Burstable pods (soft protection). Rename the policy
value to reflect this design.
2026-03-18 16:24:27 -04:00
Sohan Kunkerkar
dbd3f16787 Add node E2E tests for MemoryQoS
Signed-off-by: Sohan Kunkerkar <sohank2602@gmail.com>
2026-03-18 16:24:26 -04:00
Kubernetes Prow Robot
055254a882
Merge pull request #134179 from Priyankasaggu11929/kep-3085-pod-ready-to-start-containers
KEP 3085: PodReadyToStartContainersCondition: add test to verify sandbox condition for missing secret
2026-03-18 23:20:32 +05:30
Ayato Tokubi
3256f5175f cri-api: Add streaming RPCs for CRI list operations
Add server-side streaming RPCs to bypass the gRPC 16MB message size
limit on nodes with many containers/pods. This implements KEP-5825.

New RuntimeService streaming RPCs:
- StreamPodSandboxes
- StreamContainers
- StreamContainerStats
- StreamPodSandboxStats
- StreamPodSandboxMetrics

New ImageService streaming RPC:
- StreamImages

Each streaming RPC accepts the same filter as its unary counterpart
and streams results one item at a time.

Feature gate: CRIListStreaming
KEP: https://kep.k8s.io/5825

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Signed-off-by: Ayato Tokubi <atokubi@redhat.com>
2026-03-18 16:32:49 +00:00
Stanislav Láznička
a212f52fba
NodeConformance e2e: tests all Ensure Secret Pulled Images policies
Signed-off-by: Stanislav Láznička <slznika@microsoft.com>
2026-03-18 12:28:55 +01:00
Paco Xu
593c6deea5 fix mirror pod starttime failing message 2026-03-18 10:44:35 +08:00
Benjamin Elder
f6d42d302b make VolumeMountStatus.VolumeStatus a pointer to preserve serialization compatibility 2026-03-17 17:37:23 -07:00
Priyanka Saggu
0364ab662a PodReadyToStartContainersCondition: refactor test to verify same pod unblocks when volume is created
* volume created by (i) missing configmap, (ii) missing secret
2026-03-17 22:24:56 +05:30
Paco Xu
93c755bc33 e2e_node: fix pod StartTime assertion to compare using Equal 2026-03-17 12:02:01 +08:00
Priyanka Saggu
410efb048f add e2e tests verifying PodReadyToStartContainers condition set using criProxy to inject delay time 2026-03-16 10:14:02 +05:30
Kubernetes Prow Robot
95365ff478
Merge pull request #134768 from KevinTMtz/pod-level-resource-managers-5526
[PodLevelResourceManagers] Pod Level Resource Managers - Alpha
2026-03-14 08:45:35 +05:30
Kubernetes Prow Robot
0ad0cce87e
Merge pull request #137078 from saschagrunert/label-unlabeled-e2e-node-tests
Label unlabeled e2e node tests
2026-03-14 04:31:36 +05:30
Kubernetes Prow Robot
b5661be4ff
Merge pull request #137248 from SergeyKanzhelev/propagate-context-cri-client
add context to CRI API client and contextual logging per-call
2026-03-14 00:41:36 +05:30
George Angel
4413555fb8 kubelet: fix sidecar restart after kubelet restart
When a pod has a sidecar (initContainer with restartPolicy: Always) with
a startupProbe, and one or more regular containers crash after a kubelet
restart, the kubelet fails to restart the regular containers. RestartCount
stays at 0 indefinitely.

When ChangeContainerStatusOnKubeletRestart is disabled (default in v1.35),
the prober worker skips seeding probe results for containers that predate
the kubelet restart. For a sidecar with a startupProbe this means
startupManager.Get() returns found=false permanently. In
computeInitContainerActions, the sidecar Running case breaks out early at
the !found check, leaving podHasInitialized=false. computePodActions then
returns early at the !hasInitialized guard without restarting the crashed
regular containers.

Fix: when the gate is off and a restartable init container's startup probe
is being seeded for the first time after a kubelet restart, check the
container's Started field in the pod status. If Started=true, the sidecar
had already passed startup before the restart, so seed the startup manager
with Success. This allows computeInitContainerActions to detect pod
initialization via the sidecar Running path without altering readiness or
liveness probe seeding behaviour.

Add and update tests to cover the fix:
- worker unit tests for sidecar startup/readiness/liveness restart behaviour
- e2e node regression test for sidecar with startupProbe across kubelet restart

Fixes: https://github.com/kubernetes/kubernetes/issues/136910
2026-03-13 19:17:33 +10:00
Kubernetes Prow Robot
f7f694e5e0
Merge pull request #136792 from rata/userns-goes-ga
feature: Migrate UserNamespacesSupport to GA
2026-03-12 21:57:36 +05:30
Rodrigo Campos
f25830be53 test/e2e*: Remove references to UserNamespacesSupport feature gate
It's GA now.

Signed-off-by: Rodrigo Campos <rodrigo@amutable.com>
2026-03-12 15:20:09 +01:00
Sascha Grunert
d3919c7cef
Label unlabeled e2e node tests
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2026-03-12 09:02:24 +01:00
Kubernetes Prow Robot
a89519d791
Merge pull request #136728 from guptaNswati/kep-3695-FG-ga
KEP-3695: FG kubeletPodResources GA update
2026-03-12 07:15:34 +05:30
Kevin Torres
dec79e1fb2 E2E tests 2026-03-12 00:29:22 +00:00
Kubernetes Prow Robot
d729528df4
Merge pull request #136711 from saschagrunert/graduate-image-volume-ga
[KEP-4639]: Graduate ImageVolume to GA
2026-03-12 00:45:43 +05:30
Swati Gupta
9f9edb2525 remove featuregate in e2e_node test
Signed-off-by: Swati Gupta <swatig@nvidia.com>
2026-03-11 11:28:24 -07:00
Kubernetes Prow Robot
b16838370b
Merge pull request #136044 from SergeyKanzhelev/versioninconfigz
added API Version and Kind in /configz serailized objects
2026-03-11 15:09:36 +05:30
Kubernetes Prow Robot
4c162fe1f7
Merge pull request #136602 from guptaNswati/e2e-kep3695-podresources-test
add additional Get() tests for GA
2026-03-09 19:53:13 +05:30
Mads Jensen
1f2b70a043 Lint: Use modernize/rangeint in test/{e2e,e2e_node,images,soak} 2026-03-07 10:17:31 +01:00
Sergey Kanzhelev
f5efc1de14 add context to CRI API client and contextual logging per-call 2026-03-07 08:07:10 +00:00