Commit graph

7239 commits

Author SHA1 Message Date
Kubernetes Prow Robot
e0cae9a19e
Merge pull request #138451 from michaelasp/registerSTSMetric
Register Statefulset Metric for Reconcile skips
2026-04-24 02:58:45 +05:30
Kubernetes Prow Robot
b36864202b
Merge pull request #137755 from HirazawaUi/remove-SidecarContainers-feature-gate
Remove SidecarContainers feature gate
2026-04-23 08:16:45 +05:30
Kubernetes Prow Robot
1a22ad0fd2
Merge pull request #138408 from johnbelamaric/fix-dra-claim-flapping
Fix flapping pod.status.resourceClaimStatuses
2026-04-23 07:21:33 +05:30
Kubernetes Prow Robot
ad1c87b481
Merge pull request #138397 from omerap12/cleanup-hpa
HPA: Clean up duplicate unit tests
2026-04-23 07:21:19 +05:30
Kubernetes Prow Robot
74b206cc04
Merge pull request #138345 from soltysh/deployment_cleanup
Deployment controller cleanups
2026-04-23 06:08:19 +05:30
Kubernetes Prow Robot
82e8d2fe26
Merge pull request #138181 from michaelasp/dsExpectations
Update comments to explain why we delete expectations on all errors in DaemonSet
2026-04-23 06:06:52 +05:30
Kubernetes Prow Robot
ad36c93e0c
Merge pull request #138022 from michaelasp/svmResetMapper
Reset the rest mapper for recent discovery operations in SVM
2026-04-23 05:10:44 +05:30
MyoungHaSong
8e586c5071
Fix goroutine leaks in ephemeral volume controller test (#137970)
* Fix goroutine leaks in ephemeral volume controller test

Use context.WithCancel and properly shut down the informer factory
and workqueue in TestSyncHandler to prevent goroutine leaks.

Previously, the test used context.Background() which never cancels,
leaving informer and workqueue goroutines running after test completion.
Now that context support has been added to tools/cache (#126387),
the informers can be cleanly shut down via context cancellation.

Also add goleak.VerifyTestMain to detect goroutine leak regressions.

* Remove year from copyright header in main_test.go

* Drop main_test.go per review feedback
2026-04-23 04:15:52 +05:30
Kubernetes Prow Robot
f71292253d
Merge pull request #137744 from dims/dsrinivas/issue-137263-pv-controller-sync
persistentvolume: deflake TestControllerSync 5-2-3 startup race
2026-04-23 04:15:00 +05:30
Kubernetes Prow Robot
6bf148ce02
Merge pull request #137666 from soltysh/issue137409
Parallel pod management should not count old, broken pods for maxUnavailable budget
2026-04-23 04:14:45 +05:30
Jeffrey Ying
e5033d1fde
Simplify deployment controller deletePod logic, drop network call (#136639)
* Simplify deployment controller deletePod logic and avoid extra network call

* Fix tests
2026-04-23 03:20:01 +05:30
Keisuke Ishigami
4fd1a1c099
check the job owner reference in the cronjob reconcile loop (#133313)
* check the job owner reference in the cronjob reconcile loop

* use indexer to get jobs to be reconciled

* chore

* Update pkg/controller/cronjob/cronjob_controllerv2.go

Co-authored-by: Filip Křepinský <fkrepins@redhat.com>

* delete unnecessary comment

* move jobIndexer place

* Update pkg/controller/cronjob/cronjob_controllerv2.go

Co-authored-by: Maciej Szulik <soltysh@gmail.com>

* jobs -> jobsjobsToBeReconciled

* fix var name

---------

Co-authored-by: Filip Křepinský <fkrepins@redhat.com>
Co-authored-by: Maciej Szulik <soltysh@gmail.com>
2026-04-23 03:18:49 +05:30
Maciej Szulik
de88e7598c
Parallel pod management should accordingly count old unavailable
and terminated pods for maxUnavailable

This change ensures that Parallel pod management in statefulset controller
counts old unavailable pods as candidates for rollouts, but leaving
terminating pods untouched. All the disruptions should always ensure
that the statefulset stays within defined maxUnavilable budget.

Signed-off-by: Maciej Szulik <soltysh@gmail.com>
2026-04-20 16:30:59 +02:00
Michael Aspinwall
0d78c8191d Register statefulset metrics for skips 2026-04-17 19:04:29 +00:00
John Belamaric
57aae64982 Fix flapping pod.status.resourceClaimStatuses
resourceclaimcontroller: fix incorrect SSA apply in syncPod method

The ResourceClaimController's syncPod method only includes new
resource claims in the server-side apply, not existing claims. Since
this controller is the owning fieldManager, SSA removes the missing
existing keys. This results in flapping between claims when more than
one claim is assigned to the Pod.

This fix includes the existing claims in the SSA request.

Signed-off-by: John Belamaric <jbelamaric@google.com>
2026-04-17 14:56:18 +00:00
Omer Aplatony
62ce934202 removed TestScaleDownWithScalingRules (duplicate of TestScaleDown )
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2026-04-15 13:20:54 +00:00
Omer Aplatony
a6dce1229b removed unused testThis var and its unit test case
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2026-04-15 13:09:46 +00:00
Omer Aplatony
8a887a11db removed TestScaleUpHotCpuNoScaleWouldScaleDown (identical to TestScaleUpCMUnreadyandCpuHot)
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2026-04-15 13:08:14 +00:00
Omer Aplatony
f1f94dd037 removed duplicated unit test case ( same as 'scaleDown with spec MinReplicas limitation with large pod policy') 2026-04-15 12:51:58 +00:00
Omer Aplatony
0455d9c0c3 removed TestScaleUpBothMetricsEmpty since we have TestConditionInvalidSourceType
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2026-04-15 12:19:53 +00:00
Maciej Szulik
83496be2cc
Drop outdated TODOs
Signed-off-by: Maciej Szulik <soltysh@gmail.com>
2026-04-13 16:08:32 +02:00
Maciej Szulik
1fa236c8f7
Drop extensions/v1beta1 from deployment controller
Signed-off-by: Maciej Szulik <soltysh@gmail.com>
2026-04-13 16:08:13 +02:00
Maciej Szulik
46eed6f82c
Move reading podMap to where it's actually needed
Signed-off-by: Maciej Szulik <soltysh@gmail.com>
2026-04-13 16:08:09 +02:00
Kubernetes Prow Robot
40007b6452
Merge pull request #138210 from Mujib-Ahasan/featuregate-WorkloadWithJob
Rename feature gate `EnableWorkloadWithJob` to `WorkloadWithJob`
2026-04-09 23:12:21 +05:30
Michael Aspinwall
1d78c7f2b1 Update comments to explain why we delete on all errors in DaemonSet 2026-04-07 16:56:58 +00:00
Omer Aplatony
cfe5b54d0a
Adds polling for HPA reconciliation_duration unit test (#138059)
* Adds polling for HPA reconciliation_duration unit test

Signed-off-by: Omer Aplatony <omerap12@gmail.com>

* using struct name

Signed-off-by: Omer Aplatony <omerap12@gmail.com>

---------

Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2026-04-07 21:45:33 +05:30
Mujib Ahasan
b9b0ff440d remove accidently commited file
Signed-off-by: Mujib Ahasan <ahasanmujib8@gmail.com>
2026-04-04 12:53:30 +05:30
Michael Aspinwall
fe88a9c6ff Reset the rest mapper for recent discovery operations 2026-03-25 20:19:34 +00:00
Jon Huhn
d80f384b70 Workload API: PodGroup ResourceClaims (KEP-5729) 2026-03-22 14:52:45 -05:00
Nour
58cbde2aff
Pass individual informers, move DRA controllers to resource.go, simplify retry logic and metric tests
Signed-off-by: Nour <nurmn3m@gmail.com>
2026-03-19 16:50:03 +02:00
Nour
8b9159baa4
Drop CSR analogy, mark ObjectMeta +required,reduce limits (maxItems=500, maxLength=128) for etcd safety, add Errors printer column
Signed-off-by: Nour <nurmn3m@gmail.com>
2026-03-19 16:50:03 +02:00
Nour
4dffbf5b2a
Add tests for ResourcePoolStatusRequest
Add unit tests for handwritten and declarative validation, controller
logic, metrics, table printer output, controller-manager registration,
etcd storage round-trip, and an integration test for the full RPSR
lifecycle. Also add an e2e test exercising the DRA test driver with
RPSR and the example manifest.
2026-03-19 16:50:03 +02:00
Nour
30fe79df21
Add ResourcePoolStatusRequest controller, registry, and RBAC
Implement the RPSR controller that watches ResourcePoolStatusRequest
objects and aggregates pool status from DRA drivers. Add the API server
registry (strategy, storage), handwritten validation, RBAC bootstrap
policy for the controller, kube-controller-manager wiring, table
printer columns, and storage factory registration.
2026-03-19 16:50:02 +02:00
Kubernetes Prow Robot
9d02f5f918
Merge pull request #137032 from helayoty/helayoty/5547-workload-job-integration
KEP-5547: Implement Workload APIs integration with Job controller
2026-03-19 17:10:31 +05:30
HirazawaUi
964d79dd6e Remove SidecarContainers feature gate 2026-03-19 15:56:47 +08:00
Kubernetes Prow Robot
98bb6823a8
Merge pull request #137862 from gnufied/pvc-unused-since-condition
Report PVC unused time via PVC condition
2026-03-19 07:08:49 +05:30
Kubernetes Prow Robot
b865748c1c
Merge pull request #135118 from johanneswuerbach/scaletozero
KEP-2021: HPA condition based scaling to zero
2026-03-19 03:36:30 +05:30
Roman Bednar
58f1520a03 update resize e2e tests to check only resize conditions 2026-03-18 17:08:11 -04:00
Roman Bednar
6c087b2724 add unused condition to persistent volume claims 2026-03-18 17:08:08 -04:00
helayoty
68e30095de
Implement Workload and PodGroup integration with Job controller
Signed-off-by: helayoty <heelayot@microsoft.com>
2026-03-18 20:32:37 +00:00
Kubernetes Prow Robot
1b5bcf309c
Merge pull request #137641 from helayoty/helayoty/protection-controller-podgroup
KEP-5832: Add protection controller for PodGroup
2026-03-19 01:03:00 +05:30
Kubernetes Prow Robot
92767f8e32
Merge pull request #137746 from dims/dsrinivas/issue-137740-nodeipam-resync
nodeipam: buffer TestNodeSyncResync report notifications
2026-03-18 23:21:21 +05:30
helayoty
1b90507cfa
move protectionutil pkg to controller/util
Signed-off-by: helayoty <heelayot@microsoft.com>
2026-03-18 15:27:56 +00:00
helayoty
0ef8d78d1d
Add new protection controller for PodGroup
Signed-off-by: helayoty <heelayot@microsoft.com>
2026-03-18 15:27:17 +00:00
Alay Patel
b9729e8197 kep-5304: add DeviceMetadata API 2026-03-18 08:29:42 -04:00
Johannes Würbach
6bebe8d3a2
KEP-2021: HPA condition based scaling to zero 2026-03-17 09:18:18 +01:00
Kubernetes Prow Robot
2d7979b985
Merge pull request #136367 from bhope/metrics-beta-job-controller
Promote job controller metrics to beta
2026-03-17 04:47:44 +05:30
Davanum Srinivas
72b6f1b590
persistentvolume: deflake TestControllerSync 5-2-3 startup race
Subtest 5-2-3 starts the controller with both the claim and the volume
already present and with the volume annotated as controller-bound. That
setup depends on watch/list ordering during controller startup: if the
initial objects are observed in a different order, the test can time out
even though the controller later converges to the expected state.

Model the scenario more explicitly. Start with the unbound volume only,
then inject the claim through AddClaimEvent after the controller is
running. Also drop the pre-set controller-bound annotation from the
initial volume since the binding is what the test wants the controller to
establish.

Tested:
go test -race ./pkg/controller/volume/persistentvolume -run TestControllerSync -count=50
2026-03-14 11:58:33 -04:00
Davanum Srinivas
5d3e8d9db6
nodeipam: buffer TestNodeSyncResync report notifications
TestNodeSyncResync closes opChan after observing the first resync and then
waits for the loop to exit. There is still a small window where the 1ms
resync timer fires again before the select notices the closed channel.
When that happens ReportResult sends a second notification on the
unbuffered reportChan, the loop blocks in the send, and the test waits
forever on doneChan.

Allow one queued notification so the loop can drain that race and reach
the closed opChan case. The test still validates that a resync happened;
it just stops depending on exact scheduling between two ready events.

Tested:
go test -race ./pkg/controller/nodeipam/ipam/sync -run TestNodeSyncResync -count=200
2026-03-14 11:58:31 -04:00
Kubernetes Prow Robot
4e2bbc78bf
Merge pull request #137170 from pohly/dra-device-taints-beta
DRA device taints: graduate to beta
2026-03-13 00:13:38 +05:30