* Fix goroutine leaks in ephemeral volume controller test
Use context.WithCancel and properly shut down the informer factory
and workqueue in TestSyncHandler to prevent goroutine leaks.
Previously, the test used context.Background() which never cancels,
leaving informer and workqueue goroutines running after test completion.
Now that context support has been added to tools/cache (#126387),
the informers can be cleanly shut down via context cancellation.
Also add goleak.VerifyTestMain to detect goroutine leak regressions.
* Remove year from copyright header in main_test.go
* Drop main_test.go per review feedback
* check the job owner reference in the cronjob reconcile loop
* use indexer to get jobs to be reconciled
* chore
* Update pkg/controller/cronjob/cronjob_controllerv2.go
Co-authored-by: Filip Křepinský <fkrepins@redhat.com>
* delete unnecessary comment
* move jobIndexer place
* Update pkg/controller/cronjob/cronjob_controllerv2.go
Co-authored-by: Maciej Szulik <soltysh@gmail.com>
* jobs -> jobsjobsToBeReconciled
* fix var name
---------
Co-authored-by: Filip Křepinský <fkrepins@redhat.com>
Co-authored-by: Maciej Szulik <soltysh@gmail.com>
and terminated pods for maxUnavailable
This change ensures that Parallel pod management in statefulset controller
counts old unavailable pods as candidates for rollouts, but leaving
terminating pods untouched. All the disruptions should always ensure
that the statefulset stays within defined maxUnavilable budget.
Signed-off-by: Maciej Szulik <soltysh@gmail.com>
resourceclaimcontroller: fix incorrect SSA apply in syncPod method
The ResourceClaimController's syncPod method only includes new
resource claims in the server-side apply, not existing claims. Since
this controller is the owning fieldManager, SSA removes the missing
existing keys. This results in flapping between claims when more than
one claim is assigned to the Pod.
This fix includes the existing claims in the SSA request.
Signed-off-by: John Belamaric <jbelamaric@google.com>
* Adds polling for HPA reconciliation_duration unit test
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
* using struct name
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
---------
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
Add unit tests for handwritten and declarative validation, controller
logic, metrics, table printer output, controller-manager registration,
etcd storage round-trip, and an integration test for the full RPSR
lifecycle. Also add an e2e test exercising the DRA test driver with
RPSR and the example manifest.
Implement the RPSR controller that watches ResourcePoolStatusRequest
objects and aggregates pool status from DRA drivers. Add the API server
registry (strategy, storage), handwritten validation, RBAC bootstrap
policy for the controller, kube-controller-manager wiring, table
printer columns, and storage factory registration.
Subtest 5-2-3 starts the controller with both the claim and the volume
already present and with the volume annotated as controller-bound. That
setup depends on watch/list ordering during controller startup: if the
initial objects are observed in a different order, the test can time out
even though the controller later converges to the expected state.
Model the scenario more explicitly. Start with the unbound volume only,
then inject the claim through AddClaimEvent after the controller is
running. Also drop the pre-set controller-bound annotation from the
initial volume since the binding is what the test wants the controller to
establish.
Tested:
go test -race ./pkg/controller/volume/persistentvolume -run TestControllerSync -count=50
TestNodeSyncResync closes opChan after observing the first resync and then
waits for the loop to exit. There is still a small window where the 1ms
resync timer fires again before the select notices the closed channel.
When that happens ReportResult sends a second notification on the
unbuffered reportChan, the loop blocks in the send, and the test waits
forever on doneChan.
Allow one queued notification so the loop can drain that race and reach
the closed opChan case. The test still validates that a resync happened;
it just stops depending on exact scheduling between two ready events.
Tested:
go test -race ./pkg/controller/nodeipam/ipam/sync -run TestNodeSyncResync -count=200