* Fix goroutine leaks in ephemeral volume controller test
Use context.WithCancel and properly shut down the informer factory
and workqueue in TestSyncHandler to prevent goroutine leaks.
Previously, the test used context.Background() which never cancels,
leaving informer and workqueue goroutines running after test completion.
Now that context support has been added to tools/cache (#126387),
the informers can be cleanly shut down via context cancellation.
Also add goleak.VerifyTestMain to detect goroutine leak regressions.
* Remove year from copyright header in main_test.go
* Drop main_test.go per review feedback
Subtest 5-2-3 starts the controller with both the claim and the volume
already present and with the volume annotated as controller-bound. That
setup depends on watch/list ordering during controller startup: if the
initial objects are observed in a different order, the test can time out
even though the controller later converges to the expected state.
Model the scenario more explicitly. Start with the unbound volume only,
then inject the claim through AddClaimEvent after the controller is
running. Also drop the pre-set controller-bound annotation from the
initial volume since the binding is what the test wants the controller to
establish.
Tested:
go test -race ./pkg/controller/volume/persistentvolume -run TestControllerSync -count=50
Raname this because facing lint error, counter metrics should have
"_total" suffix. Add the test `volume_operation_errors_total`
Marked `volume_operation_total_errors` as deprecated
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
Added podToVolumes reverse index to optimize DeletePod.
Currently we simply iterate through all the volumes and remove the pod
being deleted from there. This is inefficient and takes longer the
longer the volume list becomes.
Keeping a map pod -> volumes makes removing a pod fast. We can just jump
to the relevant volumes directly and remove the pod from there.
When calling ControllerSELinuxTranslator.Conflicts(), the SELinux label
is repeatedly split into []string to detect conflicts. This causes a huge
number of allocations when there are many comparisons.
This is now made more efficient by pre-parsing the SELinux label and
storing it in podInfo as [4]string for fast comparison when needed.
Currently, we get each released PV every 15s, and in parallel. If there are a lot of released PV and we cannot finish all the get in 15s, it will starve other request by making the queue waiting for client-side throttling very long.
Even in a normal cluster, these requests are taking majority of all get requests from KCM (58% or 470 qps) in our stress test.
Replace all imports of k8s.io/apimachinery/pkg/util/dump with
k8s.io/utils/dump across the repo. The apimachinery dump package
now contains deprecated wrapper functions that delegate to
k8s.io/utils/dump for backwards compatibility.
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
Add unit test with a volume plugin that does not support SELinux. That
simulates a CSi driver whose spec.SELinuxMount is empty or false.
This requires a little refactoring, each unit test now has a flag if it
runs with a volume plugin that supports SELinux.
Reset SELinuxChangePolicy of Pods that have no SELinux label set to
Recursive. Kubelet cannot mount with `-o context=<label>`, if the label is
not known.
This fixes the e2e test error revealed by the previous commit - it changed the
e2e test to check for events when no events are expected and it found a
warning about a Pod with no label, but MountOption policy.
When a Pod reaches its final state (Succeeded or Failed), its volumes are
getting unmounted and therefore their SELinux mount option will not
conflict with any other pod.
Let the SELinux controller monitor "pod updated" events to see the pod is
finished