Commit graph

7061 commits

Author SHA1 Message Date
Kubernetes Prow Robot
59d65dad34
Merge pull request #134945 from tchap/kcm-controllers-check-threads
pkg/controller: Improve goroutine management (part 2)
2025-11-06 00:43:01 -08:00
Kubernetes Prow Robot
50b4bcbab5
Merge pull request #134210 from yliaog/admit_quota
DRA extended resource quota
2025-11-06 00:42:53 -08:00
Kubernetes Prow Robot
6723beac00
Merge pull request #135154 from kubernetes/revert-134840-ahmet/mini-cleanup
Revert "controller: duplicate utility method cleanup"
2025-11-05 22:49:04 -08:00
Kubernetes Prow Robot
ca03752ee7
Merge pull request #135104 from mimowo/mutable-job-directives
Allow mutable job scheduling directives on suspended Jobs
2025-11-05 21:57:11 -08:00
Kubernetes Prow Robot
f025bcace9
Merge pull request #135068 from pohly/dra-device-taints-1.35-full
DRA device taint eviction: several improvements
2025-11-05 18:52:58 -08:00
yliao
870062df4f adjusts DRA extended resource quota to include devices usages from regular resource claims 2025-11-05 23:24:24 +00:00
Maciej Szulik
499bff4ca4
Revert "controller: duplicate utility method cleanup" 2025-11-05 21:06:09 +01:00
Michał Woźniak
5a7c90fb76 Allow mutable scheduling directives for suspended Jobs 2025-11-05 19:37:33 +00:00
Patrick Ohly
60744fc8b9 DRA device taint eviction: track evicting rules
This avoids having to call the rule lister (which theoretically, but not in
practice) fail and having to iterate over rules which can be ignored (might be
a small performance boost).
2025-11-05 20:03:17 +01:00
Patrick Ohly
9527987293 DRA device taint eviction: use NOP queue during simulation
It's slightly more efficient and a bit cleaner.
2025-11-05 20:03:17 +01:00
Patrick Ohly
eaee6b6bce DRA device taints: add separate feature gate for rules
Support for DeviceTaintRules depends on a significant amount of
additional code:
- ResourceSlice tracker is a NOP without it.
- Additional informers and corresponding permissions in scheduler and controller.
- Controller code for handling status.

Not all users necessarily need DeviceTaintRules, so adding a second feature
gate for that code makes it possible to limit the blast radius of bugs in that
code without having to turn off device taints and tolerations entirely.
2025-11-05 20:03:17 +01:00
Kubernetes Prow Robot
9ef1a14d68
Merge pull request #134840 from ahmetb/ahmet/mini-cleanup
controller: duplicate utility method cleanup
2025-11-05 08:06:58 -08:00
Kubernetes Prow Robot
9a192aa1c3
Merge pull request #134432 from Karthik-K-N/fix-sv-test
Fix storage version test flake
2025-11-05 06:56:52 -08:00
Ayato Tokubi
320987ead3 Addressed comments 2025-11-05 10:44:50 +00:00
Ayato Tokubi
5102591a6b Refactor resource claim metrics to use structured labels and add "source" dimension.
Signed-off-by: Ayato Tokubi <atokubi@redhat.com>
2025-11-05 09:52:47 +00:00
Kubernetes Prow Robot
c1a6a3ca71
Merge pull request #134152 from pohly/dra-device-taints-1.35
DRA: device taints: new ResourceSlice API, new features
2025-11-04 15:32:07 -08:00
Ondra Kupka
024382658b controller/volume/vacprotection: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
e08d03b1b5 controller/volume/selinuxwarning: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
1e6ad423bf controller/volume/pvprotection: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
0caae6f704 controller/volume/pvcprotection: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
ed74779a0f controller/volume/persistentvolume: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
8eab454e38 controller/volume/expand: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
27774052ab controller/volume/ephemeral: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
12205df76d controller/volume/attachdetach: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
9d4ff6ecf2 controller/tainteviction: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
d2a443db75 controller/serviceaccount: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
c641df792b controller/resourcequota: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Ondra Kupka
d908a470a5 controller/garbagecollector: Improve goroutine mgmt
Make sure all threads are terminated when Run returns.
2025-11-04 23:58:15 +01:00
Kubernetes Prow Robot
97cb47a913
Merge pull request #135080 from dejanzele/feat/promote-job-managedby-to-ga
KEP-4368: Job Managed By; Promote to GA
2025-11-04 13:42:12 -08:00
Patrick Ohly
bbf8bc766e DRA device taints: DeviceTaintRule status
To update the right statuses, the controller must collect more information
about why a pod is being evicted. Updating the DeviceTaintRule statuses then is
handled by the same work queue as evicting pods.

Both operations already share the same client instance and thus QPS+server-side
throttling, so they might as well share the same work queue. Deleting pods is
not necessarily more important than informing users or vice-versa, so there is
no strong argument for having different queues.

While at it, switching the unit tests to usage of the same mock work queue as
in staging/src/k8s.io/dynamic-resource-allocation/internal/workqueue. Because
there is no time to add it properly to a staging repo, the implementation gets
copied.
2025-11-04 21:57:24 +01:00
Patrick Ohly
0689b628c7 generated files 2025-11-04 21:57:24 +01:00
Patrick Ohly
f4a453389d DRA device taint eviction: configurable number of workers
It might never be necessary to change the default, but it is hard to be sure.
It's better to have the option, just in case.
2025-11-04 21:57:24 +01:00
Kubernetes Prow Robot
a058cf788a
Merge pull request #134624 from yt2985/podcertificates-beta
Promote Pod Certificates feature to beta
2025-11-04 11:42:12 -08:00
Dejan Zele Pejchev
3dabd4417d
KEP-4368: Job Managed By; Promote to GA
Signed-off-by: Dejan Zele Pejchev <pejcev.dejan@gmail.com>
2025-11-04 10:59:45 +01:00
Kubernetes Prow Robot
d6aa2db57e
Merge pull request #135027 from omerap12/remove-reactor-hpa
Remove unused delete reactor
2025-11-04 01:30:10 -08:00
Kubernetes Prow Robot
48c56e04e0
Merge pull request #135017 from liggitt/stateful-set-noop-rollout
Fix spurious statefulset rollout from 1.33 → 1.34
2025-11-03 19:58:11 -08:00
Kubernetes Prow Robot
41673c7198
Merge pull request #134910 from tchap/kcm-controllers-thread-mgmt
pkg/controller: Improve goroutine management
2025-11-03 17:58:03 -08:00
Jordan Liggitt
979c442774
Fix spurious workload rollout due to null creationTimestamp in controller revisions 2025-11-03 17:11:06 -05:00
Jordan Liggitt
7d186d870f
Remove unused and fragile revision hash comparisons
This was broken since 666a41c2ea when the label value became non-integer encoded
The chance of one controller revision hash label being int-parsable: 7/27 ^ 8 = 0.00002041 = ~0
The chance of both being int-parsable: 0.00002041^2 = ~0

Hash comparison locks in differences in content failing EqualRevision
even when the semantic content is normalized to be equal.
2025-11-03 16:33:40 -05:00
Jordan Liggitt
94e085e15c
Add unit test detecting spurious statefulset rollout 2025-11-03 16:33:39 -05:00
Lukasz Szaszkiewicz
c832203707 pkg/controller/garbagecollector/garbagecollector_test: wrap kubeClient with a client that doesn't support WatchList semantics. 2025-11-03 10:41:49 +01:00
tinatingyu
59e075e8d3 Promote PodCertificateRequests to v1beta1 2025-11-02 05:33:44 +00:00
Omer Aplatony
264eab46db Remove unused delete reactor
Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2025-11-01 06:13:40 +00:00
Patrick Ohly
c69259cb71 DRA device taints: switch to workqueue in controller
The approach copied from node taint eviction was to fire off one goroutine per
pod the intended time. This leads to the "thundering herd" problem: when a
single taint causes eviction of several pods and those all have no or the same
toleration grace period, then they all get deleted concurrently at the same
time.

For node taint eviction that is limited by the number of pods per node, which
is typically ~100. In an integration test, that already led to problems with
watchers:

   cacher.go:855] cacher (pods): 100 objects queued in incoming channel.
   cache_watcher.go:203] Forcing pods watcher close due to unresponsiveness: key: "/pods/", labels: "", fields: "". len(c.input) = 10, len(c.result) = 10, graceful = false

It also causes spikes in memory consumption (mostly the 2KB stack per goroutine
plus closure) with no upper limit.

Using a workqueue makes concurrency more deterministic because there is an
upper limit. In the integration test, 10 workers kept the watch active.

Another advantage is that failures to evict the pod get retried with
exponential backoff per affected pod forever. Previously, evicting was tried a
few times with a fixed rate and then the controller gave up. If the apiserver
was down long enough, pods didn't get evicted.
2025-10-31 18:11:19 +01:00
Patrick Ohly
e5fcd20a26 DRA device taints: tighten controller test
We know how often the controller should get a pod, let's check it.
Must run before we do our own GET call.
2025-10-31 18:11:18 +01:00
Patrick Ohly
6ebd853f17 DRA: implementation of none taint effect
While at it, ensure that future unknown effects are treating like
the None effect.
2025-10-31 18:11:18 +01:00
Patrick Ohly
e4dda7b282 DRA device taints: fix DeviceTaintRule + missing slice case
When the ResourceSlice no longer exists, the ResourceSlice tracker didn't and
couldn't report the tainted devices even if they are allocated and in use. The
controller must keep track of DeviceTaintRules itself and handle this scenario.

In this scenario it is impossible to evaluation CEL expressions because the
necessary device attributes aren't available. We could:
- Copy them in the allocation result: too large, big change.
- Limit usage of CEL expressions to rules with no eviction: inconsistent.
- Remove the fields which cannot be supported well.

The last option is chosen.

The tracker is now no longer needed by the eviction controller. Reading
directly from the informer means that we cannot assume that pointers are
consistent. We have to track ResourceSlices by their name, not their pointer.
2025-10-31 18:11:18 +01:00
Patrick Ohly
2e543d151b DRA device taints: convert unit test to synctest
The immediate benefit is that the time required for running the package's unit
test goes down from ~10 seconds (because of required real-world delays) to ~0.5
seconds (depending on the CPU performance of the host). It can also make
writing tests easier because after a `Wait` there is no need for locking before
accessing internal state (all background goroutines are known to be blocked
waiting for the main goroutine).

What somewhat ruins the perfect determinism is the polling for informer cache
syncs: that can take an unknown number of loop iterations. Probably could be
fixed by making the waiting block on channels (requires work in client-go).

The only change required in the implementation is avoiding the sleep when
deleting a pod failed for the last time in the loop (a useful, albeit minor
improvement by itself): the test proceeds after having blocked that last Delete
call, in which case synctest expects the background goroutine to exit without
delay.
2025-10-30 17:29:58 +01:00
Kubernetes Prow Robot
808d320de1
Merge pull request #134956 from yliaog/blockowner
removed BlockOwnerDeletion
2025-10-30 01:26:11 -07:00
yliao
4f647b3f3d removed BlockOwnerDeletion 2025-10-29 22:41:10 +00:00