kubernetes

mirror of https://github.com/kubernetes/kubernetes.git synced 2026-05-23 10:29:27 -04:00

Author	SHA1	Message	Date
Ania Borowiec	48c4605408	Add logging error when UpdatePod finds no existing PodGroup with the pod to update	2026-01-27 11:42:03 +00:00
Kubernetes Prow Robot	efc15394a1	Merge pull request #135573 from brejman/issue-129733-score-update Update scoring function for balanced allocation to consider change to the node's balance	2026-01-26 21:49:52 +05:30
Kubernetes Prow Robot	53b29a3a2c	Merge pull request #136269 from pohly/dra-scheduler-double-allocation-fixes DRA scheduler: double allocation fixes	2026-01-26 20:59:50 +05:30
Patrick Ohly	581ee0a2ec	DRA scheduler: fix another root cause of double device allocation GatherAllocatedState and ListAllAllocatedDevices need to collect information from different sources (allocated devices, in-flight claims), potentially even multiple times (GatherAllocatedState first gets allocated devices, then the capacities). The underlying assumption that nothing bad happens in parallel is not always true. The following log snippet shows how an update of the assume cache (feeding the allocated devices tracker) and in-flight claims lands such that GatherAllocatedState doesn't see the device in that claim as allocated: dra_manager.go:263: I0115 15:11:04.407714 18778] scheduler: Starting GatherAllocatedState ... allocateddevices.go:189: I0115 15:11:04.407945 18066] scheduler: Observed device allocation device="testdra-all-usesallresources-hvs5d.driver/worker-5/worker-5-device-094" claim="testdra-all-usesallresources-hvs5d/claim-0553" dynamicresources.go:1150: I0115 15:11:04.407981 89109] scheduler: Claim stored in assume cache pod="testdra-all-usesallresources-hvs5d/my-pod-0553" claim="testdra-all-usesallresources-hvs5d/claim-0553" uid=<types.UID>: a84d3c4d-f752-4cfd-8993-f4ce58643685 resourceVersion="5680" dra_manager.go:201: I0115 15:11:04.408008 89109] scheduler: Removed in-flight claim claim="testdra-all-usesallresources-hvs5d/claim-0553" uid=<types.UID>: a84d3c4d-f752-4cfd-8993-f4ce58643685 version="1211" dynamicresources.go:1157: I0115 15:11:04.408044 89109] scheduler: Removed claim from in-flight claims pod="testdra-all-usesallresources-hvs5d/my-pod-0553" claim="testdra-all-usesallresources-hvs5d/claim-0553" uid=<types.UID>: a84d3c4d-f752-4cfd-8993-f4ce58643685 resourceVersion="5680" allocation=< { "devices": { "results": [ { "request": "req-1", "driver": "testdra-all-usesallresources-hvs5d.driver", "pool": "worker-5", "device": "worker-5-device-094" } ] }, "nodeSelector": { "nodeSelectorTerms": [ { "matchFields": [ { "key": "metadata.name", "operator": "In", "values": [ "worker-5" ] } ] } ] }, "allocationTimestamp": "2026-01-15T14:11:04Z" } > dra_manager.go:280: I0115 15:11:04.408085 18778] scheduler: Device is in flight for allocation device="testdra-all-usesallresources-hvs5d.driver/worker-5/worker-5-device-095" claim="testdra-all-usesallresources-hvs5d/claim-0086" dra_manager.go:280: I0115 15:11:04.408137 18778] scheduler: Device is in flight for allocation device="testdra-all-usesallresources-hvs5d.driver/worker-5/worker-5-device-096" claim="testdra-all-usesallresources-hvs5d/claim-0165" default_binder.go:69: I0115 15:11:04.408175 89109] scheduler: Attempting to bind pod to node pod="testdra-all-usesallresources-hvs5d/my-pod-0553" node="worker-5" dra_manager.go:265: I0115 15:11:04.408264 18778] scheduler: Finished GatherAllocatedState allocatedDevices=<map[string]interface {} \| len:2>: { Initial state: "worker-5-device-094" is in-flight, not in cache - goroutine #1: starts GatherAllocatedState, copies cache - goroutine #2: adds to assume cache, removes from in-flight - goroutine #1: checks in-flight => device never seen as allocated This is the second reason for double allocation of the same device in two different claims. The other was timing in the assume cache. Both were tracked down with an integration test (separate commit). It did not fail all the time, but enough that regressions should show up as flakes.	2026-01-26 15:44:48 +01:00
Kubernetes Prow Robot	584add12b6	Merge pull request #136457 from tosi3k/workload-helper Extract helper methods from gang scheduling plugin	2026-01-26 20:01:51 +05:30
Bartosz	56ca09911f	Refactor resource allocation tests to be more readable	2026-01-26 14:26:46 +00:00
Bartosz	8f5f69bc70	Change scoring function for balanced allocation	2026-01-26 14:22:46 +00:00
Antoni Zawodny	8b39544d60	Extract helper methods from gang scheduling plugin	2026-01-26 13:45:26 +01:00
Kubernetes Prow Robot	0af247eb14	Merge pull request #136344 from brejman/kep-5732-tas-rename-podgroupinfo Rename PodGroupInfo in preparation for Workload-aware scheduling changes	2026-01-23 17:37:29 +05:30
Bartosz	ae27a49a13	Rename PodGroupInfo to PodGroupState This is in preparation for PodGroupInfo struct with more pod group details	2026-01-22 14:45:40 +00:00
Kubernetes Prow Robot	cb077823fb	Merge pull request #136204 from romanbaron/remove-cache-ttl Remove cache expiration mechanism	2026-01-20 03:42:50 +05:30
carlory	c8fc0a1b98	remove CSIMigrationPortworx and InTreePluginPortworxUnregister feature gates Signed-off-by: carlory <baofa.fan@daocloud.io>	2026-01-19 11:35:29 +08:00
Roman Baron	74b7ff3c63	scheduler: Remove ttl parameter from cache.New signature	2026-01-13 17:04:06 +02:00
Antoni Zawodny	833b7205fc	Run PreBind plugins in parallel if feasible	2026-01-11 14:19:18 +01:00
Antoni Zawodny	16b375e4ef	Generalize ErrorChannel to other underlying types	2026-01-11 13:58:06 +01:00
Kubernetes Prow Robot	b54554b72d	Merge pull request #135955 from utam0k/async-metrics scheduler: align the meaning of victim metrics between async preemption and sync preemption	2026-01-08 20:39:41 +05:30
utam0k	44e0c79406	Align the meaning of victim metrics between async preemption and sync preemption Signed-off-by: utam0k <k0ma@utam0k.jp>	2026-01-08 21:02:17 +09:00
Kubernetes Prow Robot	8ab1bc1633	Merge pull request #135725 from bart0sh/PR211-add-extended-resources-test-cases Fix extended resource handling for DRA-backed resources on pod admission	2026-01-08 04:03:42 +05:30
Kubernetes Prow Robot	4e69edd0ee	Merge pull request #135392 from brejman/issue-134393-nominated-nodes Fix queue hint for plugins on change to pods with nominated nodes	2026-01-07 20:05:38 +05:30
Kubernetes Prow Robot	b2ac9e206f	Merge pull request #130231 from Barakmor1/updateimagelocality Update ImageLocality plugin to account ImageVolume images	2026-01-05 12:28:37 +05:30
Ed Bartosh	c2361491f5	Fix extended resource handling for DRA-backed resources In kubelet admission: - Remove extended resources from pod requirements if they are either backed by DRA or not present in node's allocatable resources In scheduler (fit.go): - Remove fallback logic that delegated all resources to DRA when draManager is nil These changes ensure that: - DRA-backed extended resources are properly handled during pod admission - DevicePlugin-backed extended resources still follow standard admission rules	2026-01-02 16:08:49 +02:00
Patrick Ohly	dfa6aa22b2	DRA scheduler: fix unit test flakes Test_isSchedulableAfterClaimChange was sensitive to system load because of the arbitrary delay when waiting for the assume cache to catch up. Running inside a synctest bubble avoids this. While at it, the unit tests get converted to ktesting (nicer failure output, no extra indention needed for tCtx.SyncTest). TestPlugin/prebind-fail-with-binding-timeout relied on setting up a claim with certain time stamps and then getting that test case tested within a certain real-world time window. It's surprising that this didn't flake more often because test execution order is random. Now the time stamp gets set right before the test case is about to be tested. Conversion to a synctest would be nicer, but synctests cannot have sub-tests, which are used here to track where log output and failures come from within the larger test case. Inside the plugin itself some log output gets added to explain why a claim is unavailable on a node in case of a binding timeout or error during Filter.	2025-12-30 11:45:02 +01:00
Kubernetes Prow Robot	3226fe520d	Merge pull request #135948 from pohly/dra-scheduler-resource-plugin-unit-test-fix DRA extended resources: fix flake in unit tests	2025-12-30 16:12:35 +05:30
Kubernetes Prow Robot	2a3a6605ac	Merge pull request #135330 from sujalshah-bit/fix-mem-leak scheduler: Fix memory leak in scheduler cache	2025-12-29 15:56:34 +05:30
Patrick Ohly	7a4d650125	DRA extended resources: fix flake in unit tests The tests assumed that instantiating a DRAManager followed by informerFactory.WaitForCacheSync would be enough to have the manager up-to-date, but that's not correct: the test only waits for informer caches to be synced, but syncing event handlers like the one in the manager may still be going on. The flake rate is low, though: $ GOPATH/bin/stress -p 256 ./noderesources.test 5s: 0 runs so far, 0 failures, 256 active 10s: 256 runs so far, 0 failures, 256 active 15s: 256 runs so far, 0 failures, 256 active 20s: 512 runs so far, 0 failures, 256 active 25s: 567 runs so far, 0 failures, 256 active 30s: 771 runs so far, 0 failures, 256 active /tmp/go-stress-20251226T181044-974980161 --- FAIL: TestCalculateResourceAllocatableRequest (0.81s) --- FAIL: TestCalculateResourceAllocatableRequest/DRA-backed-resource-with-shared-device-allocation (0.00s) extendedresourcecache.go:197: I1226 18:11:14.431337] Updated extended resource cache for explicit mapping extendedResource="extended.resource.dra.io/something" deviceClass="device-class-name" extendedresourcecache.go:204: I1226 18:11:14.431380] Updated extended resource cache for default mapping extendedResource="deviceclass.resource.kubernetes.io/device-class-name" deviceClass="device-class-name" extendedresourcecache.go:220: I1226 18:11:14.431394] Updated device class mapping deviceClass="device-class-name" extendedResource="extended.resource.dra.io/something" resource_allocation_test.go:595: Expected requested=2, but got requested=1 FAIL It becomes higher when changing WaitForCacheSync such that it doesn't poll and therefore returns more promptly, which is where this flake was first observed. The fix is to run the test in a syntest bubble where Wait can be used to wait for all background activity, including event handling, to be finished before proceeding with the test. synctest is less forgiving about lingering goroutines. A synctest bubble must wait for gouroutines to stop, which in this case means that there has to be a way to wait for the metric recorder shutdown. Event handlers have to be removed. This could be done with plain Go, but here test/utils/ktesting is used instead because it offers some advantages: - less boilerplate code - automatic cancellation of the context (i.e. less manual context.WithCancel) - tCtx.SyncTest is a direct substitute for t.Run, which avoids re-indenting sub-tests. synctest itself needs another anonymous function, which makes the line too long and forced re-indention: t.Run(... func(...) { synctest.Test(... func() { }) }) For the sake of consistency all tests get updated. While at it, some code gets improved: - t.Fatal(err) is not a good way to report an error because there is no additional markup in the test output that indicates that there was an unexpected error. It just logs err.Error(), which might not be very informative and/or obvious. - newTestDRAManager aborts in case of a failure instead of returning an error.	2025-12-27 09:47:56 +01:00
Bartosz	3b4f0be6e3	Check NominatedNodeName to decide if a pod is scheduled	2025-12-19 12:30:06 +00:00
Patrick Ohly	ad79e479c2	build: remove deprecated '// +build' tag This has been replaced by `//build:...` for a long time now. Removal of the old build tag was automated with: for i in $(git grep -l '^// +build' \| grep -v -e '^vendor/'); do if ! grep -q '^// Code generated' "$i"; then sed -i -e '/^\/\/ +build/d' "$i"; fi; done	2025-12-18 12:16:21 +01:00
Kubernetes Prow Robot	a504b1b4eb	Merge pull request #135755 from pohly/dra-logging DRA: log more information	2025-12-18 02:10:38 -08:00
bmordeha	6f57f1e95b	Update imageLocality plugin to account for ImageVolume images when scoring and prioritizing nodes with required pod images Signed-off-by: bmordeha <bmordeha@redhat.com>	2025-12-18 09:28:39 +02:00
Kubernetes Prow Robot	4a1cbabadd	Merge pull request #135495 from tosi3k/skip-last-pod-deletion Skip last victim in async preemption if any prior Pod preemption failed	2025-12-17 22:36:28 -08:00
Kubernetes Prow Robot	62db4db266	Merge pull request #135489 from ania-borowiec/update_comment Update async preemption comment to reflect the current state of the code	2025-12-17 22:36:13 -08:00
Kubernetes Prow Robot	c5a0c31294	Merge pull request #135484 from bart0sh/PR209-improve-balanced-allocation-coverage Extended resources unit tests: cover DRA resources	2025-12-17 22:36:06 -08:00
Kubernetes Prow Robot	1a3d8712f3	Merge pull request #135394 from brejman/adhoc-interpodaffinity-pending-pod-update Fix queue hint for interpodaffinity when target pod is updated	2025-12-17 21:42:46 -08:00
Kubernetes Prow Robot	285eb9fdba	Merge pull request #135325 from brejman/issue-134393 Fix queue hint for inter-pod anti-affinity	2025-12-17 20:01:02 -08:00
Bartosz	d6d8639349	Fix queue hint for interpod antiaffinity	2025-12-16 13:01:15 +00:00
Bartosz	145adcd522	Fix queue hint for interpodaffinity when target pod is updated	2025-12-16 12:57:50 +00:00
Patrick Ohly	5d536bfb8e	DRA: log more information For debugging double allocation of the same device (https://github.com/kubernetes/kubernetes/issues/133602) it is necessary to have information about pools, devices and in-flight claims. Log calls get extended and the config for DRA CI jobs updated to enable higher verbosity for relevant source files. Log output in such a cluster at verbosity 6 looks like this: I1215 10:28:54.166872 1 allocator_incubating.go:130] "Gathered pool information" logger="FilterWithNominatedPods.Filter.DynamicResources" pod="dra-8841/tester-3" node="kind-worker2" pools={"count":1,"devices":["dra-8841.k8s.io/kind-worker2/device-00"],"meta":[{"InvalidReason":"","id":"dra-8841.k8s.io/kind-worker2","isIncomplete":false,"isInvalid":false}]} I1215 10:28:54.166941 1 allocator_incubating.go:254] "Gathered information about devices" logger="FilterWithNominatedPods.Filter.DynamicResources" pod="dra-8841/tester-3" node="kind-worker2" allocatedDevices={"count":2,"devices":["dra-8841.k8s.io/kind-worker/device-00","dra-8841.k8s.io/kind-worker3/device-00"]} minDevicesToBeAllocated=1	2025-12-16 09:58:05 +01:00
Ed Bartosh	1820dc7535	Fit tests: add DRA-aware test cases	2025-12-12 15:48:18 +02:00
Ed Bartosh	7860effc2c	resourceAllocationScorer: add unit test for DRA nodeMatches	2025-12-12 15:48:13 +02:00
Ed Bartosh	02a39d6c1e	Balanced allocation tests: cover DRA resources - Added DRA-aware test cases - Pulled shared DRA setup out into helper to keep tests DRY - Added SignPod test	2025-12-12 13:51:19 +02:00
Antoni Zawodny	7577f84e79	Skip last victim in async preemption if any prior Pod preemption failed	2025-12-10 14:44:06 +01:00
Ania Borowiec	0cf3d0e20a	Update comment to reflect the current state of the code	2025-11-27 22:10:02 +00:00
Mohammad Varmazyar	4c2fff1934	Address comments, log level, test assersion consistency and remove unnecessary locks in TestFlushUnschedulablePodsLeftoverSetsFlag	2025-11-26 14:08:05 +01:00
Mohammad Varmazyar	4f455c9c0d	Refactor plugin clearing to use ClearRejectorPlugins method	2025-11-26 09:54:32 +01:00
Mohammad Varmazyar	bc632c72d0	scheduler: add metric for pods scheduled after flush Add counter metric to track pods that schedule immediately after being flushed from unschedulablePods due to timeout. Uses a boolean flag that is cleared when pods return to queue or move via events.	2025-11-24 09:38:41 +01:00
Mohammad Varmazyar	b2a399cf30	scheduler: add metric for pods scheduled after flush This metric tracks pods that successfully schedule after being flushed from unschedulablePods due to timeout. High values may indicate missing queue hint optimizations or event handling issues.	2025-11-24 09:38:40 +01:00
Ravi Sastry Kadali	9dc5683c56	scheduler: Fix memory leak in scheduler cache The `removeSlice` function was leaving behind references to the removed element, preventing it from being garbage-collected. This commit ensures that removed entries are fully cleared, eliminating the memory leak. Co-authored-by: ravisastryk <ravisastryk@gmail.com> Signed-off-by: Sujal Shah <sujalshah28092004@gmail.com>	2025-11-20 02:18:38 +05:30
bwsalmon	854e67bb51	KEP 5598: Opportunistic Batching (#135231 ) * First version of batching w/out signatures. * First version of pod signatures. * Integrate batching with signatures. * Fix merge conflicts. * Fixes from self-review. * Test fixes. * Fix a bug that limited batches to size 2 Also add some new high-level logging and simplify the pod affinity signature. * Re-enable batching on perf tests for now. * fwk.NewStatus(fwk.Success) * Review feedback. * Review feedback. * Comment fix. * Two plugin specific unit tests.: * Add cycle state to the sign call, apply to topo spread. Also add unit tests for several plugi signature calls. * Review feedback. * Switch to distinct stats for hint and store calls. * Switch signature from string to []byte * Revert cyclestate in signs. Update node affinity. Node affinity now sorts all of the various nested arrays in the structure. CycleState no longer in signature; revert to signing fewer cases for pod spread. * hack/update-vendor.sh * Disable signatures when extenders are configured. * Update pkg/scheduler/framework/runtime/batch.go Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com> * Update staging/src/k8s.io/kube-scheduler/framework/interface.go Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com> * Review feedback. * Disable node resource signatures when extended DRA enabled. * Review feedback. * Update pkg/scheduler/framework/plugins/imagelocality/image_locality.go Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com> * Update pkg/scheduler/framework/interface.go Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com> * Update pkg/scheduler/framework/plugins/nodedeclaredfeatures/nodedeclaredfeatures.go Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com> * Update pkg/scheduler/framework/runtime/batch.go Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com> * Review feedback. * Fixes for review suggestions. * Add integration tests. * Linter fixes, test fix. * Whitespace fix. * Remove broken test. * Unschedulable test. * Remove go.mod changes. --------- Co-authored-by: Maciej Skoczeń <87243939+macsko@users.noreply.github.com>	2025-11-12 21:51:37 -08:00
ndixita	5ac2ffcc1e	Enabling NodeDeclaredFeatures in unit tests Signed-off-by: ndixita <ndixita@google.com>	2025-11-12 08:26:15 +00:00
ndixita	7645eb70e9	Scheduler changes to support pod level resources in place resize	2025-11-11 18:15:22 +00:00

1 2 3 4 5 ...

2003 commits