kubernetes/pkg
Patrick Ohly eb7391688e DRA scheduler: fix another root cause of double device allocation
GatherAllocatedState and ListAllAllocatedDevices need to collect information
from different sources (allocated devices, in-flight claims), potentially even
multiple times (GatherAllocatedState first gets allocated devices, then the
capacities).

The underlying assumption that nothing bad happens in parallel is not always
true. The following log snippet shows how an update of the assume
cache (feeding the allocated devices tracker) and in-flight claims lands such
that GatherAllocatedState doesn't see the device in that claim as allocated:

    dra_manager.go:263: I0115 15:11:04.407714      18778] scheduler: Starting GatherAllocatedState
    ...
    allocateddevices.go:189: I0115 15:11:04.407945      18066] scheduler: Observed device allocation device="testdra-all-usesallresources-hvs5d.driver/worker-5/worker-5-device-094" claim="testdra-all-usesallresources-hvs5d/claim-0553"
    dynamicresources.go:1150: I0115 15:11:04.407981      89109] scheduler: Claim stored in assume cache pod="testdra-all-usesallresources-hvs5d/my-pod-0553" claim="testdra-all-usesallresources-hvs5d/claim-0553" uid=<types.UID>: a84d3c4d-f752-4cfd-8993-f4ce58643685 resourceVersion="5680"
    dra_manager.go:201: I0115 15:11:04.408008      89109] scheduler: Removed in-flight claim claim="testdra-all-usesallresources-hvs5d/claim-0553" uid=<types.UID>: a84d3c4d-f752-4cfd-8993-f4ce58643685 version="1211"
    dynamicresources.go:1157: I0115 15:11:04.408044      89109] scheduler: Removed claim from in-flight claims pod="testdra-all-usesallresources-hvs5d/my-pod-0553" claim="testdra-all-usesallresources-hvs5d/claim-0553" uid=<types.UID>: a84d3c4d-f752-4cfd-8993-f4ce58643685 resourceVersion="5680" allocation=<
        	{
        	  "devices": {
        	    "results": [
        	      {
        	        "request": "req-1",
        	        "driver": "testdra-all-usesallresources-hvs5d.driver",
        	        "pool": "worker-5",
        	        "device": "worker-5-device-094"
        	      }
        	    ]
        	  },
        	  "nodeSelector": {
        	    "nodeSelectorTerms": [
        	      {
        	        "matchFields": [
        	          {
        	            "key": "metadata.name",
        	            "operator": "In",
        	            "values": [
        	              "worker-5"
        	            ]
        	          }
        	        ]
        	      }
        	    ]
        	  },
        	  "allocationTimestamp": "2026-01-15T14:11:04Z"
        	}
         >
    dra_manager.go:280: I0115 15:11:04.408085      18778] scheduler: Device is in flight for allocation device="testdra-all-usesallresources-hvs5d.driver/worker-5/worker-5-device-095" claim="testdra-all-usesallresources-hvs5d/claim-0086"
    dra_manager.go:280: I0115 15:11:04.408137      18778] scheduler: Device is in flight for allocation device="testdra-all-usesallresources-hvs5d.driver/worker-5/worker-5-device-096" claim="testdra-all-usesallresources-hvs5d/claim-0165"
    default_binder.go:69: I0115 15:11:04.408175      89109] scheduler: Attempting to bind pod to node pod="testdra-all-usesallresources-hvs5d/my-pod-0553" node="worker-5"
    dra_manager.go:265: I0115 15:11:04.408264      18778] scheduler: Finished GatherAllocatedState allocatedDevices=<map[string]interface {} | len:2>: {

Initial state: "worker-5-device-094" is in-flight, not in cache
- goroutine #1: starts GatherAllocatedState, copies cache
- goroutine #2: adds to assume cache, removes from in-flight
- goroutine #1: checks in-flight

=> device never seen as allocated

This is the second reason for double allocation of the same device in two
different claims. The other was timing in the assume cache. Both were
tracked down with an integration test (separate commit). It did not fail
all the time, but enough that regressions should show up as flakes.
2026-01-27 14:34:56 +01:00
..
api fix(pod/util): typos in getting pod validation options 2025-02-28 22:19:00 +05:30
apis fix: allow job startTime updates on resume from suspended state 2025-11-05 09:52:53 +01:00
auth wire in ctx to rbac plugins 2024-09-17 20:04:02 +03:00
capabilities Add ut coverage for capabilities.Setup (#125395) 2024-10-17 18:23:03 +01:00
client Add test to confirm default content type used by core client 2024-10-23 11:35:32 -04:00
cluster/ports
controller mark QuotaMonitor as not running and invalidate monitors list 2026-01-08 13:49:59 +01:00
controlplane test: Add emulated-version flag verification in flagz test 2025-02-20 18:54:51 -08:00
credentialprovider credential provider config: detect typos 2024-10-14 12:23:43 -07:00
features Add the feature gate OrderedNamespaceDeletion for apiserver. 2025-03-03 13:40:33 -08:00
fieldpath
generated kubelet: use env vars in node log query PS command 2025-01-13 14:25:35 -08:00
kubeapiserver v1alpha2 LeaseCandidate API 2024-11-08 02:27:19 +00:00
kubectl DRA: bump API v1alpha2 -> v1alpha3 2024-07-21 17:28:13 +02:00
kubelet mark device manager as haelthy before it started for the first time 2025-11-07 03:06:43 +00:00
kubemark remove runonce mode 2024-11-07 19:54:11 +08:00
printers v1alpha2 LeaseCandidate API 2024-11-08 02:27:19 +00:00
probe fix: enable nil-compare and error-nil rules from testifylint in module k8s.io/kubernetes 2024-09-25 06:02:47 +02:00
proxy kube-proxy/winkernel: fix stale RemoteEndpoints due to premature clearing of terminatedEndpoints map. 2025-11-06 07:59:08 +00:00
quota/v1 Merge pull request #128407 from ndixita/pod-level-resources 2024-11-08 07:10:50 +00:00
registry fix: allow job startTime updates on resume from suspended state 2025-11-05 09:52:53 +01:00
routes Move public key getter to interface 2024-06-25 18:10:08 -04:00
scheduler DRA scheduler: fix another root cause of double device allocation 2026-01-27 14:34:56 +01:00
security Copy limited pieces of code we use from runc's apparmor and utils packages 2024-10-22 09:56:22 -04:00
securitycontext Mask Linux thermal interrupt info in /proc and /sys. 2025-07-16 11:07:17 +02:00
serviceaccount Isolate mock signer for externaljwt tests 2024-12-12 09:32:11 -05:00
util Revert "Enforce the Minimum Kernel Version 6.3 for UserNamespacesSupport feature" 2025-05-15 12:26:08 +02:00
volume Merge pull request #135066 from eltrufas/automated-cherry-pick-of-#133599-upstream-release-1.32 2025-11-19 23:32:02 -08:00
windows/service Windows node graceful shutdown 2024-11-05 17:46:22 +00:00
.import-restrictions
OWNERS