The previous error message said the audience was "not found in pod
spec volume", which led users to mount a spurious projected service
account token volume in the pod spec to satisfy the check. That is
not the intended remedy: kubelets should be authorized via RBAC to
request tokens for the configured audience.
Reword the error to a generic "is not authorized to request tokens
for audience %q" so users are not pushed toward modifying pod specs.
The valid authorization paths (pod spec volume, CSIDriver tokenRequests,
or the request-serviceaccounts-token-audience verb) are documented
in the kubelet credential provider task page.
Update the unit and integration test expectations to match.
- Refine doc comment to note ExtendedResourceClaimStatus may change under
rare condition after a pod is bound to a node.
- Seed the initial pod with ExtendedResourceClaimStatus{extended-claim-0}
and RequestMappings so the test exercises the change scenario yliaog
described rather than nil-to-set.
- Update the inline comment and assertion messages to reflect the swap
from extended-claim-0 to extended-claim-1.
Pods using DRAExtendedResource (e.g. nvidia.com/gpu) have no
Spec.ResourceClaims, so ResourceClaimStatuses stays nil. The updatePod
fast-path skipped AddPod when only ExtendedResourceClaimStatus changed,
leaving the synthesized claim→pod→node edge out of the authorization graph.
Add PodExtendedStatusEqual to the fast-path guard, matching what the
scheduler event handler already does in events.go.
This commit introduces the DRAResourceClaimGranularStatusAuthorization
feature gate (Beta in 1.36) to enforce fine-grained authorization checks
on ResourceClaim status updates.
Previously, 'update' permission on 'resourceclaims/status' allowed modifying
the entire status. To enforce the principle of least privilege for DRA
drivers and the scheduler, this change introduces synthetic subresources and
verb prefixes:
- 'resourceclaims/binding': Required to update 'status.allocation' and
'status.reservedFor'.
- 'resourceclaims/driver': Required to update 'status.devices'. Evaluated
on a per-driver basis using 'associated-node:<verb>' (for node-local
ServiceAccounts) or 'arbitrary-node:<verb>' (for cluster-wide controllers).
Implement the RPSR controller that watches ResourcePoolStatusRequest
objects and aggregates pool status from DRA drivers. Add the API server
registry (strategy, storage), handwritten validation, RBAC bootstrap
policy for the controller, kube-controller-manager wiring, table
printer columns, and storage factory registration.
The fields become beta, enabled by default. DeviceTaintRule gets
added to the v1beta2 API, but support for it must remain off by default
because that API group is also off by default.
The v1beta1 API is left unchanged. No-one should be using it
anymore (deprecated in 1.33, could be removed now if it wasn't for
reading old objects and version emulation).
To achieve consistent validation, declarative validation must be enabled also
for v1alpha3 (was already enabled for other versions). Otherwise,
TestVersionedValidationByFuzzing fails:
--- FAIL: TestVersionedValidationByFuzzing (0.09s)
--- FAIL: TestVersionedValidationByFuzzing/resource.k8s.io/v1beta2,_Kind=DeviceTaintRule (0.00s)
validation_test.go:109: different error count (0 vs. 1)
resource.k8s.io/v1alpha3: <no errors>
resource.k8s.io/v1beta2: "spec.taint.effect: Unsupported value: \"幤HxÒQP¹¬永唂ȳ垞ş]嘨鶊\": supported values: \"NoExecute\", \"NoSchedule\", \"None\""
...
* Drop WorkloadRef field and introduce SchedulingGroup field in Pod API
* Introduce v1alpha2 Workload and PodGroup APIs, drop v1alpha1 Workload API
Co-authored-by: yongruilin <yongrlin@outlook.com>
* Run hack/update-codegen.sh
* Adjust kube-scheduler code and integration tests to v1alpha2 API
* Drop v1alpha1 scheduling API group and run make update
---------
Co-authored-by: yongruilin <yongrlin@outlook.com>