Commit graph

54502 commits

Author SHA1 Message Date
bmordeha
6f57f1e95b Update imageLocality plugin
to account for ImageVolume images when scoring
and prioritizing nodes with required pod images

Signed-off-by: bmordeha <bmordeha@redhat.com>
2025-12-18 09:28:39 +02:00
Kubernetes Prow Robot
53b29512ec
Merge pull request #135515 from dims/add-explicit-type-to-feature-gate-const-declarations
Add explicit type to feature gate const declarations
2025-12-17 22:36:59 -08:00
Kubernetes Prow Robot
1abe8c34e9
Merge pull request #135511 from aojea/conntrack_udp_filters
kube-proxy: optimize conntrack cleanup with O(n) flow filter
2025-12-17 22:36:43 -08:00
Kubernetes Prow Robot
321e0f69d8
Merge pull request #135504 from dims/bump=github.com/opencontainers/cgroups-to-v0.0.6
Bump github.com/opencontainers/cgroups to v0.0.6
2025-12-17 22:36:36 -08:00
Kubernetes Prow Robot
4a1cbabadd
Merge pull request #135495 from tosi3k/skip-last-pod-deletion
Skip last victim in async preemption if any prior Pod preemption failed
2025-12-17 22:36:28 -08:00
Kubernetes Prow Robot
62db4db266
Merge pull request #135489 from ania-borowiec/update_comment
Update async preemption comment to reflect the current state of the code
2025-12-17 22:36:13 -08:00
Kubernetes Prow Robot
c5a0c31294
Merge pull request #135484 from bart0sh/PR209-improve-balanced-allocation-coverage
Extended resources unit tests: cover DRA resources
2025-12-17 22:36:06 -08:00
Kubernetes Prow Robot
8fcb1fd4cf
Merge pull request #135455 from carlory/csiNodeIDMaxLength
cleanup csiNodeIDMaxLength
2025-12-17 22:35:44 -08:00
Kubernetes Prow Robot
43cfcac7cc
Merge pull request #135434 from yliaog/quota_abuse
Fixes the loophole that allows users to workaround resource quota set by system admin
2025-12-17 22:35:28 -08:00
Filip Křepinský
7aa186fa0a
schedule pod availability checks at the correct time in StatefulSets (#135428)
* wire now (time) to the availability checks in the StatefulSet controller

- this helps to make the controller reconcilliation consistent

* schedule pod availability checks at the correct time in StatefulSets

* replace "k8s.io/klog/v2/ktesting" with "k8s.io/kubernetes/test/utils/ktesting"

for advanced features (e.g. Eventually)

* add StatefulSetAvailabilityCheck test
2025-12-17 22:35:21 -08:00
Kubernetes Prow Robot
7795655410
Merge pull request #135402 from xigang/pv_controller
PV controller: Add rate-limiting queues and improve error handling
2025-12-17 21:43:02 -08:00
Kubernetes Prow Robot
1a3d8712f3
Merge pull request #135394 from brejman/adhoc-interpodaffinity-pending-pod-update
Fix queue hint for interpodaffinity when target pod is updated
2025-12-17 21:42:46 -08:00
Kubernetes Prow Robot
1757c6358b
Merge pull request #135368 from vshkrabkov/fix/scheduler-queue-metric-sync
Scheduler: Fix GatedPods metric desync in unschedulable queue
2025-12-17 21:42:00 -08:00
Kubernetes Prow Robot
9e055e5945
Merge pull request #135355 from lalitc375/basic
Add k8s:optional on Device.Basic
2025-12-17 21:41:37 -08:00
Kubernetes Prow Robot
285eb9fdba
Merge pull request #135325 from brejman/issue-134393
Fix queue hint for inter-pod anti-affinity
2025-12-17 20:01:02 -08:00
Kubernetes Prow Robot
907f9d26c7
Merge pull request #135302 from liyuerich/commentstartapidiscovery
enable commentstart check on apidiscovery API group
2025-12-17 20:00:46 -08:00
Kubernetes Prow Robot
811f4d30f9
Merge pull request #135251 from bart0sh/PR208-migrate-allocation-to-contextual-logging
kubelet: migrate allocation to contextual logging
2025-12-17 20:00:16 -08:00
Kubernetes Prow Robot
c0c81a4258
Merge pull request #135249 from bart0sh/PR207-migrate-container-manager-to-contextual-logging
kubelet: migrate container manager to contextual logging
2025-12-17 20:00:01 -08:00
Kubernetes Prow Robot
8f8bf5640b
Merge pull request #135217 from VijetaPriya47/fix-nodeipam-sync-test-goroutine-leak
Fix goroutine leak in TestNodeSyncResync
2025-12-17 19:59:39 -08:00
Kubernetes Prow Robot
05ae5a310c
Merge pull request #135126 from mrvarmazyar/add-pod-flush-metric
scheduler: add metric for pods scheduled after flush
2025-12-17 19:59:16 -08:00
Kubernetes Prow Robot
a31e6a115f
Merge pull request #132402 from astraw99/ftr-add-node-arch
Add node `arch` in the kubectl get node output
2025-12-17 18:33:14 -08:00
yliao
3e34de29c4 fixed the loophole that allows user to get around resource quota set by system admin 2025-12-18 00:56:20 +00:00
Kubernetes Prow Robot
8362ec56da
Merge pull request #134441 from humblec/kubelet-volume
Record proper orphaned pod cleanup error based on the system call
2025-12-17 16:26:18 -08:00
Kubernetes Prow Robot
cc4bccf6a1
Merge pull request #134422 from jaehanbyun/ingressclass-default-marker
ingressclass: show (default) marker for default IngressClass
2025-12-17 16:26:11 -08:00
Kubernetes Prow Robot
1078cf59b9
Merge pull request #133964 from K-Diger/fix/dra-plugin-unreachable-code
kubelet: refactor DRA plugin health client initialization
2025-12-17 16:26:03 -08:00
Kubernetes Prow Robot
1187749524
Merge pull request #133719 from carlory/removeMaxAttachLimit
clean up removeMaxAttachLimit
2025-12-17 16:25:40 -08:00
Kubernetes Prow Robot
97e95711c5
Merge pull request #133654 from kwohlfahrt/kubelet-cert
Fix kubelet certificate reload when connecting by IP address
2025-12-17 16:25:32 -08:00
Kubernetes Prow Robot
e14cdadc5a
Merge pull request #132807 from iholder101/feature/ImageVolumeWithDigest
[KEP-5365] Implement Image Volume with Digest
2025-12-17 16:25:17 -08:00
Bartosz
d6d8639349
Fix queue hint for interpod antiaffinity 2025-12-16 13:01:15 +00:00
Bartosz
145adcd522
Fix queue hint for interpodaffinity when target pod is updated 2025-12-16 12:57:50 +00:00
Vlad Shkrabkov
5be527b78e Scheduler: Fix GatedPods metric desync in unschedulable queue
Previously, when a Pod residing in the 'unschedulablePods' queue was updated and subsequently rejected by PreEnqueue plugins (returning 'Wait'), the logic in 'moveToActiveQ' would return early because the Pod was already present in the queue.

This caused the 'scheduler_gated_pods_total' metric to fail to increment, leading to metric inconsistencies (and potentially negative values upon Pod deletion).

This change adds a check to detect the transition from Ungated to Gated. If detected, the Pod is removed and re-added to the queue to ensure metrics are correctly swapped (Unschedulable-- and Gated++).

Added regression test 'TestSchedulingQueueMetrics_UngatedToGated' to verify the fix.

Signed-off-by: Vlad Shkrabkov <vshkrabkov@google.com>
2025-12-15 11:47:22 +00:00
Ed Bartosh
1820dc7535 Fit tests: add DRA-aware test cases 2025-12-12 15:48:18 +02:00
Ed Bartosh
7860effc2c resourceAllocationScorer: add unit test for DRA nodeMatches 2025-12-12 15:48:13 +02:00
Ed Bartosh
02a39d6c1e Balanced allocation tests: cover DRA resources
- Added DRA-aware test cases
- Pulled shared DRA setup out into helper to keep tests DRY
- Added SignPod test
2025-12-12 13:51:19 +02:00
Antoni Zawodny
7577f84e79 Skip last victim in async preemption if any prior Pod preemption failed 2025-12-10 14:44:06 +01:00
Taahir Ahmed
8d4237fde8 kubelet: Fix nil panic in podcertificatemanager
If the the PCR that kubelet created gets deleted before it is issued, a
nil panic will be thrown while composing the error message.
2025-12-08 11:43:07 -08:00
k-diger
b255410b4f Remove duplicate connection management in DRA plugin Fixes 2025-12-05 01:34:44 +09:00
Antonio Ojea
51f614a156 ipallocator: handle errors correctly
The ipallocator was blindly assuming that all errors are retryable, that
causes that the allocator tries to exhaust all the possibilities to
allocate an IP address.

If the error is not retryable this means the allocator will generate as
many API calls as existing available IPs are in the allocator, causing
CPU exhaustion since this requests are coming from inside the apiserver.

In addition to handle the error correctly, this patch also interpret the
error to return the right status code depending on the error type.

Co-authored-by: carlory <baofa.fan@daocloud.io>
2025-12-03 10:39:57 +00:00
Antonio Ojea
38e08c231c kube-proxy: optimize conntrack cleanup with O(n) flow filter
Previously, we created a separate filter for each stale flow,
resulting in O(n^2) complexity when deleting flows because the
netlink llibrary iterates over all filters for each flow.

This change introduces a new filter backed by a `sets.Set` for O(1) lookup per flow.
This reduces the overall complexity of cleaning up stale entries to O(n).
2025-12-03 10:35:29 +00:00
xigang
8f1ff1d8ce Refactor PV controller to use rate-limiting queues and improve error handling
Signed-off-by: xigang <wangxigang2014@gmail.com>
2025-12-01 19:11:52 +08:00
Davanum Srinivas
594ed6392b
Add explicit type to feature gate const declarations
Two feature gate constants were missing the explicit `featuregate.Feature`
type annotation, making them inconsistent with the rest of the file:
- ChangeContainerStatusOnKubeletRestart
- StatefulSetSemanticRevisionComparison

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2025-11-29 21:22:23 -05:00
Davanum Srinivas
1569ebc5a6
Bump github.com/opencontainers/cgroups to v0.0.6
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2025-11-28 16:22:46 -05:00
Ania Borowiec
0cf3d0e20a
Update comment to reflect the current state of the code 2025-11-27 22:10:02 +00:00
Adrian Moisey
dae1dbc1ff KEP-5311 - Revert RelaxedServiceNameValidation promote to beta 2025-11-27 20:52:35 +09:00
Mohammad Varmazyar
4c2fff1934 Address comments, log level, test assersion consistency and remove unnecessary locks in TestFlushUnschedulablePodsLeftoverSetsFlag 2025-11-26 14:08:05 +01:00
carlory
be6028b926
cleanup csiNodeIDMaxLength
Signed-off-by: carlory <baofa.fan@daocloud.io>
2025-11-26 17:56:56 +08:00
Mohammad Varmazyar
4f455c9c0d Refactor plugin clearing to use ClearRejectorPlugins method 2025-11-26 09:54:32 +01:00
Kai Wohlfahrt
2ba1b66b57 Fix kubelet certificate reload when connecting by IP
Currently, we set TLSConfig.Config.GetCertificate, but then also pass
certificate and key paths to http.Server.ListenAndServeTLS.

ListenAndServeTLS uses these paths to populate the TLS config Certificate
property. Then, when accepting connections, a non-nil Certificate is preferred
over GetCertificate if the ServerName is not set in ClientHelloInfo. Finally,
the Go TLS client doesn't set ServerName when connecting by IP. As a result,
when connecting to the kubelet by IP (e.g. to fetch pod logs), stale
certificates are served.

This patch passes empty certFile and keyFile arguments, to force the TLS
server to use the GetCertificate function.

This is done by clearing key/cert file config when setting GetCertificate as
suggested in PR review. This way, all downstream users of kubeDeps.TLSConfig
will do the right thing automatically.
2025-11-25 20:44:19 +01:00
kita456
950dfd612b test: add test for Ingress Update 2025-11-26 00:31:55 +09:00
Mohammad Varmazyar
d64e09c697 Clear plugins at handleSchedulingFailure and preserve both at Pop 2025-11-24 20:32:41 +01:00