Commit graph

135510 commits

Author SHA1 Message Date
Kevin Hannon
94cc00aebb set KEP-5440 to enabled by default and add two e2e tests confirming behavior 2026-02-11 23:17:21 -05:00
Kubernetes Prow Robot
311071d300
Merge pull request #133427 from natasha41575/admitHandler
[FG:InPlacePodVerticalScaling] refactor allocation feasibility check into its own admitHandler
2026-02-12 00:10:00 +05:30
Kubernetes Prow Robot
7b21ce7c9a
Merge pull request #136905 from bart0sh/PR222-e2e-fix-extended-resource-flake
Fix extended resource flake
2026-02-11 21:56:02 +05:30
Ed Bartosh
0fc147ed02 DRA: e2e: refactor device plugin deployment
Refactored deployDevicePlugin, added undeployDevicePlugin

- Add skipCleanup parameter to control cleanup behavior
- Wait for device plugin resources to become allocatable on deploy
- Add undeployDevicePlugin function to properly cleanup and wait for
  resources to be removed from node allocatable capacity
- Update tests to use refactored functions and remove duplicate code

These changes reduce test flakiness by explicitly waiting for expected
node states before proceeding with test operations.
2026-02-11 16:11:24 +02:00
Ed Bartosh
02485d02ea DRA: e2e: move util_sampledevice.go to e2e/node/framework
- Moved sample device plugin constants and helper code to the
test/e2e/node/framework, so that both deviceplugin and DRA tests can
use it without creating e2e -> e2e_node dependency.

- Moved SampleDevsAmount constant from the
test/e2e_node/device_plugin_test.go
2026-02-11 16:07:17 +02:00
Kubernetes Prow Robot
8b09f925a7
Merge pull request #130918 from iPraveenParihar/e2e/add-snapshot-metadata
Add E2E tests for CSI Snapshot Metadata functionality
2026-02-11 19:10:01 +05:30
Kubernetes Prow Robot
45b924818c
Merge pull request #136719 from pohly/e2e-gce-speedups
e2e-gce: improve performance of cluster shutdown
2026-02-11 18:12:01 +05:30
Patrick Ohly
f7a0c01d60 cluster gce: parallelize cluster cleanup a bit more
This brings down the execution time for ./hack/e2e-internal/e2e-down.sh in
e2e-gce from ~7:30min to slightly less than 5 minutes.
2026-02-11 10:50:30 +01:00
Kubernetes Prow Robot
90a76aaa9a
Merge pull request #136846 from carlory/update-cri-losing-support
kubelet: defer the configurations flags (and the related fallback behavior) deprecation removal timeline from 1.36 to 1.37 to align with containerd v1.7 support
2026-02-11 09:26:15 +05:30
Kubernetes Prow Robot
9dc55d7d9e
Merge pull request #135729 from yangjunmyfm192085/fixe2e2
test/e2e: e2e test cases `should support seccomp default, which is unconfined [LinuxOnly]`. Execution failed.
2026-02-11 09:26:08 +05:30
Kubernetes Prow Robot
fce5bc2854
Merge pull request #134316 from xigang/node_controller_pod
node_lifecycle_controller: fix processing deleted pod events, which are currently missed
2026-02-11 09:26:00 +05:30
Kubernetes Prow Robot
a79d324427
Merge pull request #136927 from BenTheElder/clean-fix
add dockerized go cache chmod to  `make clean`
2026-02-11 08:38:08 +05:30
Kubernetes Prow Robot
7b999810bf
Merge pull request #136925 from michaelasp/pipeFeatureGate
Pipe feature gate of unlockWhileProcessing
2026-02-11 08:38:01 +05:30
Kubernetes Prow Robot
b236451928
Merge pull request #136928 from BenTheElder/cleanup-unused-env
remove unused env from common.sh
2026-02-11 07:10:00 +05:30
Kubernetes Prow Robot
46ac9df8c8
Merge pull request #135675 from richabanker/merged-discovery
Peer-aggregated discovery: add GV Exclusion Manager
2026-02-11 06:12:07 +05:30
Kubernetes Prow Robot
eb09a3c23e
Merge pull request #135395 from pohly/apimachinery-wait-for-cache-sync
apimachinery + client-go + device taint eviction unit test: context-aware Start/WaitFor, waiting through channels
2026-02-11 06:11:59 +05:30
杨军10092085
d94808665c e2e test cases should support seccomp default, which is unconfined [LinuxOnly]. Execution failed. 2026-02-11 08:17:31 +08:00
Kubernetes Prow Robot
99d4b4d426
Merge pull request #135256 from natasha41575/pod-gen-field
remove Pod Generation feature gate from field descriptions
2026-02-11 05:17:59 +05:30
Benjamin Elder
2257cdc413 remove unused env from common.sh
I searched the whole repo for usage with no hits, these are stale from the rsync code
2026-02-10 15:16:56 -08:00
Benjamin Elder
a15b558fd6 add dockerized go cache to make clean 2026-02-10 14:54:44 -08:00
Michael Aspinwall
718ebb6dfc Pipe feature gates 2026-02-10 22:39:18 +00:00
Natasha Sarkar
d5dabfcd65 remove Pod Generation feature gate from field descriptions 2026-02-10 21:33:09 +00:00
Kubernetes Prow Robot
870e2928bc
Merge pull request #136716 from yonizxz/concurrent-node-syncs-split
Split from concurrent-node-syncs a separate flag for node status updates
2026-02-11 03:00:10 +05:30
Kubernetes Prow Robot
c9534cbb2e
Merge pull request #134981 from haircommander/drop-cpu-load
kubelet: drop cpu load metrics from container metrics test
2026-02-11 03:00:01 +05:30
Natasha Sarkar
0010062c24 remove unused allocatedPod var 2026-02-10 20:28:40 +00:00
Natasha Sarkar
01d42901e9 remove allocationManager's unused dependency on nodeConfig 2026-02-10 20:26:49 +00:00
Natasha Sarkar
676be66993 move resize infeasibility check into an admitHandler 2026-02-10 20:26:49 +00:00
Natasha Sarkar
694c998fb1 add 'operation' field to lifecycle.PodAdmitAttributes 2026-02-10 20:26:48 +00:00
Kubernetes Prow Robot
3e15c6fe2f
Merge pull request #136911 from pohly/dra-e2e-data-race
DRA E2E: fix data race in test driver
2026-02-11 01:24:08 +05:30
Kubernetes Prow Robot
1bb2e12490
Merge pull request #136734 from sunya-ch/sunya-ch/fix-gather-shared-id
Fix missing GetSharedDeviceIDs bug in GatherAllocatedState
2026-02-11 01:24:00 +05:30
Kubernetes Release Robot
f42571572d CHANGELOG: Update directory for v1.32.12 release 2026-02-10 19:06:20 +00:00
Kubernetes Release Robot
5fbe09a2fc CHANGELOG: Update directory for v1.33.8 release 2026-02-10 19:05:28 +00:00
Kubernetes Release Robot
3403737ffe CHANGELOG: Update directory for v1.34.4 release 2026-02-10 19:04:53 +00:00
Kubernetes Release Robot
699d840eb8 CHANGELOG: Update directory for v1.35.1 release 2026-02-10 19:03:35 +00:00
Jonathan Yaniv
0dbf8667cc Split from concurrent-node-syncs a separate flag for node status updates 2026-02-10 18:57:52 +00:00
Kubernetes Prow Robot
65f09e605c
Merge pull request #136826 from alvaroaleman/bumpv0.32
Bump structured merge diff to v6.3.2
2026-02-11 00:08:13 +05:30
Kubernetes Prow Robot
e7f26c678a
Merge pull request #136339 from ffromani/deprecate-customcpucfsquota-fg-not-feature
move to GA the `CustomCPUCFSquota` feature gate (was: deprecate the FG,  not the feature)
2026-02-11 00:08:04 +05:30
Kubernetes Prow Robot
7f890ab7ad
Merge pull request #136802 from pohly/fix-data-race-refs-populaterefs
Fix data race in PopulateRefs by copying Items and AdditionalProperties
2026-02-10 23:16:00 +05:30
Patrick Ohly
5ff323de79 client-go informers: context-aware Start + WaitForCacheSync
Passing a context to StartWithContext enables context-aware reflector
logging. This is the main remaining source of log spam (output to stderr
instead of per-test logger) in controller unit tests.

WaitForCacheSynceWithContext takes advantage of the new cache.WaitFor +
NamedHasSynced functionality to finish "immediately" (= no virtual time
passed) in a synctest bubble. While at it, the return type gets improved so
that a failure is easier to handle.
2026-02-10 17:06:47 +01:00
Patrick Ohly
fdcbb6cba9 client-go cache: wait for cache sync via channels, better logging
The main advantage is that waiting on channels creates a causal relationship
between goroutines which is visible to synctest. When a controller in a
synctest bubble does a WaitFor in a test's background goroutine for the
controller, the test can use synctest.Wait to wait for completion of cache
sync, without requiring any test specific "has controller synced" API. Without
this, the test had to poll or otherwise wait for the controller.

The polling in WaitForCacheSync moved the virtual clock forward by a random
amount, depending on how often it had to check in wait.Poll. Now tests can be
written such that all events during a test happen at a predictable time. This
will be demonstrated in a separate commit for the
pkg/controller/devicetainteviction unit test.

The benefit for normal production is immediate continuation when the last
informer is synced (not really a problem, but still...) and more important,
nicer logging thanks to the names associated with the thing that is being
waited for. The caller decides whether logging is enabled or disabled and
describes what is being waited for (typically informer caches, but maybe also
event handlers or even something else entirely as long as it implements the
DoneChecker interface).

Before:

    Waiting for caches to sync
    Caches are synced

After:

    Waiting for="cache and event handler sync"
    Done waiting for="cache and event handler sync" instance="SharedIndexInformer *v1.Pod"
    Done waiting for="cache and event handler sync" instance="SharedIndexInformer *v1.ResourceClaim"
    Done waiting for="cache and event handler sync" instance="SharedIndexInformer *v1.ResourceSlice"
    Done waiting for="cache and event handler sync" instance="SharedIndexInformer *v1.DeviceClass"
    Done waiting for="cache and event handler sync" instance="SharedIndexInformer *v1alpha3.DeviceTaintRule"
    Done waiting for="cache and event handler sync" instance="SharedIndexInformer *v1.ResourceClaim + event handler k8s.io/kubernetes/pkg/controller/devicetainteviction.(*Controller).Run"
    Done waiting for="cache and event handler sync" instance="SharedIndexInformer *v1.Pod + event handler k8s.io/kubernetes/pkg/controller/devicetainteviction.(*Controller).Run"
    Done waiting for="cache and event handler sync" instance="SharedIndexInformer *v1alpha3.DeviceTaintRule + event handler k8s.io/kubernetes/pkg/controller/devicetainteviction.(*Controller).Run"
    Done waiting for="cache and event handler sync" instance="SharedIndexInformer *v1.ResourceSlice + event handler k8s.io/kubernetes/pkg/controller/devicetainteviction.(*Controller).Run"

The "SharedIndexInformer *v1.Pod" is also how this appears in metrics.
2026-02-10 17:05:51 +01:00
Kubernetes Prow Robot
2a9b8baab7
Merge pull request #136787 from ahmedtd/bump-ctb
Push ClusterTrustBundles v1beta1 deprecation to 1.37
2026-02-10 20:30:13 +05:30
Patrick Ohly
0511a75685 DRA E2E: fix data race in test driver
UnprepareResourceClaims was not locking the mutex and thus raced with
GetPreparedResources, as found when data race detection was enabled in
kubernetes-kind-dra-all.

The locking gets added at the same level as in PrepareResourceClaims, i.e. in
the underlying implementation right before doing the actual work.

    WARNING: DATA RACE
    Write at 0x00c0047e03c0 by goroutine 9325:
      runtime.mapdelete()
          runtime/map_swiss.go:139 +0x0
      k8s.io/kubernetes/test/e2e/dra/test-driver/app.(*ExamplePlugin).nodeUnprepareResource()
          k8s.io/kubernetes/test/e2e/dra/test-driver/app/kubeletplugin.go:483 +0x391
      k8s.io/kubernetes/test/e2e/dra/test-driver/app.(*ExamplePlugin).UnprepareResourceClaims()
          k8s.io/kubernetes/test/e2e/dra/test-driver/app/kubeletplugin.go:496 +0x19d
      k8s.io/dynamic-resource-allocation/kubeletplugin.(*nodePluginImplementation).NodeUnprepareResources()
          k8s.io/dynamic-resource-allocation/kubeletplugin/draplugin.go:941 +0x4c1
      k8s.io/kubelet/pkg/apis/dra/v1._DRAPlugin_NodeUnprepareResources_Handler.func1()
          k8s.io/kubelet/pkg/apis/dra/v1/api_grpc.pb.go:181 +0xbe
      k8s.io/kubernetes/test/e2e/dra/test-driver/app.(*ExamplePlugin).recordGRPCCall()
          k8s.io/kubernetes/test/e2e/dra/test-driver/app/kubeletplugin.go:523 +0x2a6
      k8s.io/kubernetes/test/e2e/dra/test-driver/app.(*ExamplePlugin).recordGRPCCall-fm()
          <autogenerated>:1 +0x8f
      google.golang.org/grpc.getChainUnaryHandler.func1()
          google.golang.org/grpc@v1.78.0/server.go:1241 +0x23c
      k8s.io/kubernetes/test/e2e/dra/utils.(*Driver).interceptor()
          k8s.io/kubernetes/test/e2e/dra/utils/deploy.go:1045 +0x37d
      k8s.io/kubernetes/test/e2e/dra/utils.(*Driver).SetUp.func8()
          k8s.io/kubernetes/test/e2e/dra/utils/deploy.go:657 +0xb2
      google.golang.org/grpc.getChainUnaryHandler.func1.getChainUnaryHandler.1()
          google.golang.org/grpc@v1.78.0/server.go:1241 +0x11e
      k8s.io/dynamic-resource-allocation/kubeletplugin.(*grpcServer).interceptor()
          k8s.io/dynamic-resource-allocation/kubeletplugin/nonblockinggrpcserver.go:157 +0x545
      k8s.io/dynamic-resource-allocation/kubeletplugin.(*grpcServer).interceptor-fm()
          <autogenerated>:1 +0x8f
      google.golang.org/grpc.getChainUnaryHandler.func1()
          google.golang.org/grpc@v1.78.0/server.go:1241 +0x23c
      k8s.io/dynamic-resource-allocation/kubeletplugin.startGRPCServer.unaryContextInterceptor.func2()
          k8s.io/dynamic-resource-allocation/kubeletplugin/nonblockinggrpcserver.go:99 +0x7e
      google.golang.org/grpc.NewServer.chainUnaryServerInterceptors.chainUnaryInterceptors.func1()
          google.golang.org/grpc@v1.78.0/server.go:1232 +0xe7
      k8s.io/kubelet/pkg/apis/dra/v1._DRAPlugin_NodeUnprepareResources_Handler()
          k8s.io/kubelet/pkg/apis/dra/v1/api_grpc.pb.go:183 +0x1e6
      google.golang.org/grpc.(*Server).processUnaryRPC()
          google.golang.org/grpc@v1.78.0/server.go:1428 +0x1a69
      google.golang.org/grpc.(*Server).handleStream()
          google.golang.org/grpc@v1.78.0/server.go:1832 +0x185b
      google.golang.org/grpc.(*Server).serveStreams.func2.1()
          google.golang.org/grpc@v1.78.0/server.go:1063 +0x149

    Previous read at 0x00c0047e03c0 by goroutine 9257:
      runtime.mapIterStart()
          runtime/map_swiss.go:160 +0x0
      k8s.io/kubernetes/test/e2e/dra/test-driver/app.(*ExamplePlugin).GetPreparedResources()
          k8s.io/kubernetes/test/e2e/dra/test-driver/app/kubeletplugin.go:506 +0x112
      k8s.io/kubernetes/test/e2e/dra/test-driver/app.(*ExamplePlugin).GetPreparedResources-fm()
          <autogenerated>:1 +0x33
      runtime.call16()
          runtime/asm_amd64.s:774 +0x42
      reflect.Value.Call()
          reflect/value.go:365 +0xb5
      github.com/onsi/gomega/internal.(*AsyncAssertion).buildActualPoller.func3()
          github.com/onsi/gomega@v1.39.0/internal/async_assertion.go:337 +0x244
      github.com/onsi/gomega/internal.(*AsyncAssertion).match()
          github.com/onsi/gomega@v1.39.0/internal/async_assertion.go:560 +0xe01
      github.com/onsi/gomega/internal.(*AsyncAssertion).Should()
          github.com/onsi/gomega@v1.39.0/internal/async_assertion.go:145 +0xc4
      k8s.io/kubernetes/test/e2e/dra.init.func1.2.4()
          k8s.io/kubernetes/test/e2e/dra/dra.go:314 +0x4fc
      k8s.io/kubernetes/test/e2e/dra.init.func1.2.4()
          k8s.io/kubernetes/test/e2e/dra/dra.go:304 +0x204
      github.com/onsi/ginkgo/v2/internal.extractBodyFunction.func2()
          github.com/onsi/ginkgo/v2@v2.27.4/internal/node.go:517 +0x5d
      github.com/onsi/ginkgo/v2/internal.(*Suite).runNode.func3()
          github.com/onsi/ginkgo/v2@v2.27.4/internal/suite.go:945 +0x6ed
2026-02-10 15:20:48 +01:00
Kubernetes Prow Robot
01b283ad45
Merge pull request #136907 from aojea/ipaddress_flake
fix flake on ipaddress allocator integration test
2026-02-10 19:30:01 +05:30
Patrick Ohly
45251e5f65 client-go cache: allow passing name+logger to DeltaFIFO, RealFIFO and Reflector
This improves logging and enables more informative waiting for cache sync in a
following commit. It addresses one klog.TODO in the Reflector.

The RealFIFOOptions and InformerOptions structs get extended the same way as
DeltaFIFOOptions before: a logger may be set, but it's not required. This is
not an API break.

That the name has to be passed separately is a bit annoying at first glance
because it could also be set directly on the logger through WithName, but
keeping it separate is better:
- name can be set without providing a logger
- name can be defaulted
- less code in the caller when passing through a logger and adding
  the name only in the field
- last but not least, extracting the name is not supported in a portable
  manner by logr

All in-tree references in production code get updated.

While at it, logging in the fifos gets updated to follow best practices: if
some code encounters an abnormal situation and then continues, it should use
utilruntime.HandleErrorWithLogger instead of normal error logging.

Existing "logger" fields get moved to the top because that is a more common
place for such a read-only field.
2026-02-10 13:48:30 +01:00
Antonio Ojea
2cfc90672a
fix flake on ipaddress allocator integration test
The test need to consider the time for the Delete operation to populate
the ipallocator informer, otherwise, it can happen the allocator fails
with a range full failing the test.

Co-authored-by: hiirrxnn <hiren2004sharma@gmail.com>
2026-02-10 10:34:47 +00:00
Kubernetes Prow Robot
59cddedb04
Merge pull request #136901 from Phaow/vac-fix
test: bring back the VAC roll-forward test
2026-02-10 15:46:05 +05:30
Kubernetes Prow Robot
467099411d
Merge pull request #136898 from carlory/kubeadm-ContainerRuntimeVersion-1-37
kubeadm:  bump the version in the ContainerRuntimeVersionCheck warning message from 1.36 to 1.37
2026-02-10 14:48:18 +05:30
Kubernetes Prow Robot
76b4a9019c
Merge pull request #136326 from bart0sh/PR218-migrate-kubelet_node_status-to-contextual-logging
Migrate kubelet_node_status* to contextual logging
2026-02-10 14:48:10 +05:30
Kubernetes Prow Robot
65b1000a7d
Merge pull request #135749 from novahe/fix-defer-latency
Fix: Incorrect duration metric recording incorrect values
2026-02-10 14:48:01 +05:30
Kubernetes Prow Robot
44dc4cb68c
Merge pull request #136888 from neolit123/revert-136130-kubeadm_use_newclientset
Revert "kubeadm: switch tests to NewClientset"
2026-02-10 13:00:07 +05:30