Commit graph

2412 commits

Author SHA1 Message Date
Kubernetes Prow Robot
3520485f85
Merge pull request #139503 from princepereira/ppereira-create-lb-failure
Handling syscall failures when hns is not running.
2026-06-13 00:52:47 +05:30
Prince Pereira
bc9e4c12ef Handling sycall failures when hns is not running. 2026-06-12 17:10:32 +05:30
Kubernetes Prow Robot
dac71e780a
Merge pull request #139629 from Bafff/fix/conntrack-cleanup-no-endpoints
kube-proxy: clear stale conntrack entries for UDP services with no endpoints
2026-06-12 13:26:49 +05:30
Baf
fd81afe040 kube-proxy: clear stale conntrack entries for UDP services with no endpoints
The conntrack reconciler skips services without serving endpoints, so
conntrack entries established while endpoints existed are never removed
when a UDP service scales down to zero. The REJECT (iptables) / reject
(nftables) rule installed for such services does not cover those flows:
they are DNATed to the deleted endpoint IP before the rule, which
matches on the service IP, can be evaluated. One-way UDP senders (e.g.
statsd clients) refresh the 30s conntrack timeout with every packet, so
the stale flows blackhole traffic to the deleted pod IP indefinitely;
recovery only happens when the service gets an endpoint again.

This was handled before the reconciler rewrite (kubernetes#127318):
the event-based cleanup cleared entries for every deleted UDP endpoint
regardless of how many endpoints remained.

Process services with an empty endpoints set instead of skipping them,
so every entry directed to their ClusterIP, LoadBalancer IP and
ExternalIP frontends is treated as stale and deleted.

NodePort cleanup is still skipped for services without serving
endpoints: NodePort entries are matched on the destination port only,
and with an empty endpoints set that would also remove UDP flows not
owned by kube-proxy (e.g. traffic to an unrelated host on the same
port).
2026-06-11 19:01:48 +01:00
Prince Pereira
c62228debd Handling sycall failures when hns is not running. 2026-06-11 21:28:50 +05:30
Prince Pereira
92c1071355 Handling sycall failures when hns is not running. 2026-06-11 21:28:45 +05:30
Prince Pereira
57cfc45762 Handling sycall failures when hns is not running. 2026-06-11 21:14:49 +05:30
Prince Pereira
ba9ce27fd7 Handling sycall failures when hns is not running. 2026-06-11 21:14:42 +05:30
Lukasz Wojciechowski
a800e7077f Fix TestClassifyLBError test case for nil error
The classifyLBError function returns lbErrNone when err is nil,
but the test was incorrectly expecting lbErrOther.

Also add lbErrNone to TestLBErrorTypeConstants verification.
2026-06-10 23:54:35 +02:00
Prince Pereira
62f1d9062c Add Prometheus metrics for KubeProxy failed loadbalancer operations 2026-06-09 20:30:36 +00:00
Vinayak Mohanty
4d306fc68e
fix: truncate service comments in nftables to prevent length limit violations
refactor: rename svcPortNameString to svcPortComment and update test validation in nftables proxier
2026-06-06 01:49:08 +05:30
Adrian Moisey
f7265100cb
KEP-5495: Add featuregate for IPVS 2026-06-03 21:04:16 +02:00
ytcisme
3616ffa284 proxy/ipvs: avoid per-interface RTM_GETADDR dump in GetAllLocalAddressesExcept
GetAllLocalAddressesExcept previously iterated over net.Interfaces() and
called iface.Addrs() for each interface. iface.Addrs() internally performs
a full RTM_GETADDR netlink dump for the entire node and then filters in
user space. With many interfaces and many addresses (for example tens of
thousands of ClusterIPs bound to kube-ipvs0) the cost is
O(N_interfaces * N_addresses) and dominates syncProxyRules latency.

This change replaces the per-interface loop with a single
netlink.AddrList(nil, unix.AF_UNSPEC) call that dumps all addresses on
the node in one RTM_GETADDR request, then filters by LinkIndex in user
space. This makes the call O(N_addresses) and avoids the per-interface
fan-out.

On a production node with 251 interfaces and 19757 addresses, this
reduces GetAllLocalAddressesExcept latency from 34.8s to 60ms (~705x).
2026-05-24 13:31:10 +08:00
Joe Betz
9d65aeabb6
Explicitly disable validation-gen where not needed 2026-05-12 12:49:01 -04:00
Joe Betz
119a1460c1
Generate deepcopy 2026-05-11 12:27:56 -04:00
Dan Winship
0af2c0a767 Pass complete KubeProxyConfiguration to NewNodeManager 2026-04-29 10:35:14 -04:00
Dan Winship
6492838d08 Pass complete KubeProxyConfiguration to NewProxier methods
Simplify the interface between cmd/kube-proxy and the backends by
passing the complete KubeProxyConfiguration to the backend rather than
having kube-proxy need to know specifically which fields each backend
cares about.
2026-04-29 10:35:12 -04:00
Dan Winship
fe50a9420a Consistently import pkg/proxy/apis/config as kubeproxyconfig 2026-04-29 08:51:50 -04:00
Kubernetes Prow Robot
7d3b347d20
Merge pull request #138571 from aojea/proxy_large_cluster_nosync
kube-proxy: don't do full periodic syncs on large cluster mode
2026-04-27 03:12:46 +05:30
Kubernetes Prow Robot
5a555755ba
Merge pull request #138479 from yashsingh74/meta_proxier-test
test: add unit coverage for pkg/proxy/metaproxier dual-stack dispatch
2026-04-25 02:50:53 +05:30
Antonio Ojea
9daabbd6c7
kube-proxy: don't do full periodic syncs on large cluster mode
Periodic full-syncs are just reconcile loops just in case somehow
the dataplane has drifted, however, they have an important cost on large
clusters.

We can avoid to perform full-sync if kube-proxy is in the "largecluster"
mode, we are already doing some optimization, so it is reasonable to
avoid the penalty of a full sync for a "just in case" operation.

Signed-off-by: Antonio Ojea <aojea@google.com>
2026-04-24 13:35:49 +00:00
yashsingh74
4b881b0494
Adds unit tests for pkg/proxy/metaproxier, which implements dual-stack kube-proxy
Signed-off-by: yashsingh74 <yashsingh1774@gmail.com>
2026-04-23 20:35:02 +05:30
Prince Pereira
e74d21b288 Delete remote endpoint if it has same ip as local endpoint in the system. 2026-04-09 11:16:49 +05:30
Prince Pereira
c8ffcf005c Delete remote endpoint if it has same ip as local endpoint in the system. 2026-04-09 11:16:49 +05:30
shwetha-s-poojary
467ce9c6cd Fix TestGetConntrackMax to align with capped conntrack max values 2026-03-18 15:25:21 +05:30
Kubernetes Prow Robot
547b17cc13
Merge pull request #137293 from adrianmoisey/adrian-kep-5707
KEP-5707: Deprecate Service.spec.externalIPs
2026-03-18 05:19:54 +05:30
Kubernetes Prow Robot
6175c54954
Merge pull request #137002 from kairosci/gc-notfound-136525
Cap nf_conntrack limits to prevent excessive memory usage on high-core machines
2026-03-18 04:19:39 +05:30
Alessio Attilio
65564e2077 Cap nf_conntrack limits to prevent excessive memory usage on high-core machines 2026-03-13 21:42:05 +01:00
Kubernetes Prow Robot
a3ac5144e7
Merge pull request #137501 from danwinship/nftables-list-redux
Fix kube-proxy on systems with nft 1.1.3 (take 2)
2026-03-12 19:41:45 +05:30
Kubernetes Prow Robot
958c10e37e
Merge pull request #137370 from Nordix/fix-double-bind
fix(kube-proxy): fix health check binding failure in case of dual-stack
2026-03-12 03:01:43 +05:30
Alessio Attilio
117df3de4d pkg/proxy/nftables: fix kube-proxy crash with newer nftables versions
Fixes kube-proxy's nftables mode to work on systems with nft 1.1.3.
2026-03-11 10:50:07 -04:00
Kubernetes Prow Robot
05c01a2e80
Merge pull request #136499 from danwinship/nftables-hairpin-2
further nftables masquerading improvements
2026-03-07 03:32:17 +05:30
Patrick Ohly
b895ce734f golangci-lint: bump to logtools v0.10.1
This fixes a bug that caused log calls involving `klog.Logger` to not be
checked.

As a result we have to fix some code that is now considered faulty:

    ERROR: pkg/controller/serviceaccount/tokens_controller.go:382:1: A function should accept either a context or a logger, but not both. Having both makes calling the function harder because it must be defined whether the context must contain the logger and callers have to follow that. (logcheck)
    ERROR: func (e *TokensController) generateTokenIfNeeded(ctx context.Context, logger klog.Logger, serviceAccount *v1.ServiceAccount, cachedSecret *v1.Secret) ( /* retry */ bool, error) {
    ERROR: ^
    ERROR: pkg/controller/storageversionmigrator/storageversionmigrator.go:299:1: A function should accept either a context or a logger, but not both. Having both makes calling the function harder because it must be defined whether the context must contain the logger and callers have to follow that. (logcheck)
    ERROR: func (svmc *SVMController) runMigration(ctx context.Context, logger klog.Logger, gvr schema.GroupVersionResource, resourceMonitor *garbagecollector.Monitor, toBeProcessedSVM *svmv1beta1.StorageVersionMigration, listResourceVersion string) (err error, failed bool) {
    ERROR: ^
    ERROR: pkg/proxy/node.go:121:3: logging function "Error" should not use format specifier "%q" (logcheck)
    ERROR: 		klog.FromContext(ctx).Error(nil, "Timed out waiting for node %q to exist", nodeName)
    ERROR: 		^
    ERROR: pkg/proxy/node.go:123:3: logging function "Error" should not use format specifier "%q" (logcheck)
    ERROR: 		klog.FromContext(ctx).Error(nil, "Timed out waiting for node %q to be assigned IPs", nodeName)
    ERROR: 		^
    ERROR: pkg/scheduler/backend/queue/scheduling_queue.go:610:1: A function should accept either a context or a logger, but not both. Having both makes calling the function harder because it must be defined whether the context must contain the logger and callers have to follow that. (logcheck)
    ERROR: func (p *PriorityQueue) runPreEnqueuePlugin(ctx context.Context, logger klog.Logger, pl fwk.PreEnqueuePlugin, pInfo *framework.QueuedPodInfo, shouldRecordMetric bool) *fwk.Status {
    ERROR: ^
    ERROR: pkg/scheduler/framework/plugins/dynamicresources/extendeddynamicresources.go:286:1: A function should accept either a context or a logger, but not both. Having both makes calling the function harder because it must be defined whether the context must contain the logger and callers have to follow that. (logcheck)
    ERROR: func (pl *DynamicResources) deleteClaim(ctx context.Context, claim *resourceapi.ResourceClaim, logger klog.Logger) error {
    ERROR: ^
    ERROR: pkg/scheduler/framework/plugins/dynamicresources/extendeddynamicresources.go:499:1: A function should accept either a context or a logger, but not both. Having both makes calling the function harder because it must be defined whether the context must contain the logger and callers have to follow that. (logcheck)
    ERROR: func (pl *DynamicResources) waitForExtendedClaimInAssumeCache(
    ERROR: ^
    ERROR: pkg/scheduler/framework/plugins/dynamicresources/extendeddynamicresources.go:528:1: A function should accept either a context or a logger, but not both. Having both makes calling the function harder because it must be defined whether the context must contain the logger and callers have to follow that. (logcheck)
    ERROR: func (pl *DynamicResources) createExtendedResourceClaimInAPI(
    ERROR: ^
    ERROR: pkg/scheduler/framework/plugins/dynamicresources/extendeddynamicresources.go:592:1: A function should accept either a context or a logger, but not both. Having both makes calling the function harder because it must be defined whether the context must contain the logger and callers have to follow that. (logcheck)
    ERROR: func (pl *DynamicResources) unreserveExtendedResourceClaim(ctx context.Context, logger klog.Logger, pod *v1.Pod, state *stateData) {
    ERROR: ^
    ERROR: pkg/scheduler/framework/runtime/batch.go:171:1: A function should accept either a context or a logger, but not both. Having both makes calling the function harder because it must be defined whether the context must contain the logger and callers have to follow that. (logcheck)
    ERROR: func (b *OpportunisticBatch) batchStateCompatible(ctx context.Context, logger klog.Logger, pod *v1.Pod, signature fwk.PodSignature, cycleCount int64, state fwk.CycleState, nodeInfos fwk.NodeInfoLister) bool {
    ERROR: ^
    ERROR: staging/src/k8s.io/component-base/featuregate/feature_gate.go:890:4: Additional arguments to Info should always be Key Value pairs. Please check if there is any key or value missing. (logcheck)
    ERROR: 			logger.Info("Warning: SetEmulationVersionAndMinCompatibilityVersion will change already queried feature", "featureGate", feature, "oldValue", oldVal, newVal)
    ERROR: 			^
    ERROR: test/images/sample-device-plugin/sampledeviceplugin.go:108:2: logging function "Info" should not use format specifier "%s" (logcheck)
    ERROR: 	logger.Info("pluginSocksDir: %s", pluginSocksDir)
    ERROR: 	^
    ERROR: test/images/sample-device-plugin/sampledeviceplugin.go:123:2: logging function "Info" should not use format specifier "%s" (logcheck)
    ERROR: 	logger.Info("CDI_ENABLED: %s", cdiEnabled)
    ERROR: 	^

While waiting for this to merge, another call was added which also doesn't
follow conventions:

    ERROR: pkg/kubelet/kubelet.go:2454:1: A function should accept either a context or a logger, but not both. Having both makes calling the function harder because it must be defined whether the context must contain the logger and callers have to follow that. (logcheck)
    ERROR: func (kl *Kubelet) deletePod(ctx context.Context, logger klog.Logger, pod *v1.Pod) error {
    ERROR: ^

Contextual logging has been beta and enabled by default for several releases
now. It's mostly just a matter of wrapping up and declaring it GA. Therefore
the calls which directly call WithName or WithValues (always have an effect)
are left as-is instead of converting them to use the klog wrappers (support
disabling the effect). To allow that, the linter gets reconfigured to not
complain about this anymore, anywhere.

The calls which would have to be fixed otherwise are:

    ERROR: pkg/kubelet/cm/dra/claiminfo.go:170:11: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 	logger = logger.WithName("dra-claiminfo")
    ERROR: 	         ^
    ERROR: pkg/kubelet/cm/dra/healthinfo.go:45:11: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 	logger = logger.WithName("dra-healthinfo")
    ERROR: 	         ^
    ERROR: pkg/kubelet/cm/dra/healthinfo.go:89:11: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 	logger = logger.WithName("dra-healthinfo")
    ERROR: 	         ^
    ERROR: pkg/kubelet/cm/dra/healthinfo.go:157:11: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 	logger = logger.WithName("dra-healthinfo")
    ERROR: 	         ^
    ERROR: pkg/kubelet/cm/dra/manager.go:175:12: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 	logger := klog.FromContext(ctx).WithName("dra-manager")
    ERROR: 	          ^
    ERROR: pkg/kubelet/cm/dra/manager.go:239:12: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 	logger := klog.FromContext(ctx).WithName("dra-manager")
    ERROR: 	          ^
    ERROR: pkg/kubelet/cm/dra/manager.go:593:12: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 	logger := klog.FromContext(ctx).WithName("dra-manager")
    ERROR: 	          ^
    ERROR: pkg/kubelet/cm/dra/manager.go:781:12: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 	logger := klog.FromContext(context.Background()).WithName("dra-manager")
    ERROR: 	          ^
    ERROR: pkg/kubelet/cm/dra/manager.go:898:12: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 	logger := klog.FromContext(ctx).WithName("dra-manager")
    ERROR: 	          ^
    ERROR: pkg/kubelet/cm/dra/manager_test.go:1638:15: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 				logger := klog.FromContext(streamCtx).WithName(st.Name())
    ERROR: 				          ^
    ERROR: pkg/kubelet/cm/dra/plugin/dra_plugin.go:77:12: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 	logger := klog.FromContext(ctx).WithName("dra-plugin")
    ERROR: 	          ^
    ERROR: pkg/kubelet/cm/dra/plugin/dra_plugin.go:108:12: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 	logger := klog.FromContext(ctx).WithName("dra-plugin")
    ERROR: 	          ^
    ERROR: pkg/kubelet/cm/dra/plugin/dra_plugin.go:161:12: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 	logger := klog.FromContext(ctx).WithName("dra-plugin")
    ERROR: 	          ^
    ERROR: staging/src/k8s.io/dynamic-resource-allocation/resourceslice/tracker/tracker.go:695:14: function "WithValues" should be called through klogr.LoggerWithValues (logcheck)
    ERROR: 			logger := logger.WithValues("device", deviceID)
    ERROR: 			          ^
    ERROR: test/integration/apiserver/watchcache_test.go:42:54: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 	etcd0URL, stopEtcd0, err := framework.RunCustomEtcd(klog.FromContext(ctx).WithName("etcd0"), "etcd_watchcache0", etcdArgs)
    ERROR: 	                                                    ^
    ERROR: test/integration/apiserver/watchcache_test.go:47:54: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 	etcd1URL, stopEtcd1, err := framework.RunCustomEtcd(klog.FromContext(ctx).WithName("etcd1"), "etcd_watchcache1", etcdArgs)
    ERROR: 	                                                    ^
    ERROR: test/integration/scheduler_perf/scheduler_perf.go:1149:12: function "WithName" should be called through klogr.LoggerWithName (logcheck)
    ERROR: 		logger = logger.WithName(tCtx.Name())
    ERROR: 		         ^
2026-03-04 12:08:18 +01:00
Tero Kauppinen
0659a346ea
fix(kube-proxy): fix health check binding failure in case of dual-stack
In case of dual-stack, kube-proxy tries to bind both IPv4 and IPv6
health check instances to the same address and port pair which causes
the following error message in the log: 'bind: address already in use'.

Fix the issue by binding IPv4 instance to a 'tcp4' socket and IPv6 instance
to a 'tcp6' socket.

Signed-off-by: Tero Kauppinen <tero.kauppinen@est.tech>
2026-03-03 14:20:27 +02:00
Dan Winship
475f9622c8 Squash nftables endpoint chains into service vmap
We only need a separate chain for the endpoints if the service uses affinity.
2026-03-02 11:05:52 -05:00
Dan Winship
e17963cb99 Do nftables hairpin handling centrally rather than per-endpoint 2026-03-02 11:05:52 -05:00
Dan Winship
aa3a30d134 Do clusterIP masquerading centrally rather than per-service 2026-03-02 11:05:52 -05:00
Dan Winship
75aab220b4 Add NodeName to all EndpointSlices in nftables proxier unit tests
Previously it was leaving NodeName unset in many cases. Give all of
the endpoints an explicit NodeName, making them explicitly local in
all the test cases that don't care either way, and explicitly
non-local in those test cases that did care but were previously just
relying on the fact that a nil NodeName would be treated as remote.
2026-03-02 11:05:50 -05:00
Adrian Moisey
80c6507cce
KEP-5707: Deprecate Service.spec.externalIPs 2026-03-01 11:49:27 +02:00
Dan Winship
ea8bad22e6 Revert "pkg/proxy/nftables: fix kube-proxy crash with newer nftables versions"
This reverts commit 72ef5b34a8.
2026-02-20 08:28:25 -05:00
Mads Jensen
bbbc09fb11 proxy/utils: Use net.JoinHostPort to format address. 2026-02-15 16:53:17 +01:00
Kubernetes Prow Robot
5b63a8c68e
Merge pull request #136921 from dims/dump-from-utils
Move dump package from apimachinery to k8s.io/utils
2026-02-12 22:28:10 +05:30
Davanum Srinivas
550cc8645b
Move dump package from apimachinery to k8s.io/utils
Replace all imports of k8s.io/apimachinery/pkg/util/dump with
k8s.io/utils/dump across the repo. The apimachinery dump package
now contains deprecated wrapper functions that delegate to
k8s.io/utils/dump for backwards compatibility.

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2026-02-12 07:34:19 -05:00
Alessio Attilio
72ef5b34a8 pkg/proxy/nftables: fix kube-proxy crash with newer nftables versions
Fixes kube-proxy's nftables mode to work on systems with nft 1.1.3.
2026-02-11 21:46:23 +01:00
ansilh
440cfca4ef refactor(kube-proxy): remove redundant empty endpoints check in topologyModeFromHints
The len(endpoints) == 0 check is now redundant since the hasReadyEndpoints
check handles this case when the slice is empty, the loop executes zero
times, hasReadyEndpoints stays false, and returns "" via the same path.
2026-02-06 21:56:58 +05:30
ansilh
18f56fa7c7 fix(kube-proxy): skip topology hints logging when no ready endpoints exist
When all endpoints are non-ready (ready=false, serving=false, terminating=false),
the topologyModeFromHints function was incorrectly logging "Ignoring same-zone
topology hints for service since no hints were provided for zone" because the
boolean flags remained at their initial values after the loop skipped all
non-ready endpoints.

This fix adds tracking for whether any ready endpoints were processed and
returns early if none exist, avoiding misleading log messages.

Also adds a test case covering this scenario.
2026-02-06 21:46:05 +05:30
Kubernetes Prow Robot
437184c055
Merge pull request #136292 from atombrella/feature/modernize_plusbuild
Remove obsolete `// +build` instruction.
2026-01-26 19:05:59 +05:30
Dan Winship
3c1ad42773 Add a helper for the operation-counting unit tests in nftables
(This will make it easier to keep the counts in sync when we change
things.)
2026-01-24 09:42:42 -05:00
Kubernetes Prow Robot
7cdeb11327
Merge pull request #135800 from danwinship/nftables-hairpin
rework nftables masquerading code, part 1
2026-01-24 10:33:39 +05:30
Prince Pereira
4198b789f5 Fix for preferred dualstack and required dualstack in winkernel proxier. 2026-01-21 00:57:09 +05:30