hack: switch local-up-cluster to ptp CNI for reliable DIND networking

The ci-kubernetes-local-e2e job has been flaky (~40-45% success rate)
due to intermittent DNS/service connectivity failures. Investigation
revealed that the root cause is the bridge CNI plugin's dependency on
br_netfilter and bridge-nf-call-iptables kernel settings, which are
unreliable in docker-in-docker environments.

This change switches from bridge CNI to ptp (point-to-point) CNI,
which creates direct veth pairs between pods and the host namespace.
This eliminates the need for br_netfilter entirely since traffic
doesn't cross a Linux bridge.

This approach is proven to work reliably by KIND (Kubernetes IN Docker),
which uses the same ptp CNI configuration.

Changes:
- Switch CNI plugin from "bridge" to "ptp"
- Remove bridge-specific options (bridge, isGateway, promiscMode)
- Add kernel network parameter configuration:
  - route_localnet=1: enables routing to localhost after NAT
  - arp_ignore=0: required for ptp CNI's /32 addresses
  - ip_forward=1: ensures pod-to-pod traffic routing

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
This commit is contained in:
Davanum Srinivas 2026-01-30 08:17:06 -05:00
parent f4938574d4
commit 0ee5729eff
No known key found for this signature in database
GPG key ID: 6DEA177048756885

View file

@ -1318,8 +1318,12 @@ function install_cni {
\) \
-delete
# containerd in kubekins supports CNI version 0.4.0
echo "Configuring cni"
# Configure CNI using ptp (point-to-point) plugin instead of bridge.
# ptp creates direct veth pairs between pods and host namespace, which
# avoids the need for br_netfilter and bridge-nf-call-iptables settings
# that are unreliable in docker-in-docker environments.
# This approach is proven to work reliably by KIND (Kubernetes IN Docker).
echo "Configuring cni (ptp mode)"
sudo mkdir -p "$CNI_CONFIG_DIR"
cat << EOF | sudo tee "$CNI_CONFIG_DIR"/10-containerd-net.conflist
{
@ -1327,11 +1331,8 @@ function install_cni {
"name": "containerd-net",
"plugins": [
{
"type": "bridge",
"bridge": "cni0",
"isGateway": true,
"type": "ptp",
"ipMasq": true,
"promiscMode": true,
"ipam": {
"type": "host-local",
"ranges": [
@ -1387,6 +1388,26 @@ if [[ "${KUBETEST_IN_DOCKER:-}" == "true" ]]; then
# configure shared mounts to prevent failure in DIND scenarios
mount --make-rshared /
# Configure kernel network parameters for container networking.
# These settings are required for ptp CNI and iptables-based kube-proxy
# to work correctly in docker-in-docker environments.
# See KIND's network configuration for reference.
echo "Configuring kernel network parameters for DIND..."
# Enable route_localnet - allows routing to localhost addresses after NAT.
# Required for proper DNS resolution in containers.
echo 1 > /proc/sys/net/ipv4/conf/all/route_localnet
# Set arp_ignore=0 - required for ptp CNI which uses /32 addresses.
# Ensures ARP replies are sent for all local addresses.
echo 0 > /proc/sys/net/ipv4/conf/all/arp_ignore
# Ensure IP forwarding is enabled for pod-to-pod traffic
echo 1 > /proc/sys/net/ipv4/ip_forward
echo 1 > /proc/sys/net/ipv6/conf/all/forwarding 2>/dev/null || true
echo "Kernel network parameters configured"
# to use containerd as kubelet container runtime we need to install cni
install_cni