On a remount (e.g. CSIDriver.spec.requiresRepublish=true), the volume is
already published and the pod is observing the existing bind mount.
Removing the mount dir on a NodePublish error left the pod with stale
contents that subsequent successful republishes could not repair.
Thread the reconciler's existing isRemount signal into MounterArgs so
volume plugins can distinguish an initial publish from a republish
(e.g. CSIDriver.spec.requiresRepublish=true). No behavior change.
When a container using an NFS-backed volume (e.g. AWS EFS via the EFS
CSI driver) with subPathExpr is killed, kubelet may fail to restart it
with CreateContainerConfigError and "stale NFS file handle" because the
subpath bind mount target holds a cached NFS4 file handle that has been
invalidated server-side. IsMountPoint() calls lstat() on the target,
gets ESTALE, and prepareSubpathTarget() treats it as a hard failure. The
pod becomes permanently stuck and requires manual intervention.
The subpath teardown path was already hardened against stale NFS handles
in kubernetes/kubernetes#71804 (doCleanSubPaths), but the setup path in
prepareSubpathTarget() was never updated.
Detect corrupted mount points using mount.IsCorruptedMnt() and unmount
the stale bind mount before proceeding to re-create it.
Ref: https://github.com/kubernetes-sigs/aws-efs-csi-driver/issues/614
Ref: https://redhat.atlassian.net/browse/OCPBUGS-84229
mount.IsNotMount point has been deprecated and mounter.IsMountPoint
is now preffered.
This small refactor if prepareSubpathTarget() should not pose any
risk because IsNotMountPoint directly calls IsMountPoint and just
returns its negated value.
Update metrics-server addon from v0.8.0 to v0.8.1.
Also fix the ClusterRole resource-reader which still referenced the
deployment name from v0.7.1 (metrics-server-v0.7.1). The addon-resizer
nanny needs get/patch permission on the current deployment name to
function correctly; without this fix the nanny requests would be
denied by RBAC.
Additionally update the README link from the archived
kubernetes-incubator org to the current kubernetes-sigs org.
Signed-off-by: Humble Devassy Chirammal <humble.devassy@gmail.com>
Fix several incorrect error and log messages in volume plugins that
produce confusing or incomplete output during troubleshooting:
- Fix "MountMount.NodeExpandVolume" to "MountVolume.NodeExpandVolume"
in node_expander.go (2 occurrences)
- Fix "MountVolume.NodeExapndVolume" to "MountVolume.NodeExpandVolume"
in node_expander.go and operation_generator.go (2 occurrences)
- Fix iscsi mkdir error log that prints literal "error" instead of the
actual error value
- Fix "error ummounting" to "error unmounting" in subpath handling
- Fix malformed "with :" in teardown error messages in configmap,
secret, projected, and downwardapi volume plugins
- Fix duplicated "is is" in operation_generator.go comment
Signed-off-by: Humble Devassy Chirammal <humble.devassy@gmail.com>
Add MemoryReservationPolicy (None/HardReservation) controls memory.min. This allows
independently of memory.min protection, providing operators more
granular control over memoryQoS behavior.
Signed-off-by: Qi Wang <qiwan@redhat.com>
FSWatcher.Run() spawned a goroutine with no exit mechanism, causing a
goroutine leak. Add a ctx context.Context parameter to Run() so the
goroutine can exit cleanly when the context is canceled, and
defer-close the underlying fsnotify watcher on exit.
For kube-proxy, the existing ctx from runLoop() is passed directly.
For the flexvolume prober, ctx is stored in flexVolumeProber at
construction time via GetDynamicPluginProber(), representing the
component lifetime (kubelet/controller-manager), which is the
appropriate scope for this long-running watcher.
- bump init backoff to Duration=30ms, Factor=8 (Steps=6) to yield ~140s total
- prevent kubelet restarts when DNS is blackholed and NSS must fall back to myhostname
- keep CSI/CSINode initialization alive long enough to complete in ARO DNS-failure scenarios
When CSI's AttachRequired changes from true to false after a successful
volume attach, MarkVolumeAsAttached fails because it attempts to look up
the plugin by spec, which fails verification.
This patch passes the VolumeName directly to MarkVolumeAsAttached.
This allows the function to skip the plugin lookup and correctly mark
the volume as attached in the Actual State of World, ensuring
VolumeAttachment cleanup can proceed.
Signed-off-by: hongkang <mzhkcj50@gmail.com>
This has been replaced by `//build:...` for a long time now.
Removal of the old build tag was automated with:
for i in $(git grep -l '^// +build' | grep -v -e '^vendor/'); do if ! grep -q '^// Code generated' "$i"; then sed -i -e '/^\/\/ +build/d' "$i"; fi; done
The code actually calls os.Remove(), not rmdir(). The error message
should accurately reflect the operation being performed.
os.Remove() can remove both files and directories, while rmdir()
only removes directories
Signed-off-by: Humble Devassy Chirammal <humble.devassy@gmail.com>