kubernetes/test/e2e_node/node_container_manager_test.go

428 lines
17 KiB
Go
Raw Permalink Normal View History

//go:build linux
/*
Copyright 2017 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
2019-11-06 22:59:05 -05:00
package e2enode
import (
2020-02-07 21:16:47 -05:00
"context"
"fmt"
"os"
"path/filepath"
"strconv"
"strings"
2017-11-16 14:16:58 -05:00
"time"
v1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/api/resource"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
node: cm: migrate container manager to contextual logging This commit migrates the container manager package to use contextual logging. This follows the contextual logging migration pattern where: - Logger is passed from the boundary down to implementations - klog.TODO() is used at call sites where context is not yet available - Functions with context.Context extract logger via klog.FromContext(ctx) - klog.InfoS/ErrorS/V().InfoS calls are replaced with logger.Info/Error/V().Info - Sub-managers receive proper logger context from their callers Some call sites still use klog.TODO() as placeholders, with TODO comments indicating these will be replaced with proper contextual loggers as the migration continues upward through the call stack. Test/fake implementations use klog.Background() which is the appropriate choice for test code where no real context is available. Mock file updates (pkg/kubelet/cm/testing/mocks.go) were done manually due to mockery tool incompatibility with Go 1.25 and will be committed separately with an explanation. Key changes: 1. PodContainerManager and ContainerManager interface updates: - PodContainerManager.EnsureExists: add logger parameter - PodContainerManager.Destroy: add logger parameter - PodContainerManager.ReduceCPULimits: add logger parameter - PodContainerManager.SetPodCgroupConfig: add logger parameter - ContainerManager.UpdateQOSCgroups: add logger parameter - Update all implementations (Linux, Windows, stub, noop, fake) 2. Helper function updates: - GetKubeletContainer: add logger parameter (Linux and unsupported platforms) - Update cmd/kubelet/app/server.go to pass logger to GetKubeletContainer - Comment out type assertion in helpers.go due to signature change 3. Cgroup manager contextual logging: - CgroupManager interface methods updated to accept logger: * Destroy: add logger parameter * ReduceCPULimits: add logger parameter * SetCgroupConfig: add logger parameter - Update cgroupCommon and unsupportedCgroupManager implementations - Migrate klog.InfoS/V().InfoS calls to logger.Info/V().Info 4. Container manager implementation updates: - Extract logger from context in NewContainerManager() and Start() - Pass logger to sub-managers (deviceManager, topologyManager) - Update DRA manager initialization to use logger from context - Migrate klog.InfoS/ErrorS to logger.Info/Error throughout 5. Call sites updated with TODO comments: - pkg/kubelet/kubelet.go: UpdateQOSCgroups, EnsureExists - pkg/kubelet/kubelet_pods.go: UpdateQOSCgroups, ReduceCPULimits, Destroy - pkg/kubelet/kuberuntime/kuberuntime_manager.go: SetPodCgroupConfig - pkg/kubelet/kuberuntime/kuberuntime_manager_test.go: test expectations Most call sites currently use klog.TODO() as placeholders, with TODO comments indicating these will be replaced with proper contextual loggers as the migration continues upward through the call stack. Test/fake implementations use klog.Background() which is the appropriate choice for test code where no real context is available. Mock file updates (pkg/kubelet/cm/testing/mocks.go) were done manually due to mockery tool incompatibility with Go 1.25 and will be committed separately with an explanation. Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2025-10-30 09:13:29 -04:00
"k8s.io/klog/v2"
kubeletconfig "k8s.io/kubernetes/pkg/kubelet/apis/config"
"k8s.io/kubernetes/pkg/kubelet/cm"
"k8s.io/kubernetes/pkg/kubelet/stats/pidlimit"
admissionapi "k8s.io/pod-security-admission/api"
"k8s.io/kubernetes/test/e2e/feature"
"k8s.io/kubernetes/test/e2e/framework"
e2enodekubelet "k8s.io/kubernetes/test/e2e_node/kubeletconfig"
"github.com/onsi/ginkgo/v2"
"github.com/onsi/gomega"
)
func setDesiredConfiguration(initialConfig *kubeletconfig.KubeletConfiguration, cgroupManager cm.CgroupManager) {
initialConfig.EnforceNodeAllocatable = []string{"pods", kubeReservedCgroup, systemReservedCgroup}
Lift embedded structure out of eviction-related KubeletConfiguration fields - Changes the following KubeletConfiguration fields from `string` to `map[string]string`: - `EvictionHard` - `EvictionSoft` - `EvictionSoftGracePeriod` - `EvictionMinimumReclaim` - Adds flag parsing shims to maintain Kubelet's public flags API, while enabling structured input in the file API. - Also removes `kubeletconfig.ConfigurationMap`, which was an ad-hoc flag parsing shim living in the kubeletconfig API group, and replaces it with the `MapStringString` shim introduced in this PR. Flag parsing shims belong in a common place, not in the kubeletconfig API. I manually audited these to ensure that this wouldn't cause errors parsing the command line for syntax that would have previously been error free (`kubeletconfig.ConfigurationMap` was unique in that it allowed keys to be provided on the CLI without values. I believe this was done in `flags.ConfigurationMap` to facilitate the `--node-labels` flag, which rightfully accepts value-free keys, and that this shim was then just copied to `kubeletconfig`). Fortunately, the affected fields (`ExperimentalQOSReserved`, `SystemReserved`, and `KubeReserved`) expect non-empty strings in the values of the map, and as a result passing the empty string is already an error. Thus requiring keys shouldn't break anyone's scripts. - Updates code and tests accordingly. Regarding eviction operators, directionality is already implicit in the signal type (for a given signal, the decision to evict will be made when crossing the threshold from either above or below, never both). There is no need to expose an operator, such as `<`, in the API. By changing `EvictionHard` and `EvictionSoft` to `map[string]string`, this PR simplifies the experience of working with these fields via the `KubeletConfiguration` type. Again, flags stay the same. Other things: - There is another flag parsing shim, `flags.ConfigurationMap`, from the shared flag utility. The `NodeLabels` field still uses `flags.ConfigurationMap`. This PR moves the allocation of the `map[string]string` for the `NodeLabels` field from `AddKubeletConfigFlags` to the defaulter for the external `KubeletConfiguration` type. Flags are layered on top of an internal object that has undergone conversion from a defaulted external object, which means that previously the mere registration of flags would have overwritten any previously-defined defaults for `NodeLabels` (fortunately there were none).
2017-10-19 18:42:07 -04:00
initialConfig.SystemReserved = map[string]string{
string(v1.ResourceCPU): "100m",
string(v1.ResourceMemory): "100Mi",
string(pidlimit.PIDs): "1000",
}
Lift embedded structure out of eviction-related KubeletConfiguration fields - Changes the following KubeletConfiguration fields from `string` to `map[string]string`: - `EvictionHard` - `EvictionSoft` - `EvictionSoftGracePeriod` - `EvictionMinimumReclaim` - Adds flag parsing shims to maintain Kubelet's public flags API, while enabling structured input in the file API. - Also removes `kubeletconfig.ConfigurationMap`, which was an ad-hoc flag parsing shim living in the kubeletconfig API group, and replaces it with the `MapStringString` shim introduced in this PR. Flag parsing shims belong in a common place, not in the kubeletconfig API. I manually audited these to ensure that this wouldn't cause errors parsing the command line for syntax that would have previously been error free (`kubeletconfig.ConfigurationMap` was unique in that it allowed keys to be provided on the CLI without values. I believe this was done in `flags.ConfigurationMap` to facilitate the `--node-labels` flag, which rightfully accepts value-free keys, and that this shim was then just copied to `kubeletconfig`). Fortunately, the affected fields (`ExperimentalQOSReserved`, `SystemReserved`, and `KubeReserved`) expect non-empty strings in the values of the map, and as a result passing the empty string is already an error. Thus requiring keys shouldn't break anyone's scripts. - Updates code and tests accordingly. Regarding eviction operators, directionality is already implicit in the signal type (for a given signal, the decision to evict will be made when crossing the threshold from either above or below, never both). There is no need to expose an operator, such as `<`, in the API. By changing `EvictionHard` and `EvictionSoft` to `map[string]string`, this PR simplifies the experience of working with these fields via the `KubeletConfiguration` type. Again, flags stay the same. Other things: - There is another flag parsing shim, `flags.ConfigurationMap`, from the shared flag utility. The `NodeLabels` field still uses `flags.ConfigurationMap`. This PR moves the allocation of the `map[string]string` for the `NodeLabels` field from `AddKubeletConfigFlags` to the defaulter for the external `KubeletConfiguration` type. Flags are layered on top of an internal object that has undergone conversion from a defaulted external object, which means that previously the mere registration of flags would have overwritten any previously-defined defaults for `NodeLabels` (fortunately there were none).
2017-10-19 18:42:07 -04:00
initialConfig.KubeReserved = map[string]string{
string(v1.ResourceCPU): "100m",
string(v1.ResourceMemory): "100Mi",
string(pidlimit.PIDs): "738",
}
Lift embedded structure out of eviction-related KubeletConfiguration fields - Changes the following KubeletConfiguration fields from `string` to `map[string]string`: - `EvictionHard` - `EvictionSoft` - `EvictionSoftGracePeriod` - `EvictionMinimumReclaim` - Adds flag parsing shims to maintain Kubelet's public flags API, while enabling structured input in the file API. - Also removes `kubeletconfig.ConfigurationMap`, which was an ad-hoc flag parsing shim living in the kubeletconfig API group, and replaces it with the `MapStringString` shim introduced in this PR. Flag parsing shims belong in a common place, not in the kubeletconfig API. I manually audited these to ensure that this wouldn't cause errors parsing the command line for syntax that would have previously been error free (`kubeletconfig.ConfigurationMap` was unique in that it allowed keys to be provided on the CLI without values. I believe this was done in `flags.ConfigurationMap` to facilitate the `--node-labels` flag, which rightfully accepts value-free keys, and that this shim was then just copied to `kubeletconfig`). Fortunately, the affected fields (`ExperimentalQOSReserved`, `SystemReserved`, and `KubeReserved`) expect non-empty strings in the values of the map, and as a result passing the empty string is already an error. Thus requiring keys shouldn't break anyone's scripts. - Updates code and tests accordingly. Regarding eviction operators, directionality is already implicit in the signal type (for a given signal, the decision to evict will be made when crossing the threshold from either above or below, never both). There is no need to expose an operator, such as `<`, in the API. By changing `EvictionHard` and `EvictionSoft` to `map[string]string`, this PR simplifies the experience of working with these fields via the `KubeletConfiguration` type. Again, flags stay the same. Other things: - There is another flag parsing shim, `flags.ConfigurationMap`, from the shared flag utility. The `NodeLabels` field still uses `flags.ConfigurationMap`. This PR moves the allocation of the `map[string]string` for the `NodeLabels` field from `AddKubeletConfigFlags` to the defaulter for the external `KubeletConfiguration` type. Flags are layered on top of an internal object that has undergone conversion from a defaulted external object, which means that previously the mere registration of flags would have overwritten any previously-defined defaults for `NodeLabels` (fortunately there were none).
2017-10-19 18:42:07 -04:00
initialConfig.EvictionHard = map[string]string{"memory.available": "100Mi"}
// Necessary for allocatable cgroup creation.
initialConfig.CgroupsPerQOS = true
initialConfig.KubeReservedCgroup = kubeReservedCgroup
initialConfig.SystemReservedCgroup = systemReservedCgroup
if initialConfig.CgroupDriver == "systemd" {
initialConfig.KubeReservedCgroup = cm.NewCgroupName(cm.RootCgroupName, kubeReservedCgroup).ToSystemd()
initialConfig.SystemReservedCgroup = cm.NewCgroupName(cm.RootCgroupName, systemReservedCgroup).ToSystemd()
}
}
var _ = SIGDescribe("Node Container Manager", framework.WithSerial(), func() {
f := framework.NewDefaultFramework("node-container-manager")
f.NamespacePodSecurityLevel = admissionapi.LevelPrivileged
f.Describe("Validate Node Allocatable", feature.NodeAllocatable, func() {
ginkgo.It("sets up the node and runs the test", func(ctx context.Context) {
ginkgo.Skip("currently broken")
framework.ExpectNoError(runTest(ctx, f))
})
})
f.Describe("Validate CGroup management", func() {
// Regression test for https://issues.k8s.io/125923
// In this issue there's a race involved with systemd which seems to manifest most likely, or perhaps only
// (data gathered so far seems inconclusive) on the very first boot of the machine, so restarting the kubelet
// seems not sufficient. OTOH, the exact reproducer seems to require a dedicate lane with only this test, or
// to reboot the machine before to run this test. Both are practically unrealistic in CI.
// The closest approximation is this test in this current form, using a kubelet restart. This at least
// acts as non regression testing, so it still brings value.
ginkgo.It("should correctly start with cpumanager none policy in use with systemd", func(ctx context.Context) {
ginkgo.Skip("currently broken")
if !IsCgroup2UnifiedMode() {
ginkgo.Skip("this test requires cgroups v2")
}
var err error
var oldCfg *kubeletconfig.KubeletConfiguration
// Get current kubelet configuration
oldCfg, err = getCurrentKubeletConfig(ctx)
framework.ExpectNoError(err)
ginkgo.DeferCleanup(func(ctx context.Context) {
if oldCfg != nil {
// Update the Kubelet configuration.
framework.ExpectNoError(e2enodekubelet.WriteKubeletConfigFile(oldCfg))
ginkgo.By("Restarting the kubelet")
restartKubelet(ctx, true)
waitForKubeletToStart(ctx, f)
ginkgo.By("Started the kubelet")
}
})
newCfg := oldCfg.DeepCopy()
// Change existing kubelet configuration
newCfg.CPUManagerPolicy = "none"
newCfg.CgroupDriver = "systemd"
newCfg.FailCgroupV1 = true // extra safety. We want to avoid false negatives though, so we added the skip check earlier
// Update the Kubelet configuration.
framework.ExpectNoError(e2enodekubelet.WriteKubeletConfigFile(newCfg))
ginkgo.By("Restarting the kubelet")
restartKubelet(ctx, true)
waitForKubeletToStart(ctx, f)
ginkgo.By("Started the kubelet")
gomega.Consistently(ctx, func(ctx context.Context) bool {
return getNodeReadyStatus(ctx, f) && kubeletHealthCheck(kubeletHealthCheckURL)
}).WithTimeout(2 * time.Minute).WithPolling(2 * time.Second).Should(gomega.BeTrueBecause("node keeps reporting ready status"))
})
})
})
func expectFileValToEqual(filePath string, expectedValue, delta int64) error {
out, err := os.ReadFile(filePath)
if err != nil {
return fmt.Errorf("failed to read file %q", filePath)
}
actual, err := strconv.ParseInt(strings.TrimSpace(string(out)), 10, 64)
if err != nil {
return fmt.Errorf("failed to parse output %v", err)
}
// Ensure that values are within a delta range to work around rounding errors.
if (actual < (expectedValue - delta)) || (actual > (expectedValue + delta)) {
return fmt.Errorf("Expected value at %q to be between %d and %d. Got %d", filePath, (expectedValue - delta), (expectedValue + delta), actual)
}
return nil
}
func getAllocatableLimits(cpu, memory, pids string, capacity v1.ResourceList) (*resource.Quantity, *resource.Quantity, *resource.Quantity) {
var allocatableCPU, allocatableMemory, allocatablePIDs *resource.Quantity
// Total cpu reservation is 200m.
for k, v := range capacity {
if k == v1.ResourceCPU {
c := v.DeepCopy()
allocatableCPU = &c
allocatableCPU.Sub(resource.MustParse(cpu))
}
if k == v1.ResourceMemory {
c := v.DeepCopy()
allocatableMemory = &c
allocatableMemory.Sub(resource.MustParse(memory))
}
}
// Process IDs are not a node allocatable, so we have to do this ad hoc
pidlimits, err := pidlimit.Stats()
if err == nil && pidlimits != nil && pidlimits.MaxPID != nil {
allocatablePIDs = resource.NewQuantity(int64(*pidlimits.MaxPID), resource.DecimalSI)
allocatablePIDs.Sub(resource.MustParse(pids))
}
return allocatableCPU, allocatableMemory, allocatablePIDs
}
const (
kubeReservedCgroup = "kube-reserved"
systemReservedCgroup = "system-reserved"
nodeAllocatableCgroup = "kubepods"
)
func createIfNotExists(cm cm.CgroupManager, cgroupConfig *cm.CgroupConfig) error {
if !cm.Exists(cgroupConfig.Name) {
node: cm: migrate container manager to contextual logging This commit migrates the container manager package to use contextual logging. This follows the contextual logging migration pattern where: - Logger is passed from the boundary down to implementations - klog.TODO() is used at call sites where context is not yet available - Functions with context.Context extract logger via klog.FromContext(ctx) - klog.InfoS/ErrorS/V().InfoS calls are replaced with logger.Info/Error/V().Info - Sub-managers receive proper logger context from their callers Some call sites still use klog.TODO() as placeholders, with TODO comments indicating these will be replaced with proper contextual loggers as the migration continues upward through the call stack. Test/fake implementations use klog.Background() which is the appropriate choice for test code where no real context is available. Mock file updates (pkg/kubelet/cm/testing/mocks.go) were done manually due to mockery tool incompatibility with Go 1.25 and will be committed separately with an explanation. Key changes: 1. PodContainerManager and ContainerManager interface updates: - PodContainerManager.EnsureExists: add logger parameter - PodContainerManager.Destroy: add logger parameter - PodContainerManager.ReduceCPULimits: add logger parameter - PodContainerManager.SetPodCgroupConfig: add logger parameter - ContainerManager.UpdateQOSCgroups: add logger parameter - Update all implementations (Linux, Windows, stub, noop, fake) 2. Helper function updates: - GetKubeletContainer: add logger parameter (Linux and unsupported platforms) - Update cmd/kubelet/app/server.go to pass logger to GetKubeletContainer - Comment out type assertion in helpers.go due to signature change 3. Cgroup manager contextual logging: - CgroupManager interface methods updated to accept logger: * Destroy: add logger parameter * ReduceCPULimits: add logger parameter * SetCgroupConfig: add logger parameter - Update cgroupCommon and unsupportedCgroupManager implementations - Migrate klog.InfoS/V().InfoS calls to logger.Info/V().Info 4. Container manager implementation updates: - Extract logger from context in NewContainerManager() and Start() - Pass logger to sub-managers (deviceManager, topologyManager) - Update DRA manager initialization to use logger from context - Migrate klog.InfoS/ErrorS to logger.Info/Error throughout 5. Call sites updated with TODO comments: - pkg/kubelet/kubelet.go: UpdateQOSCgroups, EnsureExists - pkg/kubelet/kubelet_pods.go: UpdateQOSCgroups, ReduceCPULimits, Destroy - pkg/kubelet/kuberuntime/kuberuntime_manager.go: SetPodCgroupConfig - pkg/kubelet/kuberuntime/kuberuntime_manager_test.go: test expectations Most call sites currently use klog.TODO() as placeholders, with TODO comments indicating these will be replaced with proper contextual loggers as the migration continues upward through the call stack. Test/fake implementations use klog.Background() which is the appropriate choice for test code where no real context is available. Mock file updates (pkg/kubelet/cm/testing/mocks.go) were done manually due to mockery tool incompatibility with Go 1.25 and will be committed separately with an explanation. Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2025-10-30 09:13:29 -04:00
if err := cm.Create(klog.Background(), cgroupConfig); err != nil {
return err
}
}
return nil
}
func createTemporaryCgroupsForReservation(cgroupManager cm.CgroupManager) error {
// Create kube reserved cgroup
cgroupConfig := &cm.CgroupConfig{
Name: cm.NewCgroupName(cm.RootCgroupName, kubeReservedCgroup),
}
if err := createIfNotExists(cgroupManager, cgroupConfig); err != nil {
return err
}
// Create system reserved cgroup
cgroupConfig.Name = cm.NewCgroupName(cm.RootCgroupName, systemReservedCgroup)
return createIfNotExists(cgroupManager, cgroupConfig)
}
func destroyTemporaryCgroupsForReservation(cgroupManager cm.CgroupManager) error {
// Create kube reserved cgroup
cgroupConfig := &cm.CgroupConfig{
Name: cm.NewCgroupName(cm.RootCgroupName, kubeReservedCgroup),
}
node: cm: migrate container manager to contextual logging This commit migrates the container manager package to use contextual logging. This follows the contextual logging migration pattern where: - Logger is passed from the boundary down to implementations - klog.TODO() is used at call sites where context is not yet available - Functions with context.Context extract logger via klog.FromContext(ctx) - klog.InfoS/ErrorS/V().InfoS calls are replaced with logger.Info/Error/V().Info - Sub-managers receive proper logger context from their callers Some call sites still use klog.TODO() as placeholders, with TODO comments indicating these will be replaced with proper contextual loggers as the migration continues upward through the call stack. Test/fake implementations use klog.Background() which is the appropriate choice for test code where no real context is available. Mock file updates (pkg/kubelet/cm/testing/mocks.go) were done manually due to mockery tool incompatibility with Go 1.25 and will be committed separately with an explanation. Key changes: 1. PodContainerManager and ContainerManager interface updates: - PodContainerManager.EnsureExists: add logger parameter - PodContainerManager.Destroy: add logger parameter - PodContainerManager.ReduceCPULimits: add logger parameter - PodContainerManager.SetPodCgroupConfig: add logger parameter - ContainerManager.UpdateQOSCgroups: add logger parameter - Update all implementations (Linux, Windows, stub, noop, fake) 2. Helper function updates: - GetKubeletContainer: add logger parameter (Linux and unsupported platforms) - Update cmd/kubelet/app/server.go to pass logger to GetKubeletContainer - Comment out type assertion in helpers.go due to signature change 3. Cgroup manager contextual logging: - CgroupManager interface methods updated to accept logger: * Destroy: add logger parameter * ReduceCPULimits: add logger parameter * SetCgroupConfig: add logger parameter - Update cgroupCommon and unsupportedCgroupManager implementations - Migrate klog.InfoS/V().InfoS calls to logger.Info/V().Info 4. Container manager implementation updates: - Extract logger from context in NewContainerManager() and Start() - Pass logger to sub-managers (deviceManager, topologyManager) - Update DRA manager initialization to use logger from context - Migrate klog.InfoS/ErrorS to logger.Info/Error throughout 5. Call sites updated with TODO comments: - pkg/kubelet/kubelet.go: UpdateQOSCgroups, EnsureExists - pkg/kubelet/kubelet_pods.go: UpdateQOSCgroups, ReduceCPULimits, Destroy - pkg/kubelet/kuberuntime/kuberuntime_manager.go: SetPodCgroupConfig - pkg/kubelet/kuberuntime/kuberuntime_manager_test.go: test expectations Most call sites currently use klog.TODO() as placeholders, with TODO comments indicating these will be replaced with proper contextual loggers as the migration continues upward through the call stack. Test/fake implementations use klog.Background() which is the appropriate choice for test code where no real context is available. Mock file updates (pkg/kubelet/cm/testing/mocks.go) were done manually due to mockery tool incompatibility with Go 1.25 and will be committed separately with an explanation. Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2025-10-30 09:13:29 -04:00
if err := cgroupManager.Destroy(klog.Background(), cgroupConfig); err != nil {
return err
}
cgroupConfig.Name = cm.NewCgroupName(cm.RootCgroupName, systemReservedCgroup)
node: cm: migrate container manager to contextual logging This commit migrates the container manager package to use contextual logging. This follows the contextual logging migration pattern where: - Logger is passed from the boundary down to implementations - klog.TODO() is used at call sites where context is not yet available - Functions with context.Context extract logger via klog.FromContext(ctx) - klog.InfoS/ErrorS/V().InfoS calls are replaced with logger.Info/Error/V().Info - Sub-managers receive proper logger context from their callers Some call sites still use klog.TODO() as placeholders, with TODO comments indicating these will be replaced with proper contextual loggers as the migration continues upward through the call stack. Test/fake implementations use klog.Background() which is the appropriate choice for test code where no real context is available. Mock file updates (pkg/kubelet/cm/testing/mocks.go) were done manually due to mockery tool incompatibility with Go 1.25 and will be committed separately with an explanation. Key changes: 1. PodContainerManager and ContainerManager interface updates: - PodContainerManager.EnsureExists: add logger parameter - PodContainerManager.Destroy: add logger parameter - PodContainerManager.ReduceCPULimits: add logger parameter - PodContainerManager.SetPodCgroupConfig: add logger parameter - ContainerManager.UpdateQOSCgroups: add logger parameter - Update all implementations (Linux, Windows, stub, noop, fake) 2. Helper function updates: - GetKubeletContainer: add logger parameter (Linux and unsupported platforms) - Update cmd/kubelet/app/server.go to pass logger to GetKubeletContainer - Comment out type assertion in helpers.go due to signature change 3. Cgroup manager contextual logging: - CgroupManager interface methods updated to accept logger: * Destroy: add logger parameter * ReduceCPULimits: add logger parameter * SetCgroupConfig: add logger parameter - Update cgroupCommon and unsupportedCgroupManager implementations - Migrate klog.InfoS/V().InfoS calls to logger.Info/V().Info 4. Container manager implementation updates: - Extract logger from context in NewContainerManager() and Start() - Pass logger to sub-managers (deviceManager, topologyManager) - Update DRA manager initialization to use logger from context - Migrate klog.InfoS/ErrorS to logger.Info/Error throughout 5. Call sites updated with TODO comments: - pkg/kubelet/kubelet.go: UpdateQOSCgroups, EnsureExists - pkg/kubelet/kubelet_pods.go: UpdateQOSCgroups, ReduceCPULimits, Destroy - pkg/kubelet/kuberuntime/kuberuntime_manager.go: SetPodCgroupConfig - pkg/kubelet/kuberuntime/kuberuntime_manager_test.go: test expectations Most call sites currently use klog.TODO() as placeholders, with TODO comments indicating these will be replaced with proper contextual loggers as the migration continues upward through the call stack. Test/fake implementations use klog.Background() which is the appropriate choice for test code where no real context is available. Mock file updates (pkg/kubelet/cm/testing/mocks.go) were done manually due to mockery tool incompatibility with Go 1.25 and will be committed separately with an explanation. Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2025-10-30 09:13:29 -04:00
return cgroupManager.Destroy(klog.Background(), cgroupConfig)
}
// convertSharesToWeight converts from cgroup v1 cpu.shares to cgroup v2 cpu.weight
func convertSharesToWeight(shares int64) int64 {
return 1 + ((shares-2)*9999)/262142
}
func runTest(ctx context.Context, f *framework.Framework) error {
var oldCfg *kubeletconfig.KubeletConfiguration
subsystems, err := cm.GetCgroupSubsystems()
if err != nil {
return err
}
// Get current kubelet configuration
oldCfg, err = getCurrentKubeletConfig(ctx)
if err != nil {
return err
}
// Create a cgroup manager object for manipulating cgroups.
cgroupManager := cm.NewCgroupManager(klog.Background(), subsystems, oldCfg.CgroupDriver)
ginkgo.DeferCleanup(destroyTemporaryCgroupsForReservation, cgroupManager)
ginkgo.DeferCleanup(func(ctx context.Context) {
if oldCfg != nil {
// Update the Kubelet configuration.
ginkgo.By("Stopping the kubelet")
restartKubelet := mustStopKubelet(ctx, f)
// wait until the kubelet health check will fail
gomega.Eventually(ctx, func() bool {
return kubeletHealthCheck(kubeletHealthCheckURL)
2024-07-31 11:58:15 -04:00
}, time.Minute, time.Second).Should(gomega.BeFalseBecause("expected kubelet health check to be failed"))
framework.ExpectNoError(e2enodekubelet.WriteKubeletConfigFile(oldCfg))
ginkgo.By("Restarting the kubelet")
restartKubelet(ctx)
// wait until the kubelet health check will succeed
gomega.Eventually(ctx, func(ctx context.Context) bool {
return kubeletHealthCheck(kubeletHealthCheckURL)
2024-07-31 11:58:15 -04:00
}, 2*time.Minute, 5*time.Second).Should(gomega.BeTrueBecause("expected kubelet to be in healthy state"))
}
})
if err := createTemporaryCgroupsForReservation(cgroupManager); err != nil {
return err
}
newCfg := oldCfg.DeepCopy()
// Change existing kubelet configuration
setDesiredConfiguration(newCfg, cgroupManager)
// Set the new kubelet configuration.
// Update the Kubelet configuration.
ginkgo.By("Stopping the kubelet")
restartKubelet := mustStopKubelet(ctx, f)
expectedNAPodCgroup := cm.NewCgroupName(cm.RootCgroupName, nodeAllocatableCgroup)
// Cleanup from the previous kubelet, to verify the new one creates it correctly
node: cm: migrate container manager to contextual logging This commit migrates the container manager package to use contextual logging. This follows the contextual logging migration pattern where: - Logger is passed from the boundary down to implementations - klog.TODO() is used at call sites where context is not yet available - Functions with context.Context extract logger via klog.FromContext(ctx) - klog.InfoS/ErrorS/V().InfoS calls are replaced with logger.Info/Error/V().Info - Sub-managers receive proper logger context from their callers Some call sites still use klog.TODO() as placeholders, with TODO comments indicating these will be replaced with proper contextual loggers as the migration continues upward through the call stack. Test/fake implementations use klog.Background() which is the appropriate choice for test code where no real context is available. Mock file updates (pkg/kubelet/cm/testing/mocks.go) were done manually due to mockery tool incompatibility with Go 1.25 and will be committed separately with an explanation. Key changes: 1. PodContainerManager and ContainerManager interface updates: - PodContainerManager.EnsureExists: add logger parameter - PodContainerManager.Destroy: add logger parameter - PodContainerManager.ReduceCPULimits: add logger parameter - PodContainerManager.SetPodCgroupConfig: add logger parameter - ContainerManager.UpdateQOSCgroups: add logger parameter - Update all implementations (Linux, Windows, stub, noop, fake) 2. Helper function updates: - GetKubeletContainer: add logger parameter (Linux and unsupported platforms) - Update cmd/kubelet/app/server.go to pass logger to GetKubeletContainer - Comment out type assertion in helpers.go due to signature change 3. Cgroup manager contextual logging: - CgroupManager interface methods updated to accept logger: * Destroy: add logger parameter * ReduceCPULimits: add logger parameter * SetCgroupConfig: add logger parameter - Update cgroupCommon and unsupportedCgroupManager implementations - Migrate klog.InfoS/V().InfoS calls to logger.Info/V().Info 4. Container manager implementation updates: - Extract logger from context in NewContainerManager() and Start() - Pass logger to sub-managers (deviceManager, topologyManager) - Update DRA manager initialization to use logger from context - Migrate klog.InfoS/ErrorS to logger.Info/Error throughout 5. Call sites updated with TODO comments: - pkg/kubelet/kubelet.go: UpdateQOSCgroups, EnsureExists - pkg/kubelet/kubelet_pods.go: UpdateQOSCgroups, ReduceCPULimits, Destroy - pkg/kubelet/kuberuntime/kuberuntime_manager.go: SetPodCgroupConfig - pkg/kubelet/kuberuntime/kuberuntime_manager_test.go: test expectations Most call sites currently use klog.TODO() as placeholders, with TODO comments indicating these will be replaced with proper contextual loggers as the migration continues upward through the call stack. Test/fake implementations use klog.Background() which is the appropriate choice for test code where no real context is available. Mock file updates (pkg/kubelet/cm/testing/mocks.go) were done manually due to mockery tool incompatibility with Go 1.25 and will be committed separately with an explanation. Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2025-10-30 09:13:29 -04:00
if err := cgroupManager.Destroy(klog.Background(), &cm.CgroupConfig{
Name: cm.NewCgroupName(expectedNAPodCgroup),
}); err != nil {
return err
}
if cgroupManager.Exists(expectedNAPodCgroup) {
return fmt.Errorf("Expected Node Allocatable Cgroup %q not to exist", expectedNAPodCgroup)
}
framework.ExpectNoError(e2enodekubelet.WriteKubeletConfigFile(newCfg))
ginkgo.By("Starting the kubelet")
restartKubelet(ctx)
// wait until the kubelet health check will succeed
gomega.Eventually(ctx, func() bool {
return kubeletHealthCheck(kubeletHealthCheckURL)
2024-07-31 11:58:15 -04:00
}, 2*time.Minute, 5*time.Second).Should(gomega.BeTrueBecause("expected kubelet to be in healthy state"))
if err != nil {
return err
}
// Set new config and current config.
currentConfig := newCfg
if !cgroupManager.Exists(expectedNAPodCgroup) {
return fmt.Errorf("Expected Node Allocatable Cgroup %q to exist", expectedNAPodCgroup)
}
memoryLimitFile := "memory.limit_in_bytes"
if IsCgroup2UnifiedMode() {
memoryLimitFile = "memory.max"
}
// TODO: Update cgroupManager to expose a Status interface to get current Cgroup Settings.
2017-11-16 14:16:58 -05:00
// The node may not have updated capacity and allocatable yet, so check that it happens eventually.
gomega.Eventually(ctx, func(ctx context.Context) error {
nodeList, err := f.ClientSet.CoreV1().Nodes().List(ctx, metav1.ListOptions{})
2017-11-16 14:16:58 -05:00
if err != nil {
return err
}
if len(nodeList.Items) != 1 {
return fmt.Errorf("Unexpected number of node objects for node e2e. Expects only one node: %+v", nodeList)
}
cgroupName := nodeAllocatableCgroup
if currentConfig.CgroupDriver == "systemd" {
cgroupName = nodeAllocatableCgroup + ".slice"
}
2017-11-16 14:16:58 -05:00
node := nodeList.Items[0]
capacity := node.Status.Capacity
allocatableCPU, allocatableMemory, allocatablePIDs := getAllocatableLimits("200m", "200Mi", "1738", capacity)
2017-11-16 14:16:58 -05:00
// Total Memory reservation is 200Mi excluding eviction thresholds.
// Expect CPU shares on node allocatable cgroup to equal allocatable.
shares := int64(cm.MilliCPUToShares(allocatableCPU.MilliValue()))
if IsCgroup2UnifiedMode() {
// convert to the cgroup v2 cpu.weight value
if err := expectFileValToEqual(filepath.Join(subsystems.MountPoints["cpu"], cgroupName, "cpu.weight"), convertSharesToWeight(shares), 10); err != nil {
return err
}
} else {
if err := expectFileValToEqual(filepath.Join(subsystems.MountPoints["cpu"], cgroupName, "cpu.shares"), shares, 10); err != nil {
return err
}
2017-11-16 14:16:58 -05:00
}
// Expect Memory limit on node allocatable cgroup to equal allocatable.
if err := expectFileValToEqual(filepath.Join(subsystems.MountPoints["memory"], cgroupName, memoryLimitFile), allocatableMemory.Value(), 0); err != nil {
2017-11-16 14:16:58 -05:00
return err
}
// Expect PID limit on node allocatable cgroup to equal allocatable.
if err := expectFileValToEqual(filepath.Join(subsystems.MountPoints["pids"], cgroupName, "pids.max"), allocatablePIDs.Value(), 0); err != nil {
return err
}
2017-11-16 14:16:58 -05:00
// Check that Allocatable reported to scheduler includes eviction thresholds.
schedulerAllocatable := node.Status.Allocatable
// Memory allocatable should take into account eviction thresholds.
// Process IDs are not a scheduler resource and as such cannot be tested here.
allocatableCPU, allocatableMemory, _ = getAllocatableLimits("200m", "300Mi", "1738", capacity)
2017-11-16 14:16:58 -05:00
// Expect allocatable to include all resources in capacity.
if len(schedulerAllocatable) != len(capacity) {
return fmt.Errorf("Expected all resources in capacity to be found in allocatable")
}
// CPU based evictions are not supported.
if allocatableCPU.Cmp(schedulerAllocatable[v1.ResourceCPU]) != 0 {
return fmt.Errorf("Unexpected cpu allocatable value exposed by the node. Expected: %v, got: %v, capacity: %v", allocatableCPU, schedulerAllocatable[v1.ResourceCPU], capacity[v1.ResourceCPU])
}
if allocatableMemory.Cmp(schedulerAllocatable[v1.ResourceMemory]) != 0 {
return fmt.Errorf("Unexpected memory allocatable value exposed by the node. Expected: %v, got: %v, capacity: %v", allocatableMemory, schedulerAllocatable[v1.ResourceMemory], capacity[v1.ResourceMemory])
}
return nil
}, time.Minute, 5*time.Second).Should(gomega.Succeed())
cgroupPath := ""
if currentConfig.CgroupDriver == "systemd" {
cgroupPath = cm.NewCgroupName(cm.RootCgroupName, kubeReservedCgroup).ToSystemd()
} else {
cgroupPath = cgroupManager.Name(cm.NewCgroupName(cm.RootCgroupName, kubeReservedCgroup))
}
// Expect CPU shares on kube reserved cgroup to equal it's reservation which is `100m`.
kubeReservedCPU := resource.MustParse(currentConfig.KubeReserved[string(v1.ResourceCPU)])
shares := int64(cm.MilliCPUToShares(kubeReservedCPU.MilliValue()))
if IsCgroup2UnifiedMode() {
if err := expectFileValToEqual(filepath.Join(subsystems.MountPoints["cpu"], cgroupPath, "cpu.weight"), convertSharesToWeight(shares), 10); err != nil {
return err
}
} else {
if err := expectFileValToEqual(filepath.Join(subsystems.MountPoints["cpu"], cgroupPath, "cpu.shares"), shares, 10); err != nil {
return err
}
}
// Expect Memory limit kube reserved cgroup to equal configured value `100Mi`.
kubeReservedMemory := resource.MustParse(currentConfig.KubeReserved[string(v1.ResourceMemory)])
if err := expectFileValToEqual(filepath.Join(subsystems.MountPoints["memory"], cgroupPath, memoryLimitFile), kubeReservedMemory.Value(), 0); err != nil {
return err
}
// Expect process ID limit kube reserved cgroup to equal configured value `738`.
kubeReservedPIDs := resource.MustParse(currentConfig.KubeReserved[string(pidlimit.PIDs)])
if err := expectFileValToEqual(filepath.Join(subsystems.MountPoints["pids"], cgroupPath, "pids.max"), kubeReservedPIDs.Value(), 0); err != nil {
return err
}
if currentConfig.CgroupDriver == "systemd" {
cgroupPath = cm.NewCgroupName(cm.RootCgroupName, systemReservedCgroup).ToSystemd()
} else {
cgroupPath = cgroupManager.Name(cm.NewCgroupName(cm.RootCgroupName, systemReservedCgroup))
}
// Expect CPU shares on system reserved cgroup to equal it's reservation which is `100m`.
systemReservedCPU := resource.MustParse(currentConfig.SystemReserved[string(v1.ResourceCPU)])
shares = int64(cm.MilliCPUToShares(systemReservedCPU.MilliValue()))
if IsCgroup2UnifiedMode() {
if err := expectFileValToEqual(filepath.Join(subsystems.MountPoints["cpu"], cgroupPath, "cpu.weight"), convertSharesToWeight(shares), 10); err != nil {
return err
}
} else {
if err := expectFileValToEqual(filepath.Join(subsystems.MountPoints["cpu"], cgroupPath, "cpu.shares"), shares, 10); err != nil {
return err
}
}
// Expect Memory limit on node allocatable cgroup to equal allocatable.
systemReservedMemory := resource.MustParse(currentConfig.SystemReserved[string(v1.ResourceMemory)])
if err := expectFileValToEqual(filepath.Join(subsystems.MountPoints["memory"], cgroupPath, memoryLimitFile), systemReservedMemory.Value(), 0); err != nil {
return err
}
// Expect process ID limit system reserved cgroup to equal configured value `1000`.
systemReservedPIDs := resource.MustParse(currentConfig.SystemReserved[string(pidlimit.PIDs)])
if err := expectFileValToEqual(filepath.Join(subsystems.MountPoints["pids"], cgroupPath, "pids.max"), systemReservedPIDs.Value(), 0); err != nil {
return err
}
return nil
}