Commit graph

115407 commits

Author SHA1 Message Date
Miciah Masters
fc18ffe58d TopologyAwareHints: Take lock in HasPopulatedHints
Prevent potential concurrent map access by taking a lock before reading the
topology cache's hintsPopulatedByService map.

* staging/src/k8s.io/endpointslice/topologycache/topologycache.go
(setHintsLocked, hasPopulatedHintsLocked): New helper functions.  These are
the same as the existing SetHints and HasPopulatedHints methods except that
these helpers assume that a lock is already held.
(SetHints): Use setHintsLocked.
(HasPopulatedHints): Take a lock and use hasPopulatedHintsLocked.
(AddHints): Take a lock and use setHintsLocked and hasPopulatedHintsLocked.
* staging/src/k8s.io/endpointslice/topologycache/topologycache_test.go
(TestTopologyCacheRace): Add a goroutine that calls HasPopulatedHints.
2023-08-31 16:38:51 -04:00
Kubernetes Prow Robot
32842f1d00
Merge pull request #120054 from ruiwen-zhao/automated-cherry-pick-of-#119986-upstream-release-1.27
Automated cherry pick of #119986: Pass Pinned field to kubecontainer.Image
2023-08-30 08:44:47 -07:00
Kubernetes Prow Robot
4737b87479
Merge pull request #119432 from ffromani/automated-cherry-pick-of-#118635-upstream-release-1.27-1689697264
[1.27]  kubelet: devices: skip allocation for running pods #118635
2023-08-30 07:42:47 -07:00
Kubernetes Prow Robot
7c74995186
Merge pull request #120209 from mimowo/automated-cherry-pick-of-#120204-upstream-release-1.27
Automated cherry pick of #120204: Mark Job onPodConditions as optional in pod failure policy
2023-08-28 11:36:13 -07:00
Michal Wozniak
3449ab21b4 Mark Job onPodConditions as optional in pod failure policy 2023-08-28 17:19:14 +02:00
Kubernetes Prow Robot
350fe2642a
Merge pull request #120067 from HirazawaUi/automated-cherry-pick-of-#116506-upstream-release-1.27
Automated cherry pick of #116506: generate ReportingInstance and ReportingController in Event
2023-08-28 01:07:23 -07:00
Kubernetes Prow Robot
e370904e28
Merge pull request #120020 from divyasri537/automated-cherry-pick-of-#119341-upstream-release-1.27
Automated cherry pick of #119341: Ignore context canceled from validate and mutate webhook
2023-08-25 09:48:52 -07:00
Kubernetes Release Robot
dd6bc2548e Update CHANGELOG/CHANGELOG-1.27.md for v1.27.5 2023-08-24 01:07:20 +00:00
Kubernetes Release Robot
93e0d7146f Release commit for Kubernetes v1.27.5 2023-08-24 00:42:11 +00:00
Divya Sri Sanaganapalli
94c19e36b7 Incorporating feedback on 119341 2023-08-23 20:28:32 +00:00
Kubernetes Prow Robot
38c97fa67e
Merge pull request #120135 from ritazh/cherry-pick-cve-2023-3955-1.27
Cherry pick of #120128 Use environment variables for parameters in Powershell
2023-08-23 12:35:57 -07:00
Kubernetes Prow Robot
8904833942
Merge pull request #120130 from ritazh/cherry-pick-cve-2023-3676-1.27
Cherry pick of #120127 Use env variables for passing path and subpath to Powershell
2023-08-23 10:51:29 -07:00
James Sturtevant
acc29048e6
Use environment varaibles for parameters in Powershell
As a defense in depth, pass parameters to powershell via environment variables.

Signed-off-by: James Sturtevant <jstur@microsoft.com>
2023-08-23 07:00:53 -07:00
James Sturtevant
172644fb55
Use env varaibles for passing path
The subpath could be passed a powershell subexpression which would be executed by kubelet with privilege.  Switching to pass the arguments via environment variables means the subexpression won't be evaluated.

Signed-off-by: James Sturtevant <jstur@microsoft.com>
2023-08-23 06:39:13 -07:00
HirazawaUi
9c142a62f7 generate ReportingInstance and ReportingController in Event 2023-08-19 17:19:44 +08:00
ruiwen-zhao
78e14a2d80 Pass Pinned field to kubecontainer.Image
Signed-off-by: ruiwen-zhao <ruiwen@google.com>
2023-08-18 17:38:43 +00:00
Divya Sri Sanaganapalli
27d56a7332 Skip apiserver_admission_webhook_request_total during context-canceled 2023-08-17 13:35:50 +00:00
Divya Sri Sanaganapalli
6c8436b1f5 Ignore context canceled from validate and mutate webhook failopen metric 2023-08-17 13:35:50 +00:00
Kubernetes Prow Robot
00dfa0634b
Merge pull request #119868 from liggitt/automated-cherry-pick-of-#119835-upstream-release-1.27
Automated cherry pick of #119835: Avoid returning nil responseKind in v1beta1 aggregated
2023-08-10 07:13:27 -07:00
Jordan Liggitt
3b6bcaa0b9
Avoid returning nil responseKind in v1beta1 aggregated discovery 2023-08-09 14:43:56 -04:00
Kubernetes Prow Robot
bd722aa3ff
Merge pull request #119828 from jeremyrickard/go1207-1.27
[release-1.27] releng/go: Bump images, versions and deps to use Go 1.…
2023-08-08 16:41:50 -07:00
Jeremy Rickard
94b3e00eef
[release-1.27] releng/go: Bump images, versions and deps to use Go 1.20.7
Signed-off-by: Jeremy Rickard <jeremyrrickard@gmail.com>
2023-08-08 09:57:24 -06:00
Francesco Romani
e5512149e2 node: devicemgr: topomgr: add logs
One of the contributing factors of issues #118559 and #109595 hard to
debug and fix is that the devicemanager has very few logs in important
flow, so it's unnecessarily hard to reconstruct the state from logs.

We add minimal logs to be able to improve troubleshooting.
We add minimal logs to be backport-friendly, deferring a more
comprehensive review of logging to later PRs.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-08-08 13:36:40 +02:00
Francesco Romani
c3c0ce4b0b e2e: node: add test to check device-requiring pods are cleaned up
Make sure orphanded pods (pods deleted while kubelet is down) are
handled correctly.
Outline:
1. create a pod (not static pod)
2. stop kubelet
3. while kubelet is down, force delete the pod on API server
4. restart kubelet
the pod becomes an orphaned pod and is expected to be killed by HandlePodCleanups.

There is a similar test already, but here we want to check device
assignment.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-08-08 13:36:40 +02:00
Francesco Romani
5bad46a243 e2e: node: devices: improve the node reboot test
The recently added e2e device plugins test to cover node reboot
works fine if runs every time on CI environment (e.g CI) but
doesn't handle correctly partial setup when run repeatedly on
the same instance (developer setup).

To accomodate both flows, we extend the error management, checking
more error conditions in the flow.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-08-08 13:36:40 +02:00
Francesco Romani
dd851f4880 e2e: node: devicemanager: update tests
Fix e2e device manager tests.
Most notably, the workload pods needs to survive a kubelet
restart. Update tests to reflect that.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-08-08 13:36:39 +02:00
Francesco Romani
34f2a5803a kubelet: devices: skip allocation for running pods
When kubelet initializes, runs admission for pods and possibly
allocated requested resources. We need to distinguish between
node reboot (no containers running) versus kubelet restart (containers
potentially running).

Running pods should always survive kubelet restart.
This means that device allocation on admission should not be attempted,
because if a container requires devices and is still running when kubelet
is restarting, that container already has devices allocated and working.

Thus, we need to properly detect this scenario in the allocation step
and handle it explicitely. We need to inform
the devicemanager about which pods are already running.

Note that if container runtime is down when kubelet restarts, the
approach implemented here won't work. In this scenario, so on kubelet
restart containers will again fail admission, hitting
https://github.com/kubernetes/kubernetes/issues/118559 again.
This scenario should however be pretty rare.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-08-08 13:36:13 +02:00
Kubernetes Prow Robot
de56018f04
Merge pull request #117269 from tnqn/automated-cherry-pick-of-#117245-#117249-upstream-release-1.27
Automated cherry pick of #117245: Fix TopologyAwareHint not working when zone label is added
#117249: Fix a data race in TopologyCache
2023-08-04 13:26:31 -07:00
Kubernetes Prow Robot
521580378a
Merge pull request #119363 from jsafrane/automated-cherry-pick-of-#117804-upstream-release-1.27
Automated cherry pick of #117804: Refactor FindAttachablePluginBySpec out of CSI code path
2023-08-04 11:58:08 -07:00
Kubernetes Prow Robot
d35a1c8a7a
Merge pull request #119620 from liggitt/automated-cherry-pick-of-#117710-upstream-release-1.27
Automated cherry pick of #117710: e2e_node: move getSampleDevicePluginPod to
2023-08-03 16:12:20 -07:00
Kubernetes Prow Robot
579208d961
Merge pull request #117486 from TommyStarK/automated-cherry-pick-of-#117449-upstream-release-1.27
Automated cherry pick of #117449: e2e: fix flaky test 'should contain OpenAPI V3 for Aggregated
2023-08-02 09:00:43 -07:00
Kubernetes Prow Robot
2ac615ccde
Merge pull request #117235 from cvvz/automated-cherry-pick-of-#116134-origin-release-1.27
Automated cherry pick of #116134: fix: After a Node is down and take some time to get back to up again, the mount point of the evicted Pods cannot be cleaned up successfully.
2023-08-02 05:32:44 -07:00
Kubernetes Prow Robot
559f43d49c
Merge pull request #119466 from mimowo/automated-cherry-pick-of-#119434-upstream-release-1.27
Automated cherry pick of #119434: Include ignored pods when computing backoff delay for Job pod
2023-08-02 04:36:54 -07:00
Kubernetes Prow Robot
382c283f33
Merge pull request #119113 from champtar/automated-cherry-pick-of-#118922-upstream-release-1.27
Automated cherry pick of #118922: kubeadm: backdate generated CAs
2023-08-02 04:36:42 -07:00
Kubernetes Prow Robot
05b64c6b5e
Merge pull request #119604 from a7i/automated-cherry-pick-of-#118549-upstream-release-1.27
Automated cherry pick of #118549: fix 'pod' in kubelet prober metrics
2023-07-28 01:35:56 -07:00
Kubernetes Prow Robot
ecd45047e4
Merge pull request #119572 from andrewsykim/automated-cherry-pick-of-#118601-origin-release-1.27
Automated cherry pick of #118601: priority & fairness: support dynamic max seats
2023-07-28 00:05:54 -07:00
Hana (Hyang-Ah) Kim
927dba2589
e2e_node: move getSampleDevicePluginPod to device_plugin_test.go
image_list.go is one of the files included in the non-test variant Go build list, but its getSampleDevicePluginPod function references readDaemonSetV1OrDie function defined in device_plugin_test.go which is included in the test variant Go build list only. (The file name is *_test.go).

As a result, "go build" fails with the undefined reference error.

In practice, that may not be an issue since k8s project contributors aren't meant to run go build on this package. However, tools that depend on go build to operate - e.g., gopls or govulncheck ./... - will report this as an error.

Fix this error and make test/e2e package pass go build by moving this file to also test-only source code.
2023-07-27 12:09:17 -04:00
Amir Alavi
db832fdfa6
fix 'pod' in kubelet prober metrics 2023-07-26 21:44:00 -04:00
Andrew Sy Kim
4c67c5d5e7 priority & fairness: support dynamically configuring work estimator max seats
Max seats from prioriy & fairness work estimator is now min(0.15 x
nominalCL, nominalCL/handSize)

'Max seats' calculated by work estimator is currently hard coded to 10.
When using lower values for --max-requests-inflight, a single
LIST request taking up 10 seats could end up using all if not most seats in
the priority level. This change updates the default work estimator
config such that 'max seats' is at most 10% of the
maximum concurrency limit for a priority level, with an upper limit of 10.
This ensures seats taken from LIST request is proportional to the total
available seats.

Signed-off-by: Andrew Sy Kim <andrewsy@google.com>
2023-07-25 20:20:07 +00:00
Kubernetes Prow Robot
6d31f4b31b
Merge pull request #119519 from jingxu97/automated-cherry-pick-of-#118451-upstream-release-1.27
Add mininumKubelet tag into ReadWriteOncePod test
2023-07-24 15:22:14 -07:00
jinxu
17c98720e8 Add mininumKubelet tag into ReadWriteOncePod test
ReadWriteOncePod feature needs min requirement of 1.27 kubelet, add the
tag to skip test if kubelet version is smaller than 1.27

Change-Id: I27959156db90f2477cead6dfc16f42dbc54663bc
2023-07-22 09:38:04 -07:00
Michal Wozniak
ed0cdc9e0b Include ignored pods when computing backoff delay for Job pod failures
# Conflicts:
#	pkg/controller/job/job_controller.go
2023-07-21 09:31:49 +02:00
Michal Wozniak
ae24a5cf74 Remarks 2023-07-21 09:29:47 +02:00
Michal Wozniak
9e1050b4d9 Adjust the algorithm for computing the pod finish time
Change-Id: Ic282a57169cab8dc498574f08b081914218a1039
2023-07-20 16:29:26 +02:00
Kubernetes Release Robot
fa950050cc Update CHANGELOG/CHANGELOG-1.27.md for v1.27.4 2023-07-19 12:38:41 +00:00
Kubernetes Release Robot
fa3d799010 Release commit for Kubernetes v1.27.4 2023-07-19 12:14:48 +00:00
Kubernetes Prow Robot
d794e0e5cf
Merge pull request #119366 from xmudrii/go1206-1.27
[release-1.27] releng/go: Bump images, versions and deps to use Go 1.20.6
2023-07-17 06:37:12 -07:00
Marko Mudrinić
a1b127ca7a
[release-1.27] releng/go: Bump images, versions and deps to use Go 1.20.6
Signed-off-by: Marko Mudrinić <mudrinic.mare@gmail.com>
2023-07-17 12:26:55 +02:00
Jan Safranek
aefc4d0392 Rename updateReconstructedFromAPIServer
to be in sync with volumesNeedUpdateFromNodeStatus.
2023-07-17 11:18:48 +02:00
Jan Safranek
eeba02fc62 Rename volumesNeedDevicePath
To volumesNeedUpdateFromNodeStatus - because both devicePath and uncertain
attach-ability needs to be fixed from node status.
2023-07-17 11:18:48 +02:00