Commit graph

846 commits

Author SHA1 Message Date
Jordan Liggitt
203d8ac838
Generate and format files
- Run hack/update-codegen.sh
    - Run hack/update-generated-device-plugin.sh
    - Run hack/update-generated-protobuf.sh
    - Run hack/update-generated-runtime.sh
    - Run hack/update-generated-swagger-docs.sh
    - Run hack/update-openapi-spec.sh
    - Run hack/update-gofmt.sh

Replay of a9593d634c
2022-12-20 17:26:07 -05:00
xing-yang
479f049df9 Fix unit test 2022-09-07 22:44:17 -04:00
ZhangKe10140699
62e1ea58c4 Fix problem in updating VolumeAttached in node status 2022-09-07 18:21:39 -04:00
Jean-Francois Remy
56bfc202e4 Add unit tests
- actual_state_of_world_test.go: test the new method GetVolumesToReportAttachedForNode
  for an existing node and a non-existing node
- node_status_updater_test.go: test UpdateNodeStatuses and UpdateNodeStatuses in nominal
  case with 2 nodes getting one volume each. Test UpdateNodeStatuses with the first call
  to node.patch failing but the following one succeeding
- add comment in node_status_updater.go
- fix log line in reconciler.go
- rename variable in actual_state_of_world.go
2022-03-02 10:48:18 -08:00
Jean-Francois Remy
a5faf0b5ce Fix nodes volumesAttached status not updated
The UpdateNodeStatuses code stops too early in case there is
an error when calling updateNodeStatus. It will return immediately
which means any remaining node won't have its update status put back
to true.

Looking at the call sites for UpdateNodeStatuses, it appears this is
not the only issue. If the lister call fails with anything but a Not Found
error, it's silently ignored which is wrong in the detach path.
Also the reconciler detach path calls UpdateNodeStatuses but the real intent
is to only update the node currently processed in the loop and not proceed
with the detach call if there is an error updating that specifi node volumesAttached
property. With the current implementation, it will not proceed if there is
an error updating another node (which is not completely bad but not ideal) and
worse it will proceed if there is a lister error on that node which means the
node volumesAttached property won't have been updated.

To fix those issues, introduce the following changes:
- [node_status_updater] introduce UpdateNodeStatusForNode which does what
  UpdateNodeStatuses does but only for the provided node
- [node_status_updater] if the node lister call fails for anything but a Not
  Found error, we will return an error, not ignore it
- [node_status_updater] if the update of a node volumesAttached properties fails
  we continue processing the other nodes
- [actual_state_of_world] introduce GetVolumesToReportAttachedForNode which
  does what GetVolumesToReportAttached but for the node whose name is provided
  it returns a bool which indicates if the node in question needs an update as
  well as the volumesAttached list. It is used by UpdateNodeStatusForNode
- [actual_state_of_world] use write lock in updateNodeStatusUpdateNeeded, we're
  modifying the map content
- [reconciler] use UpdateNodeStatusForNode in the detach loop
2022-03-02 10:47:28 -08:00
Hemant Kumar
f61c4b18c4 use node informer to check volumes attachment status before backoff
fix unit tests
2022-01-06 12:15:56 -05:00
Kubernetes Prow Robot
6805e6ee41
Merge pull request #104722 from leiyiz/migration
turning on the CSIMigrationGCE feature flag
2021-11-16 15:28:32 -08:00
Kubernetes Prow Robot
f151a40d8d
Merge pull request #106154 from gnufied/recover-expansion-failure-123
Recover expansion failure
2021-11-16 13:21:34 -08:00
Léiyì Zhang
275fdf0884 fixing unit test failures induced by turning on CSIMigrationGCE
disable CSIMigrationGCE in some unit tests
2021-11-16 19:26:30 +00:00
Hemant Kumar
1ddd598d31 Implement controller and kubelet changes for recovery from resize
failures
2021-11-16 11:06:46 -05:00
Kubernetes Prow Robot
ce98eda406
Merge pull request #106376 from jsafrane/stabilize-unit-test
Fix deletion protection unit test
2021-11-15 13:04:48 -08:00
Neha Lohia
fa1b6765d5
move pkg/util/node to component-helpers/node/util (#105347)
Signed-off-by: Neha Lohia <nehapithadiya444@gmail.com>
2021-11-12 07:52:27 -08:00
Jan Safranek
bb8157d780 Fix deletion protection unit test
The test should not depend on current set of default feature gates, it
should always ensure the ones necessary for the tests are set.
2021-11-12 10:47:15 +01:00
Deepak Kinni
bfd5f23a0b PV controller changes to support PV Deletion protection finalizer
Signed-off-by: Deepak Kinni <dkinni@vmware.com>
2021-11-08 10:35:58 -08:00
Konstantin Misyutin
808c8f42d5 Remove StorageObjectInUseProtection feature gate logic
This feature has graduated to GA in v1.11 and will always be
enabled. So no longe need to check if enabled.

Signed-off-by: Konstantin Misyutin <konstantin.misyutin@huawei.com>
2021-11-03 00:13:50 +03:00
Mike Dame
4960d0976a Wire contexts to Core controllers 2021-11-01 10:29:00 -04:00
Kubernetes Prow Robot
c592bd40f2
Merge pull request #105609 from pohly/generic-ephemeral-volume-ga
generic ephemeral volume GA
2021-10-28 17:36:50 -07:00
Konstantin Misyutin
dbc9d7b71a Remove tests when StorageObjectInUseProtection feature is disabled
As well as feature gate are locked, the tests when this feature is
disabled will crash. So we should remove them together with locking
the feature.

Signed-off-by: Konstantin Misyutin <konstantin.misyutin@huawei.com>
2021-10-15 19:39:37 +08:00
Kubernetes Prow Robot
baaa53db64
Merge pull request #105211 from xiaopingrubyist/fix-pv-controller-claim-cache-issue
fix:claim cached in pvcontroller is not the newest may cause unexpected issue
2021-10-14 05:47:18 -07:00
torubylist
f28a8d7f2b fix:cached claim is not the newest will cause unexpected issue 2021-10-13 20:03:00 +08:00
Patrick Ohly
a8c930ef46 generic ephemeral volume: graduation to GA
The feature gate gets locked to "true", with the goal to remove it in two
releases.

All code now can assume that the feature is enabled. Tests for "feature
disabled" are no longer needed and get removed.

Some code wasn't using the new helper functions yet. That gets changed while
touching those lines.
2021-10-11 20:54:20 +02:00
Kubernetes Prow Robot
b0eac84937
Merge pull request #105345 from pohly/generic-ephemeral-volume-util
generic ephemeral volume util, base code and controller
2021-10-07 08:19:47 -07:00
Patrick Ohly
4ae0eecb34 controller: use generic ephemeral volume helper functions
The name concatenation and ownership check were originally considered small
enough to not warrant dedicated functions, but the intent of the code is more
readable with them.

There also was a missing owner check in the attach controller.
2021-10-06 14:01:44 +02:00
Kubernetes Prow Robot
debd6c1e9e
Merge pull request #104526 from jingxu97/aug/volumeattach
Fix issue in node status updating VolumeAttached list
2021-10-05 17:30:32 -07:00
Jing Xu
69b9f9b1f0 Fix issue in node status updating VolumeAttached list
During volume detach, the following might happen in reconciler

1. Pod is deleting
2. remove volume from reportedAsAttached, so node status updater will
update volumeAttached list
3. detach failed due to some issue
4. volume is added back in reportedAsAttached
5. reconciler loops again the volume, remove volume from
reportedAsAttached
6. detach will not be trigged because exponential back off, detach call
will fail with exponential backoff error
7. another pod is added which using the same volume on the same node
8. reconciler loops and it will NOT try to tigger detach anymore

At this point, volume is still attached and in actual state, but
volumeAttached list in node status does not has this volume anymore, and
will block volume mount from kubelet.

The fix in first round is to add volume back into the volume list that
need to reported as attached at step 6 when detach call failed with
error (exponentical backoff). However this might has some performance
issue if detach fail for a while. During this time, volume will be keep
removing/adding back to node status which will cause a surge of API
calls.

So we changed to logic to check first whether operation is safe to retry which
means no pending operation or it is not in exponentical backoff time
period before calling detach. This way we can avoid keep removing/adding
volume from node status.

Change-Id: I5d4e760c880d72937d34b9d3e904ecad125f802e
2021-10-05 09:44:35 -07:00
Kubernetes Prow Robot
b6924839ca
Merge pull request #101987 from sky-philipalmeida/patch-1
Log if PV is still in use trying to delete it
2021-09-23 14:30:54 -07:00
Phil
f1a9402082 Log if PV is still in use trying to delete it
Similar to what we have in:
https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/volume/pvcprotection/pvc_protection_controller.go#L181
The objective is to have a easy way to monitor if a PV will enter in Terminating state due to a failed removal when still in use.
This way we can capture the PV log and alert according.
The code is not tested.

Update pv_protection_controller.go

Change call to Infof
2021-09-21 18:05:16 +01:00
Shivanshu Raj Shrivastava
bbd809cbd0
Fixing incorrectly migrated structured logs (#105122)
* added keys for structured logging

* used KObj
2021-09-19 12:28:08 -07:00
Kubernetes Prow Robot
bcd2ffbdc1
Merge pull request #104590 from Jiawei0227/anno
Add GA AnnStorageProvisioner annotation to PVC
2021-09-03 06:09:49 -07:00
Kubernetes Prow Robot
fca3175df7
Merge pull request #104231 from astraw99/fix_unified_workers
Unify controller worker num param `threadiness` to `workers`
2021-08-27 09:34:05 -07:00
Jiawei Wang
8de0f11946 Add GA AnnStorageProvisioner annotation to PVC
This PR adds GA AnnStorageProvisioner annotation to
a PVC if the PVC requires dynamic provisioning. This
also deprecates the beta AnnStorageProvisioner annotation
and it will be removed in a later release.
2021-08-26 12:46:47 -07:00
Stephen Augustus
481cf6fbe7
generated: Run hack/update-gofmt.sh
Signed-off-by: Stephen Augustus <foo@auggie.dev>
2021-08-24 15:47:49 -04:00
Konstantin Misyutin
29bd66d018 Remove "pkg/controller/volume/scheduling" dependency from "pkg/scheduler/framework/plugins"
All dependencies of VolumeBinding plugin from
"k8s.io/kubernetes/pkg/controller/volume/scheduling" package moved to
"k8s.io/kubernetes/pkg/scheduler/framework/plugins/volumebinding" package:

- whole file pkg/controller/volume/scheduling/scheduler_assume_cache.go
- whole file pkg/controller/volume/scheduling/scheduler_assume_cache_test.go
- whole file pkg/controller/volume/scheduling/scheduler_binder.go
- whole file pkg/controller/volume/scheduling/scheduler_binder_fake.go
- whole file pkg/controller/volume/scheduling/scheduler_binder_test.go

Package "k8s.io/kubernetes/pkg/controller/volume/scheduling/metrics" moved
to "k8s.io/kubernetes/pkg/scheduler/framework/plugins/volumebinding/metrics"
because it only used in VolumeBinding plugin and (e2e) tests.

More described in issue #89930 and PR #102953.

Signed-off-by: Konstantin Misyutin <konstantin.misyutin@huawei.com>
2021-08-13 19:08:45 +08:00
astraw99
e6df935fd3 unify worker num to workers 2021-08-09 15:46:04 +08:00
SataQiu
7fa0b9b6c1 add --concurrent-ephemeralvolume-syncs flag for kube-controller-manager 2021-07-25 21:36:57 +08:00
Cheng Xing
0e315355df Pass FsGroup to MountDevice 2021-07-03 16:29:42 -07:00
Chris Henzie
83e3ee780a Rename access mode contains helper method
So it is consistent with other methods performing the same check (one
for internal and external types)
2021-06-28 21:24:56 -07:00
Jordan Liggitt
ca279bbcc1 Fix race in attachdetach tests 2021-06-04 01:59:32 -04:00
yuzhiquan
0b8dc56408 fix volume failing test 2021-06-04 09:45:21 +08:00
Tim Ebert
cd3709232f
Fix VolumeAttachment garbage collection for migrated PVs 2021-05-28 08:35:05 +02:00
Kubernetes Prow Robot
894803ab2e
Merge pull request #98199 from yangjunmyfm192085/run-test3
fix mistake about [avaliable] for index_test.go
2021-05-25 02:46:22 -07:00
Kubernetes Prow Robot
838a967be5
Merge pull request #101175 from lojies/cleanupforpvcontroller
code cleanup:remove redundant return statement in pv_controller.go
2021-05-24 21:48:49 -07:00
Jiawei Wang
be583070d2 Use CSI driver to determine unique name for migrated in-tree plugins 2021-05-06 10:31:30 -07:00
Kubernetes Prow Robot
fe88bdc1ab
Merge pull request #101304 from wangyx1992/capatial-log-controller
cleanup: fix errors in wrapped format and log capitalization in controller
2021-04-22 15:55:52 -07:00
wangyx1992
fd51e654af cleanup: fix errors in wrapped format and log capitalization in controller
Signed-off-by: wangyx1992 <wang.yixiang@zte.com.cn>
2021-04-22 15:40:54 +08:00
andyzhangx
e10d3948f5 fix: azure file namespace issue in csi translation
fix build failure

fix comments
2021-04-20 07:23:09 +00:00
Kubernetes Prow Robot
df9ad4d7d2
Merge pull request #96094 from Hellcatlk/m
Some comments' typos
2021-04-16 11:54:22 -07:00
卢振兴10069964
8009823867 code cleanup:remove redundant return statement in pv_controller.go 2021-04-16 09:02:21 +08:00
Kubernetes Prow Robot
410d092d8a
Merge pull request #99643 from pohly/generic-ephemeral-volume-beta
generic ephemeral volume beta
2021-03-09 17:39:26 -08:00
Kubernetes Prow Robot
5155865ae2
Merge pull request #99326 from sunpa93/fs_resize_fix
fix: use pv annotation to trigger filesystem resize when necessary
2021-03-09 11:05:18 -08:00