Commit graph

1695 commits

Author SHA1 Message Date
Brad Davidson
6199b79f4b Add etcd snapshot metrics
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-02-18 11:09:42 -08:00
Roberto Bonafiglia
3f2373b55a Revert "Add ability to pass configuration options to flannel backend"
This reverts commit 8643576985.

Signed-off-by: Roberto Bonafiglia <roberto.bonafiglia@suse.com>
2025-02-14 18:33:46 +01:00
Brad Davidson
bc45972398 Update containerd config schema to version 3
Ref: https://github.com/containerd/containerd/blob/release/2.0/docs/cri/config.md

Since this is a breaking change, add support for a new v3 template file. If no v3 template is present, fall back to checking for the legacy v2 template and render the old structure.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-02-07 12:03:48 -08:00
Brad Davidson
124e46bccf Upgrade containerd to v2.0.2
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-02-07 12:03:48 -08:00
Brad Davidson
77cf99aa5f Bump traefik to 3.3.2
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-02-07 07:47:32 -08:00
Brad Davidson
96c2dd3865 Skip netpol startup on windows instead of panicing
Netpol startup is skipped with a warning on linux if ipset support is missing, we should do the same on windows

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-02-07 07:46:19 -08:00
Brad Davidson
99f4f5ad12 Add linux nodeSelector to local-storage and metrics-server
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-02-07 07:46:19 -08:00
Brad Davidson
85987ac23f Fix default pause image on windows
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-02-07 07:46:19 -08:00
Brad Davidson
50326c8bca Add missing windows runtime type definition
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-02-07 07:46:19 -08:00
Brad Davidson
8aa412ed66 Fix windows path quoting/escaping in containerd config template
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-02-07 07:46:19 -08:00
Brad Davidson
bf97b8facc Fix containerd hosts.toml path on windows
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-02-07 07:46:19 -08:00
Brad Davidson
838d68777f Fix permissions checks on windows
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-02-07 07:46:19 -08:00
Brad Davidson
b2418ba354 Replace hardcoded unix-style paths in test
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-02-07 07:46:19 -08:00
Brad Davidson
8f85ee3c60 Remove broken unused windows test
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-02-07 07:46:19 -08:00
Brad Davidson
4cacf6e1c0 Make etcd test linux-only
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-02-07 07:46:19 -08:00
Brad Davidson
0d15457c77 Fix linux-specific clientaccess test
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-02-07 07:46:19 -08:00
Brad Davidson
85b3775071 Consolidate linux and windows containerd config templates
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-02-05 11:34:05 -08:00
Brad Davidson
0d028a2283 Add support for AWS shared credentials file
Also adds a CLI flag and fields for session token, which must be passed
alongside the access key and secret when using temporary credentials.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-01-29 00:45:56 -08:00
manuelbuil
2b00ef5b46 Correct the k3s token command help
Signed-off-by: manuelbuil <mbuil@suse.com>
2025-01-29 07:46:58 +01:00
github-actions[bot]
28300ea154
Bump Local Path Provisioner version (#11657)
* chore: Bump Local Path Provisioner version

Made with ❤️️ by updatecli

* chore: Bump Local Path Provisioner version

Made with ❤️️ by updatecli

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-01-27 11:03:35 -08:00
Brad Davidson
95700aa6b3 Update p2p boostrap helpers for Spegel v0.0.30
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-01-27 11:02:27 -08:00
Brad Davidson
fd8348324d Disable s3 transport transparent compression/decompression
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-01-27 11:00:01 -08:00
Brad Davidson
976b23d432 Update tests
Also add an ordinal to subtests so its easier to figure out which one is failing

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-01-23 17:29:28 -08:00
Brad Davidson
29a5739b7e Remove local restriction for deferred node password validation
Restricting deferred node password validation to only requests from the local node is not possible without breaking split-role cluster cold start. There are too many cases where node password secrets may not yet be available due to the apiserver not being up.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-01-23 17:29:28 -08:00
Derek Nola
08c30f5ae6
chore: Bump klipper-lb and klipper-helm (#11595)
* Bump klipper-lb to v0.4.10

Bump klipper-helm to v0.9.4
Signed-off-by: Derek Nola <derek.nola@suse.com>

* Bump helm-controller

Signed-off-by: Derek Nola <derek.nola@suse.com>

---------

Signed-off-by: Derek Nola <derek.nola@suse.com>
2025-01-16 12:08:26 -08:00
Brad Davidson
d0ea741b13 Fix local password validation when bind-address is set
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-01-15 12:45:16 -08:00
Maja Bojarska
646e3135bc Align etcd-snapshot-dir default path description
The effective snapshot dir is "${data-dir}/server/db/snapshots". The
server segment is missing in the CLI-reported default path, potentially
misleading the user about the actual default snapshot destination.

Signed-off-by: Maja Bojarska <majabojarska98@gmail.com>
2025-01-13 11:32:56 -08:00
Brad Davidson
6b0247fa4d Improve flannel RBAC changes
Only wait for k3s-controller RBAC when AuthorizeNodeWithSelectors blocks kubelet from listing nodes

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-01-10 17:50:25 -08:00
muicoder
0144d9b749 Update Traefik to v2.11.18
#11501
Signed-off-by: muicoder <muicoder@gmail.com>
2025-01-09 11:45:59 -08:00
Vitor Savian
7e18c69254
Add auto import images for containerd image store
* Add auto import images

Signed-off-by: Vitor Savian <vitor.savian@suse.com>

* Fix EOF error log when importing tarball files

Signed-off-by: Vitor Savian <vitor.savian@suse.com>

* Delaying queue

Signed-off-by: Vitor Savian <vitor.savian@suse.com>

* Add parse for images

Signed-off-by: Vitor Savian <vitor.savian@suse.com>
2025-01-09 13:15:27 -03:00
Brad Davidson
f345697c0a Add tests for supervisor request handlers
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-01-09 00:51:19 -08:00
Brad Davidson
e6327652f0 Replace *core.Factory with CoreFactory interface
Make this field an interface instead of pointer to allow mocking. Not sure why wrangler has a type that returns an interface instead of just making it an interface itself. Wrangler in general is hard to mock for testing.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-01-09 00:51:19 -08:00
Brad Davidson
c20c06373a Move additional core/v1 mocks into tests package
Convert nodepassword tests to use shared mocks

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-01-09 00:51:19 -08:00
Brad Davidson
8f8cfb56b5 Move core/v1 mock into tests package for reuse
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-01-09 00:51:19 -08:00
Brad Davidson
f8271d8506 Add test for join existing cluster
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-01-09 00:51:19 -08:00
Brad Davidson
365372441b Handle cluster join as create if we're the only member
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-01-09 00:51:19 -08:00
Brad Davidson
caeebc52b7 Add client-side certificate generation support
Clients now generate keys client-side and send CSRs. If the server is down-level and sends a cert+key instead of just responding with a cert signed with the client's public key, we use the key from the server instead.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-01-09 00:51:19 -08:00
Brad Davidson
5b1d57f7b9 Remove unused Certificate field from Node struct
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-01-09 00:51:19 -08:00
Brad Davidson
2e4e7cf2c1 Move request handlers out of server package
The servers package, and router.go in particular, had become quite
large. Address this by moving some things out to separate packages:
* http request handlers all move to pkg/server/handlers.
* node password bootstrap auth handler goes into pkg/nodepassword with
  the other nodepassword code.

While we're at it, also be more consistent about calling variables that
hold a config.Control struct or reference `control` instead of `config` or `server`.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-01-09 00:51:19 -08:00
muicoder
056cee8290
Update Traefik to v2.11.17 (#11502)
#11501
Signed-off-by: muicoder <muicoder@gmail.com>
2025-01-07 14:52:20 -08:00
Derek Nola
c3460fce73
Add "k3s certificate check" clause for better test coverage (#11485)
* Add "k3s certificate check" clause for better test coverage

Signed-off-by: Derek Nola <derek.nola@suse.com>

* Add table support to cert check

Signed-off-by: Derek Nola <derek.nola@suse.com>

---------

Signed-off-by: Derek Nola <derek.nola@suse.com>
2025-01-07 10:19:23 -08:00
Hussein Galal
e6f6fce676
Load kernel modules for nft in agent setup (#11524)
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2025-01-07 03:01:02 +02:00
Hyouka
e64e2fcfd4
add IPv6 to cluster-dns Usage Docs (#11498)
Signed-off-by: rivolity <hamdaouiomar1@gmail.com>
2025-01-03 09:30:31 -08:00
Brad Davidson
6381ae93e7 Switch to using kubelet config files instead of CLI args
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-12-20 14:41:40 -08:00
Hussein Galal
763188d642
V1.32.0+k3s1 (#11478)
* Update libraries and codegen for k8s 1.32

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>

* Fixes for 1.32

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>

* Disable tests with down-rev agents

These are broken by AuthorizeNodeWithSelectors being on by default. All
agents must be upgraded to v1.32 or newer to work properly, until we
backport RBAC changes to older branches.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>

---------

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Co-authored-by: Brad Davidson <brad.davidson@rancher.com>
2024-12-20 23:17:14 +02:00
Reinhard Nägele
124be7472b
Update coredns to 1.12.0 (#11387)
* Update to coredns 1.12.0

Signed-off-by: Reinhard Nägele <unguiculus@gmail.com>
2024-12-10 10:09:27 -08:00
Brad Davidson
e143e0fa12 Add hidden flag/var for supervisor/apiserver listen config
Add flags supervisor and apiserver ports and bind address so that we can add an e2e to cover supervisor and apiserver on separate ports, as used by rke2

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-12-10 09:31:18 -08:00
Brad Davidson
5a5b136151 Fix agent tunnel address on rke2
Fix issue where rke2 tunnel was trying to connect to apiserver port instead of supervisor

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-12-10 09:31:18 -08:00
Derek Nola
69c310d68b
Remove experimental from embedded-registry flag (#11443)
Signed-off-by: Derek Nola <derek.nola@suse.com>
2024-12-10 08:37:13 -08:00
Brad Davidson
c7ff957cae Fall back to polling the supervisor for apiserver addresses when the watch fails
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-12-06 11:45:34 -08:00
Brad Davidson
168b344d1d Return apiserver addresses from both etcd and endpoints
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-12-06 11:45:34 -08:00
Brad Davidson
71918e0d69 Use helper to set consistent rest.Config rate limits and timeouts
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-12-06 11:45:34 -08:00
Brad Davidson
3d2fabb013 Add loadbalancer metrics
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-12-06 11:45:34 -08:00
Brad Davidson
911ee19a93 Refactor load balancer server list and health checking
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-12-06 11:45:34 -08:00
Brad Davidson
95797c4a79 Refactor filterCN to use a Set instead of map[string]bool
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-12-06 11:45:34 -08:00
Brad Davidson
67fd5fa9e5 Separate persistent config struct from LoadBalancer and make fields private
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-12-06 11:45:34 -08:00
Brad Davidson
13e9113787 Move http/socks proxy stuff to separate file
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-12-06 11:45:34 -08:00
Brad Davidson
f2f57b4a4b Remove unused code from etcdproxy
None of these fields or functions are used in k3s or rke2

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-12-06 11:45:34 -08:00
Derek Nola
183f0c8d09
Fix secrets-encrypt reencrypt timeout error (#11385)
* Add missing default OS for split server test

Signed-off-by: Derek Nola <derek.nola@suse.com>

* Launch go routine and return for k3s secrets-encrypt reencrypt

Signed-off-by: Derek Nola <derek.nola@suse.com>

---------

Signed-off-by: Derek Nola <derek.nola@suse.com>
2024-12-05 09:11:22 -08:00
manuelbuil
4ec261733e If no etcd was deployed, fail etcd-snapshot with a useful error
Signed-off-by: manuelbuil <mbuil@suse.com>
2024-11-28 09:40:42 +01:00
Brad Davidson
cd4ddedbc9 Fix issue with loadbalancer failover to default server
The loadbalancer should only fail over to the default server if all other server have failed, and it should force fail-back to a preferred server as soon as one passes health checks.

The loadbalancer tests have been improved to ensure that this occurs.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-11-13 19:41:45 -08:00
Brad Davidson
0c29696eef Fix handling of wrapped subcommands when run with a path
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-11-12 16:35:54 -08:00
Ludo Stellingwerff
2441e46950 Fix the "Standalone"-mode of oidc-login in the wrapped kubectl application.
This fixes: 'error: no Auth Provider found for name "oidc"' when trying to run any subcommands in kubectl that require a valid server login.

Signed-off-by: Ludo Stellingwerff <ludo.stellingwerff@gmail.com>
2024-11-08 08:52:53 -08:00
Brad Davidson
ff5c633fe7 Fix MustFindString returning override flags on external CLI commands
External CLI actions cannot short-circuit on --help or --version, so we
cannot skip loading the config file if these flags are present when
running these wrapped commands. The behavior of just returning the
override flag name instead of the requested flag value was breaking
data-dir lookup when running wrapped commands.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-11-06 09:51:55 -08:00
Brad Davidson
56fb3b0991 Add nonroot-devices flag to agent CLI
Add new flag that is passed through to the device_ownership_from_security_context parameter in the containerd CRI config. This is not possible to change without providing a complete custom containerd.toml template so we should add a flag for it.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-11-05 11:36:55 -08:00
Brad Davidson
bc60ff79f6 Set kine EmulatedETCDVersion from embedded etcd version
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-11-05 09:07:11 -08:00
Brad Davidson
a39e191906 Add tests for ETCD.Test()
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-10-30 12:18:48 -07:00
Brad Davidson
095e34d816 Fix issues with defragment and alarm clear on etcd startup
* Use clientv3.NewCtxClient instead of New to avoid automatic retry of all RPCs
* Only timeout status requests; allow defrag and alarm clear requests to run to completion.
* Only clear alarms on the local cluster member, not ALL cluster members

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-10-30 12:18:48 -07:00
Derek Nola
4888376682
Fix Github Actions for Ubuntu-24.04 (#11112)
* Fix vagrant/libvirt composite action for ubuntu-24.04

* Don't ignore changes to internal actions

* Fix unit tests for ubuntu 24.04, new lsof version

* Pin os version for unit and E2E workflows

Signed-off-by: Derek Nola <derek.nola@suse.com>
2024-10-16 12:22:07 -07:00
manuelbuil
536fa44eb0 Revert "Make svclb as simple as possible"
This reverts commit 1befd65a0a.

Signed-off-by: manuelbuil <mbuil@suse.com>
2024-10-15 20:30:03 +02:00
manuelbuil
054cec849f Add the nvidia runtime cdi
Signed-off-by: manuelbuil <mbuil@suse.com>
2024-10-11 21:38:21 +02:00
manuelbuil
660c6052c2 Make svclb as simple as possible
Signed-off-by: manuelbuil <mbuil@suse.com>
2024-10-11 10:52:47 +02:00
Brad Davidson
b0ad6d846d Bump local-path-provisioner to v0.0.30
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-10-09 19:49:35 -07:00
github-actions[bot]
c00af8e95e chore: Bump Local Path Provisioner version
Made with ❤️️ by updatecli

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-10-09 14:06:21 -07:00
Brad Davidson
1ae9ca73f5 Update tcpproxy for import path change
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-10-09 11:46:08 -07:00
Brad Davidson
c6392c9ffc Fix issue that caused passwd file and psk to be regenerated when rotating CA certs
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-10-08 17:03:31 -07:00
Brad Davidson
0826ebc142 Fix race condition when multiple nodes reconcile S3 snapshots
Don't delete s3 etcdsnapshotfiles if they are missing from s3 but less than a minute old, its possible the other node just finished uploading it and the object key has not yet become visible.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-10-07 11:11:58 -07:00
Ludo Stellingwerff
38d13e03d9
Allow additional Rootless CopyUpDirs through K3S_ROOTLESS_COPYUPDIRS env variable (#10386)
Signed-off-by: Ludo Stellingwerff <ludo.stellingwerff@gmail.com>
2024-10-07 09:38:11 -07:00
Brad Davidson
0942e6a0c5 Fix sqlite endpoint when migrating from sqlite to etcd
Support for 'sqlite' as the endpoint was removed in
https://github.com/k3s-io/kine/pull/320 and the constant removed in
https://github.com/k3s-io/kine/pull/325

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-10-03 10:54:03 -07:00
Brad Davidson
c9e7b05971 Bump kine
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-10-03 10:54:03 -07:00
Vitor Savian
1ff43bf07f Add user path to runtimes search
Signed-off-by: Vitor Savian <vitor.savian@suse.com>
2024-10-02 09:52:11 -03:00
Brad Davidson
6c6d87d1b0 Bump traefik to chart 27.0.2 / appVersion v2.11.10
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-09-30 12:49:18 -07:00
Brad Davidson
d6c20b7452 Fix hosts.toml header var
Resolves issue from 270f85e468 that prevented old hosts.toml files from being cleaned up.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-09-10 14:59:41 -07:00
Arne Winter
c4c11e51f1
add node-internal-dns/node-external-dns address pass-through support (#10852)
* add --node-internal-dns and --node-external-dns

Signed-off-by: Arne Winter <github@arnewinter.dev>
Co-authored-by: Brad Davidson <brad@oatmail.org>
2024-09-06 14:15:19 -07:00
Brad Davidson
270f85e468 Only clean up containerd hosts dirs managed by k3s
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-09-05 17:21:55 -07:00
Harsimran Singh Maan
0b4d2497e5 Update coredns to 1.11.3 and metrics-server to 0.7.2
Used https://github.com/coredns/corefile-migration to
migrate the corefile. There are no changes for the
default file from 1.10.1 to 1.11.3.

Notable plugin changes include the k8s_external with fallthrough option
and rewrite with cname_target option.

These changes are not part of the default config that ships
with k3s. Customers using these two plugins can start using the new options

Metrics does not have any new features other than build tooling updates.

Requires https://github.com/rancher/image-mirror/pull/704

Signed-off-by: Harsimran Singh Maan <maan.harry@gmail.com>
2024-08-29 15:00:45 -07:00
Brad Davidson
bd45aa5c45 Bump traefik to v2.11.8
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-08-29 14:02:58 -07:00
Derek Nola
85e02e10d7
Remove secrets encryption controller (#10612)
* Remove secrets encryption controller

Signed-off-by: Derek Nola <derek.nola@suse.com>
2024-08-26 08:31:49 -07:00
Brad Davidson
fe3324cb84 Fix rotateca validation failures when not touching default self-signed CAs
Also silences warnings about bootstrap fields that are not intended to be handled by CA rotation

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-08-22 14:47:40 -07:00
Derek Nola
d358a89171 Fix secrets-encrypt metrics
Signed-off-by: Derek Nola <derek.nola@suse.com>
2024-08-22 14:23:34 -07:00
galal-hussein
5087240e32 Downgrade Microsoft/hcsshim to v0.8.26
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2024-08-22 14:23:34 -07:00
galal-hussein
8cbcbcd044 go generate
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2024-08-22 14:23:34 -07:00
galal-hussein
20b50426ab Update to v1.31.0
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2024-08-22 14:23:34 -07:00
Alireza Eskandari
22fb7049bd Add tolerations support for DaemonSet pods
Signed-off-by: Alireza Eskandari <alireza.eskandari@wsd.com>
2024-08-12 13:01:27 -07:00
Brad Davidson
bffdf463e1 Fix cloudprovider controller name
Looking at metrics revealed the cloudprovider controller name was anempty string.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-29 21:54:20 -07:00
Brad Davidson
e168438d44 Wire lasso metrics up to common gatherer
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-29 21:54:20 -07:00
Will Andrews
e2179aa957 Update pkg/cluster/managed.go
Co-authored-by: Derek Nola <derek.nola@suse.com>
Signed-off-by: Will Andrews <will7989@hotmail.com>
2024-07-29 16:23:17 -07:00
Will Andrews
3ec086f6f7 Update pkg/secretsencrypt/config.go
Co-authored-by: Brad Davidson <brad@oatmail.org>
Signed-off-by: Will Andrews <will7989@hotmail.com>
2024-07-29 16:23:17 -07:00
Will
e4f3cc7b54 remove deprecated use of wait functions
Signed-off-by: Will <will7989@hotmail.com>
2024-07-29 16:23:17 -07:00
Brad Davidson
e514940020 Fix inconsistent loading of config dropins when config file does not exist
FindString would silently skip parsing dropins if the main config file
didn't exist. If a custom config file path was passed it would raise an
error, but if we were parsing the default config file and it didn't
exist it would just silently fail to load the dropins.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-29 15:23:52 -07:00
Brad Davidson
9111b1f77e Add K3S_DATA_DIR as env var for --data-dir flag
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-29 15:23:52 -07:00
Derek Nola
59e0761043
Use higher QPS for secrets reencryption (#10571)
* Use higher QPS for secrets reencryption

Signed-off-by: Derek Nola <derek.nola@suse.com>
2024-07-26 12:07:26 -07:00
Derek Nola
a70157c12e
Allow Pprof and Superisor metrics in standalone mode (#10576)
* Allow pprof to run on server with `--disable-agent`
* Allow supervisor metrics to run on server with `--disable-agent`

Signed-off-by: Derek Nola <derek.nola@suse.com>
2024-07-26 11:23:57 -07:00
Brad Davidson
d4c3422a85 Fix ipv6 sysctl required by non-ipv6 LoadBalancer service
This is a partial revert of 095ecdb034,
with the workaround moved into klipper-lb.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-24 13:40:33 -07:00
Brad Davidson
21611c5665 Cap length of generated name used for servicelb daemonset
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-24 13:40:33 -07:00
Brad Davidson
891e72f90f Update secretsencrypt pagination
Make secretsencrypt page size and iteration consistent with other paginators

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-24 12:44:29 -07:00
Brad Davidson
c2216a62ad Use pagination when retrieving etcd snapshot list
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-24 12:44:29 -07:00
Brad Davidson
37830fe170 Don't use server and token values from config file for etcd-snapshot commands
Fixes an issue where running etcd-snapshot commands on a node that has a server address set in the config will manage snapshots on that server, instead of on the local node as intended.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-15 10:12:50 -07:00
Brad Davidson
cb6bf74bc4 Add dial duration to debug error message
This should give us more detail on how long dials take before failing, so that we can perhaps better tune the retry loop in the future.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-15 09:46:52 -07:00
Brad Davidson
118acabec2 Fix IPv6 primary node-ip handling
I should have caught `[]string{cfg.NodeIP}[0]` and `[]string{envInfo.NodeIP.String()}[0]` in code review...

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-15 09:46:52 -07:00
Brad Davidson
9841517457 Fix agents removing configured supervisor address
We shouldn't be replacing the configured server address on agents. Doing
so breaks the agent's ability to fall back to the fixed registration
endpoint when all servers are down, since we replaced it with the first
discovered apiserver address. The fixed registration endpoint will be
restored as default when the service is restarted, but this is not the
correct behavior. This should have only been done on etcd-only nodes
that start up using their local supervisor, but need to switch to a
control-plane node as soon as one is available.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-15 09:46:52 -07:00
Brad Davidson
9d0c2e0000 Fix reentrant rlock in loadbalancer.dialContext
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-15 09:46:52 -07:00
Brad Davidson
c36db53e54 Add etcd s3 config secret implementation
* Move snapshot structs and functions into pkg/etcd/snapshot
* Move s3 client code and functions into pkg/etcd/s3
* Refactor pkg/etcd to track snapshot and s3 moves
* Add support for reading s3 client config from secret
* Add minio client cache, since S3 client configuration can now be
  changed at runtime by modifying the secret, and don't want to have to
  create a new minio client every time we read config.
* Add tests for pkg/etcd/s3

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-10 13:13:55 -07:00
Brad Davidson
eb8bd15889 Ensure remotedialer kubelet connections use kubelet bind address
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-07-10 13:00:25 -07:00
github-actions[bot]
a0b374508e
Bump Local Path Provisioner version (#10394)
* chore: Bump Local Path Provisioner version

Made with ❤️️ by updatecli

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-07-10 12:53:46 -07:00
Roberto Bonafiglia
faeaf1b01b Update flannel to v0.25.4 and fixed issue with IPv6 mask
Signed-off-by: Roberto Bonafiglia <roberto.bonafiglia@suse.com>
2024-07-01 18:57:34 +02:00
Brad Davidson
aa4794b372 Replace 1-weight semaphore on snapshots with simple mutex
Fixes an issue where the semaphore wasn't permanently initialized
until a scheduled snapshot was taken, allowing multiple on-demand
snapshots to be taken until the first scheduled snapshot was triggered.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-06-19 09:47:58 -07:00
Brad Davidson
b4d4ed8f01 Fix agent supervisor port using apiserver port instead
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-06-13 15:13:21 -07:00
Harrison Affel
f10cb29534 fix typo, use rancher/permissions
Signed-off-by: Harrison Affel <harrisonaffel@gmail.com>
2024-06-07 08:00:44 -07:00
Brad Davidson
c0450a2cb4 Fix race condition panic in loadbalancer.nextServer
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-06-07 07:39:48 -07:00
Vitor Savian
d9b8ba8d71
Add snapshot retention etcd-s3-folder fix
* Add snapshot retention folder fix

Signed-off-by: Vitor Savian <vitor.savian@suse.com>

* Add snapshot retention E2E test

Signed-off-by: Vitor Savian <vitor.savian@suse.com>

---------

Signed-off-by: Vitor Savian <vitor.savian@suse.com>
2024-06-06 17:31:01 -03:00
fmoral2
043b1eac5d
Add test for isValidResolvConf (#10302)
Signed-off-by: Francisco <francisco.moral@suse.com>
2024-06-06 17:02:31 -03:00
Brad Davidson
1661f1024a Fix bug that caused agents to bypass local loadbalancer
If proxy.SetAPIServerPort was called multiple times, all calls after the
first one would cause the apiserver address to be set to the default
server address, bypassing the local load-balancer. This was most likely
to occur on RKE2, where the supervisor may be up for a period of time
before it is ready to manage node password secrets, causing the agent
to retry.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-06-04 11:18:45 -07:00
Koen de Laat
79ba10f5ec fix: Use actual warningPeriod in certmonitor
Signed-off-by: Koen de Laat <koen.de.laat@philips.com>
2024-06-03 11:20:15 -07:00
github-actions[bot]
1268779ea0
Bump Local Path Provisioner version (#10268)
* chore: Bump Local Path Provisioner version

Made with ❤️️ by updatecli
2024-06-03 11:19:23 -07:00
Brad Davidson
f9130d537d Fix embedded mirror blocked by SAR RBAC and re-enable test
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-31 08:33:18 -07:00
Katherine Door
7a0ea3c953
Add write-kubeconfig-group flag to server (#9233)
* Add write-kubeconfig-group flag to server
* update kubectl unable to read config message for kubeconfig mode/group

Signed-off-by: Katherine Pata <me@kitty.sh>
2024-05-30 23:45:34 -07:00
Brad Davidson
307f07bd61 Fix issue caused by sole server marked as failed under load
If health checks are failing for all servers, make a second pass through the server list with health-checks ignored before returning failure

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-30 11:47:23 -07:00
Brad Davidson
ed23a2bb48 Fix netpol crash when node remains tained unintialized
It is concievable that users might take more than 60 seconds to deploy their own cloud-provider. Instead of exiting, we should wait forever, but with more logging to indicate what's being waited on.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-28 23:34:44 -07:00
Brad Davidson
f8e0648304 Convert remaining http handlers over to use util.SendError
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-28 16:24:57 -07:00
Brad Davidson
ff679fb3ab Refactor supervisor listener startup and add metrics
* Refactor agent supervisor listener startup and authn/authz to use upstream
  auth delegators to perform for SubjectAccessReview for access to
  metrics.
* Convert spegel and pprof handlers over to new structure.
* Promote bind-address to agent flag to allow setting supervisor bind
  address for both agent and server.
* Promote enable-pprof to agent flag to allow profiling agents. Access
  to the pprof endpoint now requires client cert auth, similar to the
  spegel registry api endpoint.
* Add prometheus metrics handler.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-28 16:24:57 -07:00
Brad Davidson
3d14092f76 Fix issue with k3s-etcd informers not starting
Start shared informer caches when k3s-etcd controller wins leader election. Previously, these were only started when the main k3s apiserver controller won an election. If the leaders ended up going to different nodes, some informers wouldn't be started

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-28 15:48:15 -07:00
Thomas Ferrandiz
6dcd52eb8e Use TrafficManager interface when calling flannel
Signed-off-by: Thomas Ferrandiz <thomas.ferrandiz@suse.com>
2024-05-27 13:05:18 +00:00
Thomas Ferrandiz
af7bcc3900 Bump flannel version to v0.25.2
Signed-off-by: Thomas Ferrandiz <thomas.ferrandiz@suse.com>
2024-05-27 13:05:18 +00:00
huangzy
6fcaad553d allow helm controller set owner reference
Signed-off-by: huangzy <huangzynn@outlook.com>
2024-05-24 12:44:10 -07:00
Robert Rose
6886c0977f Follow directory symlinks in auto deploying manifests (#9288)
Signed-off-by: Robert Rose <robert.rose@mailbox.org>
2024-05-24 12:42:25 -07:00
linxin
f24ba9d3a9 Validate resolv.conf for presence of nameserver entries
Co-authored-by: Brad Davidson <brad@oatmail.org>
Signed-off-by: linxin <linxin@geedgenetworks.com>
2024-05-24 12:39:34 -07:00
Brad Davidson
5a0162d8ee Drop check for legacy traefik v1 chart
We have been bundling traefik v2 for three years, its time to drop the legacy chart check

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-23 14:13:13 -07:00
Brad Davidson
37f97b33c9 Add support for svclb pod PriorityClassName
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-23 14:11:15 -07:00
Brad Davidson
b453630478 Update local-path-provisioner helper script
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-23 14:00:00 -07:00
Brad Davidson
095ecdb034 Fix issue with local traffic policy for single-stack services on dual-stack nodes.
Just enable IP forwarding for all address families regardless of service address families.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-23 13:54:30 -07:00
Brad Davidson
5cf4d75749 Bump spegel version
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-23 13:48:38 -07:00
Brad Davidson
30999f9a07 Switch stargz over to cri registry config_path
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-23 13:35:15 -07:00
Brad Davidson
7374010c0c Use fixed stream server bind address for cri-dockerd
Will now use 127.0.0.1:10010, same as containerd's CRI

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-23 13:33:27 -07:00
Brad Davidson
5f6b813cc8 Add WithSkipMissing to not fail import on missing blobs
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-23 13:32:22 -07:00
Manuel Buil
811de8b819 Fix bug when using tailscale config by file
Signed-off-by: Manuel Buil <mbuil@suse.com>
2024-05-23 11:55:20 +02:00
Harrison Affel
1d22b6971f windows changes
Signed-off-by: Harrison Affel <harrisonaffel@gmail.com>
2024-05-16 14:40:27 -07:00
Derek Nola
6531fb79b0
Deprecate pod-infra-container-image kubelet flag (#7409)
Signed-off-by: Derek Nola <derek.nola@suse.com>
2024-05-06 10:39:10 -07:00
Hussein Galal
144f5ad333
Kubernetes V1.30.0-k3s1 (#10063)
* kubernetes 1.30.0-k3s1

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* Update go version to v1.22.2

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* update dynamiclistener and helm-controller

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* update go in go.mod to 1.22.2

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* update go in Dockerfiles

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* update cri-dockerd

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* Add proctitle package with linux and windows constraints

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* go mod tidy

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* Fixing setproctitle function

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* update dynamiclistener to v0.6.0-rc1

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

---------

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2024-05-06 19:42:27 +03:00
Brad Davidson
94e29e2ef5 Make /db/info available anonymously from localhost
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-04-22 19:34:43 -07:00
Brad Davidson
d3b60543e7 Fix 10 second etcd-snapshot request timeout
The default clientaccess request timeout is too short. Wait longer by default, and add the s3 timeout if s3 is enabled.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-04-19 23:26:51 -07:00
Brad Davidson
5b431ca531 Fix on-demand snapshots not honoring folder
Also fix etcd s3 tests to actually check that the files are saved to s3 🙃

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-04-19 23:26:51 -07:00
Thomas Anderson
c59820a52a Allow LPP to read helper logs (#9834)
Signed-off-by: Thomas Anderson <127358482+zc-devs@users.noreply.github.com>
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-04-11 12:31:54 -07:00
Brad Davidson
3f906bee79 Update packaged manifests
* Update traefik chart to bump image tag and fix quoting
* Fix image quoting in flat manifests
* Update local-path-provisioner config to stop using deprecated hostpath volume type

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-04-11 09:22:51 -07:00
Brad Davidson
4cc73b1fee Actually fix agent certificate rotation
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-04-10 09:21:01 -07:00
Brad Davidson
08f1022663 Don't log 'apiserver disabled' error sent by etcd-only nodes
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-04-09 15:36:33 -07:00
Brad Davidson
7d9abc9f07 Improve etcd load-balancer startup behavior
Prefer the address of the etcd member being joined, and seed the full address list immediately on startup.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-04-09 15:36:33 -07:00
Brad Davidson
fe465cc832 Move etcd snapshot management CLI to request/response
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-04-09 15:21:26 -07:00
Brad Davidson
60248c42de Add supervisor cert/key to rotate list
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-04-05 10:59:17 -07:00
Derek Nola
9846a72e92
Bump spegel to v0.0.20-k3s1 (#9863)
* Bump spegel to v0.0.20-k3s1

* Remove deprecated libp2p Pretty function

* Remove quic-go pin
   Pinned version is now out of date,  indirect dependencies are now newer, with CVE issue fixed
Signed-off-by: Derek Nola <derek.nola@suse.com>
2024-04-05 08:43:19 -07:00
Brad Davidson
f2961fb5d2 Add workaround for containerd hosts.toml bug
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-04-03 20:47:54 -07:00
Brad Davidson
7f659759dd Add certificate expiry check and warnings
* Add ADR
* Add `k3s certificate check` command.
* Add periodic check and events when certs are about to expire.
* Add metrics for certificate validity remaining, labeled by cert subject

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-03-28 12:05:21 -07:00
Derek Nola
6a42c6fcfe
Remove old pinned dependencies (#9806)
Signed-off-by: Derek Nola <derek.nola@suse.com>
2024-03-28 10:09:48 -07:00
Derek Nola
14f54d0b26
Transition from deprecated pointer library to ptr (#9801)
Signed-off-by: Derek Nola <derek.nola@suse.com>
2024-03-28 10:07:02 -07:00
Vitor Savian
5d69d6e782 Add tls for kine
Signed-off-by: Vitor Savian <vitor.savian@suse.com>

Bump kine

Signed-off-by: Vitor Savian <vitor.savian@suse.com>

Add integration tests for kine with tls

Signed-off-by: Vitor Savian <vitor.savian@suse.com>
2024-03-28 11:12:07 -03:00
Brad Davidson
c51d7bfbd1 Add health-check support to loadbalancer
* Adds support for health-checking loadbalancer servers. If a
  health-check fails when dialing, all existing connections to the
  server will be closed.
* Wires up a remotedialer tunnel connectivity check as the health check
  for supervisor/apiserver connections.
* Wires up a simple ping request to the supervisor port as the health
  check for etcd connections.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-03-27 16:50:27 -07:00
Brad Davidson
edb0440017 Fix etcd snapshot reconcile for agentless nodes
Disable cleanup of orphaned snapshots and patching of node annotations if running agentless

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-03-27 16:44:36 -07:00
Vitor Savian
3f649e3bcb Add a new error when kine is with disable apiserver or disable etcd
Signed-off-by: Vitor Savian <vitor.savian@suse.com>
2024-03-27 10:59:34 -03:00
Brad Davidson
f099bfa508 Fix error when image has already been pulled
CRI and containerd APIs disagree about the registry names - CRI supports
index.docker.io as an alias for docker.io, while containerd does not.
Use the actual stored RepoTag to determine what image to ask containerd for.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-03-26 16:19:40 -07:00
Brad Davidson
65cd606832 Respect cloud-provider fields set by kubelet
Don't clobber the providerID field and instance-type/region/zone labels if provided by the kubelet. This allows the user to set these to the correct values when using the embedded CCM in a real cloud environment.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-03-26 16:18:34 -07:00
Brad Davidson
d7cdbb7d4d Send error response if member list cannot be retrieved
Prevents joining nodes from being stuck with bad initial member list if there is a transient failure, or if they try to join themselves

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-03-26 15:17:15 -07:00
Brad Davidson
7a2a2d075c Move error response generation code into util
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-03-26 15:17:15 -07:00
Brad Davidson
bba3e3c66b Fix wildcard entry upstream fallback
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-03-12 23:31:16 -07:00
Brad Davidson
fe2ca9ecf1 Warn and suppress duplicate registry mirror endpoints
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-03-07 16:30:06 -08:00
Brad Davidson
2a091a693a Bump metrics-server to v0.7.0
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-03-07 12:45:29 -08:00
Roberto Bonafiglia
88c431aea5 Adjust first node-ip based on configured clusterCIDR
Signed-off-by: Roberto Bonafiglia <roberto.bonafiglia@suse.com>
2024-03-06 11:10:41 +01:00
Vitor Savian
59c724f7a6 Fix wildcard with embbeded registry test
Signed-off-by: Vitor Savian <vitor.savian@suse.com>
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-03-05 14:38:36 -08:00
Flavio Castelli
64e4f0e6e7 fix: use correct wasm shims names
Fix the wasm shim detection and the containerd configuration generation.

Prior to this commit, the binary and the `RuntimeType` values were not
correct.

Signed-off-by: Flavio Castelli <fcastelli@suse.com>
2024-03-05 13:12:08 -08:00
Brad Davidson
091a5c8965 Don't register embedded registry address as an upstream registry
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-03-04 15:11:26 -08:00
Brad Davidson
b5a4846e9d Remove filtering of wildcard mirror entry
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-03-04 15:11:26 -08:00
Brad Davidson
84a071a81e Add env var to allow spegel mirroring of latest tag
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-03-04 15:11:26 -08:00
Philip Laine
26feb25c40 Bump spegel to v0.0.18-k3s4
Signed-off-by: Philip Laine <philip.laine@gmail.com>
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-03-04 15:11:26 -08:00
Brad Davidson
0b3593205a Move snapshot-retention to EtcdSnapshotFlags in order to support loading from config
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-03-04 12:09:29 -08:00
Brad Davidson
3576ed4327 Clean up snapshotDir create/exists logic
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-03-04 12:09:29 -08:00
Brad Davidson
b164d7a270 Fix additional corner cases in registries handling
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-03-04 11:59:33 -08:00
Brad Davidson
82432a2df7 Fix issue with etcd node name missing hostname
* Set ServerNodeName in snapshot CLI setup
* Raise errer if ServerNodeName ends up empty some other way
* Fix status controller to use etcd node name annotation instead of prefix checking

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-03-01 13:52:53 -08:00
Brad Davidson
513c3416e7 Tweak netpol node wait logs
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-03-01 12:01:34 -08:00
Brad Davidson
be569f65a9 Fix NodeHosts on dual-stack clusters
* Add both dual-stack addresses to the node hosts file
* Add hostname to hosts file as alias for node name to ensure consistent resolution

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-03-01 11:59:59 -08:00
Edgar Lee
8c83b5e0f3 Rootless mode also bind service nodePort to host for LoadBalancer type
Signed-off-by: Edgar Lee <edgarhinshunlee@gmail.com>
2024-03-01 10:43:19 -08:00
Manuel Buil
3b4f13f28d Update klipper-lb image version
Signed-off-by: Manuel Buil <mbuil@suse.com>
2024-03-01 11:28:12 +01:00
Brad Davidson
86f102134e Fix netpol startup when flannel is disabled
Don't break out of the poll loop if we can't get the node, RBAC might not be ready yet.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-02-26 14:58:48 -08:00
Derek Nola
fae41a8b2a Rename AgentReady to ContainerRuntimeReady for better clarity
Signed-off-by: Derek Nola <derek.nola@suse.com>
2024-02-21 12:21:19 -08:00
Derek Nola
91cc2feed2 Restore original order of agent startup functions
Signed-off-by: Derek Nola <derek.nola@suse.com>
2024-02-21 12:21:19 -08:00
Brad Davidson
de825845b2 Bump kine and set NotifyInterval to what the apiserver expects
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-02-09 14:22:38 -08:00
Edgar Lee
0ac4c6a056 Expose rootless containerd socket directories for external access
Signed-off-by: Edgar Lee <edgarhinshunlee@gmail.com>
2024-02-09 14:22:03 -08:00
Edgar Lee
14c6c63b30 Expose rootless state dir under ~/.rancher/k3s/rootless
Signed-off-by: Edgar Lee <edgarhinshunlee@gmail.com>
2024-02-09 14:21:52 -08:00
Oleg Matskiv
e3b237fc35 Don't verify the node password if the local host is not running an agent
Signed-off-by: Oleg Matskiv <oleg.matskiv@gmail.com>
2024-02-09 14:21:43 -08:00
Derek Nola
fa11850563
Readd k3s secrets-encrypt rotate-keys with correct support for KMSv2 GA (#9340)
* Reorder copy order for caching
* Enable longer http timeout requests

Signed-off-by: Derek Nola <derek.nola@suse.com>

* Setup reencrypt controller to run on all apiserver nodes
* Fix reencryption for disabling secrets encryption, reenable drone tests
2024-02-09 11:37:37 -08:00
Oliver Larsson
cfc3a124ee
[Testing]: Test_UnitApplyContainerdQoSClassConfigFileIfPresent (Created) (#8945)
Problem:
Function not tested.

Solution:
Unit test added.

Signed-off-by: Oliver Larsson <larsson.e.oliver@gmail.com>
2024-02-09 11:28:06 -08:00
Harrison Affel
a36cc736bc allow executors to define containerd and docker behavior
Signed-off-by: Harrison Affel <harrisonaffel@gmail.com>
2024-02-09 15:51:35 -03:00
Brad Davidson
753c00f30c Consistently handle component exit on shutdown
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-02-07 10:23:54 -08:00
Vitor Savian
e9cec46a23 Runtimes refactor using exec.LookPath
Signed-off-by: Vitor Savian <vitor.savian@suse.com>
2024-02-07 15:06:16 -03:00
Vitor Savian
f9ee66f4d8 Changed how lastHeartBeatTime works in the etcd condition
Signed-off-by: Vitor Savian <vitor.savian@suse.com>
2024-02-07 15:05:33 -03:00
Brad Davidson
8224a3a7f6 Fix ipv6 endpoint address selection for on-demand snapshots
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-02-06 18:02:36 -08:00
Brad Davidson
888f866dae Fix issue with coredns node hosts controller
The nodes controller was reading from the configmaps cache, but doesn't add any handlers, so if no other controller added configmap handlers, the cache would remain empty.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-02-06 18:02:06 -08:00
Brad Davidson
6ec1926f88 Add check for etcd-snapshot-dir and fix panic in Walk
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-02-06 17:47:33 -08:00
Brad Davidson
82e3c32c9f Retry startup snapshot reconcile
The reconcile may run before the kubelet has created the node object; retry until it succeeds

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-02-06 17:46:24 -08:00
Brad Davidson
4005600d4e Fix excessive retry on snapshot reconcile
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-02-06 17:46:24 -08:00
github-actions[bot]
f249fcc2f1
Bump Local Path Provisioner version (#8953)
* chore: Bump Local Path Provisioner version
---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-02-06 16:57:07 -06:00
Brad Davidson
c635818956 Bump runc and helm-controller versions
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-02-01 18:51:51 -08:00
Brad Davidson
97a22632b9 gofmt config_test.go
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-02-01 18:51:51 -08:00
Brad Davidson
29848dea3d Fix issues with certs.d template generation
* Fix issue with bare host or IP as endpoint
* Fix issue with localhost registries not defaulting to http.
* Move the registry template prep to a separate function,
  and adds tests of that function so that we can ensure we're
  generating the correct content.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-02-01 12:09:13 -08:00
Vitor Savian
9a70021a9e Error getting node in setEtcdStatusCondition
Signed-off-by: Vitor Savian <vitor.savian@suse.com>

Added retry and changed nodes for

Signed-off-by: Vitor Savian <vitor.savian@suse.com>
2024-01-11 22:06:36 -03:00
Brad Davidson
c87e6e5f7e Move proxy dialer out of init() and fix crash
* Fixes issue where proxy support only honored server address via K3S_URL, not CLI or config.
* Fixes crash when agent proxy is enabled, but proxy env vars do not return a proxy URL for the server address (server URL is in NO_PROXY list).
* Adds tests

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-01-11 16:12:15 -08:00
Brad Davidson
76fa022045 Enable network policy controller metrics
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-01-11 10:19:39 -08:00
Brad Davidson
37e9b87f62 Add embedded registry implementation
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-01-09 15:23:05 -08:00
Brad Davidson
ef90da5c6e Add server CLI flag and config fields for embedded registry
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-01-09 15:23:05 -08:00
Brad Davidson
77846d63c1 Propagate errors up from config.Get
Fixes crash when killing agent while waiting for config from server

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-01-09 15:23:05 -08:00
Brad Davidson
16d29398ad Move registries.yaml load into agent config
Moving it into config.Agent so that we can use or modify it outside the context of containerd setup

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-01-09 15:23:05 -08:00
Brad Davidson
5c99bdd9bd Pin images instead of locking layers with lease
Layer leases never did what we wanted anyways, and this is the new approved interface for ensuring that images do not get GCd

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-01-09 15:23:05 -08:00
Vitor Savian
4a92ced8ee Handle etcd status condition when cluster reset and disable etcd
Signed-off-by: Vitor Savian <vitor.savian@suse.com>

Set condition if node is unhealthy

Signed-off-by: Vitor Savian <vitor.savian@suse.com>
2024-01-09 11:20:41 -03:00
Aofei Sheng
8d2c40cdac
Use ipFamilyPolicy: RequireDualStack for dual-stack kube-dns (#8984)
Signed-off-by: Aofei Sheng <aofei@aofeisheng.com>
2024-01-09 00:44:03 +02:00
Manuel Buil
6330e26bb3 Wait for taint to be gone in the node before starting the netpol controller
Signed-off-by: Manuel Buil <mbuil@suse.com>
2024-01-08 12:04:18 +01:00
Brad Davidson
b297996b92 Add runtime checking of golang version
Forces other groups packaging k3s to intentionally choose to build k3s with an unvalidated golang version

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-01-04 17:22:46 -08:00
Lex Rivera
5fe074b540
Add more paths to crun runtime detection (#9086)
* add usr/local paths for crun detection

Signed-off-by: Lex Rivera <me@lex.io>
2024-01-04 16:51:13 -08:00
Brad Davidson
c45524e662 Add support for containerd cri registry config_path
Render cri registry mirrors.x.endpoints and configs.x.tls into config_path; keep
using mirrors.x.rewrites and configs.x.auth those do not yet have an
equivalent in the new format.

The new config file format allows disabling containerd's fallback to the
default endpoint when using mirror endpoints; a new CLI flag is added to
control that behavior.

This also re-shares some code that was unnecessarily split into parallel
implementations for linux/windows versions. There is probably more work
to be done on this front but it's a good start.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-01-04 16:50:26 -08:00
Brad Davidson
319dca3e82 Fix nil map in full snapshot configmap reconcile
If a full reconcile wins the race against sync of an individual snapshot resource, or someone intentionally deletes the configmap, the data map could be nil and cause a crash.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-01-04 16:49:58 -08:00
Brad Davidson
db7091b3f6 Handle logging flags when parsing kube-proxy args
Also adds a test to ensure this continues to work.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-01-04 16:23:03 -08:00
Brad Davidson
1e663622d2 Fix the OTHER log message that prints the wrong variable
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-01-04 15:23:39 -08:00
Brad Davidson
a27d660a24 Add ServiceLB support for PodHostIPs FeatureGate
If the feature-gate is enabled, use status.hostIPs for dual-stack externalTrafficPolicy=Local support

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-01-02 16:00:09 -08:00
Derek Nola
aca1c2fd11
Add a retry around updating a secrets-encrypt node annotations (#9039)
* Add a retry around updating a se node annotations

Signed-off-by: Derek Nola <derek.nola@suse.com>
2024-01-02 12:21:37 -08:00
Pierre
bbd68f3a50
Rebase & Squash (#9070)
Signed-off-by: Yodo <pierre@azmed.co>
2024-01-02 12:05:36 -08:00
Derek Nola
3190a5faa2
Remove rotate-keys subcommand (#9079)
Signed-off-by: Derek Nola <derek.nola@suse.com>
2023-12-20 12:26:41 -08:00
Hussein Galal
9411196406
Update flannel to v0.24.0 and remove multiclustercidr flag (#9075)
* update flannel to v0.24.0

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* remove multiclustercidr flag

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

---------

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2023-12-20 00:25:38 +02:00
Hussein Galal
7101af36bb
Update Kubernetes to v1.29.0+k3s1 (#9052)
* Update to v1.29.0

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* Update to v1.29.0

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* Update go to 1.21.5

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* update golangci-lint

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* update flannel to 0.23.0-k3s1

This update uses k3s' fork of flannel to allow the removal of
multicluster cidr flag logic from the code

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fix flannel calls

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* update cri-tools to version v1.29.0-k3s1

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* Remove GOEXPERIMENT=nounified from arm builds

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* Skip golangci-lint

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* Fix setup logging with newer go version

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* Move logging flags to components arguments

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* add sysctl commands to the test script

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* Update scripts/test

Signed-off-by: Brad Davidson <brad@oatmail.org>

* disable secretsencryption tests

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

---------

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
Signed-off-by: Brad Davidson <brad@oatmail.org>
Co-authored-by: Brad Davidson <brad@oatmail.org>
2023-12-19 05:14:02 +02:00
Brad Davidson
231cb6ed20
Remove GA feature-gates (#8970)
Remove KubeletCredentialProviders and JobTrackingWithFinalizers feature-gates, both of which are GA and cannot be disabled.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-12-14 22:57:24 +02:00
Brad Davidson
08509a2a90 Allow setting default-runtime on servers
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-12-08 18:18:08 -08:00
Brad Davidson
b9c288f702 Bump containerd/runc to v1.7.10-k3s1/v1.1.10
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-12-08 18:17:19 -08:00
Vitor Savian
03532f7c0b Added runtime classes for crun/wasm/nvidia
Signed-off-by: Vitor Savian <vitor.savian@suse.com>

Added default runtime flag

Signed-off-by: Vitor Savian <vitor.savian@suse.com>
2023-12-08 15:49:28 -03:00
Brad Davidson
6d3a92a658 Print key instead of file path in snapshot metadata log message
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-11-21 14:03:27 -08:00
Brad Davidson
b23e70d519 Don't apply s3 retention if S3 client failed to initialize
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-11-21 14:03:27 -08:00
Brad Davidson
a92c4a0f17 Don't request metadata when listing objects
While some implementations may support it, it appears that most don't,
and some may in fact return an error if it is requested.

We already stat the object to get the metadata anyway, so this was
unnecessary if harmless on most implementations.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-11-21 14:03:27 -08:00
Brad Davidson
1e0a7044cf Reorder snapshot configmap reconcile to reduce log spew during initial startup
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-11-17 10:09:01 -08:00
Vitor Savian
e53c189587
Handle nil pointer when runtime core is not ready in etcd
Signed-off-by: Vitor <vitor.savian@suse.com>
2023-11-16 15:58:42 -08:00
Brad Davidson
6c544a4679 Add jitter to client config retry
Also:
* Replaces labeled for/continue RETRY loops with wait helpers for improved readability
* Pulls secrets and nodes from cache for node password verification
* Migrate nodepassword tests to wrangler mocks for better code reuse

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-11-16 09:53:28 -08:00
Harsimran Singh Maan
abc2efdd57
Disable helm CRD installation for disable-helm-controller (#8702)
* Disable helm CRD installation for disable-helm-controller
    The NewContext package requires config as input which would
    require all third-party callers to update when the new go module
    is published.
    
    This change only affects the behaviour of installation of helm
    CRDs. Existing helm crds installed in a cluster would not be removed
    when disable-helm-controller flag is set on the server.
    
    Addresses #8701
* address review comments
* remove redundant check

Signed-off-by: Harsimran Singh Maan <maan.harry@gmail.com>
2023-11-15 14:35:31 -08:00
Jason Costello
07ee854914
Tweaked order of ingress IPs in ServiceLB (#8711)
* Tweaked order of ingress IPs in ServiceLB
    Previously, ingress IPs were only string-sorted when returned
    Sorted by IP family and string-sorted in each family as part of
    filterByIPFamily method
* Update pkg/cloudprovider/servicelb.go
* Formatting

Signed-off-by: Jason Costello <jason@hazy.com>
Co-authored-by: Brad Davidson <brad@oatmail.org>
2023-11-15 14:33:31 -08:00
Brad Davidson
7ecd5874d2 Skip initial datastore reconcile during cluster-reset
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-11-15 14:31:44 -08:00
Brad Davidson
2088218c5f Fix issue with snapshot metadata configmap
Omit snapshot list configmap entries for snapshots without extra metadata; reduce log level of warnings about missing s3 metadata files.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-11-15 14:25:28 -08:00
chenk008
b47cbbfd42
add agent flag disable-apiserver-lb (#8717)
* add node flag disable-agent-lb
* add agent flag disable-apiserver-lb

Co-authored-by: Brad Davidson <brad@oatmail.org>
Signed-off-by: chenk008 <kongchen28@gmail.com>
2023-11-14 15:54:32 -08:00