Commit graph

41 commits

Author SHA1 Message Date
Brad Davidson
d38b4b30cd Replace temporary etcd server with raw mvcc store access
Fixes an issue where copying files out from under a currently-running etcd instance can cause startup reconcile to fail. Direct creation of a mvcc store without any of the raft stuff is faster, and gives us direct control over how the store handles snapshot recovery.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2026-01-05 09:59:29 -08:00
Brad Davidson
d8af4f162a lint: if-return
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-12-18 11:20:07 -08:00
Brad Davidson
4974fc7c24 Use sync.WaitGroup to avoid exiting before components have shut down
Currently only waits on etcd and kine, as other components
are stateless and do not need to shut down cleanly.

Terminal but non-fatal errors now request shutdown via context
cancellation, instead of just logging a fatal error.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-09-17 09:37:08 -07:00
Derek Nola
56ef1cd3a2
Update etcd to v3.6.4-k3s3
* Raft is now an independent dependency, with a seperate release version
* errors moved into their own subpackage
* set a default WarningUnaryRequestDuration

Signed-off-by: Derek Nola <derek.nola@suse.com>
Co-authored-by: Michael Fritch <mfritch@suse.com>
2025-09-04 17:33:10 -06:00
Brad Davidson
c837bfcdc7 Bump kine for metrics panic fix
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-09-03 09:52:51 -07:00
bo.jiang
b5f4fd1d73 Fix K3s not validating datastore connection when no token is set
Signed-off-by: bo.jiang <bo.jiang@daocloud.io>
2025-06-05 12:49:26 -07:00
Brad Davidson
f940368747 Use etcd proxy to bootstrap control-plane-only nodes, if possible
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-02-27 11:19:26 -08:00
Brad Davidson
53fcadc028 Serve HTTP bootstrap data from datastore before disk
Fixes issue where CA rotation would fail on servers with join URL set due to using old data from disk on other server

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-02-27 11:19:26 -08:00
Will
e4f3cc7b54 remove deprecated use of wait functions
Signed-off-by: Will <will7989@hotmail.com>
2024-07-29 16:23:17 -07:00
Vitor Savian
5d69d6e782 Add tls for kine
Signed-off-by: Vitor Savian <vitor.savian@suse.com>

Bump kine

Signed-off-by: Vitor Savian <vitor.savian@suse.com>

Add integration tests for kine with tls

Signed-off-by: Vitor Savian <vitor.savian@suse.com>
2024-03-28 11:12:07 -03:00
Brad Davidson
d885162967 Add server token hash to CR and S3
This required pulling the token hash stuff out of the cluster package, into util.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-10-12 15:04:45 -07:00
Derek Nola
dface01de8
Server Token Rotation (#8265)
* Consolidate NewCertCommands
* Add support for user defined new token
* Add E2E testlets

Signed-off-by: Derek Nola <derek.nola@suse.com>

* Ensure agent token also changes

Signed-off-by: Derek Nola <derek.nola@suse.com>
2023-10-09 10:58:49 -07:00
Brad Davidson
cf9ebb3259 Fail to validate server tokens that use bootstrap id/secret format
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-05-05 12:24:35 -07:00
Brad Davidson
d95980bba3 Lock bootstrap data with empty key to prevent conflicts
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-04-05 10:56:57 -07:00
Derek Nola
06d81cb936
Replace deprecated ioutil package (#6230)
* Replace ioutil package
* check integration test null pointer
* Remove rotate retries

Signed-off-by: Derek Nola <derek.nola@suse.com>
2022-10-07 17:36:57 -07:00
Luther Monson
9a849b1bb7
[master] changing package to k3s-io (#4846)
* changing package to k3s-io

Signed-off-by: Luther Monson <luther.monson@gmail.com>

Co-authored-by: Derek Nola <derek.nola@suse.com>
2022-03-02 15:47:27 -08:00
Brad Davidson
e4846c92b4 Move temporary etcd startup into etcd module
Reuse the existing etcd library code to start up the temporary etcd
server for bootstrap reconcile. This allows us to do proper
health-checking of the datastore on startup, including handling of
alarms.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-03-01 20:25:20 -08:00
Brad Davidson
a1b800f0bf Remove unnecessary copies of etcdconfig struct
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-02-28 12:05:16 -08:00
Brad Davidson
2989b8b2c5 Remove unnecessary copies of runtime struct
Several types contained redundant references to ControlRuntime data. Switch to consistently accessing this via config.Runtime instead.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-02-28 12:05:16 -08:00
Brad Davidson
8ad7d141e8 Close etcd clients to avoid leaking GRPC connections
If you don't explicitly close the etcd client when you're done with it,
the GRPC connection hangs around in the background. Normally this is
harmelss, but in the case of the temporary etcd we start up on 2399 to
reconcile bootstrap data, the client will start logging errors
afterwards when the server goes away.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-12-17 23:55:17 -08:00
Derek Nola
bcb662926d
Secrets-encryption rotation (#4372)
* Regular CLI framework for encrypt commands
* New secrets-encryption feature
* New integration test
* fixes for flaky integration test CI
* Fix to bootstrap on restart of existing nodes
* Consolidate event recorder

Signed-off-by: Derek Nola <derek.nola@suse.com>
2021-12-07 14:31:32 -08:00
Brian Downs
adaeae351c
update bootstrap logic (#4438)
* update bootstrap logic resolving a startup bug and account for etcd
2021-11-10 05:33:42 -07:00
Brian Downs
34080b23b1
Copy old bootstrap buffer data for use during migration (#4215) 2021-10-15 10:17:29 -07:00
Derek Nola
feec44572d
Improve error message when using a "K10" prefixed token (#4180)
* Add new error message with a K10 prefixed secret token

Signed-off-by: dereknola <derek.nola@suse.com>
2021-10-11 10:00:22 -07:00
Brian Downs
ac7a8d89c6
Add ability to reconcile bootstrap data between datastore and disk (#3398) 2021-10-07 12:47:00 -07:00
galal-hussein
20a48734c2 more fixes
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-07-21 22:42:05 +02:00
galal-hussein
7ebcc4b134 more fixes
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-07-21 22:39:44 +02:00
galal-hussein
b4401296ec replace error with warn in delete
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-07-21 22:18:56 +02:00
galal-hussein
2f82bfcf67 fix warning msg
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-07-21 22:05:43 +02:00
galal-hussein
b377839148 migrate old token key format
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-07-21 20:59:57 +02:00
galal-hussein
997ed7b9b4 simplifying the code
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-07-21 19:56:19 +02:00
galal-hussein
ad17292fa8 migrate empty string key properly
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-07-21 19:21:38 +02:00
galal-hussein
a65e5b6466 Fix multiple bootstrap keys found
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-07-21 02:50:42 +02:00
Brad Davidson
246b378a27 Bump kine to resolve race condition and unrevisioned delete
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-06-30 09:54:46 -07:00
Hussein Galal
136dddca11
Fix storing bootstrap data with empty token string (#3422)
* Fix storing bootstrap data with empty token string

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* delete node password secret after restoration

fixes to bootstrap key

vendor update

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fix comment

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fix typo

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* more fixes

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fixes

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fixes

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* typos

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* Removing dynamic listener file after restoration

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* go mod tidy

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-06-22 22:42:34 +02:00
Hussein Galal
73df65d93a
remove etcd data dir when etcd is disabled (#3059)
* remove etcd data dir when etcd is disabled

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fix comment

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* more fixes

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* use debug instead of info logs

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-03-16 18:14:43 +02:00
Brian Downs
7c99f8645d
Have Bootstrap Data Stored in etcd at Completed Start (#3038)
* have state stored in etcd at completed start and remove unneeded code
2021-03-11 13:07:40 -07:00
Jacob Blain Christen
36230daa86
[migration k3s-io] update kine dependency (#2568)
rancher/kine ➡️ k3s-io/kine

Part of https://github.com/rancher/k3s/issues/2189

Signed-off-by: Jacob Blain Christen <jacob@rancher.com>
2020-11-30 16:45:22 -07:00
Brad Davidson
703ba5cde7 Add a bunch of doc comments
Also change identical error messages to clarify where problems are
occurring.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2020-09-27 03:10:00 -07:00
Darren Shepherd
a18d387390 Refactor clustered DB framework 2020-06-06 16:39:41 -07:00
Darren Shepherd
0ae20eb7a3 Support both http and db based bootstrap 2019-11-12 01:12:24 +00:00