Commit graph

48 commits

Author SHA1 Message Date
Brad Davidson
3acf8db8f2 Update packages to remove dep on archived github.com/pkg/errors
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2026-03-09 16:09:01 -07:00
Brad Davidson
100cb633a3 lint: duplicated-imports
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-12-18 11:20:07 -08:00
Brad Davidson
316464975e lint: redundant-build-tag
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-12-18 11:20:07 -08:00
Brad Davidson
4974fc7c24 Use sync.WaitGroup to avoid exiting before components have shut down
Currently only waits on etcd and kine, as other components
are stateless and do not need to shut down cleanly.

Terminal but non-fatal errors now request shutdown via context
cancellation, instead of just logging a fatal error.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-09-17 09:37:08 -07:00
Brad Davidson
7e253dbf02 Fix netpol fatal error when changing node IP
Wait for updated ready condition before starting netpol controller, to ensure that node IPs have been updated following a restart. The current checks only ensure that the taint is removed, which works for the initial join - but does not handle changing node IPs on restarts.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-09-10 10:27:52 -07:00
Roberto Bonafiglia
573da0d41c Update network components
Signed-off-by: Roberto Bonafiglia <roberto.bonafiglia@suse.com>
2025-06-16 16:04:25 +02:00
Brad Davidson
bed1f66880 Avoid use of github.com/pkg/errors functions that capture stack
We are not making use of the stack traces that these functions capture, so we should avoid using them as unnecessary overhead.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-03-05 00:41:38 -08:00
Brad Davidson
96c2dd3865 Skip netpol startup on windows instead of panicing
Netpol startup is skipped with a warning on linux if ipset support is missing, we should do the same on windows

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2025-02-07 07:46:19 -08:00
Brad Davidson
71918e0d69 Use helper to set consistent rest.Config rate limits and timeouts
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-12-06 11:45:34 -08:00
Will
e4f3cc7b54 remove deprecated use of wait functions
Signed-off-by: Will <will7989@hotmail.com>
2024-07-29 16:23:17 -07:00
Brad Davidson
ed23a2bb48 Fix netpol crash when node remains tained unintialized
It is concievable that users might take more than 60 seconds to deploy their own cloud-provider. Instead of exiting, we should wait forever, but with more logging to indicate what's being waited on.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-28 23:34:44 -07:00
Brad Davidson
ff679fb3ab Refactor supervisor listener startup and add metrics
* Refactor agent supervisor listener startup and authn/authz to use upstream
  auth delegators to perform for SubjectAccessReview for access to
  metrics.
* Convert spegel and pprof handlers over to new structure.
* Promote bind-address to agent flag to allow setting supervisor bind
  address for both agent and server.
* Promote enable-pprof to agent flag to allow profiling agents. Access
  to the pprof endpoint now requires client cert auth, similar to the
  spegel registry api endpoint.
* Add prometheus metrics handler.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-05-28 16:24:57 -07:00
Brad Davidson
513c3416e7 Tweak netpol node wait logs
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-03-01 12:01:34 -08:00
Brad Davidson
86f102134e Fix netpol startup when flannel is disabled
Don't break out of the poll loop if we can't get the node, RBAC might not be ready yet.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-02-26 14:58:48 -08:00
Brad Davidson
76fa022045 Enable network policy controller metrics
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2024-01-11 10:19:39 -08:00
Manuel Buil
6330e26bb3 Wait for taint to be gone in the node before starting the netpol controller
Signed-off-by: Manuel Buil <mbuil@suse.com>
2024-01-08 12:04:18 +01:00
Manuel Buil
4aafff0219 Wrap error stating that it is coming from netpol
Signed-off-by: Manuel Buil <mbuil@suse.com>
2023-05-12 19:33:25 +02:00
Roberto Bonafiglia
3e3512bdae Updated kube-route version to move the iptables ACCEPT default rule at the end of the chain
Signed-off-by: Roberto Bonafiglia <roberto.bonafiglia@suse.com>
2023-04-06 09:55:34 +02:00
Roberto Bonafiglia
e098b99bfa
Update flannel and kube-router (#7039)
* Update kube-router version to fix iptables rules

Signed-off-by: Roberto Bonafiglia <roberto.bonafiglia@suse.com>

* Update Flannel to v0.21.3

Signed-off-by: Roberto Bonafiglia <roberto.bonafiglia@suse.com>

---------

Signed-off-by: Roberto Bonafiglia <roberto.bonafiglia@suse.com>
2023-03-10 19:57:16 -08:00
Thomas Ferrandiz
68ac954489 log kube-router version when starting netpol controller
Signed-off-by: Thomas Ferrandiz <thomas.ferrandiz@suse.com>
2022-11-03 12:26:50 +01:00
Michal Rostecki
5f2a4d4209 server: Allow to enable network policies with IPv6-only
After previous changes, network policies are working on IPv6-only
installations.

Signed-off-by: Michal Rostecki <vadorovsky@gmail.com>
2022-04-29 10:51:38 -07:00
Michal Rostecki
c0045f415b agent(netpol): Explicitly enable IPv4 when necessary
Before this change, kube-router was always assuming that IPv4 is
enabled, which is not the case in IPv6-only clusters. To enable network
policies in IPv6-only, we need to explicitly let kube-router know when
to disable IPv4.

Signed-off-by: Michal Rostecki <vadorovsky@gmail.com>
2022-04-29 10:51:38 -07:00
Michal Rostecki
c707948adf netpol: Add dual-stack support
This change allows to define two cluster CIDRs for compatibility with
Kubernetes dual-stuck, with an assumption that two CIDRs are usually
IPv4 and IPv6.

It does that by levearaging changes in out kube-router fork, with the
following downstream release:

https://github.com/k3s-io/kube-router/releases/tag/v1.3.2%2Bk3s

Signed-off-by: Michal Rostecki <vadorovsky@gmail.com>
2022-04-06 14:43:09 +02:00
Luther Monson
9a849b1bb7
[master] changing package to k3s-io (#4846)
* changing package to k3s-io

Signed-off-by: Luther Monson <luther.monson@gmail.com>

Co-authored-by: Derek Nola <derek.nola@suse.com>
2022-03-02 15:47:27 -08:00
Manuel Buil
062fe63dd1 Fix annoying netpol log
Signed-off-by: Manuel Buil <mbuil@suse.com>
2022-02-10 20:01:27 +01:00
Michal Rostecki
4fed9f4052 netpol: Use kube-router as a library
Before this change, we were copying a part of kube-router code to
pkg/agent/netpol directory with modifications, from which the biggest
one was consumption of k3s node config instead of kube-router config.

However, that approach made it hard to follow new upstream versions.
It's possible to use kube-router as a library, so it seems like a better
way to do that.

Instead of modifying kube-router network policy controller to comsume
k3s configuration, this change just converts k3s node config into
kube-router config. All the functionality of kube-router except netpol
is still disabled.

Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>
Signed-off-by: Manuel Buil <mbuil@suse.com>
2022-02-07 10:54:08 +01:00
Derek Nola
21c8a33647
Introduction of Integration Tests (#3695)
* Commit of new etcd snapshot integration tests.
* Updated integration github action to not run on doc changes.
* Update Drone runner to only run unit tests

Signed-off-by: dereknola <derek.nola@suse.com>
2021-07-26 09:59:33 -07:00
Derek Nola
73df2d806b
Update embedded kube-router (#3557)
* Update embedded kube-router

Signed-off-by: dereknola <derek.nola@suse.com>
2021-07-07 08:46:10 -07:00
Brad Davidson
2705431d96
Add support for dual-stack Pod/Service CIDRs and node IP addresses (#3212)
* Add support for dual-stack cluster/service CIDRs and node addresses

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-04-21 15:56:20 -07:00
Brad Davidson
65c78cc397 Replace options.KubeRouterConfig with config.Node and remove metrics/waitgroup stuff
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-02-03 10:41:51 -08:00
Brad Davidson
95a1a86847 Spell check upstream code
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-02-03 10:41:51 -08:00
Brad Davidson
29483d0651 Initial update of netpol and utils from upstream
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-02-03 10:41:51 -08:00
JenTing Hsiao
57041f0239
Add codespell CI test and fix codespell error (#2740)
* Add codespell CI test
* Fix codespell error
2020-12-22 12:35:58 -08:00
Erik Wilson
e26e333b7e
Add network policy controller CacheSyncOrTimeout 2020-10-07 12:35:44 -07:00
Erik Wilson
045cd49ab5
Add event handlers to network policy controller 2020-10-07 12:10:27 -07:00
Brad Davidson
8c6d3567fe Rename k3s-controller based on the build-time program name
Since we're replacing the k3s rolebindings.yaml in rke2, we should allow
renaming this so that we can use the white-labeled name downstream.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2020-09-16 10:53:07 -07:00
Brian Downs
866dc94cea
Galal hussein etcd backup restore (#2154)
* Add etcd snapshot and restore

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fix error logs

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* goimports

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fix flag describtion

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* Add disable snapshot and retention

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* use creation time for snapshot retention

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* unexport method, update var name

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* adjust snapshot flags

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update var name, string concat

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* revert previous change, create constants

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* updates

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* type assertion error checking

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* pr remediation

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* pr remediation

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* pr remediation

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* pr remediation

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* pr remediation

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* updates

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* updates

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* simplify logic, remove unneeded function

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update flags

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update flags

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* add comment

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* exit on restore completion, update flag names, move retention check

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* exit on restore completion, update flag names, move retention check

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* exit on restore completion, update flag names, move retention check

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update disable snapshots flag and field names

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* move function

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update field names

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update var and field names

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update var and field names

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update defaultSnapshotIntervalMinutes to 12 like rke

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update directory perms

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update etc-snapshot-dir usage

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update interval to 12 hours

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* fix usage typo

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* add cron

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* add cron

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* add cron

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* wire in cron

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* wire in cron

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* wire in cron

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* wire in cron

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* wire in cron

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* wire in cron

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* wire in cron

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update deps target to work, add build/data target for creation, and generate

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* remove dead make targets

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* error handling, cluster reset functionality

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* error handling, cluster reset functionality

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* update

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* remove intermediate dapper file

Signed-off-by: Brian Downs <brian.downs@gmail.com>

Co-authored-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2020-08-28 16:57:40 -07:00
Erik Wilson
0d6a2bfb0b
Merge pull request #1974 from mschneider82/patch-1
fixed panic in network_policy_controller
2020-07-01 09:48:00 -07:00
niusmallnan
d713683614 Add retry backoff for starting network-policy controller
Signed-off-by: niusmallnan <niusmallnan@gmail.com>
2020-06-30 09:25:09 +08:00
Matthias Schneider
56a083c812 fixed panic in network_policy_controller
I have rebooted a newly created k3s etcd cluster and this panic was triggered:

    ```
    k3s[948]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x45f2945]
    k3s[948]: goroutine 1 [running]:
    k3s[948]: github.com/rancher/k3s/pkg/agent/netpol.NewNetworkPolicyController(0xc00159e180, 0x61b4a60, 0xc006294000, 0xdf8475800, 0xc011d9a360, 0xc, 0x0, 0xc00bf545b8, 0x2b2edbc)
    k3s[948]:         /home/x/git/k3s/pkg/agent/netpol/network_policy_controller.go:1698 +0x275
    ```

Signed-off-by: Matthias Schneider <ms@wck.biz>
2020-06-29 20:49:24 +02:00
Darren Shepherd
a8d96112d9 Updates for k8s v1.18 support 2020-04-18 23:59:08 -07:00
Knic Knic
c2db115ec3 fix formatting 2020-02-23 00:48:26 -08:00
Knic Knic
2346ccc63f get build on windows and get api_server to work 2020-02-22 23:17:59 -08:00
Erik Wilson
5b98d10e4b Warn if NPC can't start rather than fatal error
If the ip_set kernel module is not available we should warn
that the network policy controller can not start rather than
cause a fatal error.

Also adds module probing and config checks for ip_set.
2020-01-14 14:30:12 -07:00
yuzhiquan
24869ddf21 remove []byte trans, handle func error 2019-11-28 19:26:45 +08:00
yuzhiquan
7cc0110081 fix typo 2019-11-28 19:24:19 +08:00
Darren Shepherd
ba240d0611 Refactor tokens, bootstrap, and cli args 2019-10-30 19:06:49 -07:00
Erik Wilson
da3a7c6bbc Add network policy controller 2019-10-18 16:11:42 -07:00