prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2026-06-08 16:12:16 -04:00

Author	SHA1	Message	Date
Matthieu MOREL	cef219c31c	chore: enable unused-receiver rule from revive Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2025-08-04 09:43:33 +00:00
Joe Adams	cdfb67467f	Review feedback Signed-off-by: Joe Adams <github@joeadams.io>	2025-07-31 21:22:43 -04:00
Joe Adams	56a3bbf5c5	Fix import formatting Signed-off-by: Joe Adams <github@joeadams.io>	2025-07-29 23:17:23 -04:00
Joe Adams	eab9b696f2	Upgrade AWS SDK to v2 AWS SDK v1 is end of life soon, so migrate to the V2 SDK. The credential loading should work more consistently with other projects that use the SDK and load credentials from the appropriate locations including from environment variables. This affects the EC2 and Lightsail service discovery features. Signed-off-by: Joe Adams <github@joeadams.io>	2025-07-29 23:06:05 -04:00
Ayoub Mrini	9dc274687b	Merge pull request #16831 from machine424/nsmeta feat(discovery/kubernetes): allow attaching namespace metadata	2025-07-17 10:30:27 +01:00
machine424	a9f6fdd910	feat(discovery/kubernetes): allow attaching namespace metadata to ingress and service roles. with the help of claude-4-sonnet Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2025-07-17 09:53:16 +02:00
Yandi Lee	8eb445b8a4	Discovery.Manager: close sync ch after sender() is stopped (#14465 ) * close sync ch after sender() is stopped * break if chan is closed Signed-off-by: liyandi <littlepangdi@163.com> Co-authored-by: liyandi <liyandi@xiaomi.com>	2025-07-11 17:15:01 +01:00
machine424	020e803ee0	chore(discovery): remove unused StaticProvider struct, library users can easily define it on their side Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2025-07-09 17:10:13 +01:00
chenlujjj	a2735494e1	chore: complete error message in RegisterSDMetrics function (#14635 ) Signed-off-by: chenlujjj <953546398@qq.com>	2025-07-08 12:05:24 +00:00
machine424	c2d6e528e4	feat(discovery/kubernetes): allow attaching namespace metadata to endpointslice, endpoints and pod roles after injecting the labels for endpointslice, claude-4-sonnet helped transpose the code and tests to endpoints and pod roles fixes https://github.com/prometheus/prometheus/issues/9510 supersedes https://github.com/prometheus/prometheus/pull/13798 Signed-off-by: machine424 <ayoubmrini424@gmail.com> Co-authored-by: Paul BARRIE <paul.barrie.calmels@gmail.com>	2025-07-03 19:41:08 +02:00
Lukasz Mierzwa	b49d143595	Fix a race in discovery manager ApplyConfig & shutdown If we call ApplyConfig() at the same time the manager is being stopped we might end up hanging forever. This is because ApplyConfig() will try to cancel obsolete providers and wait until they are cancelled. It's done by setting a done() function that call Done() on a sync.WaitGroup: ``` if len(prov.newSubs) == 0 { wg.Add(1) prov.done = func() { wg.Done() } } ``` then calling prov.cancel() and finally waiting until all providers run done() function that by blocking it all on a wg.Wait() call. For each provider there is a goroutine created by calling Manager.startProvider(Provider): ``` func (m Manager) startProvider(ctx context.Context, p Provider) { m.logger.Debug("Starting provider", "provider", p.name, "subs", fmt.Sprintf("%v", p.subs)) ctx, cancel := context.WithCancel(ctx) updates := make(chan []targetgroup.Group) p.mu.Lock() p.cancel = cancel p.mu.Unlock() go p.d.Run(ctx, updates) go m.updater(ctx, p, updates) } ``` It creates a context that can be cancelled and that cancel function becomes prov.cancel. This is what ApplyConfig will call. If we look at the body of updater() method: ``` func (m Manager) updater(ctx context.Context, p Provider, updates chan []targetgroup.Group) { // Ensure targets from this provider are cleaned up. defer m.cleaner(p) for { select { case <-ctx.Done(): return [...] ``` we can see that it will exit if that context is cancelled and that will trigger a call to Manager.cleaner(). That cleaner() is where done() is called. So ApplyConfig() -> calls cancel() -> causes cleaner() to be executed -> calls done(). cancel() is also called from cancelDiscoverers() method that will be called by Manager.Run() when Manager is stopping: ``` func (m Manager) Run() error { go m.sender() <-m.ctx.Done() m.cancelDiscoverers() return m.ctx.Err() } ``` The problem is that if we call both ApplyConfig and stop the manager at the same time we might end up with: - We call Manager.ApplyConfig() - We stop the Manager - Manager.cancelDiscoverers() is called - Provider.cancel() is called for every Provider - cancel() causes provider context to be cancelled which terminates updater() for given Provider - cancelling context causes cleaner() method to be called for given Provider - cleaner() calls done() and exits - Provider is considered stopped at this point, there is no goroutine running that will call done() anymore - ApplyConfig iterates providers and decides that one is obsolete is must be stopped - It sets a custom done() function body with a WaitGroup.Done() call in it - Then ApplyConfig waits until all Providers run done() - But they are all stopped and no done() will be run - We wait forever This only happens if cancelDiscoverers() is run before ApplyConfig, if ApplyConfig runs first done() will be called, if cancelDiscoverers() is called first it will stop updater() instances and so done() won't be called anymore. Part of the problem is that there is no distinction between running and stopped providers. There is Provider.IsStarted() method that returns a bool based on the value of cancel function but ApplyConfig doesn't check it. Second problem is that although there is a mutex on a Provider it's used much in the code, so two goroutines can try to read and/or write provider.cancel and/or provider.done at the same time, making it all more likely to race. The easiest way to fix it is to check if the provider is started inside ApplyConfig so we don't try to stop a provider that's already stopped. For that we need to mark it as stopped after cancel() is called, by setting cancel to nil. This also needs better lock usage to avoid different parts of the code trying to set cancel and done at the same time. Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>	2025-07-02 16:03:10 +01:00
Lukasz Mierzwa	357e652044	Add a test for a rare shutdown hang When doing a config reload that need to stop some providers while also sending SIGTERM to Prometheus at the same time can sometimes hang 1: sync.WaitGroup.Wait [83 minutes] [Created by run.(Group).Run in goroutine 1 @ group.go:37] sync sema.go:110 runtime_SemacquireWaitGroup(uint32(#166)) sync waitgroup.go:118 (WaitGroup).Wait(WaitGroup(#23)) discovery manager.go:276 (Manager).ApplyConfig(#23, #167) main main.go:964 main.func5(#120) main main.go:1505 reloadConfig({#183, 0x1b}, 1, #40, #43, #50, {#31, 0xa, 0}) main main.go:1182 main.func22() run group.go:38 (Group).Run.func1(*Group(#26), #51) Add a test for it. Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>	2025-07-02 16:01:42 +01:00
Bryan Boreham	d6f9ba6310	[BUILD] Docker SD: Fix up deprecated types Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2025-06-23 16:15:58 +01:00
Jan-Otto Kröpke	ceaa3bd6f9	discovery: add STACKIT SD (#16401 )	2025-06-17 15:41:14 +02:00
Ayoub Mrini	50ba25f273	chore(docs/kubernetes SD): add a note about Endpoints API being deprecated in kubernetes 1.33+ (#16684 ) * chore(docs/kubernetes SD): add a note about Endpoints API being deprecated in kubernetes 1.33+ Signed-off-by: machine424 <ayoubmrini424@gmail.com> * chore(discovery/kubernetes): add Endpoints API deprecation comment Signed-off-by: machine424 <ayoubmrini424@gmail.com> --------- Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2025-06-06 11:56:27 +02:00
Zhengke Zhou	45211dc72f	chore: Adjust test and add comment about DNS resolution issue for failing tests (#16200 ) * chore: Add comment about DNS resolution issue for failing tests Signed-off-by: zhengkezhou1 <madzhou1@gmail.com> * remove unexported-return Signed-off-by: zhengkezhou1 <madzhou1@gmail.com> --------- Signed-off-by: zhengkezhou1 <madzhou1@gmail.com>	2025-05-27 14:40:09 +02:00
Ryan Wu	091e662f4d	refactor(endpointslice): use service cache.Indexer to achieve better iteration performance (#16365 ) * refactor(endpointslice): use cache.Indexer to index endpointslices by LabelServiceName so not have to iterate over all endpoint objects. Signed-off-by: Ryan Wu <rongjun0821@gmail.com> * check the type and error early and add 'TestEndpointSliceDiscoveryWithUnrelatedServiceUpdate' unit test to give a regression test Signed-off-by: Ryan Wu <rongjun0821@gmail.com> * make service indexer namespaced Signed-off-by: Ryan Wu <rongjun0821@gmail.com> * remove unneeded test func Signed-off-by: Ryan Wu <rongjun0821@gmail.com> * Apply suggestions from code review Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com> Signed-off-by: Ryan Wu <rongjun0821@gmail.com> --------- Signed-off-by: Ryan Wu <rongjun0821@gmail.com> Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>	2025-05-20 20:33:25 +02:00
Ayoub Mrini	eb8d34c2ad	Merge pull request #16587 from prymitive/discoveryLocks discovery: Try fixing potential deadlocks in discovery	2025-05-19 11:09:49 +02:00
Ben Kochie	1eaf12e99b	Add golangci-lint fmt (#16602 ) With golangci-lint v2, it now has "formatters" that can be configured. Add `golangci-lint fmt` to the `make format` in Makefile.common. * Enable goimports formatter. Signed-off-by: SuperQ <superq@gmail.com>	2025-05-16 11:05:35 +02:00
Lukasz Mierzwa	59761f631b	Move m.targetsMtx.Lock down into the loop Make sure the order of locks is always the same in all functions. In ApplyConfig() we have m.targetsMtx.Lock() after provider is locked, so replicate the same in allGroups(). Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>	2025-05-15 12:30:48 +01:00
Lukasz Mierzwa	7d55ee8cc8	Try fixing potential deadlocks in discovery Manager.ApplyConfig() uses multiple locks: - Provider.mu - Manager.targetsMtx Manager.cleaner() uses the same locks but in the opposite order: - First it locks Manager.targetsMtx - The it locks Provider.mu I've seen a few strange cases of Prometheus hanging up on shutdown and never compliting that shutdown. From a few traces I was given it appears that while Prometheus is still running only discovery.Manager and notifier.Manager are running running. From that trace it also seems like they are stuck on a lock from two functions: - cleaner waits on a RLock() - ApplyConfig waits on a Lock() I cannot reproduce it but I suspect this is a race between locks. Imagine this scenario: - Manager.ApplyConfig() is called - Manager.ApplyConfig locks Provider.mu.Lock() - at the same time cleaner() is called on the same Provider instance and it calls Manager.targetsMtx.Lock() - Manager.ApplyConfig() now calls Manager.targetsMtx.Lock() but that lock is already held by cleaner() function so ApplyConfig() hangs there - at the same time cleaner() now wants to lock Provider.mu.Rlock() but that lock is already held by Manager.ApplyConfig() - we end up with both functions locking each other out without any way to break that lock Re-order lock calls to try to avoid this scenario. I tried writing a test case for it but couldn't hit this issue. Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>	2025-05-12 09:13:46 +01:00
hardlydearly	ba4b058b7a	refactor: use slices.Contains to simplify code Signed-off-by: hardlydearly <799511800@qq.com>	2025-05-09 08:27:10 +02:00
Arve Knudsen	e7e3ab2824	Fix linting issues found by golangci-lint v2.0.2 (#16368 ) * Fix linting issues found by golangci-lint v2.0.2 --------- Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2025-05-03 19:05:13 +02:00
Jonas Lammler	08982b177f	Add `label_selector` to hetzner service discovery Allows to filter the servers when sending the listing request to the API. This feature is only available when using the `role=hcloud`. See https://docs.hetzner.cloud/#label-selector for details on how to use the label selector. Signed-off-by: Jonas Lammler <jonas.lammler@hetzner-cloud.de>	2025-04-30 09:24:14 +02:00
Ryan Wu	b4d3c06acb	discovery: make endpointSlice discovery more efficient (#16433 ) * discovery: a change to a service with the same name but from another namespace won't enqueue the endpointSlice Signed-off-by: Ryan Wu <rongjun0821@gmail.com> * Update discovery/kubernetes/endpointslice.go Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com> Signed-off-by: Ryan Wu <rongjun0821@gmail.com> * Update endpointslice.go Signed-off-by: Ryan Wu <rongjun0821@gmail.com> --------- Signed-off-by: Ryan Wu <rongjun0821@gmail.com> Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>	2025-04-16 16:43:30 +02:00
Zhengke Zhou	c884dd16ac	discovery: Remove ingress & endpoint slice adaptors (#16413 ) * Remove ingress & endpoint slice adaptors * fix ci Signed-off-by: zhengkezhou1 <madzhou1@gmail.com>	2025-04-09 10:25:53 +01:00
Ryan Wu	7d73c1d3f8	refactor[discovery, tsdb]: simplify error handling and remove redundant checks (#16328 ) * refactor: simplify error handling and remove redundant checks Signed-off-by: Ryan Wu <rongjun0821@gmail.com> * Add the comment for return of reloading blocks failure Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com> Signed-off-by: Ryan Wu <rongjun0821@gmail.com> * Add the comment for return of reloading blocks failure Signed-off-by: Ryan Wu <rongjun0821@gmail.com> --------- Signed-off-by: Ryan Wu <rongjun0821@gmail.com> Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>	2025-03-27 12:20:59 +01:00
Matthieu MOREL	5fa1146e21	chore: enable gci linter (#16245 ) Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2025-03-22 15:46:13 +00:00
Matthieu MOREL	6719867196	test(kubernetes): replace equality check with JSON equality assertion (#16246 ) Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2025-03-22 13:55:30 +01:00
Patryk Prus	452fd42aeb	Disable additional test as flaky on windows Signed-off-by: Patryk Prus <p@trykpr.us>	2025-03-18 14:06:33 -04:00
machine424	b0227d1f16	chore(discovery): disable some file update tests as flaky Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2025-03-14 18:33:13 +01:00
Paulo Dias	9630dc656c	discovery(openstack): remove duplicated error handling for floatingips.List (#16205 ) Signed-off-by: Paulo Dias <paulodias.gm@gmail.com>	2025-03-12 15:25:50 +01:00
dependabot[bot]	6f9f29542e	chore(deps): bump github.com/docker/docker (#16118 ) Bumps [github.com/docker/docker](https://github.com/docker/docker) from 27.5.1+incompatible to 28.0.1+incompatible. - [Release notes](https://github.com/docker/docker/releases) - [Commits](https://github.com/docker/docker/compare/v27.5.1...v28.0.1) --- updated-dependencies: - dependency-name: github.com/docker/docker dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-03-07 15:40:29 +01:00
co63oc	0e4e5a71bd	Fix typos (#16076 ) Signed-off-by: co63oc <co63oc@users.noreply.github.com>	2025-02-28 11:24:25 +11:00
Matthieu MOREL	c7d4b53ec1	chore: enable unused-parameter from revive Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2025-02-19 19:50:28 +01:00
Björn Rabenstein	13c05a385c	Merge pull request #16007 from mmorel-35/revive/early-return chore: enable early-return from revive	2025-02-11 20:31:34 +01:00
Ayoub Mrini	de6add2c7d	Merge pull request #14228 from Codelax/sd-scaleway-routed-ips feat(scaleway-sd): add labels for multiple public IPs	2025-02-11 17:21:29 +01:00
Matthieu MOREL	b472ce7010	chore: enable early-return from revive Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2025-02-10 22:08:43 +01:00
Pierre Prinetti	bb30a871ac	deps: Use Gophercloud v2 Signed-off-by: Pierre Prinetti <pierreprinetti@redhat.com>	2025-01-28 15:08:34 +01:00
Jan Fajerski	ffea9f005b	Merge pull request #15539 from paulojmdias/openstack-loadbalancer-discovery discovery(openstack): add load balancer discovery	2025-01-28 14:10:06 +01:00
Jan Fajerski	7f37a008c4	Merge pull request #15540 from mmorel-35/prometheus/common@v0.61.0 chore(deps): use `version.PrometheusUserAgent`	2025-01-28 13:10:48 +01:00
Matthieu MOREL	dd5ab743ea	chore(deps): use version.PrometheusUserAgent Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2025-01-22 07:31:02 +01:00
Paulo Dias	803b1565a5	fix: fix network endpoint id Signed-off-by: Paulo Dias <paulodias.gm@gmail.com>	2025-01-21 11:20:15 +00:00
Paulo Dias	1d49d11786	fix: fix testing Signed-off-by: Paulo Dias <paulodias.gm@gmail.com>	2025-01-21 11:18:34 +00:00
Paulo Dias	cddf729ca3	Merge branch 'main' of github.com:prometheus/prometheus into openstack-loadbalancer-discovery Signed-off-by: Paulo Dias <paulodias.gm@gmail.com>	2025-01-21 11:16:52 +00:00
Paulo Dias	e8fab32ca2	discovery: move openstack floating ips function from deprecated Compute API /os-floating-ips to Network API /floatingips (#14367 )	2025-01-21 11:40:15 +01:00
crystalstall	616914abe2	Signed-off-by: crystalstall <crystalruby@qq.com> refactor: using slices.Contains to simplify the code Signed-off-by: crystalstall <crystalruby@qq.com>	2025-01-11 00:41:51 +08:00
Paulo Dias	36ccf62692	Merge branch 'prometheus:main' into openstack-loadbalancer-discovery	2025-01-02 14:44:19 +00:00
Paulo Dias	d40e99c2ec	Merge branch 'openstack-loadbalancer-discovery' of github.com:paulojmdias/prometheus into openstack-loadbalancer-discovery Signed-off-by: Paulo Dias <paulodias.gm@gmail.com>	2025-01-02 14:43:46 +00:00
Paulo Dias	cb7254158b	feat: rename status to provisioning_status and add operating_status Signed-off-by: Paulo Dias <paulodias.gm@gmail.com>	2025-01-02 14:43:31 +00:00
pinglanlu	6a61efcfc3	discovery: use a more direct and less error-prone return value (#15347 ) Signed-off-by: pinglanlu <pinglanlu@outlook.com>	2024-12-29 18:03:06 +01:00
Paulo Dias	a5c20713dc	Merge branch 'prometheus:main' into openstack-loadbalancer-discovery	2024-12-08 22:54:18 +00:00
Paulo Dias	713903fe48	fix: fix configuration and remove uneeded libs Signed-off-by: Paulo Dias <paulodias.gm@gmail.com>	2024-12-06 17:58:21 +00:00
Ayoub Mrini	af2a1cb10c	Merge pull request #15227 from aniketnk/i15185_1 Run discovery/kubernetes tests in parallel	2024-12-05 10:48:26 +01:00
Paulo Dias	d136e43109	fix: fix comment Signed-off-by: Paulo Dias <paulodias.gm@gmail.com>	2024-12-04 23:48:31 +00:00
Paulo Dias	9e9929c421	fix: remove new line Signed-off-by: Paulo Dias <paulodias.gm@gmail.com>	2024-12-04 23:46:11 +00:00
Paulo Dias	fc0141aec2	discovery: add openstack load balancer discovery Signed-off-by: Paulo Dias <paulodias.gm@gmail.com>	2024-12-04 23:34:29 +00:00
machine424	c9f3d9b47f	doc(nomad): adjust sections about nomad_sd_config's server test(nomad): extend TestConfiguredService with more valid/invalid servers configs fixes https://github.com/prometheus/prometheus/issues/12306 Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2024-12-03 19:41:45 +01:00
hongmengning	2a1b940ae4	discovery: fix some function names in comment Signed-off-by: hongmengning <go@before.tech>	2024-11-25 17:33:04 +08:00
Aniket Kaulavkar	f7685caf0d	Parallelize discovery/kubernetes tests using t.Parallel() Signed-off-by: Aniket Kaulavkar <aniket.kaulavkar@gmail.com>	2024-11-14 10:44:03 +05:30
Matthieu MOREL	af1a19fc78	enable errorf rule from perfsprint linter Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2024-11-06 16:50:36 +01:00
Giedrius Statkevičius	58fedb6b61	discovery/kubernetes: optimize more gets Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>	2024-10-28 17:17:37 +02:00
Giedrius Statkevičius	716fd5b11f	discovery/kubernetes: use namespacedName Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>	2024-10-28 16:19:56 +02:00
Giedrius Statkevičius	e452308e37	discovery/kubernetes: optimize resolvePodRef resolvePodRef is in a hot path: ``` ROUTINE ======================== github.com/prometheus/prometheus/discovery/kubernetes.(Endpoints).resolvePodRef in discovery/kubernetes/endpoints.go 2.50TB 2.66TB (flat, cum) 22.28% of Total . . 447:func (e Endpoints) resolvePodRef(ref apiv1.ObjectReference) apiv1.Pod { . . 448: if ref == nil \|\| ref.Kind != "Pod" { . . 449: return nil . . 450: } 2.50TB 2.50TB 451: p := &apiv1.Pod{} . . 452: p.Namespace = ref.Namespace . . 453: p.Name = ref.Name . . 454: . 156.31GB 455: obj, exists, err := e.podStore.Get(p) . . 456: if err != nil { . . 457: level.Error(e.logger).Log("msg", "resolving pod ref failed", "err", err) . . 458: return nil . . 459: } . . 460: if !exists { ``` This is some low hanging fruit that we can easily optimize. The key of an object has format "namespace/name" so generate that inside of Prometheus itself and use pooling. ``` goos: linux goarch: amd64 pkg: github.com/prometheus/prometheus/discovery/kubernetes cpu: Intel(R) Core(TM) i9-10885H CPU @ 2.40GHz │ olddisc │ newdisc │ │ sec/op │ sec/op vs base │ ResolvePodRef-16 516.3n ± 17% 289.5n ± 7% -43.92% (p=0.000 n=10) │ olddisc │ newdisc │ │ B/op │ B/op vs base │ ResolvePodRef-16 1168.00 ± 0% 24.00 ± 0% -97.95% (p=0.000 n=10) │ olddisc │ newdisc │ │ allocs/op │ allocs/op vs base │ ResolvePodRef-16 2.000 ± 0% 2.000 ± 0% ~ (p=1.000 n=10) ¹ ¹ all samples are equal ``` Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>	2024-10-28 12:12:40 +02:00
3Juhwan	685d6d169f	refactor: reorder fields in defaultSDConfig initialization for consistency Signed-off-by: 3Juhwan <13selfesteem91@naver.com>	2024-10-28 10:40:49 +01:00
Ayoub Mrini	98dcd28b1a	Merge pull request #15170 from machine424/awldi fix(discovery): Handle cache.DeletedFinalStateUnknown in node informers' DeleteFunc	2024-10-18 17:33:08 +02:00
akunszt	08a7162502	discovery: aws/ec2 unit tests (#14364 ) * discovery: add aws/ec2 unit tests * discovery: initial skeleton for aws/ec2 unit tests This is a - very likely - not too useful unit test for the AWS SD. It is commited so other people can check the basic logic and the implementation. Signed-off-by: Arpad Kunszt <akunszt@hiya.com> * discovery: fix linter complains about ec2_test.go Signed-off-by: Arpad Kunszt <akunszt@hiya.com> * discovery: add basic unit test for aws This tests only the basic labelling, not including the VPC related information. Signed-off-by: Arpad Kunszt <akunszt@hiya.com> * discovery: fix linter complains about ec2_test.go Signed-off-by: Arpad Kunszt <akunszt@hiya.com> * discovery: other linter fixes in aws/ec2_test.go Signed-off-by: Arpad Kunszt <akunszt@hiya.com> * discovery: implement remaining tests for aws/ec2 The coverage is not 100% but I think it is a good starting point if someone wants to improve that. Currently it covers all the AWS API calls. Signed-off-by: Arpad Kunszt <akunszt@hiya.com> * discovery: make linter happy in aws/ec2_test.go Signed-off-by: Arpad Kunszt <akunszt@hiya.com> * discovery: make utility funtcions private Signed-off-by: Arpad Kunszt <akunszt@hiya.com> * discover: no global variable in the aws/ec2 test Signed-off-by: Arpad Kunszt <akunszt@hiya.com> * discovery: common body for some tests in ec2 Signed-off-by: Arpad Kunszt <akunszt@hiya.com> * discovery: try to make golangci-lint happy Signed-off-by: Arpad Kunszt <akunszt@hiya.com> * discovery: make every non-test function private Signed-off-by: Arpad Kunszt <akunszt@hiya.com> * discovery: test for errors first in TestRefresh Signed-off-by: Arpad Kunszt <akunszt@hiya.com> * discovery: move refresh tests into the function This way people can find both the test cases and the execution of the test at the same place. Signed-off-by: Arpad Kunszt <akunszt@hiya.com> * discovery: fix copyright date Signed-off-by: Arpad Kunszt <akunszt@hiya.com> * discovery: remove misleading comment Signed-off-by: Arpad Kunszt <akunszt@hiya.com> * discovery: rename test for easier identification Signed-off-by: Arpad Kunszt <akunszt@hiya.com> * discovery: use static values for the test cases Signed-off-by: Arpad Kunszt <akunszt@hiya.com> * discover: try to make the linter happy Signed-off-by: Arpad Kunszt <akunszt@hiya.com> * discovery: drop redundant data from ec2 and use common ptr functions Signed-off-by: Arpad Kunszt <akunszt@hiya.com> * discovery: use Error instead of Equal Signed-off-by: Arpad Kunszt <akunszt@hiya.com> * discovery: merge refreshAZIDs tests into one Signed-off-by: Arpad Kunszt <akunszt@hiya.com> --------- Signed-off-by: Arpad Kunszt <akunszt@hiya.com>	2024-10-16 14:36:37 +02:00
machine424	b1c356beea	fix(discovery): Handle cache.DeletedFinalStateUnknown in node informers' DeleteFunc Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2024-10-16 10:20:37 +02:00
M Viswanath Sai	16bba78f15	discovery: Improve Azure test coverage to 50% (#14586 ) * azure sd: separate refresh and refreshAzure * azure sd: create a client with mocked servers for tests * add test for refresh function --------- Signed-off-by: mviswanathsai <mviswanath.sai.met21@itbhu.ac.in>	2024-10-13 10:24:51 +02:00
Bryan Boreham	b87b88ddc2	Merge branch 'main' into consul-catalog-filter-support Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-10-08 12:20:31 +01:00
TJ Hoplock	6ebfbd2d54	chore!: adopt log/slog, remove go-kit/log For: #14355 This commit updates Prometheus to adopt stdlib's log/slog package in favor of go-kit/log. As part of converting to use slog, several other related changes are required to get prometheus working, including: - removed unused logging util func `RateLimit()` - forward ported the util/logging/Deduper logging by implementing a small custom slog.Handler that does the deduping before chaining log calls to the underlying real slog.Logger - move some of the json file logging functionality to use prom/common package functionality - refactored some of the new json file logging for scraping - changes to promql.QueryLogger interface to swap out logging methods for relevant slog sugar wrappers - updated lots of tests that used/replicated custom logging functionality, attempting to keep the logical goal of the tests consistent after the transition - added a healthy amount of `if logger == nil { $makeLogger }` type conditional checks amongst various functions where none were provided -- old code that used the go-kit/log.Logger interface had several places where there were nil references when trying to use functions like `With()` to add keyvals on the new *slog.Logger type Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>	2024-10-07 15:58:50 -04:00
Matthieu MOREL	ab64966e9d	fix: use "ErrorContains" or "EqualError" instead of "Contains(t, err.Error()" and "Equal(t, err.Error()" (#15094 ) * fix: use "ErrorContains" or "EqualError" instead of "Contains(t, err.Error()" and "Equal(t, err.Error()" --------- Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com> Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com> Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>	2024-10-06 16:35:29 +00:00
bas smit	73997289c3	tests: update discovery tests with new labael Previous commit added the pod_container_init label to discovery, so all the tests need to reflect that. Signed-off-by: bas smit <bsmit@bol.com>	2024-10-01 10:26:58 +02:00
bas smit	a10dc9298e	sd k8s: support sidecar containers in endpoint discovery Sidecar containers are a newish feature in k8s. They're implemented similar to init containers but actually stay running and allow you to delay startup of your application pod until the sidecar started (like init containers always do). This adds the ports of the sidecar container to the list of discovered endpoint(slice), allowing you to target those containers as well. The implementation is a copy of that of Pod discovery fixes: #14927 Signed-off-by: bas smit <bsmit@bol.com>	2024-10-01 10:26:58 +02:00
bas smit	7a90d73fa6	sd k8s: test for sidecar container support in endpoints This test is expected to fail, the followup will add the feature Signed-off-by: bas smit <bsmit@bol.com>	2024-10-01 10:26:58 +02:00
machine424	b5569c4070	fix(discovery): adjust how type is retrieved in Configs' MarshalYAML/UnmarshalYAML Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2024-09-30 12:33:07 +02:00
machine424	97f3219157	test(discovery): add a Configs test showing that the custom unmarshalling/marshalling is broken. This went under the radar because the utils are never called directly. We usually marshall/unmarshal Configs as embeded in a struct using UnmarshalYAMLWithInlineConfigs/MarshalYAMLWithInlineConfigs which bypasses Configs' custom UnmarshalYAML/MarshalYAML Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2024-09-30 12:33:07 +02:00
Nathan Baulch	50cd453c8f	chore: Fix typos (#14868 ) * Fix typos --------- Signed-off-by: Nathan Baulch <nathan.baulch@gmail.com>	2024-09-10 22:32:03 +02:00
machine424	d18fa62ae9	chore(discovery): enable new-service-discovery-manager by default and drop legacymanager package Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2024-09-05 12:46:03 +02:00
Jan Fajerski	fe4289b502	Merge branch 'main' into HEAD	2024-09-04 18:50:00 +02:00
Jan Fajerski	00315ce15e	Merge branch 'main' into 3.0-main-sync-24-08-30 using -Xours Signed-off-by: Jan Fajerski <jfajersk@redhat.com>	2024-09-02 11:27:18 +02:00
machine424	d23d196db5	fix(discovery): prevent the manager from storing stale targetGroups Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2024-08-30 14:39:31 +02:00
machine424	c586c15ae6	fix(discovery): make discovery manager notify consumers of dropped targets for still defined jobs scrape/manager_test.go: add a test to check that the manager gets notified for targets that got dropped by discovery to reproduce: https://github.com/prometheus/prometheus/issues/12858#issuecomment-1732318102 Signed-off-by: machine424 <ayoubmrini424@gmail.com>	2024-08-28 17:39:02 +02:00
Bryan Boreham	4202be5e79	Merge branch 'release-2.54' into merge-2.54.1-into-main	2024-08-27 12:04:48 +01:00
beorn7	0f760f63dd	lint: Revamp our linting rules, mostly around doc comments Several things done here: - Set `max-issues-per-linter` to 0 so that we actually see all linter warnings and not just 50 per linter. (As we also set `max-same-issues` to 0, I assume this was the intention from the beginning.) - Stop using the golangci-lint default excludes (by setting `exclude-use-default: false`. Those are too generous and don't match our style conventions. (I have re-added some of the excludes explicitly in this commit. See below.) - Re-add the `errcheck` exclusion we have used so far via the defaults. - Exclude the signature requirement `govet` has for `Seek` methods because we use non-standard `Seek` methods a lot. (But we keep other requirements, while the default excludes completely disabled the check for common method segnatures.) - Exclude warnings about missing doc comments on exported symbols. (We used to be pretty adamant about doc comments, but stopped that at some point in the past. By now, we have about 500 missing doc comments. We may consider reintroducing this check, but that's outside of the scope of this commit. The default excludes of golangci-lint essentially ignore doc comments completely.) - By stop using the default excludes, we now get warnings back on malformed doc comments. That's the most impactful change in this commit. It does not enforce doc comments (again), but _if_ there is a doc comment, it has to have the recommended form. (Most of the changes in this commit are fixing this form.) - Improve wording/spelling of some comments in .golangci.yml, and remove an outdated comment. - Leave `package-comments` inactive, but add a TODO asking if we should change that. - Add a new sub-linter `comment-spacings` (and fix corresponding comments), which avoids missing spaces after the leading `//`. Signed-off-by: beorn7 <beorn@grafana.com>	2024-08-22 17:36:11 +02:00
Jan Fajerski	5138922b0d	Merge branch 'main' into 3.0-main-sync-24-08-21	2024-08-21 09:09:36 +02:00
ouyang1204@gmail.com	89dee48cc8	fix the issue of failing to match the first network when the container is reconnected to a new network Signed-off-by: ouyang1204@gmail.com <ouyang1204@gmail.com>	2024-08-19 21:26:25 +08:00
Arve Knudsen	3a78e76282	Upgrade golangci-lint to v1.60.1 Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2024-08-18 12:13:25 +02:00
Julien Pivotto	7711cd5ab5	Remove deprecated storage.tsdb.retention flag Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu> Signed-off-by: Julien <roidelapluie@o11y.eu>	2024-08-16 13:53:09 +02:00
cuiweiyuan	1800af54f0	chore: fix some function names Signed-off-by: cuiweiyuan <cuiweiyuan@aliyun.com>	2024-08-15 13:57:21 +08:00
Björn Rabenstein	3f16a2e7de	Merge pull request #14543 from jan--f/3.0-main-sync-24-08-01 3.0 main sync 24 08 01	2024-08-13 15:54:13 +02:00
Julien	3933cba052	Merge pull request #14365 from simonpasquier/fix-12884 discovery(k8s): remove support for API versions no longer served	2024-08-09 12:48:54 +02:00
Bryan Boreham	79a0ba9d64	Merge pull request #13503 from tylitianrui/chore/remove_redundance remove redundant code	2024-07-30 12:44:03 +01:00
Bryan Boreham	ce3bd4abea	Update for Docker deprecation Signed-off-by: Bryan Boreham <bjboreham@gmail.com>	2024-07-17 17:03:32 +01:00
Simon Pasquier	145988d48f	discovery(k8s): remove support for API versions no longer served This commit removes support for the following API versions: * `discovery.k8s.io/v1beta1` API version of EndpointSlice (no longer served as of v1.25). * `networking.k8s.io/v1beta1` API version of Ingress (no longer served as of v1.22). Closes #12884 Signed-off-by: Simon Pasquier <spasquie@redhat.com>	2024-07-04 14:54:27 +02:00
Paulo Dias	f4b1fcb73e	discovery: add support for gathering flavor name in Openstack discovery (#14312 ) * feat: add support for gathering flavor name in Openstack discovery Signed-off-by: Paulo Dias <paulodias.gm@gmail.com> * Update instance.go Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com> Signed-off-by: Paulo Dias <44772900+paulojmdias@users.noreply.github.com> * Update configuration.md Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com> Signed-off-by: Paulo Dias <44772900+paulojmdias@users.noreply.github.com> * fix: fix linting Signed-off-by: Paulo Dias <paulodias.gm@gmail.com> * fix: fix instance type Signed-off-by: Paulo Dias <paulodias.gm@gmail.com> * Update docs/configuration/configuration.md Co-authored-by: Simon Pasquier <spasquie@redhat.com> Signed-off-by: Paulo Dias <44772900+paulojmdias@users.noreply.github.com> --------- Signed-off-by: Paulo Dias <paulodias.gm@gmail.com> Signed-off-by: Paulo Dias <44772900+paulojmdias@users.noreply.github.com> Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com> Co-authored-by: Simon Pasquier <spasquie@redhat.com>	2024-06-30 19:18:18 +02:00
Bryan Boreham	c5040c5ea9	Merge pull request #10490 from DrAuYueng/fix-docker-sd-service-missing [ENHANCEMENT] Docker SD: add MatchFirstNetwork for containers with multiple networks Fixes docker sd service misssing in shared mode and deduplicate targets by network	2024-06-26 12:33:50 +01:00
Arve Knudsen	d902116b41	Fix various linting errors Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>	2024-06-24 16:11:53 -07:00
unknown	0d25931049	rebase main and adjust the configuration Signed-off-by: ouyang1204@gmail.com <ouyang1204@gmail.com>	2024-06-21 19:10:18 +08:00
akunszt	2aaf99dd0a	discovery: aws: expose Primary IPv6 addresses as label, partially fixes #7406 (#14156 ) * discovery: aws: expose Primary IPv6 addresses as label Add __meta_ec2_primary_ipv6_addresses label. This label contains the Primary IPv6 address for every ENI attached to the EC2 instance. It is ordered by the DeviceIndex and the missing elements (interface without Primary IPv6 address) are kept in the list. --------- Signed-off-by: Arpad Kunszt <akunszt@hiya.com> Co-authored-by: Ayoub Mrini <ayoubmrini424@gmail.com>	2024-06-20 14:36:20 +01:00

1 2 3 4 5 ...

870 commits