prometheus

mirror of https://github.com/prometheus/prometheus.git synced 2026-06-11 01:20:07 -04:00

Author	SHA1	Message	Date
Antoine Labarussias	24b1b60836	discovery/aws: handle rds clusters without instances Signed-off-by: Antoine Labarussias <antoinelabarussias@gmail.com>	2026-06-03 09:29:36 +02:00
Bartlomiej Plotka	c0b4c5ef18	Merge pull request #18821 from roidelapluie/roidelapluie/k8s-sd-binary-size discovery/kubernetes: keep SD client from defeating dead-code elimination	2026-06-01 15:06:07 +01:00
Julien	3294878850	Merge pull request #18824 from roidelapluie/roidelapluie/azure-sd-binary-size discovery/azure: keep SD clients from defeating dead-code elimination	2026-06-01 11:32:18 +02:00
Julien Pivotto	5040dd9e68	discovery/azure: keep SD clients from defeating dead-code elimination The azureClient struct held the armcompute and armnetwork SDK clients as concrete fields while satisfying the client interface. Once such a client is reachable through an interface, Go's linker conservatively retains every exported method of the concrete type plus the entire (de)serializer graph those operations drag in, even though discovery calls only a handful of them. Wrap each SDK client in a small adapter that captures only the operations discovery uses as method-value closures, and box the adapters instead of the raw clients. The concrete clients then live only inside closure contexts, which reflection cannot traverse, so dead-code elimination drops the unused operations. This drops the retained operations per client from ~60 down to the 2-3 actually used (UsedInIface markers go from 244+66 to 0), shrinking both the prometheus and promtool binaries by ~3.2 MB each. No functional or API change. Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>	2026-05-29 16:17:59 +02:00
Julien	27925e446c	Merge pull request #18818 from roidelapluie/roidelapluie/aws-sd-binary-size discovery/aws: keep AWS SDK clients out of interfaces to shrink the binary	2026-05-29 14:26:36 +02:00
Julien Pivotto	da7be2b867	discovery/kubernetes: keep SD client from defeating dead-code elimination The Kubernetes SD Discovery struct held the clientset as a kubernetes.Interface field. Boxing the concrete *kubernetes.Clientset into an interface marks it <UsedInIface>, so the Go linker conservatively retains every API-group accessor and, transitively, every resource client and its apply configurations, even though discovery only touches the core, apps, batch, discovery and networking v1 groups. Wrap the clientset in an adapter that captures only the used API-group accessors as method-value closures and exposes them through a narrow k8sClient interface. The concrete clientset now lives only inside closure contexts, which reflection cannot traverse, so dead-code elimination drops the unused groups. The fake clientset still satisfies the narrow interface, so tests are unchanged. This trims about 10 MB from each of the prometheus and promtool binaries. Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>	2026-05-29 14:17:57 +02:00
Julien Pivotto	fd493fe25a	discovery/gce: keep Compute SD client from defeating dead-code elimination The Discovery struct held compute.Service and compute.InstancesService as fields and is boxed into the discovery.Discoverer interface. Once reflection is reachable in the program (it always is, via the YAML/config machinery), the Go linker conservatively retains every exported method of any concrete type reachable through an interface, including via struct fields. compute.Service exposes ~150 sub-services and their operations, so all of them — 994 list/get operations and their serializers — were retained even though discovery only calls Instances.List. Wrap the single used operation in a closure over the concrete compute.Service so the service lives only in closure context, which reflection cannot traverse. Returning a compute.InstancesListCall would not help, since that type has an s Service back-reference that re-propagates the marker, so the closure encapsulates the whole List/Filter/Pages chain and only exposes the *compute.InstanceList data type discovery already uses. compute/v1 footprint drops from ~4.9 MB to ~33 KB, and the prometheus and promtool binaries each shrink by ~13.5 MB. No functional or API change. Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>	2026-05-29 14:00:09 +02:00
Julien Pivotto	5550b3689b	discovery/aws: keep AWS SDK clients out of interfaces to shrink the binary The AWS service-discovery code boxed each concrete SDK client (ec2.Client, rds.Client, lightsail.Client, elasticache.Client, ecs.Client and kafka.Client) into an interface, either directly or as a field of a struct that is itself boxed into the Discoverer interface. Once reflection is reachable in the program -- it always is, via the YAML/config machinery -- the Go linker conservatively retains every exported method of any concrete type reachable through an interface, including a type held as a field of an interface-boxed struct. Each SDK client exposes the service's full API (e.g. *ec2.Client has ~470 operation methods), so all of their operation serializers and the corresponding types (de)serializer graphs were kept, even though discovery only calls a handful of operations. EC2 alone accounted for ~21 MB. Wrap each client in a small adapter that captures only the operations discovery uses as method-value closures. The concrete client then lives only inside closure contexts, which reflection cannot traverse, so dead-code elimination can drop the unused operations. This reduces the binary sizes substantially: prometheus 228.6 MB -> 162.6 MB (-66 MB, -29%) promtool 205.1 MB -> 139.0 MB (-66 MB, -32%) There is no functional or API change; the mocking interfaces used by the tests are unchanged. Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>	2026-05-29 13:39:36 +02:00
Nandhis	80b1044a9e	discovery/scaleway: discover IPAM private NIC addresses (#18772 ) Some checks are pending buf.build / lint and publish (push) Waiting to run Details CI / Go tests (push) Waiting to run Details CI / More Go tests (push) Waiting to run Details CI / Go tests for 32-bit x86 (push) Waiting to run Details CI / Go tests for Prometheus upgrades and downgrades (push) Waiting to run Details CI / Go tests with previous Go version (push) Waiting to run Details CI / UI tests (push) Waiting to run Details CI / Go tests on Windows (push) Waiting to run Details CI / Mixins tests (push) Waiting to run Details CI / Compliance testing (push) Waiting to run Details CI / Build Prometheus for common architectures (push) Waiting to run Details CI / Build Prometheus for all architectures (push) Waiting to run Details CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions Details CI / Check generated parser (push) Waiting to run Details CI / golangci-lint (push) Waiting to run Details CI / fuzzing (push) Waiting to run Details CI / codeql (push) Waiting to run Details CI / Publish main branch artifacts (push) Blocked by required conditions Details CI / Publish release artefacts (push) Blocked by required conditions Details CI / Publish UI on npm Registry (push) Blocked by required conditions Details govulncheck / Run govulncheck (push) Waiting to run Details Scorecards supply-chain security / Scorecards analysis (push) Waiting to run Details Signed-off-by: msnandhis <45960035+msnandhis@users.noreply.github.com>	2026-05-26 16:17:42 +02:00
Ogulcan Aydogan	4f9db4d29e	discovery/marathon: run tests in parallel (#18613 ) Some checks are pending buf.build / lint and publish (push) Waiting to run Details CI / Go tests (push) Waiting to run Details CI / More Go tests (push) Waiting to run Details CI / Go tests for 32-bit x86 (push) Waiting to run Details CI / Go tests for Prometheus upgrades and downgrades (push) Waiting to run Details CI / Go tests with previous Go version (push) Waiting to run Details CI / UI tests (push) Waiting to run Details CI / Go tests on Windows (push) Waiting to run Details CI / Mixins tests (push) Waiting to run Details CI / Compliance testing (push) Waiting to run Details CI / Build Prometheus for common architectures (push) Waiting to run Details CI / Build Prometheus for all architectures (push) Waiting to run Details CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions Details CI / Check generated parser (push) Waiting to run Details CI / golangci-lint (push) Waiting to run Details CI / fuzzing (push) Waiting to run Details CI / codeql (push) Waiting to run Details CI / Publish main branch artifacts (push) Blocked by required conditions Details CI / Publish release artefacts (push) Blocked by required conditions Details CI / Publish UI on npm Registry (push) Blocked by required conditions Details govulncheck / Run govulncheck (push) Waiting to run Details Scorecards supply-chain security / Scorecards analysis (push) Waiting to run Details Add t.Parallel() to the 13 top-level test functions in marathon_test.go. Tests have no shared mutable state — each test creates its own registry, metrics, and config — so parallelisation is safe and speeds up the suite. Refs #15185 Signed-off-by: Ogulcan Aydogan <ogulcanaydogan@hotmail.com>	2026-05-19 14:04:03 +02:00
Ogulcan Aydogan	e01be38010	discovery/consul: run tests in parallel (#18611 ) Each test creates its own httptest.Server and prometheus.Registry so there is no shared global state between them. Adding t.Parallel() to all 13 top-level test functions and the subtests in TestUnmarshalConfig allows the Go test runner to overlap them, cutting wall-clock time. Refs: #15185 Signed-off-by: Ogulcan Aydogan <ogulcanaydogan@hotmail.com>	2026-05-19 14:01:50 +02:00
Ogulcan Aydogan	1d1b670c40	discovery/aws: run tests in parallel Each test function creates its own mock AWS client and operates on independent data stores with no shared global state between them. Adding t.Parallel() to the 33 top-level test functions across the six test files (aws, ec2, ecs, elasticache, msk, rds) allows the Go test runner to overlap their execution, cutting wall-clock time. TestLoadRegion is excluded because its subtests use t.Setenv, which panics when a parallel ancestor is detected (Go 1.25+). Refs: #15185 Signed-off-by: Ogulcan Aydogan <ogulcanaydogan@hotmail.com>	2026-05-15 13:45:20 +01:00
Joe Adams	63f2bf9b9a	Merge pull request #16088 from rlees85/ipv6-only-ec2-sd Some checks are pending buf.build / lint and publish (push) Waiting to run Details CI / Go tests (push) Waiting to run Details CI / More Go tests (push) Waiting to run Details CI / Go tests for Prometheus upgrades and downgrades (push) Waiting to run Details CI / Go tests with previous Go version (push) Waiting to run Details CI / UI tests (push) Waiting to run Details CI / Go tests on Windows (push) Waiting to run Details CI / Mixins tests (push) Waiting to run Details CI / Compliance testing (push) Waiting to run Details CI / Build Prometheus for common architectures (push) Waiting to run Details CI / Build Prometheus for all architectures (push) Waiting to run Details CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions Details CI / Check generated parser (push) Waiting to run Details CI / golangci-lint (push) Waiting to run Details CI / fuzzing (push) Waiting to run Details CI / codeql (push) Waiting to run Details CI / Publish main branch artifacts (push) Blocked by required conditions Details CI / Publish release artefacts (push) Blocked by required conditions Details CI / Publish UI on npm Registry (push) Blocked by required conditions Details govulncheck / Run govulncheck (push) Waiting to run Details Scorecards supply-chain security / Scorecards analysis (push) Waiting to run Details discovery: Allow EC2 Service Discovery to work with IPv6-only instances	2026-05-11 23:09:34 -04:00
Julien Pivotto	aa5927029e	discovery/stackit: use config.Secret for ServiceAccountKey and PrivateKey Fixes GHSA-39j6-789q-qxvh Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>	2026-05-08 15:16:54 +02:00
Rich	fcdaef1365	Use a default IPv6 address for IPv6 only instances Signed-off-by: Rich <git0@bitservices.io>	2026-04-29 19:58:17 +01:00
matt-gp	8b63debbca	[ENHANCEMENT] AWS SD: Add optional external_id field	2026-04-26 18:34:56 +01:00
matt-gp	c08d850668	AWS SD: Set Directory	2026-04-25 14:18:53 +01:00
Julien	551b5b1c56	Merge pull request #17171 from bboreham/aws-external-id Some checks are pending buf.build / lint and publish (push) Waiting to run Details CI / Go tests (push) Waiting to run Details CI / More Go tests (push) Waiting to run Details CI / Go tests for Prometheus upgrades and downgrades (push) Waiting to run Details CI / Go tests with previous Go version (push) Waiting to run Details CI / UI tests (push) Waiting to run Details CI / Go tests on Windows (push) Waiting to run Details CI / Mixins tests (push) Waiting to run Details CI / Compliance testing (push) Waiting to run Details CI / Build Prometheus for common architectures (push) Waiting to run Details CI / Build Prometheus for all architectures (push) Waiting to run Details CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions Details CI / Check generated parser (push) Waiting to run Details CI / golangci-lint (push) Waiting to run Details CI / fuzzing (push) Waiting to run Details CI / codeql (push) Waiting to run Details CI / Publish main branch artifacts (push) Blocked by required conditions Details CI / Publish release artefacts (push) Blocked by required conditions Details CI / Publish UI on npm Registry (push) Blocked by required conditions Details govulncheck / Run govulncheck (push) Waiting to run Details Scorecards supply-chain security / Scorecards analysis (push) Waiting to run Details [ENHANCEMENT] AWS SD: Add optional external_id field	2026-04-24 09:41:58 +02:00
Julien Pivotto	ddca8ee45a	discovery/scaleway: use http.Client instead of RoundTripper, enabling follow_redirects support Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>	2026-04-17 18:01:44 +02:00
Julien	6e83b49dd6	Merge pull request #17685 from akshatsinha0/fix-aws-sd-setdirectory Fix(discovery/aws): Added SetDirectory method to EC2SDConfig.	2026-04-15 14:35:46 +02:00
Julien	12de1243c0	Merge pull request #18518 from prometheus/release-3.11 Some checks are pending buf.build / lint and publish (push) Waiting to run Details CI / Go tests (push) Waiting to run Details CI / More Go tests (push) Waiting to run Details CI / Go tests for Prometheus upgrades and downgrades (push) Waiting to run Details CI / Go tests with previous Go version (push) Waiting to run Details CI / UI tests (push) Waiting to run Details CI / Go tests on Windows (push) Waiting to run Details CI / Mixins tests (push) Waiting to run Details CI / Compliance testing (push) Waiting to run Details CI / Build Prometheus for common architectures (push) Waiting to run Details CI / Build Prometheus for all architectures (push) Waiting to run Details CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions Details CI / Check generated parser (push) Waiting to run Details CI / golangci-lint (push) Waiting to run Details CI / fuzzing (push) Waiting to run Details CI / codeql (push) Waiting to run Details CI / Publish main branch artifacts (push) Blocked by required conditions Details CI / Publish release artefacts (push) Blocked by required conditions Details CI / Publish UI on npm Registry (push) Blocked by required conditions Details Scorecards supply-chain security / Scorecards analysis (push) Waiting to run Details Merge back release 3.11.2	2026-04-14 10:26:38 +02:00
Julien Pivotto	4cc50803ff	discovery/consul: fix catalog watch trigger and improve filter tests When health_filter is set without explicit services, the catalog needs to be watched to enumerate services. Add watchedFilter to the condition that triggers catalog watching. Improve the filter test suite: - Replace defer with t.Cleanup for stub servers. - Rewrite TestFilterOption to assert that the catalog receives the filter and the health endpoint does not. - Rewrite TestHealthFilterOption to assert that health_filter is routed correctly to the health endpoint only. - Add TestBothFiltersOption to verify both filters are routed to their respective endpoints when both are configured. Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>	2026-04-10 10:26:40 +02:00
Arthur Silva Sens	72293ff1d2	Merge pull request #18433 from codeboten/codeboten/remove-docker-monopkg-dep Some checks are pending buf.build / lint and publish (push) Waiting to run Details CI / Go tests (push) Waiting to run Details CI / More Go tests (push) Waiting to run Details CI / Go tests for Prometheus upgrades and downgrades (push) Waiting to run Details CI / Go tests with previous Go version (push) Waiting to run Details CI / UI tests (push) Waiting to run Details CI / Go tests on Windows (push) Waiting to run Details CI / Mixins tests (push) Waiting to run Details CI / Compliance testing (push) Waiting to run Details CI / Build Prometheus for common architectures (push) Waiting to run Details CI / Build Prometheus for all architectures (push) Waiting to run Details CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions Details CI / Check generated parser (push) Waiting to run Details CI / golangci-lint (push) Waiting to run Details CI / fuzzing (push) Waiting to run Details CI / codeql (push) Waiting to run Details CI / Publish main branch artifacts (push) Blocked by required conditions Details CI / Publish release artefacts (push) Blocked by required conditions Details CI / Publish UI on npm Registry (push) Blocked by required conditions Details Scorecards supply-chain security / Scorecards analysis (push) Waiting to run Details chore: remove dependency on github.com/docker/docker	2026-04-09 11:58:10 -03:00
Julien Pivotto	1e73d2fcde	discovery/consul: add health_filter for Health API filtering The filter field was documented as targeting the Catalog API but since PR #17349 it was also passed to the Health API. This broke existing configs using Catalog-only fields like ServiceTags, which the Health API rejects (it uses Service.Tags instead). Introduce a separate health_filter field that is passed exclusively to the Health API, while filter remains catalog-only. Update the docs to explain the two-phase discovery (Catalog for service listing, Health for instances) and the field name differences between the two APIs. Fixes #18479 Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>	2026-04-09 16:03:16 +02:00
avilevy18	eb220862e5	discovery, scrape: Use backoff interval for throttling discovery updates; add DiscoveryReloadOnStartup option for short-lived environments (#18187 ) * Adding scape on shutdown Signed-off-by: avilevy <avilevy@google.com> * scrape: replace skipOffsetting to make the test offset deterministic instead of skipping it entirely Signed-off-by: avilevy <avilevy@google.com> * renamed calculateScrapeOffset to getScrapeOffset Signed-off-by: avilevy <avilevy@google.com> * discovery: Add skipStartupWait to bypass initial discovery delay In short-lived environments like agent mode or serverless, the Prometheus process may only execute for a few seconds. Waiting for the default 5-second `updatert` ticker before sending the first target groups means the process could terminate before collecting any metrics at all. This commit adds a `skipStartupWait` option to the Discovery Manager to bypass this initial delay. When enabled, the sender uses an unthrottled startup loop that instantly forwards all triggers. This ensures both the initial empty update from `ApplyConfig` and the first real targets from discoverers are passed downstream immediately. After the first ticker interval elapses, the sender cleanly breaks out of the startup phase, resets the ticker, and resumes standard operations. Signed-off-by: avilevy <avilevy@google.com> * scrape: Bypass initial reload delay for ScrapeOnShutdown In short-lived environments like agent mode or serverless, the default 5-second `DiscoveryReloadInterval` can cause the process to terminate before the scrape manager has a chance to process targets and collect any metrics. Because the discovery manager sends an initial empty update upon configuration followed rapidly by the actual targets, simply waiting for a single reload trigger is insufficient—the real targets would still get trapped behind the ticker delay. This commit introduces an unthrottled startup loop in the `reloader` when `ScrapeOnShutdown` is enabled. It processes all incoming `triggerReload` signals immediately during the first interval. Once the initial tick fires, the `reloader` resets the ticker and falls back into its standard throttled loop, ensuring short-lived processes can discover and scrape targets instantly. Signed-off-by: avilevy <avilevy@google.com> * test(scrape): refactor time-based manager tests to use synctest Addresses PR feedback to remove flaky, time-based sleeping in the scrape manager tests. Add TestManager_InitialScrapeOffset and TestManager_ScrapeOnShutdown to use the testing/synctest package, completely eliminating real-world time.Sleep delays and making the assertions 100% deterministic. - Replaced httptest.Server with net.Pipe and a custom startFakeHTTPServer helper to ensure all network I/O remains durably blocked inside the synctest bubble. - Leveraged the skipOffsetting option to eliminate random scrape jitter, making the time-travel math exact and predictable. - Using skipOffsetting also safely bypasses the global singleflight DNS lookup in setOffsetSeed, which previously caused cross-bubble panics in synctest. - Extracted shared boilerplate into a setupSynctestManager helper to keep the test cases highly readable and data-driven. Signed-off-by: avilevy <avilevy@google.com> * Clarify use cases in InitialScrapeOffset comment Signed-off-by: avilevy <avilevy@google.com> * test(scrape): use httptest for mock server to respect context cancellation - Replaced manual HTTP string formatting over `net.Pipe` with `httptest.NewUnstartedServer`. - Implemented an in-memory `pipeListener` to allow the server to handle `net.Pipe` connections directly. This preserves `synctest` time isolation without opening real OS ports. - Added explicit `r.Context().Done()` handling in the mock HTTP handler to properly simulate aborted requests and scrape timeouts. - Validates that the request context remains active and is not prematurely cancelled during `ScrapeOnShutdown` scenarios. - Renamed `skipOffsetting` to `skipJitterOffsetting`. - Addressed other PR comments. Signed-off-by: avilevy <avilevy@google.com> * tmp Signed-off-by: bwplotka <bwplotka@gmail.com> * exp2 Signed-off-by: bwplotka <bwplotka@gmail.com> * fix Signed-off-by: bwplotka <bwplotka@gmail.com> * scrape: fix scrapeOnShutdown context bug and refactor test helpers The scrapeOnShutdown feature was failing during manager shutdown because the scrape pool context was being cancelled before the final shutdown scrapes could execute. Fix this by delaying context cancellation in scrapePool.stop() until after all scrape loops have stopped. In addition: - Added test cases to verify scrapeOnShutdown works with InitialScrapeOffset. - Refactored network test helper functions from manager_test.go to helpers_test.go. - Addressed other comments. Signed-off-by: avilevy <avilevy@google.com> * Update scrape/scrape.go Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com> Signed-off-by: avilevy18 <105948922+avilevy18@users.noreply.github.com> * feat(discovery): add SkipInitialWait to bypass initial startup delay This adds a SkipInitialWait option to the discovery Manager, allowing consumers sensitive to startup latency to receive the first batch of discovered targets immediately instead of waiting for the updatert ticker. To support this without breaking the immediate dropped target notifications introduced in #13147, ApplyConfig now uses a keep flag to only trigger immediate downstream syncs for obsolete or updated providers. This prevents sending premature empty target groups for brand-new providers on initial startup. Additionally, the scrape manager's reloader loop is updated to process the initial triggerReload immediately, ensuring the end-to-end pipeline processes initial targets without artificial delays. Signed-off-by: avilevy <avilevy@google.com> * scrape: Add TestManagerReloader and refactor discovery triggerSync Adds a new TestManagerReloader test suite using synctest to assert behavior of target updates, discovery reload ticker intervals, and ScrapeOnShutdown flags. Updates setupSynctestManager to allow skipping initial config setup by passing an interval of 0. Also renames the 'keep' variable to 'triggerSync' in ApplyConfig inside discovery/manager.go for clarity, and adds a descriptive comment. Signed-off-by: avilevy <avilevy@google.com> * feat(discovery,scrape): rename startup wait options and add DiscoveryReloadOnStartup - discovery: Rename `SkipInitialWait` to `SkipStartupWait` for clarity. - discovery: Pass `context.Context` to `flushUpdates` to handle cancellation and avoid leaks. - scrape: Add `DiscoveryReloadOnStartup` to `Options` to decouple startup discovery from `ScrapeOnShutdown`. - tests: Refactor `TestTargetSetTargetGroupsPresentOnStartup` and `TestManagerReloader` to use table-driven tests and `synctest` for better stability and coverage. Signed-off-by: avilevy <avilevy@google.com> * feat(discovery,scrape): importing changes proposed in `043d710` - Refactor sender to use exponential backoff - Replaces `time.NewTicker` in `sender()` with an exponential backoff to prevent panics on non-positive intervals and better throttle updates. - Removes obsolete `skipStartupWait` logic. - Refactors `setupSynctestManager` to use an explicit `initConfig` argument Signed-off-by: avilevy <avilevy@google.com> * fix: updating go mod Signed-off-by: avilevy <avilevy@google.com> * fixing merge Signed-off-by: avilevy <avilevy@google.com> * fixing issue: 2 variables but NewTestMetrics returns 1 value Signed-off-by: avilevy <avilevy@google.com> * Update discovery/manager.go Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com> Signed-off-by: avilevy18 <105948922+avilevy18@users.noreply.github.com> * Refactor setupSynctestManager initConfig into a separate function Signed-off-by: avilevy <avilevy@google.com> --------- Signed-off-by: avilevy <avilevy@google.com> Signed-off-by: bwplotka <bwplotka@gmail.com> Signed-off-by: avilevy18 <105948922+avilevy18@users.noreply.github.com> Co-authored-by: bwplotka <bwplotka@gmail.com>	2026-04-03 11:01:49 +01:00
alex boten	ab63eeb0d2	update deprecated calls Signed-off-by: alex boten <223565+codeboten@users.noreply.github.com>	2026-04-02 15:54:32 -07:00
alex boten	39672984b5	chore: remove dependency on github.com/docker/docker The package makes vulnerability scanners unhappy, and the functionality is available in the smaller moby/moby packages. Signed-off-by: alex boten <223565+codeboten@users.noreply.github.com>	2026-04-02 15:11:24 -07:00
George Krajcsovits	f1b226a2f3	discovery: fix build error in TestGaugeLastUpdateTimestamp (#18430 ) Some checks are pending buf.build / lint and publish (push) Waiting to run Details CI / Go tests (push) Waiting to run Details CI / More Go tests (push) Waiting to run Details CI / Go tests for Prometheus upgrades and downgrades (push) Waiting to run Details CI / Go tests with previous Go version (push) Waiting to run Details CI / UI tests (push) Waiting to run Details CI / Go tests on Windows (push) Waiting to run Details CI / Mixins tests (push) Waiting to run Details CI / Compliance testing (push) Waiting to run Details CI / Build Prometheus for common architectures (push) Waiting to run Details CI / Build Prometheus for all architectures (push) Waiting to run Details CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions Details CI / Check generated parser (push) Waiting to run Details CI / golangci-lint (push) Waiting to run Details CI / fuzzing (push) Waiting to run Details CI / codeql (push) Waiting to run Details CI / Publish main branch artifacts (push) Blocked by required conditions Details CI / Publish release artefacts (push) Blocked by required conditions Details CI / Publish UI on npm Registry (push) Blocked by required conditions Details Scorecards supply-chain security / Scorecards analysis (push) Waiting to run Details NewTestMetrics returns a single value but the test was assigning it to two variables. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>	2026-04-02 20:42:03 +02:00
Will Bollock	b70a871988	fix(discovery): delete expired refresh metrics on reload (#17614 ) Building off config-specific Prometheus refresh metrics from an earlier PR (https://github.com/prometheus/prometheus/pull/17138), this deletes refresh metrics like `prometheus_sd_refresh_duration_seconds` and `prometheus_sd_refresh_failures_total` when the underlying scrape job configuration is removed on reload. This reduces un-needed cardinality from scrape job specific metrics while still preserving metrics that indicate overall health of a service discovery engine. For example, `prometheus_sd_refresh_failures_total{config="linode-servers",mechanism="linode"} 1` will no longer be exported by Prometheus when the `linode-servers` scrape job for the Linode service provider is removed. The generic, service discovery specific `prometheus_sd_linode_failures_total` metric will persist however. * fix: add targetsMtx lock for targets access * test: validate refresh/discover metrics are gone * ref: combine sdMetrics and refreshMetrics Good idea from @bboreham to combine sdMetrics and refreshMetrics! They're always passed around together and don't have much of a reason not to be combined. mechanismMetrics makes it clear what kind of metrics this is used for (service discovery mechanisms). --------- Signed-off-by: Will Bollock <wbollock@linode.com>	2026-04-02 13:43:35 +01:00
Julien Pivotto	a2d741cbf7	discovery/kubernetes: fix pod_test.go after makeNode signature change PR #17601 extended makeNode with annotations and conditions parameters but missed updating two call sites in pod_test.go. Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>	2026-04-02 10:03:00 +02:00
Frederic Branczyk	b014aa101d	Merge pull request #17601 from joelkbiju12/ft-kubernetes-nodeready-label Some checks are pending buf.build / lint and publish (push) Waiting to run Details CI / Go tests (push) Waiting to run Details CI / More Go tests (push) Waiting to run Details CI / Go tests for Prometheus upgrades and downgrades (push) Waiting to run Details CI / Go tests with previous Go version (push) Waiting to run Details CI / UI tests (push) Waiting to run Details CI / Go tests on Windows (push) Waiting to run Details CI / Mixins tests (push) Waiting to run Details CI / Compliance testing (push) Waiting to run Details CI / Build Prometheus for common architectures (push) Waiting to run Details CI / Build Prometheus for all architectures (push) Waiting to run Details CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions Details CI / Check generated parser (push) Waiting to run Details CI / golangci-lint (push) Waiting to run Details CI / fuzzing (push) Waiting to run Details CI / codeql (push) Waiting to run Details CI / Publish main branch artifacts (push) Blocked by required conditions Details CI / Publish release artefacts (push) Blocked by required conditions Details CI / Publish UI on npm Registry (push) Blocked by required conditions Details Scorecards supply-chain security / Scorecards analysis (push) Waiting to run Details discovery: adding kubernetes node condition labels	2026-04-01 18:05:27 +02:00
Aurélien Duboc	fa5960bf08	feat(discovery): add Outscale VM service discovery (#18139 ) Add Outscale VM service discovery using osc-sdk-go, including optional secret_key_file support, metrics, docs, and configuration examples. Document the default region (eu-west-2). Signed-off-by: Aurelien Duboc <aurelienduboc96@gmail.com>	2026-03-31 18:03:51 +02:00
Vladimir Skesov	20ff771593	discovery: add DigitalOcean Managed Databases service discovery This adds 'databases' role to digitalocean_sd_config to discover DigitalOcean Managed Database clusters. It follows the multi-role design pattern by introducing a 'role' parameter (default: 'droplets'). Includes: - Support for Managed Databases API. - Pagination handling for Databases API. - Comprehensive meta labels for database targets. - Updated documentation and tests. Signed-off-by: Vladimir Skesov <skesov@gmail.com>	2026-03-30 11:31:17 +03:00
Pierluigi Lenoci	73902efbd0	discovery/vultr: upgrade govultr from v2 to v3 (#18347 ) * discovery/vultr: upgrade govultr from v2 to v3 The govultr/v2 library is no longer actively maintained. Upgrade to govultr/v3 (v3.28.1) which receives regular updates and security patches. The v3 library is API-compatible with v2 for the Instance.List method used by the Vultr SD, with the only change being an additional http.Response return value. Signed-off-by: Pierluigi Lenoci <pierluigi.lenoci@gmail.com> discovery/vultr: check HTTP response status code Validate that the Vultr API returns a 2xx status code after listing instances, as the http.Response from govultr v3 is now available. Signed-off-by: Pierluigi Lenoci <pierluigi.lenoci@gmail.com> discovery/vultr: fix linter error in error string capitalization Error strings should not be capitalized per Go conventions (ST1005). Signed-off-by: Pierluigi Lenoci <pierluigi.lenoci@gmail.com> --------- Signed-off-by: Pierluigi Lenoci <pierluigi.lenoci@gmail.com>	2026-03-26 09:42:25 +01:00
Ogulcan Aydogan	7bbff490a3	discovery/azure: fix system managed identity when client_id is empty When using ManagedIdentity authentication with system-assigned identity, the client_id field is intentionally left empty. However, the current code unconditionally sets options.ID = azidentity.ClientID(cfg.ClientID), which passes an empty string instead of nil. The Azure SDK treats an empty ClientID as a request for a user-assigned identity with an empty client ID, rather than falling back to system-assigned identity. Fix by only setting options.ID when cfg.ClientID is non-empty, matching the pattern already used in storage/remote/azuread/azuread.go. Fixes #16634 Signed-off-by: Ogulcan Aydogan <ogulcanaydogan@hotmail.com>	2026-03-19 10:49:17 +00:00
Jonas L.	a9d90952ba	Deprecate Hetzner Cloud server datacenter labels (#17850 ) Some checks are pending buf.build / lint and publish (push) Waiting to run Details CI / Go tests (push) Waiting to run Details CI / More Go tests (push) Waiting to run Details CI / Go tests with previous Go version (push) Waiting to run Details CI / UI tests (push) Waiting to run Details CI / Go tests on Windows (push) Waiting to run Details CI / Mixins tests (push) Waiting to run Details CI / Compliance testing (push) Waiting to run Details CI / Build Prometheus for common architectures (push) Waiting to run Details CI / Build Prometheus for all architectures (push) Waiting to run Details CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions Details CI / Check generated parser (push) Waiting to run Details CI / golangci-lint (push) Waiting to run Details CI / fuzzing (push) Waiting to run Details CI / codeql (push) Waiting to run Details CI / Publish main branch artifacts (push) Blocked by required conditions Details CI / Publish release artefacts (push) Blocked by required conditions Details CI / Publish UI on npm Registry (push) Blocked by required conditions Details Scorecards supply-chain security / Scorecards analysis (push) Waiting to run Details [hcloud.Server.Datacenter] is deprecated and will be removed after 1 July 2026. Use [hcloud.Server.Location] instead. See https://docs.hetzner.cloud/changelog#2025-12-16-phasing-out-datacenters Changes to Hetzner meta labels: - `__meta_hetzner_datacenter` - is deprecated for the role `robot` but kept for backward compatibility. Using `__meta_hetzner_robot_datacenter` is preferred. - is deprecated for the role `hcloud` and will stop working after the 1 July 2026. - `__meta_hetzner_hcloud_datacenter_location` label - is deprecated but kept for backward compatibility, the same data is available in the [`hcloud.Server.Location`](https://pkg.go.dev/github.com/hetznercloud/hcloud-go/v2/hcloud#Server) struct. - using `__meta_hetzner_hcloud_location` is preferred. - `__meta_hetzner_hcloud_datacenter_location_network_zone` - is deprecated but kept for backward compatibility, the same data is available in the [`hcloud.Server.Location`](https://pkg.go.dev/github.com/hetznercloud/hcloud-go/v2/hcloud#Server) struct. - using `__meta_hetzner_hcloud_location_network_zone` is preferred. - `__meta_hetzner_hcloud_location` - replacement label for `__meta_hetzner_hcloud_datacenter_location` - `__meta_hetzner_hcloud_location_network_zone` - replacement label for `__meta_hetzner_hcloud_datacenter_location_network_zone` - `__meta_hetzner_robot_datacenter` - replacement label for `__meta_hetzner_datacenter` with the role `robot`. Signed-off-by: Jonas Lammler <jonas.lammler@hetzner-cloud.de>	2026-03-19 11:25:01 +01:00
Munem Hashmi	89b3ad45a8	discovery/file: restore atomic file writes in tests (#18259 ) PR #17269 replaced atomic os.Rename-based file writes with os.WriteFile to fix a Windows flake. However, os.WriteFile is not atomic (it truncates then writes), and fsnotify can fire between the truncate and write, causing the watcher to read an empty file and replace valid targets with empty ones. Restore atomicity by writing to a temporary file and renaming. On Windows, retry the rename with a short backoff to handle transient "Access is denied" errors when the file watcher or readFile holds an open handle to the destination. Fixes #18237 Signed-off-by: Munem Hashmi <munem.hashmi@gmail.com>	2026-03-11 08:12:56 +01:00
Matt	5a02b92c0e	AWS SD: RDS Role (#18206 )	2026-03-04 12:17:38 +01:00
Frederic Branczyk	b9ce7f3be0	Merge pull request #18192 from rexagod/17193 Some checks are pending buf.build / lint and publish (push) Waiting to run Details CI / Go tests (push) Waiting to run Details CI / More Go tests (push) Waiting to run Details CI / Go tests with previous Go version (push) Waiting to run Details CI / UI tests (push) Waiting to run Details CI / Go tests on Windows (push) Waiting to run Details CI / Mixins tests (push) Waiting to run Details CI / Compliance testing (push) Waiting to run Details CI / Build Prometheus for common architectures (push) Waiting to run Details CI / Build Prometheus for all architectures (push) Waiting to run Details CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions Details CI / Check generated parser (push) Waiting to run Details CI / golangci-lint (push) Waiting to run Details CI / fuzzing (push) Waiting to run Details CI / codeql (push) Waiting to run Details CI / Publish main branch artifacts (push) Blocked by required conditions Details CI / Publish release artefacts (push) Blocked by required conditions Details CI / Publish UI on npm Registry (push) Blocked by required conditions Details Scorecards supply-chain security / Scorecards analysis (push) Waiting to run Details discovery/k8s: Dedup EPS for `*DualStack` policies	2026-03-04 11:46:44 +01:00
Frederic Branczyk	44699107d2	Merge pull request #17774 from rexagod/16747 discovery/kubernetes: Support linked pod controllers	2026-03-04 11:07:14 +01:00
Matthieu MOREL	45b9329e68	chore: fix emptyStringTest issues from gocritic (#18226 ) Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2026-03-04 08:24:50 +01:00
Pranshu Srivastava	ebacf4a084	fixup! fixup! discovery/kubernetes: Support linked pod controllers Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>	2026-03-03 18:30:52 +05:30
Pranshu Srivastava	f21785f5a9	fixup! discovery/kubernetes: Support linked pod controllers Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>	2026-03-03 18:30:52 +05:30
Pranshu Srivastava	228b94f6eb	discovery/kubernetes: Support linked pod controllers Extended Kubernetes SD to support the following pod-based labels: * `__meta_kubernetes_pod_deployment_name` * `__meta_kubernetes_pod_cronjob_name` * `__meta_kubernetes_pod_job_name` Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>	2026-03-03 18:30:48 +05:30
Pranshu Srivastava	2684af0ca8	fixup! discovery/k8s: Dedup EPS for `*DualStack` policies Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>	2026-03-03 18:03:37 +05:30
George Krajcsovits	318980a5c2	Merge pull request #17207 from thomas-gouveia/feat/16634/add-support-for-workload-identity-azure-discovery feat: add support for Azure Workload Identity authentication method for Azure discovery [#16634]	2026-03-03 12:59:58 +01:00
George Krajcsovits	feb741e470	Merge pull request #18215 from mmorel-35/gocritic chore: fix httpNoBody issues from gocritic	2026-03-03 12:50:19 +01:00
Hank (Yong-Han) Chen	264be9aa74	fix(discovery/file): Fix flaky test on Windows by replacing os.rename with os.WriteFile (#17269 ) Signed-off-by: Yong-Han Chen <hank96015@gmail.com>	2026-03-03 12:43:50 +01:00
Matthieu MOREL	026d284c43	chore: fix httpNoBody issues from gocritic Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2026-03-02 20:06:30 +01:00
Frederic Branczyk	1751685dd4	Merge pull request #18194 from rexagod/11726 Some checks are pending buf.build / lint and publish (push) Waiting to run Details CI / Go tests (push) Waiting to run Details CI / More Go tests (push) Waiting to run Details CI / Go tests with previous Go version (push) Waiting to run Details CI / UI tests (push) Waiting to run Details CI / Go tests on Windows (push) Waiting to run Details CI / Mixins tests (push) Waiting to run Details CI / Compliance testing (push) Waiting to run Details CI / Build Prometheus for common architectures (push) Waiting to run Details CI / Build Prometheus for all architectures (push) Waiting to run Details CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions Details CI / Check generated parser (push) Waiting to run Details CI / golangci-lint (push) Waiting to run Details CI / fuzzing (push) Waiting to run Details CI / codeql (push) Waiting to run Details CI / Publish main branch artifacts (push) Blocked by required conditions Details CI / Publish release artefacts (push) Blocked by required conditions Details CI / Publish UI on npm Registry (push) Blocked by required conditions Details Scorecards supply-chain security / Scorecards analysis (push) Waiting to run Details discovery: Introduce `prometheus_sd_last_update_timestamp_seconds`	2026-03-02 17:00:42 +01:00

1 2 3 4 5 ...

928 commits