prometheus/notifier
Will Bollock b70a871988
fix(discovery): delete expired refresh metrics on reload (#17614)
Building off config-specific Prometheus refresh metrics from an earlier
PR (https://github.com/prometheus/prometheus/pull/17138), this deletes
refresh metrics like `prometheus_sd_refresh_duration_seconds` and
`prometheus_sd_refresh_failures_total` when the underlying scrape job
configuration is removed on reload. This reduces un-needed cardinality
from scrape job specific metrics while still preserving metrics that
indicate overall health of a service discovery engine.

For example,
`prometheus_sd_refresh_failures_total{config="linode-servers",mechanism="linode"} 1`
will no longer be exported by Prometheus when the `linode-servers`
scrape job for the Linode service provider is removed. The generic,
service discovery specific `prometheus_sd_linode_failures_total` metric
will persist however.

* fix: add targetsMtx lock for targets access

* test: validate refresh/discover metrics are gone

* ref: combine sdMetrics and refreshMetrics

Good idea from @bboreham to combine sdMetrics and refreshMetrics!
They're always passed around together and don't have much of a
reason not to be combined. mechanismMetrics makes it clear what kind of
metrics this is used for (service discovery mechanisms).

---------

Signed-off-by: Will Bollock <wbollock@linode.com>
2026-04-02 13:43:35 +01:00
..
alert.go Remove copyright date from headers (#17785) 2026-01-05 13:46:21 +01:00
alertmanager.go Remove copyright date from headers (#17785) 2026-01-05 13:46:21 +01:00
alertmanager_test.go feat(notifier): independent alertmanager sendloops (#16355) 2026-01-20 10:33:07 +01:00
alertmanagerset.go feat(notifier): independent alertmanager sendloops (#16355) 2026-01-20 10:33:07 +01:00
manager.go fix(notify): apply config sendloop cleanup fix (#17915) 2026-01-22 22:22:44 +01:00
manager_test.go fix(discovery): delete expired refresh metrics on reload (#17614) 2026-04-02 13:43:35 +01:00
metric.go feat(notifier): independent alertmanager sendloops (#16355) 2026-01-20 10:33:07 +01:00
sendloop.go feat(notifier): independent alertmanager sendloops (#16355) 2026-01-20 10:33:07 +01:00
sendloop_test.go feat(notifier): independent alertmanager sendloops (#16355) 2026-01-20 10:33:07 +01:00
util.go Remove copyright date from headers (#17785) 2026-01-05 13:46:21 +01:00
util_test.go feat(notifier): independent alertmanager sendloops (#16355) 2026-01-20 10:33:07 +01:00