prometheus/docs
Will Bollock b70a871988
fix(discovery): delete expired refresh metrics on reload (#17614)
Building off config-specific Prometheus refresh metrics from an earlier
PR (https://github.com/prometheus/prometheus/pull/17138), this deletes
refresh metrics like `prometheus_sd_refresh_duration_seconds` and
`prometheus_sd_refresh_failures_total` when the underlying scrape job
configuration is removed on reload. This reduces un-needed cardinality
from scrape job specific metrics while still preserving metrics that
indicate overall health of a service discovery engine.

For example,
`prometheus_sd_refresh_failures_total{config="linode-servers",mechanism="linode"} 1`
will no longer be exported by Prometheus when the `linode-servers`
scrape job for the Linode service provider is removed. The generic,
service discovery specific `prometheus_sd_linode_failures_total` metric
will persist however.

* fix: add targetsMtx lock for targets access

* test: validate refresh/discover metrics are gone

* ref: combine sdMetrics and refreshMetrics

Good idea from @bboreham to combine sdMetrics and refreshMetrics!
They're always passed around together and don't have much of a
reason not to be combined. mechanismMetrics makes it clear what kind of
metrics this is used for (service discovery mechanisms).

---------

Signed-off-by: Will Bollock <wbollock@linode.com>
2026-04-02 13:43:35 +01:00
..
command-line Merge branch 'feature/start-time' into cedwards/document-st-storage 2026-03-17 06:21:05 +01:00
configuration Merge pull request #17601 from joelkbiju12/ft-kubernetes-nodeready-label 2026-04-01 18:05:27 +02:00
images prometheus-agent-documentation 2024-07-27 14:21:24 +01:00
querying promql: add test and docs for info() behavior when info series goes stale (#18352) 2026-03-30 09:32:29 +02:00
feature_flags.md st: disconnect st-storage with xor2-encoding given planned experiments (#18316) 2026-03-19 08:47:42 +00:00
federation.md feat(nh): mark native histograms as stable in docs 2025-10-24 12:31:42 +02:00
getting_started.md docs(): fix gettingStarted outdated graph reference 2025-09-15 17:31:18 +02:00
http_sd.md fix(discovery): delete expired refresh metrics on reload (#17614) 2026-04-02 13:43:35 +01:00
index.md Update Prometheus Agent doc (#17591) 2025-11-21 11:34:19 +01:00
installation.md Standardize doc page title handling 2025-05-28 21:37:27 +02:00
management_api.md Standardize doc page title handling 2025-05-28 21:37:27 +02:00
migration.md Merge pull request #16155 from LukoJy3D/docs/migration/clarify_on_content_type_headers 2026-03-31 12:58:48 +01:00
prometheus_agent.md fix(docs): typo in prometheus_agent.md doc 2026-02-08 19:02:56 +03:30
stability.md chore: exclude experimental /v1/ endpoints from stability guarantees 2025-08-12 16:45:17 +02:00
storage.md Merge pull request #14317 from anarcat/wal-backups 2025-12-16 11:35:43 +00:00