* fix(promql): histogram_quantile NaN observed in native histogram
Fixes: #16578
See the issue for detailed explanation.
When a histogram had only NaN observations and no normal observations,
we returned 0 from the quantile, which is completely wrong. If there were
normal observations but we went over them, we returned the upper bound of
the existing buckets, however that contradicts expectations on
histogram_fraction. Now we return NaN if the quantile is calculated to be
over all normal observations, falling into NaNs (in a virtual +Inf bucket).
We also return info level annotations if we see any NaN observations.
The annotation calls out if we returned NaN or even if we took the
virtual +Inf bucket into account.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
* fix(promql): histogram_fraction NaN observed in native histogram
Fixes: #16580
According to the specification we should not take NaN observations
into account when calculating the fraction. This commit fixes that
and adds an info level annotation to let the user know about this.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
This commit adds the ts_of_(max,min,last)_over_time functions behind the experimental feature flag.
Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
Add support for promoting all OTel resource attributes via `promote_all_resource_attributes`,
except for those ignored using 'ignore_resource_attributes'.
---------
Signed-off-by: Antonio Jimenez <antonjim@thousandEyes.com>
Signed-off-by: Antonio Jimenez <123171955+antonjim-te@users.noreply.github.com>
The new docs site will have syntax highlighting, so this adds language tags
to code boxes that are currently missing them. I didn't add `promql` as a
language yet since the highlighter doesn't support it yet, plus a lot of
the PromQL codeboxes in our docs aren't strictly valid PromQL, they are
more like multiple expressions listed in the same code box on multiple
lines. So I'm leaving that for sometime later.
In the HTTP API page, I moved the curl examples from the JSON codeboxes to
their own ones above the JSON output. I considered putting an "Output:"
text between the curl + JSON output, but I think the way it currently looks
without it is probably fine.
I also fixed a number of headings which were at the wrong level relative to
their nesting in the document.
I also removed `go` as a language from the Go template language examples,
because the Go template language isn't Go at all.
I also adjusted the indent on one codebox to be more reasonable (2 spaces
instead of 8).
And then finally, my editor made a bunch of whitespace changes
automatically, like removing trailing spaces.
Signed-off-by: Julius Volz <julius.volz@gmail.com>
Signed-off-by: Julius Volz <julius.volz@gmail.com>
This is a fix for documentation update done in https://github.com/prometheus/prometheus/pull/16066, setting correct name for a configuration value.
Signed-off-by: Martin Danko <46035688+zepellin@users.noreply.github.com>
Addresses https://github.com/prometheus/prometheus/issues/16371
This will help with migrating to native histograms with `convert_classic_histograms_to_nhcb` since users may still need to keep the classic histograms during a migration
Signed-off-by: chardch <otwordsne@gmail.com>
State explicitely what kind of timestamps are expected for the
--min-time and --max-time options of promtool tsdb commands.
This is especially important for the dump-openmetrics command as users
could otherwise mistakenly think it would be in seconds, like the
OpenMetrics timestamps themselves.
Signed-off-by: Nicolas Peugnet <nicolas.peugnet@lip6.fr>
Allows to filter the servers when sending the listing request to the API. This feature is only available when using the `role=hcloud`.
See https://docs.hetzner.cloud/#label-selector for details on how to use the label selector.
Signed-off-by: Jonas Lammler <jonas.lammler@hetzner-cloud.de>
* Add documentation for 'NoTranslation' mode
Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>
* Update docs/configuration/configuration.md
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>
---------
Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
* promql: histogram_fraction for bucket histograms
This PR extends the histogram_fraction function to also work with classic bucket histograms. This is beneficial because it allows expressions like sum(increase(my_bucket{le="0.5"}[10m]))/sum(increase(my_total[10m])) to be written without knowing the actual values of the "le" label, easing the transition to native histograms later on.
It also feels natural since histogram_quantile also can deal with classic histograms.
Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
* promql: histogram_fraction for bucket histograms
* Add documentation and reduce code duplication
* Fix a bug in linear interpolation between bucket boundaries
* Add more PromQL tests
Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
* Update docs/querying/functions.md
Co-authored-by: Björn Rabenstein <github@rabenste.in>
Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
---------
Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
Co-authored-by: Björn Rabenstein <github@rabenste.in>
These are supported in the main prometheus binary but the feature flags
weren't supported in promtool.
Fixes#16412.
Signed-off-by: David Leadbeater <dgl@dgl.cx>
We implemented proper type handling (gauge vs. counter histogram) a
while ago. The note about it is obsolete.
Signed-off-by: beorn7 <beorn@grafana.com>
Updated the parser to allow calculations in PromQL durations.
This enables durations in the form of:
rate(http_requests_total[10m+2s])
The computation of the calculations is done directly at the parse level and does not hit the PromQL Engine.
The lexer has also been updated and improved, in particular for subqueries.
Buxfix: rate(http_requests_total[0]) is no longer allowed.
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
The new metric_name_escaping_scheme config option works in parallel with metric_name_validation_scheme and controls which escaping scheme is requested when scraping. When not specified, the scheme will request underscores if the validation scheme is set to legacy, and will request allow-utf-8 when the validation scheme is set to utf8. This setting allows users to allow utf8 names if they like, but explicitly request an escaping scheme rather than UTF-8.
Fixes https://github.com/prometheus/prometheus/issues/16034
Built on https://github.com/prometheus/prometheus/pull/16080
Signed-off-by: Owen Williams <owen.williams@grafana.com>
* ruler notifier: make batch size configurable
In Mimir we experimented with setting a higher value for the batch size.
A 4x increase in batch size decreased the time to process a single notification by about 2x.
This reduces the processing time of the notifications queue and increases the throughput of the queue.
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
* Update cmd/prometheus/main.go
Co-authored-by: gotjosh <josue.abreu@gmail.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
* Update docs
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
* Use a string constant
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
* Add godoc comment on exported constant
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
---------
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Co-authored-by: gotjosh <josue.abreu@gmail.com>
doc: fix broken kuma.io link
I'm not actually familiar with kuma, but I noticed this link was broken, and I believe the one I've found here is equivalent.
Signed-off-by: Ian Kerins <git@isk.haus>
Signed-off-by: Björn Rabenstein <github@rabenste.in>
Co-authored-by: Björn Rabenstein <github@rabenste.in>
Rationales:
* metadata-wal-records might be deprecated and replaced going forward: https://github.com/prometheus/prometheus/issues/15911
* PRW 2.0 works without metadata just fine (although it sends untyped metrics as expected).
Signed-off-by: bwplotka <bwplotka@gmail.com>
This is also meant to document the actual implementation, but
see #13934 for the current state.
This also improves and streamlines some parts of the documentation
that are not strictly native histogram related, but are colocated with
them. In particular, the section about aggregation operators got
restructured quite a bit, including the removal of a quite verbose
example for `limit_ratio` (which was just too long an this location
and also a bit questionabl in its usefulness).
Signed-off-by: beorn7 <beorn@grafana.com>
This fixes a formatting problem (`__name__`) was rendered in boldface
without the underscores in the headline).
Furthermore, it explains the possible issues with the feature flag
(change of behavior of certain "weird" queries, problems when
disecting a query manually or in PromLens).
Signed-off-by: beorn7 <beorn@grafana.com>
Add --ignore-unknown-fields that ignores unknown fields in rule group
files. There are lots of tools in the ecosystem that "like" to extend
the rule group file structure but they are currently unreadable by
promtool if there's anything extra. The purpose of this flag is so that
we could use the "vanilla" promtool instead of rolling our own.
Some examples of tools/code:
https://github.com/grafana/mimir/blob/main/pkg/mimirtool/rules/rwrulefmt/rulefmt.go8898eb3cc5/pkg/rules/rules.go (L18-L25)
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
What
Adds support for OTLP delta temporality to the OTLP endpoint.
This is done by calling the deltatocumulative processor from the OpenTelemetry collector during OTLP conversion.
Why
Delta conversion is a naturally stateful process, which requires careful request routing when operated inside a collector.
Prometheus is already stateful and doing the conversion in-server reduces the operational burden on the ingest architecture by only having one stateful component.
How
deltatocumulative is a OTel collector component that works as follows:
* pmetric.Metrics come from a receiver or in this case from the HTTP client
* It operates as an in-place update loop:
* for each sample, if not delta, leave unmodified
* if delta, do:
* state += sample, where state is the in-memory sum of all previous samples
* sample = state, sample value is now cumulative
* this is supported for sums (counters), gauges, histograms (old histograms) and exponential histograms (native histograms)
If a series receives no new samples for 5m, its state is removed from memory
Performance
Delta performance is a stateful operation and the OTel code is not highly optimized yet, e.g. it locks the entire processor for each request. Nonetheless, care has been taken to mitigate those effects:
delta conversion is behind a feature flag. If disabled, no conversion code is ever invoked
if enabled, conversion is not invoked if request not actually contains delta samples. This leads to no measureable performance difference between default-cumulative to convert-cumulative (only cumulative, feature on/off)
Signed-off-by: sh0rez <me@shorez.de>
This commit introduced two field in `/status` endpoint:
- The node currently serving the request.
- The current server time for debugging time drift issues.
fixes#15394.
Signed-off-by: sujal shah <sujalshah28092004@gmail.com>
When a remote-write is executed towards a host name that is resolved to multiple IP addresses, this PR introduces a possibility to force creation of new connections used for the remote-write request to a randomly chosen IP address from the ones corresponding to the host name. The default behavior remains unchanged, i.s., the IP address used for the connection creation remains the one chosen by Go.
This is an experimental feature, it is disabled by default.
Signed-off-by: Yuri Nikolic <durica.nikolic@grafana.com>
Enable the `auto-gomaxprocs` feature flag by default.
* Add command line flag `--no-auto-gomaxprocs` to disable.
Signed-off-by: SuperQ <superq@gmail.com>
Enable the `auto-gomemlimit` feature flag by default.
* Add command line flag `--no-auto-gomemlimit` to disable.
Signed-off-by: SuperQ <superq@gmail.com>
Enable the `auto-gomaxprocs` feature flag by default.
* Add command line flag `--no-auto-gomaxprocs` to disable.
Signed-off-by: SuperQ <superq@gmail.com>
* Enable auto-gomemlimit by default
Enable the `auto-gomemlimit` feature flag by default.
* Add command line flag `--no-auto-gomemlimit` to disable.
---------
Signed-off-by: SuperQ <superq@gmail.com>
Documented that WAL can still be written after memory-snapshot-on-shutdown - #10824
Co-authored-by: Björn Rabenstein <github@rabenste.in>
Signed-off-by: gopi <gopi.singaravelan.k@gmail.com>
---------
Signed-off-by: Gopi-eng2202 <gopi.singaravelan.k@gmail.com>
Signed-off-by: gopi <gopi.singaravelan.k@gmail.com>
Co-authored-by: Björn Rabenstein <github@rabenste.in>
* docs: 2 to 3 migration guide
Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
* docs/stability: add 3.0 section
Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
* docs/migration: details on enabling legacy name validation
Signed-off-by: Owen Williams <owen.williams@grafana.com>\
* migration: add log format and `le` normalization
Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
* migration: add new enable_http2 default for remote write
Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
---------
Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
Signed-off-by: Owen Williams <owen.williams@grafana.com>
Co-authored-by: Owen Williams <owen.williams@grafana.com>
Remote-write creates several shards to parallelise sending, each with
its own http connection. We do not want them all combined onto one
socket by http2.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* promtool: Add debug flag for rule tests
This makes it print out the tsdb state (both input_series and rules that
are run) at the end of a test, making reasoning about tests much easier.
Signed-off-by: David Leadbeater <dgl@dgl.cx>
* Reuse generated test name from junit testing
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
---------
Signed-off-by: David Leadbeater <dgl@dgl.cx>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Co-authored-by: David Leadbeater <dgl@dgl.cx>
scrape: Remove implicit fallback to the Prometheus text format
Remove implicit fallback to the Prometheus text format in case of invalid/missing Content-Type and fail the scrape instead. Add ability to specify a `fallback_scrape_protocol` in the scrape config.
---------
Signed-off-by: alexgreenbank <alex.greenbank@grafana.com>
Signed-off-by: Alex Greenbank <alex.greenbank@grafana.com>
Co-authored-by: Björn Rabenstein <beorn@grafana.com>
The `info` function is an experiment to improve UX
around including labels from info metrics.
`info` has to be enabled via the feature flag `--enable-feature=promql-experimental-functions`.
This MVP of info simplifies the implementation by assuming:
* Only support for the target_info metric
* That target_info's identifying labels are job and instance
Also:
* Encode info samples' original timestamp as sample value
* Deduce info series select hints from top-most VectorSelector
---------
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Ying WANG <ying.wang@grafana.com>
Co-authored-by: Augustin Husson <augustin.husson@amadeus.com>
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Co-authored-by: Björn Rabenstein <github@rabenste.in>
Co-authored-by: Bryan Boreham <bjboreham@gmail.com>
Put a scalar to query, it can be graphed.
So the doc says "an expression that returns an instant vector is the only type which can be graphed." is not correct?
And also, a query_range, which used for graph, always return a range vector <https://promlabs.com/blog/2020/06/18/the-anatomy-of-a-promql-query/#range-queries> , so it's confusing to read the above statement.
Signed-off-by: Viet Hung Nguyen <hvn@familug.org>