When using the @ modifier with a timestamp that has no data, several
PromQL range functions were panicking with "index out of range [0]
with length 0". This was introduced by #16797 which changed function
signatures to use concrete types instead of interfaces.
The panic occurred because functions were accessing array elements
(matrixVal[0], vectorVals[0][0]) without checking if the arrays were
empty first.
Fixes#18018
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
Fix incorrect interpolation when counter resets occur in smoothed range
selector evaluation. Previously, the asymmetric handling of counter
resets (y1=0 on left edge, y2+=y1 on right edge) produced wrong values.
Now uniformly set y1=0 when a counter reset is detected, correctly
modeling the counter as starting from 0 post-reset.
This fixes rate calculations across counter resets. For example,
rate(metric[10s] smoothed) where metric goes from 100 to 10 (a reset)
now correctly computes 0.666... by treating the counter as resetting
to 0 rather than producing inflated values from the old behavior.
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
When filtering by a label that exists on both the input metric and
target_info (e.g., info(metric, {host_name="orbstack"}) where host_name
exists on both), the function incorrectly returned empty results.
The bug was in combineWithInfoVector: when no new labels were added
(because they all overlapped with base metric labels), the code entered
the "no match" filtering block even though an info series WAS matched.
The fix checks len(seenInfoMetrics) == 0 to correctly identify when no
info series matched. If an info series matched (seenInfoMetrics is
non-empty), the series is kept even if no new labels were added.
Fixes#17813
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Generally, binary operations between two vectors fail if there is a many-to-one
or one-to-many matching situation between series within a match group and no
`group_left()` or `group_right()` modifier is present. For filter ops this is
also generally the case, but there can be situations where multiple series on
one side can match a single series on the other side, but only 0 or 1 of those
multiple series remains after the filter operator has been applied. In this
case, the PromQL engine does not produce a matching error, since it only tracks
series matching for those series that survive the filtering. IMO this is
incorrect behavior (which can also erratically make a query sometimes fail and
sometimes succeed, depending on current sample values), and we should always
produce an error if there is a match error prior to applying the filter op.
This PR ensures that we do the cardinality / match tracking independently of
the result of the filter operation.
Signed-off-by: Julius Volz <julius.volz@gmail.com>
The test at line 1283 for avg_over_time(nhcb_metric[13m]) incorrectly
expected counter_reset_hint:gauge in the result. However, the actual
avg_over_time implementation does not explicitly set the CounterResetHint
to GaugeType on its output histogram.
With the new counter reset hint comparison logic added to the promqltest
framework (which compares hints when explicitly specified in expected
results), this incorrect expectation was now being caught.
This fix removes the incorrect counter_reset_hint:gauge from the expected
result, allowing the test to correctly verify the avg_over_time behavior
without asserting a specific hint value that the function does not set.
The counter reset hint comparison logic works as designed: if the expected
histogram has UnknownCounterReset (the default when not specified), no
comparison is performed. Only when a hint is explicitly specified in the
test expectation will it be compared against the actual result.
Fixes the test failure introduced by the counter reset hint comparison
feature in promqltest.
Signed-off-by: Aviral Garg <aviralg2106@gmail.com>
Signed-off-by: aviralgarg05 <gargaviral99@gmail.com>
The funcResets and funcChanges functions now correctly return no result when all float samples are at or before the range start for anchored selectors, consistent with the behavior of rate/increase functions.
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
Fixes#17255.
The implementation happens mostly in the Add and Sub method, but the reconciliation works for all relevant operations. For example, you can now `rate` over a range wherein the custom bucket boundaries are changing.
Any custom bucket reconciliation is flagged with an info-level annotation.
---------
Signed-off-by: Linas Medziunas <linas.medziunas@gmail.com>
Signed-off-by: Linas Medžiūnas <linasm@users.noreply.github.com>
avg_over_time already correctly checked the counter reset hint fo all
histograms, but in sum_over_time, the 1st histogram was missed in the
loop. This commit exposes the bug in a test.
Signed-off-by: beorn7 <beorn@grafana.com>
Fixes#17308.
As explained adding the warn-annotation about conflicting counter
reset hints doesn't happen consistently. Furthermore, because of
incremental mean calculation being used so far (which includes
subtraction), avg calculation always created gauge histograms.
The fix is to make Sub behave like Add WRT counter reset handling, and
then set the result of a subtraction to gauge explicitly in actual
PromQL subtraction (rather than using Sub for something else, like
incremental mean calculation). Also, track the presence of a
CounterReset hint and a NotCounterReset hint separately for the
entirety of aggregated histograms and create the warn-annotation based
on that.
As a minor fix, this commit also consistently creates the warn
annotation in aggregation to be about "aggregation" rather than
"subtraction" or "addition", because the latter are just internal
operations within the aggregation, which is not of interest for the
user.
Signed-off-by: beorn7 <beorn@grafana.com>
* Add anchored and smoothed to vector selectors.
This adds "anchored" and "smoothed" keywords that can be used following a matrix selector.
"Anchored" selects the last point before the range (or the first one after the range) and adds it at the boundary of the matrix selector.
"Smoothed" applies linear interpolation at the edges using the points around the edges. In the absence of a point before or after the edge, the first or the last point is added to the edge, without interpolation.
*Exemple usage*
* `increase(caddy_http_requests_total[5m] anchored)` (equivalent of *caddy_http_requests_total - caddy_http_requests_total offset 5m* but takes counter reset into consideration)
* `rate(caddy_http_requests_total[step()] smoothed)`
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
* Update docs/feature_flags.md
Co-authored-by: Charles Korn <charleskorn@users.noreply.github.com>
Signed-off-by: Julien <291750+roidelapluie@users.noreply.github.com>
* Smoothed/Anchored rate: Add more tests
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
* Anchored/Smoothed modifier: error out with histograms
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
---------
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
Signed-off-by: Julien <291750+roidelapluie@users.noreply.github.com>
Co-authored-by: Charles Korn <charleskorn@users.noreply.github.com>
Signed-off-by: Andrew Hall <andrew.hall@grafana.com>
Co-authored-by: Charles Korn <charleskorn@users.noreply.github.com>
Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
The current code stops the walk after we have found the first relevant
function. However, in expressions with multiple legs, we will then use
the `HistogramStatsIterator` at most once. This change should make
sure we explore all legs.
The added tests make sure we are not using `HistogramStatsIterator`
where we shouldn't (but the opposite can only be seen in a benchmark
or with a more explicit test).
Signed-off-by: beorn7 <beorn@grafana.com>
Prior to #17127, we needed to add another level in the AST to trigger
the usage of `HistogramStatsIterator`. This is fixed now.
Signed-off-by: beorn7 <beorn@grafana.com>
This is an attempt to make sure that we are not accidentally warning
about conflicting counter resets in rate calculation, see
https://github.com/prometheus/prometheus/pull/17051#issuecomment-3226503416 .
This is done by being more explicit about the warn expectation.
However, as long as
https://github.com/prometheus/prometheus/issues/15346 is not
addressed, we won't be able to trigger the annotation this way anyway.
However, we can play a trick, by wrapping a suitable expression in
`histogram_count` or `histogram_sum`, which will invoke the
`HistogramStatsIterator`, which in turn creates counter reset hints on
the fly. So this commit also adds tests with that, both for absence of
an annotation with `rate` and presence of an annotation with
`sum_over_time`.
Signed-off-by: beorn7 <beorn@grafana.com>
test tbs
Signed-off-by: beorn7 <beorn@grafana.com>
Add further tests for first_over_time (also covering existing
last_over_time, count_over_time, etc) to exercise vectors
containing a mix of float and histogram samples where the
histogram samples do not come last in the series.
This tripped over https://github.com/prometheus/prometheus/issues/17025
so it's structured a bit oddly to work around that bug in the
appender as used by promtest.
Signed-off-by: Craig Ringer <craig.ringer@enterprisedb.com>
Add a first_over_time function, and corresponding ts_of_first_over_time
function. Both are behind the experimental functions feature flag.
Signed-off-by: Craig Ringer <craig.ringer@enterprisedb.com>
As `histogram_count` is playing tricks to improve performance, we
better make sure that the limitation of extrapolation below zero still
works as expected.
Signed-off-by: beorn7 <beorn@grafana.com>
This deals with the count field of native histograms in the same way
as with simple float counters. It then scale the whole histogram with
the same factor as it has scaled the count. This will still allow
individual buckets to get extrapolated below zero, but maybe that is
fine.
This implements approach (2) as described in
https://github.com/prometheus/prometheus/issues/15976#issuecomment-3032095158
Signed-off-by: beorn7 <beorn@grafana.com>
This shows how float counters cannot go below zero when extrapolationg
for rate/increase, and how histograms do not have that protection yet,
leading to an overestimation of the rate/increase.
This also demonstrates edge cases where the count extrapolation does
not need to be limited, but an individual bucket still goes below
zero.
Signed-off-by: beorn7 <beorn@grafana.com>
step() is a new keyword introduced to represent the query step width in duration expressions.
min(a,b) and max(a,b) return the min and max from two duration expressions.
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>