Commit graph

39 commits

Author SHA1 Message Date
beorn7
6fcd225aee promql(native histograms): Introduce exponential interpolation
The linear interpolation (assuming that observations are uniformly
distributed within a bucket) is a solid and simple assumption in lack
of any other information. However, the exponential bucketing used by
standard schemas of native histograms has been chosen to cover the
whole range of observations in a way that bucket populations are
spread out over buckets in a reasonably way for typical distributions
encountered in real-world scenarios.

This is the origin of the idea implemented here: If we divide a given
bucket into two (or more) smaller exponential buckets, we "most
naturally" expect that the samples in the original buckets will split
among those smaller buckets in a more or less uniform fashion. With
this assumption, we end up with an "exponential interpolation", which
therefore appears to be a better match for histograms with exponential
bucketing.

This commit leaves the linear interpolation in place for NHCB, but
changes the interpolation for exponential native histograms to
exponential. This affects `histogram_quantile` and
`histogram_fraction` (because the latter is more or less the inverse
of the former).

The zero bucket has to be treated specially because the assumption
above would lead to an "interpolation to zero" (the bucket density
approaches infinity around zero, and with the postulated uniform usage
of buckets, we would end up with an estimate of zero for all quantiles
ending up in the zero bucket). We simply fall back to linear
interpolation within the zero bucket.

At the same time, this commit makes the call to stick with the
assumption that the zero bucket only contains positive observations
for native histograms without negative buckets (and vice versa). (This
is an assumption relevant for interpolation. It is a mostly academic
point, as the zero bucket is supposed to be very small anyway.
However, in cases where it _is_ relevantly broad, the assumption helps
a lot in practice.)

This commit also updates and completes the documentation to match both
details about interpolation.

As a more high level note: The approach here attempts to strike a
balance between a more simplistic approach without any assumption, and
a more involved approach with more sophisticated assumptions. I will
shortly describe both for reference:

The "zero assumption" approach would be to not interpolate at all, but
_always_ return the harmonic mean of the bucket boundaries of the
bucket the quantile ends up in. This has the advantage of minimizing
the maximum possible relative error of the quantile estimation.
(Depending on the exact definition of the relative error of an
estimation, there is also an argument to return the arithmetic mean of
the bucket boundaries.) While limiting the maximum possible relative
error is a good property, this approach would throw away the
information if a quantile is closer to the upper or lower end of the
population within a bucket. This can be valuable trending information
in a dashboard. With any kind of interpolation, the maximum possible
error of a quantile estimation increases to the full width of a bucket
(i.e. it more than doubles for the harmonic mean approach, and
precisely doubles for the arithmetic mean approach). However, in
return the _expectation value_ of the error decreases. The increase of
the theoretical maximum only has practical relevance for pathologic
distributions. For example, if there are thousand observations within
a bucket, they could _all_ be at the upper bound of the bucket. If the
quantile calculation picks the 1st observation in the bucket as the
relevant one, an interpolation will yield a value close to the lower
bucket boundary, while the true quantile value is close to the upper
boundary.

The "fancy interpolation" approach would be one that analyses the
_actual_ distribution of samples in the histogram. A lot of statistics
could be applied based on the information we have available in the
histogram. This would include the population of neighboring (or even
all) buckets in the histogram. In general, the resolution of a native
histogram should be quite high, and therefore, those "fancy"
approaches would increase the computational cost quite a bit with very
little practical benefits (i.e. just tiny corrections of the estimated
quantile value). The results are also much harder to reason with.

Signed-off-by: beorn7 <beorn@grafana.com>
2024-09-19 14:19:10 +02:00
Jan Fajerski
91608c002f Merge branch 'main' into release-3.0-beta.0
Conflicts:
	scrape/scrape_test.go
          Pick both changes.
2024-09-10 20:51:20 +02:00
Charles Korn
e8c7482137
Return negative counts when multiplied or divided by a negative value
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-09-09 14:37:59 +10:00
Charles Korn
113de6301c
Add failing test cases for unary negation and multiplication and division with negative scalars
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-09-04 16:20:28 +10:00
Charles Korn
9b451abec7
Make positive and negative bucket counts different in existing test cases
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-09-04 16:08:05 +10:00
Jan Fajerski
956245b25b promqltest: adjust eval times and range selector
In order to fix new tests for changes added in
https://github.com/prometheus/prometheus/pull/13904.

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
2024-09-02 11:27:39 +02:00
Jan Fajerski
00315ce15e Merge branch 'main' into 3.0-main-sync-24-08-30
using -Xours

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
2024-09-02 11:27:18 +02:00
Neeraj Gartia
8c7bf39d96
Moves TestNativeHistogram_MulDivOperator to promql testing framework (#14688)
PromQL: add test for mul and div operator

Also, remove the converted test from the engine_test.go file.

This also includes an extension of the test framework to allow NaN/Inf in histogram buckets.

---------

Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>
2024-08-29 16:42:35 +02:00
Jan Fajerski
7c8c748399 promql tests: adjust range query intervals
Some test queries need their interval adjusted to account for
https://github.com/prometheus/prometheus/pull/13904. Otherwise the
queries don't return enough samples.
promql/engine_test.go:TestHistogramCopyFromIteratorRegression needed the
same, but also the result needed a fix since `increase` interpolates
over the full range.

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
2024-08-21 12:33:52 +02:00
Jan Fajerski
5138922b0d Merge branch 'main' into 3.0-main-sync-24-08-21 2024-08-21 09:09:36 +02:00
Ziqi Zhao
8f828d45c1 convert TestNativeHistogram_Sum_Count_Add_AvgOperator into testing framework
Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
2024-08-21 09:24:50 +08:00
Charles Korn
52818a97e2
Merge branch 'main' into sum-and-avg-over-mixed-custom-exponential-histograms
# Conflicts:
#	promql/promqltest/testdata/native_histograms.test
2024-08-14 07:52:08 +10:00
György Krajcsovits
386fc8b9f6 Update from review comments.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-08-13 15:26:07 +02:00
György Krajcsovits
6aee5b4b38 fix typo
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-08-12 12:04:45 +02:00
György Krajcsovits
06a8886b94 Native histograms: define behavior when rate is null.
Histogram quantile returns NaN in this case, which might be
surprising, so add a unit test that clarifies that this is
intentional.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2024-08-12 10:40:21 +02:00
Charles Korn
f992f81bd0
Merge branch 'main' into sum-and-avg-over-mixed-custom-exponential-histograms
Signed-off-by: Charles Korn <charleskorn@users.noreply.github.com>
2024-08-09 13:58:54 +10:00
Charles Korn
5cfdde327c
Address PR feedback: add extra test case
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-08-09 13:57:37 +10:00
Charles Korn
82bb35fabb
Address PR feedback: fix typo and rename variable
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-08-09 13:51:31 +10:00
Charles Korn
f07b3ae67b
Fix issue where avg over mixed exponential and custom buckets, or incompatible custom buckets, produces incorrect results or panics
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-08-07 15:32:35 +10:00
Charles Korn
5ee94f49a2
Fix issue where sum over mixed exponential and custom buckets, or incompatible custom buckets, produces incorrect results
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-08-07 15:30:01 +10:00
Charles Korn
424cefcf5e
Fix "cannot reduce resolution to custom buckets schema" panic in rate over native histograms with mix of custom and exponential buckets
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-08-07 14:45:02 +10:00
Charles Korn
aadec25faf
promql: Fix issue where some native histogram-related annotations are not emitted by rate (#14575)
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-08-06 09:10:40 +01:00
Jan Fajerski
adf5d6bce1 Merge branch 'main' into 3.0-main-sync-24-07-18
Signed-off-by: Jan Fajerski <jfajersk@redhat.com>

Conflicts:
	VERSION
          pick 3.0.0
	promql/promqltest/testdata/histograms.test
          pick changes from c39776c5b5,
          but adjust 5m range selectors to 10m to account for
          https://github.com/prometheus/prometheus/pull/13904.
Fixes:
        promql/promqltest/testdata/functions.test
	promql/promqltest/testdata/staleness.test
          Tests added in https://github.com/prometheus/prometheus/pull/9138
          need to be adjusted to account for
          https://github.com/prometheus/prometheus/pull/13904.
2024-07-18 15:56:40 +02:00
Filip Petkovski
acb6c1ae4b
Fix decoding buckets for native histograms in binops
The optimizer which detects cases where histogram buckets can be skipped
does not take into account binary expressions. This can lead to buckets
not being decoded if a metric is used with both histogram_fraction/quantile and
histogram_sum/count in the same expression.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2024-07-10 11:55:29 +02:00
Jan Fajerski
c0d67fd845 Merge branch 'main' into 3.0-main-sync-24-07-09 2024-07-09 16:51:07 +02:00
Jan Fajerski
b4152309a4 Merge branch 'main' into 3.0-main-sync
Conflicts:
  promql/engine_test.go
    Resolved by picking main changes but adjusting total_samples for
    query "max_over_time(metricWith1HistogramEvery10Seconds[60s])[20s:5s]"
    to 312. Via https://github.com/prometheus/prometheus/pull/13662 this
    histogram now stores 13 values per timestamp, but via
    https://github.com/prometheus/prometheus/pull/13904 the range query
    is now left-open.
  promql/promqltest/testdata/functions.test
    Resolved by picking changes in main. See also
    https://github.com/prometheus/prometheus/pull/13662, but adjust some
    range selectors (`s/1m/2m/`) to account for
    https://github.com/prometheus/prometheus/pull/13904.
  promql/promqltest/testdata/histograms.test
    Resolved by picking changes in main. See also
    https://github.com/prometheus/prometheus/pull/13662, but adjust some
    range selectors (`s/5m/10m/`) to account for
    https://github.com/prometheus/prometheus/pull/13904.

Signed-off-by: Jan Fajerski <jfajersk@redhat.com>
2024-07-03 09:45:25 +02:00
Ganesh Vernekar
3d54bcc018
Merge pull request #14362 from charleskorn/charleskorn/sum-infinity 2024-07-03 01:05:03 -04:00
Charles Korn
fd6bdf5230
Fix issue where summation of +/- infinity returns NaN instead of infinity
Signed-off-by: Charles Korn <charles.korn@grafana.com>
2024-06-28 11:26:54 +10:00
Jeanette Tan
dda5f48c9e Merge branch 'main' into nhcb-review-2 2024-06-20 22:50:00 +08:00
Zhang Zhanpeng
debbdb8608 make matrix selection and lookback left-open and right-closed
Signed-off-by: Zhang Zhanpeng <zhangzhanpeng.zzp@alibaba-inc.com>
Signed-off-by: beorn7 <beorn@grafana.com>
Co-authored-by: beorn7 <beorn@grafana.com>
2024-06-20 22:05:40 +08:00
Jeanette Tan
14f8dded39 Merge branch 'main' into nhcb
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2024-06-07 19:17:14 +08:00
beorn7
c7fdfe8004 promql: Add tests for histogram counter reset only in bucket
This also exercises the "fast path" (only decoding count and sum),
i.e. where the counter reset isn't visible at all in the decoded data.

Signed-off-by: beorn7 <beorn@grafana.com>
2024-06-06 17:47:38 +02:00
Jeanette Tan
f028496133 Merge branch 'main' into nhcb
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2024-05-14 16:20:15 +08:00
Neeraj Gartia
661856cb65 removes the added tests from engine_test.go
Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>
2024-05-13 22:58:25 +05:30
Neeraj Gartia
6119124d0e some nits
Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>
2024-05-13 22:55:38 +05:30
Neeraj Gartia
adf5a36c1e adds test for sum, count, stddev, stdvar, quantile and fraction func to promql testing framework
Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>
2024-05-13 22:55:38 +05:30
Neeraj Gartia
8b838a05d9 adds test for native histogram rate func in promql testing framework
Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>
2024-05-13 22:55:38 +05:30
Neeraj Gartia
548bd9d6fb adds TestNativeHistogramRate func to promql test framework
Signed-off-by: Neeraj Gartia <neerajgartia211002@gmail.com>
2024-05-13 22:55:38 +05:30
Bryan Boreham
11b27d5d22 test: move test files into new promqltest package
So that promql package does not bring in test-only dependencies.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-05-08 13:42:55 +01:00
Renamed from promql/testdata/native_histograms.test (Browse further)