Add a maximum limit of 10,000 to the TSDB status endpoint to prevent
resource exhaustion from excessively large limit values, as we preallocate
[]Stat for up to the limit: `make([]Stat, 0, length)`.
Note that the endpoint acquires a cardinality mutex during
stats calculation, so this can not be run in parallel.
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
Remove redundant IsZero check since promqltest.LazyLoader already
handles zero StartTime by defaulting to Unix epoch.
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
Currently both the backend and frontend printers/formatters/serializers
incorrectly transform the following expression:
```
up * ignoring() group_left(__name__) node_boot_time_seconds
```
...into:
```
up * node_boot_time_seconds
```
...which yields a different result (including the metric name in the result
vs. no metric name).
We need to keep empty `ignoring()` modifiers if there is a grouping modifier
present.
Signed-off-by: Julius Volz <julius.volz@gmail.com>
This commit adds support for configuring a custom start timestamp
for Prometheus unit tests, allowing tests to use realistic timestamps
instead of starting at Unix epoch 0.
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
For tests only, we had various ways of opening DB. Reduced to one
instead of:
* Open
* newTestDB
* newTestDBOpts
* openTestDB
This so https://github.com/prometheus/prometheus/pull/17629 is smaller
and bit easier. Also for test maintainability and consistency.
Signed-off-by: bwplotka <bwplotka@gmail.com>
The return value of functions relating to the current time, e.g. time(),
is set by promtool to start at timestamp 0 at the start of a test's
evaluation.
This has the very nice consequence that tests can run reliably without
depending on when they are run.
It does, however, mean that tests will give out results that can be
unexpected by users.
If this behaviour is documented, then users will be empowered to write
tests for their rules that use time-dependent functions.
(Closes: prometheus/docs#1464)
Signed-off-by: Gabriel Filion <lelutin@torproject.org>
* Delay compactions until Thanos uploads all blocks
Using Thanos sidecar with Prometheus requires us to disable TSDB compactions on Prometheus side by setting --storage.tsdb.min-block-duration and --storage.tsdb.max-block-duration to the same value. See https://thanos.io/tip/components/sidecar.md. The main problem this avoids is that Prometheus might compact given block before Thanos uploads it, creating a gap in Thanos metrics. Thanos does not upload compacted blocks because that would upload the same sample multiple times. You can tell Thanos to upload compacted blocks but that is aimed at one time migrations. This patch creates a bridge between Thanos and Prometheus by allowing Prometheus to read the shipper file Thanos creates, where it tracks which blocks were already uploaded, and using that data delays compaction of blocks until they are marked as uploaded by Thanos. Thanks to this both services can coordinate with each other (in a way) and we can stop disabling compaction on Prometheus side when Thanos uploads are enabled.
The reason to have this is that disabling compactions have very dramatic performance cost. Since most time series exist for longer than a single block duration (2h by default) large chunks of block index will reference the same series, so 10 * 2h blocks will each have an index that is usually fairly big and is almost the same for all 10 blocks. Compaction de-duplicates the index so merging 10 blocks together would leave us with a single index that is around the same size as each of these 10 2h blocks would have (plus some extra for series that only exists in some blocks, but not all). Every range query that iterates over all 10 blocks would then have to read each index and so we're doing 10x more work then if we had a single compacted block.
Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>
* Rename structs and functions to make this more generic
Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>
* Address review comments
Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>
* Cache UploadMeta for 1 minute
Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>
---------
Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>
The test at line 1283 for avg_over_time(nhcb_metric[13m]) incorrectly
expected counter_reset_hint:gauge in the result. However, the actual
avg_over_time implementation does not explicitly set the CounterResetHint
to GaugeType on its output histogram.
With the new counter reset hint comparison logic added to the promqltest
framework (which compares hints when explicitly specified in expected
results), this incorrect expectation was now being caught.
This fix removes the incorrect counter_reset_hint:gauge from the expected
result, allowing the test to correctly verify the avg_over_time behavior
without asserting a specific hint value that the function does not set.
The counter reset hint comparison logic works as designed: if the expected
histogram has UnknownCounterReset (the default when not specified), no
comparison is performed. Only when a hint is explicitly specified in the
test expectation will it be compared against the actual result.
Fixes the test failure introduced by the counter reset hint comparison
feature in promqltest.
Signed-off-by: Aviral Garg <aviralg2106@gmail.com>
Signed-off-by: aviralgarg05 <gargaviral99@gmail.com>
This commit implements counter reset hint comparison in the promqltest
framework to address issue #17615. Previously, while test definitions
could specify a counter_reset_hint in expected native histogram results,
the framework did not actually compare this hint between expected and
actual results.
The implementation adds optional comparison logic to the
compareNativeHistogram function:
- If the expected histogram has UnknownCounterReset (the default),
the hint is not compared (meaning "don't care")
- If the expected histogram explicitly specifies CounterReset,
NotCounterReset, or GaugeType, it is verified against the actual
histogram's hint
This allows tests to verify that PromQL functions correctly set or
preserve counter reset hints while maintaining backward compatibility
with existing tests that don't specify explicit hints.
Fixes#17615
Signed-off-by: aviralgarg05 <gargaviral99@gmail.com>
This adds the following native histograms (with a few classic buckets for backwards compatibility), while keeping the corresponding summaries (same name, just without `_histogram`):
- `prometheus_sd_refresh_duration_histogram_seconds`
- `prometheus_rule_evaluation_duration_histogram_seconds`
- `prometheus_rule_group_duration_histogram_seconds`
- `prometheus_target_sync_length_histogram_seconds`
- `prometheus_target_interval_length_histogram_seconds`
- `prometheus_engine_query_duration_histogram_seconds`
Signed-off-by: Harsh <harshmastic@gmail.com>
Signed-off-by: harsh kumar <135993950+hxrshxz@users.noreply.github.com>
Co-authored-by: Björn Rabenstein <github@rabenste.in>
Improve test stability by waiting for the relevant metrics to appear on /metrics before the
first check on the desired shard count.
Increase the scrape interval to avoid timeouts, as 100 ms may be insufficient for Prometheus
to scrape itself in some environments (e.g., CI).
Have Prometheus scrape itself multiple times to increase the volume of data sent and help
fill the queue more quickly.
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
The original implementation in #9705 for native histograms included a
technical dept #15177 where samples were committed ordered by type
not by their append order. This was fixed in #17071, but this docstring
was not updated.
I've also took the liberty to mention that we do not order by timestamp
either, thus it is possible to append out of order samples.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
See https://github.com/prometheus/prometheus/issues/16911
This will create a denser layout by default, enabling people to see more
information on the page without having to discover the global settings menu.
Signed-off-by: Julius Volz <julius.volz@gmail.com>
Added tests for durationWithUnitRegexp to validate matching of complete durations with units and ensure non-matching cases are correctly identified.
Signed-off-by: ADITYA TIWARI <142050150+ADITYATIWARI342005@users.noreply.github.com>