Commit graph

1656 commits

Author SHA1 Message Date
George Krajcsovits
2aefd00c24
tsdb/chunkenc: take a generic Appender as the prev parameter (#18857)
The chunkenc.Appender interface's AppendHistogram and AppendFloatHistogram
methods used to require a typed previous appender (*HistogramAppender or
*FloatHistogramAppender). Callers were forced to type-assert s.app to that
concrete type before each call, discarding it (and the cross-chunk
counter-reset signal it carries) whenever the actual concrete type didn't
match -- for example when a chunk used a different histogram encoding.

Change the signature to accept the generic chunkenc.Appender interface and
move the concrete-type check inside each implementation, onto the code path
that actually needs the previous appender's state (the new-chunk branch
where setCounterResetHeader runs). The check goes through small private
interfaces -- histogramAppendable and floatHistogramAppendable -- so any
appender type that exposes the appropriate appendable() method can serve as
prev, and types that don't (xor, xor2, or a histogram appender on the
opposite-kind code path) are silently ignored.

This prepares the ground for #18609, which introduces *HistogramSTAppender
and *FloatHistogramSTAppender. Both embed their non-ST counterparts and
will satisfy the new interfaces automatically, so they can be passed as
prev without a special case in the caller.

Callers in tsdb/head_append.go and tsdb/ooo_head.go are simplified
accordingly. The Appender consumers in storage/series.go and tsdb/querier.go
were already passing nil and continue to do so unchanged.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2026-06-09 10:41:43 +02:00
Yuri Nikolic
ed12b940fd
tsdb: capture chunk-boundary samples in CompactStaleHead
Signed-off-by: Yuri Nikolic <durica.nikolic@grafana.com>
2026-06-03 15:58:05 +02:00
Đurica Yuri Nikolić
5ddb7e49e3
tsdb: store maxt timestamp in walExpiries on stale series eviction (#18847)
[BUGFIX] tsdb: store a millisecond timestamp (not a WAL segment number) in `walExpiries` 
when a series is evicted via `CompactStaleHead`/`CompactSelectedSeries`, so the series' 
label record is correctly retained in the next WAL checkpoint and replays cleanly.

Signed-off-by: Yuri Nikolic <durica.nikolic@grafana.com>
2026-06-03 14:56:17 +01:00
RoyS
f3d653fb5c
chore: fix typos in comments (#18834)
* chore: fix typos in comments

Fix three minor typos in source comments:
- scrape: mimicks -> mimics
- tsdb: descibes -> describes
- ui/codemirror-promql: theses -> these

Signed-off-by: RoySerbi <roy676564@gmail.com>

* ci: retrigger CI to clear known 32-bit flake

Empty commit to retrigger CI. The previous run failed only on
'Go tests for 32-bit x86' due to the known intermittent flake in
TestRemoteWrite_PerQueueMetricsAfterRelabeling (see #17356), which
is unrelated to this comment-only PR.

Signed-off-by: RoySerbi <roy676564@gmail.com>

---------

Signed-off-by: RoySerbi <roy676564@gmail.com>
2026-06-02 15:34:02 +02:00
Bryan Boreham
87866e0c3f
Merge pull request #18838 from prometheus/tsdb/fix-histogram-pending-commit-condition
tsdb: fix pendingCommit condition for classic histogram append
2026-06-02 10:32:06 +01:00
György Krajcsovits
c2b77f753b
tsdb: add regression tests for histogram pendingCommit on append error
Add TestAppendHistogramErrorDoesNotSetPendingCommit (V1) and
TestHeadAppenderV2_HistogramErrorDoesNotSetPendingCommit (V2),
each covering the integer and float histogram branches.

The integer V1 branch previously set s.pendingCommit on the error
path, which left the flag stuck on existing series whenever an
append was rejected (e.g. ErrOutOfOrderSample). Because the failed
sample is never added to the appender's batch, Commit/Rollback
never clears pendingCommit for that series, and head GC at
tsdb/head.go treats it as still in use.

The V1 integer subtest fails on main without the prior commit;
both subtests pass with it. The V2 paths already use err == nil
and the V2 test is a lock-in; inverting the V2 condition locally
confirms the test would catch a similar regression there.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2026-06-02 09:33:40 +02:00
Weixie Cui
56918ba034
tsdb: fix pendingCommit condition for classic histogram append
AppendHistogram used err != nil when deciding to set pendingCommit for integer histograms, while the float histogram branch uses err == nil. Align the classic histogram path so pendingCommit is set only after a successful appendableHistogram check, matching appendableFloatHistogram.

Signed-off-by: Weixie Cui <cuiweixie@gmail.com>
2026-06-02 09:15:05 +02:00
Iheanacho Amarachi Sharon
f9ba49a9b6
tsdb: complete TestHistogramCounterResetHeader for integer histograms (#18289)
Signed-off-by: Amarachi Iheanacho <amarachi.iheanacho@siderolabs.com>
2026-06-02 09:04:42 +02:00
Bartlomiej Plotka
178c53d83c
Merge pull request #18813 from miguelbernadi/improve-hostogram-allocations
tsdb/record: eliminate prev pointer escapes in V2 histogram WAL decoder
2026-06-01 10:54:02 +01:00
Bryan Boreham
489d90e717
Merge pull request #18735 from colega/implement-head-stale-index-reader-without-sacrificing-performance
tsdb: Implement `headStaleIndexReader` methods properly
2026-06-01 10:48:08 +01:00
Miguel Bernabeu Diaz
423c7878de
Update tsdb/record/record.go
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Miguel Bernabeu Diaz <miguelbernadi@gmail.com>
2026-06-01 11:13:00 +02:00
Oleg Zaytsev
84b870106e
Remove unnecessary method and rename
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2026-06-01 10:19:06 +02:00
Miguel Bernabeu Diaz
b4db611d52
tsdb/record: eliminate prev pointer escapes in V2 histogram decoder
histogramSamplesV2 and floatHistogramSamplesV2 tracked the previous
sample's Ref and ST via a *RefHistogramSample pointer (prev). Taking the
address of a loop-local variable (prev = &rh) forced the compiler to
heap-allocate rh on every iteration; the first iteration also allocated
a separate sentinel struct. The pointed-to fields were only ever read as
two int64 scalars, so the pointer added zero semantic value.

Replace prev with two scalar variables (prevRef, prevST) and a boolean
sentinel. rh no longer has its address taken and stays on the stack.

This affects every caller of dec.HistogramSamples that produces V2
records (EnableSTStorage=true): WAL replay, the WAL watcher (remote
write tail), and checkpoint creation.

Benchmarks (go test -count=6 -benchmem, benchstat):

  BenchmarkDecodeHistogramSamples (tsdb/record)
                            │    before    │              after               │
                            │   allocs/op  │  allocs/op    vs base            │
  buckets=0/v2              │   2.001k ± 0%│   1.000k ± 0%  -50.02% (p=0.002)│
  buckets=4/v2              │   4.001k ± 0%│   3.000k ± 0%  -25.02% (p=0.002)│
  buckets=16/v2             │   4.001k ± 0%│   3.000k ± 0%  -25.02% (p=0.002)│

                            │    before    │              after               │
                            │    B/op      │    B/op       vs base            │
  buckets=0/v2              │  187.5Ki ± 0%│  156.2Ki ± 0%  -16.68% (p=0.002)│
  buckets=4/v2              │  250.0Ki ± 0%│  218.8Ki ± 0%  -12.51% (p=0.002)│
  buckets=16/v2             │  437.5Ki ± 0%│  406.2Ki ± 0%   -7.15% (p=0.002)│

  BenchmarkLoadWLs end-to-end WAL replay (tsdb), stStorage=true only
                            │    before    │              after               │
                            │   allocs/op  │  allocs/op    vs base            │
  histogramSeriesPct=1.000  │  19.70M ± 0% │  14.90M ± 0%  -24.39% (p=0.002)│
  histogramSeriesPct=0.500  │  10.47M ± 0% │   8.06M ± 0%  -23.00% (p=0.002)│

                            │    before    │              after               │
                            │    B/op      │    B/op       vs base            │
  histogramSeriesPct=1.000  │  1.539Gi ± 0%│  1.394Gi ± 0%  -9.42% (p=0.002)│
  histogramSeriesPct=0.500  │  1051.3Mi ± 0%│  975.1Mi ± 0%  -7.25% (p=0.002)│

                            │    before    │              after               │
                            │    sec/op    │    sec/op     vs base            │
  histogramSeriesPct=1.000  │  824.9m ± 0% │  762.6m ± 1%   -7.55% (p=0.002)│
  histogramSeriesPct=0.500  │  488.6m ± 1% │  451.4m ± 1%   -7.61% (p=0.002)│

V1 paths and float-only shapes are unchanged (p >> 0.05 throughout).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Miguel Bernabeu Diaz <miguel.bernabeu@coralogix.com>
2026-05-28 21:40:52 +02:00
Miguel Bernabeu Diaz
badf9da96a
tsdb/record,tsdb: add native histogram WAL decode benchmarks
Add two benchmark components to measure the native histogram decode hot
path, which is shared by WAL replay, WAL watcher (remote write), and
checkpoint creation.

tsdb/record: BenchmarkDecodeHistogramSamples isolates the V1 and V2
histogram decoder paths across bucket counts (0, 4, 16), giving a
precise per-sample allocation signal for decoder changes.

tsdb: BenchmarkLoadWLs gains two new shapes:
- all-histogram (histogramSeriesPct=1.0, bucketsPerHistogram=8): mirrors
  the existing "In between" float shape for direct comparison.
- mixed (histogramSeriesPct=0.5, bucketsPerHistogram=8): models a
  deployment partway through migrating to native histograms.

Both shapes are parameterised over stStorage (V1 vs V2 encoding) via the
existing enableSTStorage loop, so benchstat can show the V1/V2 delta
without additional test infrastructure. The subtest names include
histogramSeriesPct and bucketsPerHistogram only when non-zero, leaving
existing float-only subtest names unchanged.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Miguel Bernabeu Diaz <miguel.bernabeu@coralogix.com>
2026-05-28 21:40:12 +02:00
Owen Williams
134051d480
tsdb: Add TODOs for ST-in-WAL work (#18773)
Some checks are pending
buf.build / lint and publish (push) Waiting to run
CI / Go tests (push) Waiting to run
CI / More Go tests (push) Waiting to run
CI / Go tests for 32-bit x86 (push) Waiting to run
CI / Go tests for Prometheus upgrades and downgrades (push) Waiting to run
CI / Go tests with previous Go version (push) Waiting to run
CI / UI tests (push) Waiting to run
CI / Go tests on Windows (push) Waiting to run
CI / Mixins tests (push) Waiting to run
CI / Compliance testing (push) Waiting to run
CI / Build Prometheus for common architectures (push) Waiting to run
CI / Build Prometheus for all architectures (push) Waiting to run
CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions
CI / Check generated parser (push) Waiting to run
CI / golangci-lint (push) Waiting to run
CI / fuzzing (push) Waiting to run
CI / codeql (push) Waiting to run
CI / Publish main branch artifacts (push) Blocked by required conditions
CI / Publish release artefacts (push) Blocked by required conditions
CI / Publish UI on npm Registry (push) Blocked by required conditions
govulncheck / Run govulncheck (push) Waiting to run
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run
Comment-only changes.
This will make it easier for me to track my work.

Signed-off-by: Owen Williams <owen.williams@grafana.com>
2026-05-22 13:37:35 -04:00
Julien Pivotto
fae25e1405 tsdb: replace default encoding cases with explicit cases in snapshot encode/decode
Replace the catch-all default branch in encodeToSnapshotRecord and
decodeSeriesFromChunkSnapshot with an explicit EncFloatHistogram case and a
default that panics (encode) or returns an error (decode), making unknown
encodings immediately visible rather than silently mishandling them.

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
2026-05-21 12:47:55 +02:00
Owen Williams
5fe52643a0
tsdb: Rewrite TestCancelCompactions to run faster (#18632)
* Rewrite TestCancelCompactions to run faster

---------

Signed-off-by: Owen Williams <owen.williams@grafana.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
2026-05-21 09:06:10 +02:00
Julien
0f5727f420
tsdb: count EncXOR2 chunks as float samples; fix snapshot encoding (#18739)
Some checks are pending
buf.build / lint and publish (push) Waiting to run
CI / Go tests (push) Waiting to run
CI / More Go tests (push) Waiting to run
CI / Go tests for 32-bit x86 (push) Waiting to run
CI / Go tests for Prometheus upgrades and downgrades (push) Waiting to run
CI / Go tests with previous Go version (push) Waiting to run
CI / UI tests (push) Waiting to run
CI / Go tests on Windows (push) Waiting to run
CI / Mixins tests (push) Waiting to run
CI / Compliance testing (push) Waiting to run
CI / Build Prometheus for common architectures (push) Waiting to run
CI / Build Prometheus for all architectures (push) Waiting to run
CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions
CI / Check generated parser (push) Waiting to run
CI / golangci-lint (push) Waiting to run
CI / fuzzing (push) Waiting to run
CI / codeql (push) Waiting to run
CI / Publish main branch artifacts (push) Blocked by required conditions
CI / Publish release artefacts (push) Blocked by required conditions
CI / Publish UI on npm Registry (push) Blocked by required conditions
govulncheck / Run govulncheck (push) Waiting to run
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run
EncXOR2 is a float encoding and must be treated like EncXOR in all
places that enumerate chunk types:

- compact.go: NumFloatSamples was not incremented for EncXOR2 chunks
  during compaction, leading to under-reported block stats.
- head_wal.go: encodeToSnapshotRecord fell through to the default
  (FloatHistogram) branch for EncXOR2 head chunks, which would corrupt
  chunk snapshots; the decode path already handled EncXOR2 correctly.
- ooo_head.go: update stale comment to mention EncXOR2.

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
2026-05-20 15:20:37 +00:00
Oleg Zaytsev
959bc1c90e
Unexport sortedStaleSeriesRefsNoOOOData
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2026-05-20 10:07:38 +02:00
Oleg Zaytsev
58898e8031
Implement headStaleIndexReader methods properly
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2026-05-20 09:41:16 +02:00
Bartlomiej Plotka
ccf503efe6
Merge pull request #18629 from prometheus/owilliams/flakefix
tsdb: fix init race that lets initialized() return true before maxTime is set
2026-05-18 14:21:59 +02:00
Arve Knudsen
43e5fc6a24
perf(tsdb): inline oversize-chunk check in populateChunksFromIterable (#18699)
Some checks failed
buf.build / lint and publish (push) Has been cancelled
CI / Go tests (push) Has been cancelled
CI / More Go tests (push) Has been cancelled
CI / Go tests for Prometheus upgrades and downgrades (push) Has been cancelled
CI / Go tests with previous Go version (push) Has been cancelled
CI / UI tests (push) Has been cancelled
CI / Go tests on Windows (push) Has been cancelled
CI / Mixins tests (push) Has been cancelled
CI / Compliance testing (push) Has been cancelled
CI / Build Prometheus for common architectures (push) Has been cancelled
CI / Build Prometheus for all architectures (push) Has been cancelled
CI / Check generated parser (push) Has been cancelled
CI / golangci-lint (push) Has been cancelled
CI / fuzzing (push) Has been cancelled
CI / codeql (push) Has been cancelled
govulncheck / Run govulncheck (push) Has been cancelled
Scorecards supply-chain security / Scorecards analysis (push) Has been cancelled
CI / Report status of build Prometheus for all architectures (push) Has been cancelled
CI / Publish main branch artifacts (push) Has been cancelled
CI / Publish release artefacts (push) Has been cancelled
CI / Publish UI on npm Registry (push) Has been cancelled
The oversize-chunk trigger introduced in #18692 was implemented as a
closure defined inside the per-sample loop in
populateChunksFromIterable and invoked once at the `if` condition.
Replace it with a plain conditional and hoist
`len(currentChunk.Bytes())` out of the switch so the two encoding
cases don't repeat the same expression. The new shape preserves the
original `||` short-circuit: the size check is only evaluated when
neither the encoding nor the start-timestamp capability forces a new
chunk, which also keeps `currentChunk` non-nil at the point of read.

`gcflags=-m=2` reports the closure body inlined and the symbol table
shows no separate `func1` symbol, yet benchstat shows a measurable
speedup. The most likely explanation: the closure body inlines, but
the `funcval` struct (capturing `currentChunk` and `currentValueType`)
is still stack-constructed each iteration — invisible to escape
analysis, but a real per-iteration cost in a hot loop.

Benchmark, `go test -count=6 -benchmem -bench=BenchmarkQuerierSelectWithOutOfOrder -benchtime=5s -run=^$ ./tsdb/`,
Intel Xeon Platinum 8280 @ 2.70 GHz (linux/amd64), 1M-series head,
query selectivity varies:

                                          │   main      │           optimized                │
                                          │   sec/op    │    sec/op    vs base               │
Head/1of1000000-16                          301.5m ± 4%   257.0m ±  4%  -14.74% (p=0.002 n=6)
Head/10of1000000-16                         305.6m ± 3%   260.4m ±  2%  -14.80% (p=0.002 n=6)
Head/100of1000000-16                        303.9m ± 2%   259.7m ±  2%  -14.54% (p=0.002 n=6)
Head/1000of1000000-16                       303.8m ± 2%   267.0m ±  2%  -12.13% (p=0.002 n=6)
Head/10000of1000000-16                      318.1m ± 1%   278.9m ±  8%  -12.33% (p=0.002 n=6)
Head/100000of1000000-16                     364.1m ± 7%   352.8m ±  4%        ~ (p=0.065 n=6)
Head/1000000of1000000-16                     1.115 ± 2%    1.089 ± 26%        ~ (p=0.394 n=6)
geomean                                     377.8m        337.3m        -10.71%

allocs/op and B/op unchanged. The two largest-selectivity cases trend
faster but are dominated by the per-sample append cost so the relative
delta is smaller and lost in variance.

`TestChunkQuerier_OverlappingInOrderAndOOOChunks` continues to
exercise the overflow path.

```release-notes
NONE
```

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2026-05-15 11:40:31 +00:00
George Krajcsovits
8a9f4ff440
fix(tsdb): chunk overflow on ooo query (#18692)
Some checks are pending
buf.build / lint and publish (push) Waiting to run
CI / Go tests (push) Waiting to run
CI / More Go tests (push) Waiting to run
CI / Go tests for Prometheus upgrades and downgrades (push) Waiting to run
CI / Go tests with previous Go version (push) Waiting to run
CI / UI tests (push) Waiting to run
CI / Go tests on Windows (push) Waiting to run
CI / Mixins tests (push) Waiting to run
CI / Compliance testing (push) Waiting to run
CI / Build Prometheus for common architectures (push) Waiting to run
CI / Build Prometheus for all architectures (push) Waiting to run
CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions
CI / Check generated parser (push) Waiting to run
CI / golangci-lint (push) Waiting to run
CI / fuzzing (push) Waiting to run
CI / codeql (push) Waiting to run
CI / Publish main branch artifacts (push) Blocked by required conditions
CI / Publish release artefacts (push) Blocked by required conditions
CI / Publish UI on npm Registry (push) Blocked by required conditions
govulncheck / Run govulncheck (push) Waiting to run
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run
* fix(tsdb): chunk overflow on ooo query

Protect against and fix overflow of chunks with more than 2^16-1 samples
in case we're recoding chunks due to for example in-order and ooo samples
overlap during compaction or query.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2026-05-14 22:22:52 +02:00
Arve Knudsen
cf4505c6cd
tsdb: skip entire stripes in mmapHeadChunks via per-stripe ready count (#18541)
* tsdb: extract stripeSeries.refStripe helper

Extract the repeated ref-to-stripe-index calculation into a method on
stripeSeries, replacing five inline copies that used two different
casting styles (int and uint64). The helper computes with uint64
internally so it is correct on 32-bit architectures.

* tsdb: skip entire stripes in mmapHeadChunks via per-stripe ready count

Add a per-stripe mmapReady counter to stripeSeries that tracks how many
series in each stripe have headChunkCount >= 2 (i.e., are ready for
mmapping). mmapHeadChunks skips stripes where the counter is zero,
avoiding the RLock and map iteration entirely.

---------

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2026-05-14 18:24:56 +02:00
Owen Williams
9ec9b1e3c6
Comments to explain what's going on
Signed-off-by: Owen Williams <owen.williams@grafana.com>
2026-05-12 11:00:12 -04:00
Owen Williams
9e16c38214
Merge remote-tracking branch 'origin/main' into owilliams/flakefix 2026-05-12 10:42:09 -04:00
Owen Williams
da1f89e736
tsdb(wal): st-per-sample for histograms initial code and benchmarks (#18221)
Implements ST for Histograms and Float Histograms (and their custom bucket cousins) in WAL. New tests, new benchmarks.

Part of https://github.com/prometheus/prometheus/issues/17790

```release-notes
[CHANGE] Adds Start Time value to all WAL Histogram samples in memory, and therefore may increase memory usage.
```

Signed-off-by: Owen Williams <owen.williams@grafana.com>
2026-05-06 14:33:03 -04:00
Owen Williams
1cdee43726
tsdb: fix init race that lets initialized() return true before maxTime is set
initTime previously set minTime first and maxTime second. Because
Head.initialized() keys only off minTime, a concurrent Head.Appender call
could observe initialized() == true while maxTime was still
math.MinInt64. h.appender() then computes appendableMinValidTime as
MaxTime() - chunkRange/2, which underflows to a large positive number
and rejects in-range samples with ErrOutOfBounds.

Set maxTime first, then minTime. The CAS-loser wait now spins on
minTime instead of maxTime, preserving the existing anti-deadlock
timeout. AppenderV2 shares the same gate, so this single change covers
both paths.

The TestHead_InitAppenderRace_ErrOutOfBounds test added in #17963 is now
stable across 1000 iterations (and 100 iterations under -race).

Relates to #17941
Builds on #17963

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Owen Williams <owen.williams@grafana.com>
2026-05-06 11:16:47 -04:00
Owen Williams
a5876a0143
tsdb: Reduce test flakiness (#18577)
I have seen some flakiness in these tests, including timeouts. LLM suggested these fixes to make them more deterministic.  They look good to me.

Signed-off-by: Owen Williams <owen.williams@grafana.com>
2026-04-27 10:19:15 +02:00
Denys Sedchenko
ca578101af
feat(tsdb/agent): Implement checkpoint based on series in memory (#17948)
Some checks failed
buf.build / lint and publish (push) Has been cancelled
CI / Go tests (push) Has been cancelled
CI / More Go tests (push) Has been cancelled
CI / Go tests for Prometheus upgrades and downgrades (push) Has been cancelled
CI / Go tests with previous Go version (push) Has been cancelled
CI / UI tests (push) Has been cancelled
CI / Go tests on Windows (push) Has been cancelled
CI / Mixins tests (push) Has been cancelled
CI / Compliance testing (push) Has been cancelled
CI / Build Prometheus for common architectures (push) Has been cancelled
CI / Build Prometheus for all architectures (push) Has been cancelled
CI / Check generated parser (push) Has been cancelled
CI / golangci-lint (push) Has been cancelled
CI / fuzzing (push) Has been cancelled
CI / codeql (push) Has been cancelled
govulncheck / Run govulncheck (push) Has been cancelled
Scorecards supply-chain security / Scorecards analysis (push) Has been cancelled
CI / Report status of build Prometheus for all architectures (push) Has been cancelled
CI / Publish main branch artifacts (push) Has been cancelled
CI / Publish release artefacts (push) Has been cancelled
CI / Publish UI on npm Registry (push) Has been cancelled
Adds CheckpointFromInMemorySeries option for agent.Options to enable a faster checkpoint implementation that skips segment re-read and just uses in-memory data instead.

* feat: impl agent-specific checkpoint dir
* feat: impl ActiveSeries interface
* feat: use new checkpoint impl
* feat: hide new checkpoint impl behind a feature flag
* feat: add benchmark
* feat: add benchstat case
* feat: use feature flag in bench
* feat: use same labels for persisted state and append
* feat: set WAL segment size
* feat: add checkpoint size metric and bump series size
* feat: wal replay test
* feat: expose new checkpoint opts in cmd flags
* feat: update cli doc
* add ActiveSeries and DeletedSeries doc

Signed-off-by: x1unix <9203548+x1unix@users.noreply.github.com>
Signed-off-by: Denys Sedchenko <9203548+x1unix@users.noreply.github.com>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
2026-04-24 19:42:26 +02:00
Julien Pivotto
f69db5bc54 storage: introduce search interface with scoring and filtering
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
2026-04-23 15:05:48 +02:00
Julien
3b9caf6564
Merge pull request #18569 from roidelapluie/roidelapluie/labelnames-limit
Some checks are pending
buf.build / lint and publish (push) Waiting to run
CI / Go tests (push) Waiting to run
CI / More Go tests (push) Waiting to run
CI / Go tests for Prometheus upgrades and downgrades (push) Waiting to run
CI / Go tests with previous Go version (push) Waiting to run
CI / UI tests (push) Waiting to run
CI / Go tests on Windows (push) Waiting to run
CI / Mixins tests (push) Waiting to run
CI / Compliance testing (push) Waiting to run
CI / Build Prometheus for common architectures (push) Waiting to run
CI / Build Prometheus for all architectures (push) Waiting to run
CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions
CI / Check generated parser (push) Waiting to run
CI / golangci-lint (push) Waiting to run
CI / fuzzing (push) Waiting to run
CI / codeql (push) Waiting to run
CI / Publish main branch artifacts (push) Blocked by required conditions
CI / Publish release artefacts (push) Blocked by required conditions
CI / Publish UI on npm Registry (push) Blocked by required conditions
govulncheck / Run govulncheck (push) Waiting to run
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run
tsdb: apply LabelNames limit from LabelHints in blockBaseQuerier
2026-04-23 12:28:19 +02:00
Julien Pivotto
a5b5a3329c tsdb: apply LabelNames limit from LabelHints in blockBaseQuerier
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
2026-04-23 11:05:17 +02:00
George Krajcsovits
c84b0acdb4
test(tsdb): add OOO error coverage for ST zero sample appends (#18554)
* test(tsdb): add OOO error coverage for ST zero sample appends

Add unit tests exercising the out-of-order error paths in
AppendSTZeroSample, AppendHistogramSTZeroSample (AppenderV1), and
the best-effort ST injection in AppenderV2.Append.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* make format

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* test(tsdb): add TestHeadAppenderV2_BestEffortSTZeroSample_OOO

The three OOO cases added to TestHeadAppenderV2_Append_EnableSTAsZeroSample
use a single appender so headChunks is nil at append time; the zero sample
enters the batch and is rejected silently in commitFloats, never reaching
the error-handling branch at line 374 of bestEffortAppendSTZeroSample.

Add a dedicated test that commits the first sample before appending the
second. This makes headChunks non-nil, so appendFloat/appendHistogram/
appendFloatHistogram returns ErrOutOfOrderSample at append time and the
branch at line 374 is actually executed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

---------

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-23 09:48:12 +02:00
Arve Knudsen
c7b2210ac3
tsdb: cache collected head chunks on ChunkReader for O(1) lookup (#18302)
tsdb: cache collected head chunks on ChunkReader for O(1) lookup

The query path calls s.chunk() once per chunk meta via
ChunkOrIterableWithCopy. Each call walks the head chunks linked list
from the head to the target position. For a series with N head chunks
iterated oldest-first, total work is O(N²).

Cache the collected []*memChunk slice on headChunkReader, keyed by
series ref, head pointer, and mmapped chunks length. Collected once
per series under lock; reused on subsequent chunk lookups for the same
series. The backing array is reused across series (zero alloc after
first use).

Series with 0 or 1 head chunks skip the cache entirely to avoid
per-series overhead that dominates for typical workloads where most
series have a single head chunk.

The cache is gated behind an enableCache flag, toggled via an optional
chunkCacheToggler interface only when hints.Step > 0 (range queries).
Instant queries only need one chunk per series, so the cache overhead
is not recouped.

Also replace O(N²) linked-list traversals in appendSeriesChunks with
O(N) collectHeadChunks + slice iteration, and thread reusable
headChunksBuf through the index reader paths to avoid per-series
allocations.


---------

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
2026-04-17 18:34:41 +02:00
Arve Knudsen
98809e40c6
tsdb: Skip clean series during periodic head chunk mmap (#18272)
tsdb: Skip clean series during periodic head chunk mmap

The periodic mmapHeadChunks cycle previously acquired a per-series
lock on every series, even though typically >99% have nothing to
mmap. This was identified as a CPU bottleneck in Grafana Mimir.

Add a headChunkCount field (sync/atomic.Uint32) to memSeries that
tracks the number of head chunks. It is incremented in
cutNewHeadChunk and the histogram new-chunk paths, and reset by
mmapChunks and truncateChunksBefore. mmapHeadChunks uses a lock-free
Load to skip series with fewer than 2 head chunks, avoiding the
per-series lock for clean series.

sync/atomic.Uint32 (4 bytes) is used instead of go.uber.org/atomic
(8 bytes) to fit in existing struct padding without growing
memSeries. Chunk counts are bounded by the 3-byte field in
HeadChunkRef, so cannot overflow uint32.

Also fix pre-existing comment inaccuracies in the touched code:
headChunks.next -> headChunks.prev, mmapHeadChunks() -> mmapChunks()
in the doc comment, and a grammar error.

---------

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2026-04-14 17:11:35 +02:00
Julien Pivotto
2828c543bc tsdb: reduce chunk segment size in TestDiskFillingUpAfterDisablingOOO
The test only writes ~80 samples, so the default 512MB chunk segment
pre-allocation during compaction is unnecessary. Use 1MB instead to
avoid large file allocations on constrained CI environments.

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
2026-04-02 12:23:55 +02:00
Ayoub Mrini
0dd834e924
Merge pull request #18406 from machine424/depll
test: migrate TestDelayedCompaction to synctest to eliminate flakiness
2026-04-01 16:40:50 +02:00
Björn Rabenstein
4280662cdf
Merge pull request #18304 from crawfordxx/fix-typos-in-comments
Fix typos in comments and metric help strings
2026-04-01 13:45:59 +02:00
Jorge Creixell
4b562bba6e
tsdb: fix prometheus_tsdb_head_chunks going negative after WAL replay (#18401)
Some checks are pending
buf.build / lint and publish (push) Waiting to run
CI / Go tests (push) Waiting to run
CI / More Go tests (push) Waiting to run
CI / Go tests for Prometheus upgrades and downgrades (push) Waiting to run
CI / Go tests with previous Go version (push) Waiting to run
CI / UI tests (push) Waiting to run
CI / Go tests on Windows (push) Waiting to run
CI / Mixins tests (push) Waiting to run
CI / Compliance testing (push) Waiting to run
CI / Build Prometheus for common architectures (push) Waiting to run
CI / Build Prometheus for all architectures (push) Waiting to run
CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions
CI / Check generated parser (push) Waiting to run
CI / golangci-lint (push) Waiting to run
CI / fuzzing (push) Waiting to run
CI / codeql (push) Waiting to run
CI / Publish main branch artifacts (push) Blocked by required conditions
CI / Publish release artefacts (push) Blocked by required conditions
CI / Publish UI on npm Registry (push) Blocked by required conditions
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run
* tsdb: fix prometheus_tsdb_head_chunks going negative after WAL replay

  When truncateStaleSeries deletes a series (writing a full-range tombstone
  to the WAL) and the same label set is immediately re-created, WAL replay
  queues the following sequence on the same processor shard for the shared
  memSeries pointer:

    reset(mSeries, M mmappedChunks, walRef=old)
    deleteSeriesByID(old)
    reset(mSeries, N mmappedChunks, walRef=new)

  deleteSeriesByID correctly subtracts M from the gauge but does not clear
  series.mmappedChunks. The subsequent reset subtracts M again, driving
  prometheus_tsdb_head_chunks negative when M > N.

  Fix by setting series.mmappedChunks = nil in deleteSeriesByID after
  accounting for those chunks.

  Fixes #10884

  Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Signed-off-by: Jorge Creixell <jcreixell@gmail.com>

* Simplify test

  - Re-use appending helper
  - Cleanup comments

Signed-off-by: Jorge Creixell <jcreixell@gmail.com>

* Improve comments in test

Signed-off-by: Jorge Creixell <jcreixell@gmail.com>

* Fix formatting

Signed-off-by: Jorge Creixell <jcreixell@gmail.com>

* Improve comment

Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Signed-off-by: Jorge Creixell <jcreixell@gmail.com>

---------

Signed-off-by: Jorge Creixell <jcreixell@gmail.com>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
2026-04-01 11:30:33 +02:00
Rushabh Mehta
a2172f91c1
tsdb: Find the last series ID on startup from the last series id file and WAL scan (#18333)
* Add logic to Head.Init(...) for fast startup

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

* Add unit tests

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

* Empty commit to retrigger CI

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

* Empty commit to retrigger CI

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

* Make readSeriesStateFile return a struct directly, fix small nits, remove test

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

* Fix test for readSeriesStateFile function

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

* Fix some more nits, add extra testcase

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

---------

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>
2026-03-31 21:45:53 -07:00
Bartlomiej Plotka
fb38463dfb
Merge pull request #18321 from atoulme/aix
aix: support the aix/ppc64 compilation target
2026-03-31 16:42:20 +02:00
Julien
4b4d5157b8
chunkenc: add tests for XOR2 active ST delta and value branches (#18363)
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
2026-03-31 15:15:48 +02:00
Kyle Eckhart
37d85980a3
tsdb/agent: fix getOrCreate race (#18292)
* tsdb/agent: fix race in getOrCreate and consolidate series lookup
* tsdb/agent: fix transition window race in SetUnlessAlreadySet
* tsdb/agent: address review feedback and improve BenchmarkGetOrCreate

Signed-off-by: Kyle Eckhart <kgeckhart@users.noreply.github.com>

---------

Signed-off-by: Kyle Eckhart <kgeckhart@users.noreply.github.com>
2026-03-31 15:08:58 +02:00
machine424
86215cf91f
test: migrate TestDelayedCompaction to synctest to eliminate flakiness
The previous implementation relied on real wall-clock time and busy-loops
(time.Sleep + polling loops) to detect when compaction had finished, making
it both slow and flaky especially on busy CI envs and also  on Windows due to timer imprecision).

Now both the subtests run on windows.

The delay value can be increased (1s → 5s) at zero cost to test runtime

Also cleaned up shared logic into small helpers and split the no-delay and
delay-enabled cases into separate subtests for clarity.

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2026-03-30 23:58:08 +02:00
machine424
dcfb8ce59c
chore: remove util/testutil/synctest now that we use Go>=1.25
Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2026-03-30 19:48:39 +02:00
Julien Pivotto
3856195bb8 tsdb: use float64 for retention percentage
The retention.percentage config field was typed as uint, which silently
truncated fractional values. Setting percentage: 1.5 in prometheus.yml
resulted in a retention of 1%, with no warning or error.

Remove the redundant MaxPercentage > 100 clamp in main.go; the config
UnmarshalYAML already returns an error for out-of-range values before
this code is reached.

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
2026-03-26 12:39:22 +01:00
Julien Pivotto
7a1a5e285f chunkenc: add extra tests
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
2026-03-25 09:59:12 +01:00
Julien Pivotto
d8607cbd9b tsdb/chunkenc: optimise XOR2 and varbit hot paths
Use writeBitsFast instead of writeBits in putVarbitInt/putVarbitUint,
combining prefix and value into a single call per bucket. Inline the
common fast paths in XOR2 Append to avoid encodeJoint and putVarbitInt
calls for the typical dod=0 and 13-bit dod cases.

Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
2026-03-25 09:09:46 +01:00
Rushabh Mehta
df61021436
tsdb: Add series_state.json file to wal/ directory to track state (#18303)
Some checks are pending
buf.build / lint and publish (push) Waiting to run
CI / Go tests (push) Waiting to run
CI / More Go tests (push) Waiting to run
CI / Go tests with previous Go version (push) Waiting to run
CI / UI tests (push) Waiting to run
CI / Go tests on Windows (push) Waiting to run
CI / Mixins tests (push) Waiting to run
CI / Compliance testing (push) Waiting to run
CI / Build Prometheus for common architectures (push) Waiting to run
CI / Build Prometheus for all architectures (push) Waiting to run
CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions
CI / Check generated parser (push) Waiting to run
CI / golangci-lint (push) Waiting to run
CI / fuzzing (push) Waiting to run
CI / codeql (push) Waiting to run
CI / Publish main branch artifacts (push) Blocked by required conditions
CI / Publish release artefacts (push) Blocked by required conditions
CI / Publish UI on npm Registry (push) Blocked by required conditions
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run
* Add series_state.json file creation and updation logic.

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

* Make comments follow the guidelines.

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

* Fix linter complaints

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

* Put PR behind feature flag fast-startup

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

* Marshal updated information to file directly

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

* Fix linter failures

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

* Move series state code from head.go to head_wal.go

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

* Fix nits

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

* Add unit test

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>

---------

Signed-off-by: Rushabh Mehta <mehtarushabh2005@gmail.com>
2026-03-23 20:46:04 -07:00