[BUGFIX] tsdb: store a millisecond timestamp (not a WAL segment number) in `walExpiries`
when a series is evicted via `CompactStaleHead`/`CompactSelectedSeries`, so the series'
label record is correctly retained in the next WAL checkpoint and replays cleanly.
Signed-off-by: Yuri Nikolic <durica.nikolic@grafana.com>
resetMatchedSigs used clear() to recycle inner maps, but the fill-LHS
loop checked matchedSigs[sigOrd] \!= nil to decide whether an RHS sample
was already matched. A cleared but non-nil map from a previous timestep
caused a false positive, skipping the fill and producing missing samples
in range queries using group_left fill_left or group_right fill_right.
Fix by restoring clear() for map reuse and changing the check to
len(matchedSigs[sigOrd]) > 0, which correctly treats an empty map as
unmatched while preserving the allocation-reuse across timesteps.
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
Adding this as a meta label makes it possible to dynamically configure
this setting through discovery labels.
This helps in usecases like K8S where we could enable this with a pod
annotation.
Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
* chore: fix typos in comments
Fix three minor typos in source comments:
- scrape: mimicks -> mimics
- tsdb: descibes -> describes
- ui/codemirror-promql: theses -> these
Signed-off-by: RoySerbi <roy676564@gmail.com>
* ci: retrigger CI to clear known 32-bit flake
Empty commit to retrigger CI. The previous run failed only on
'Go tests for 32-bit x86' due to the known intermittent flake in
TestRemoteWrite_PerQueueMetricsAfterRelabeling (see #17356), which
is unrelated to this comment-only PR.
Signed-off-by: RoySerbi <roy676564@gmail.com>
---------
Signed-off-by: RoySerbi <roy676564@gmail.com>
This change adds support for case-insensitive prefix matching, with the goal of especially improving performance when evaluating long case-insensitive regexes, without degrading performance particularly in other cases.
Signed-off-by: Casie Chen <casie.chen@grafana.com>
Add TestAppendHistogramErrorDoesNotSetPendingCommit (V1) and
TestHeadAppenderV2_HistogramErrorDoesNotSetPendingCommit (V2),
each covering the integer and float histogram branches.
The integer V1 branch previously set s.pendingCommit on the error
path, which left the flag stuck on existing series whenever an
append was rejected (e.g. ErrOutOfOrderSample). Because the failed
sample is never added to the appender's batch, Commit/Rollback
never clears pendingCommit for that series, and head GC at
tsdb/head.go treats it as still in use.
The V1 integer subtest fails on main without the prior commit;
both subtests pass with it. The V2 paths already use err == nil
and the V2 test is a lock-in; inverting the V2 condition locally
confirms the test would catch a similar regression there.
Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
AppendHistogram used err != nil when deciding to set pendingCommit for integer histograms, while the float histogram branch uses err == nil. Align the classic histogram path so pendingCommit is set only after a successful appendableHistogram check, matching appendableFloatHistogram.
Signed-off-by: Weixie Cui <cuiweixie@gmail.com>
- Add packages: write permission to publish_main and publish_release jobs
- Add ghcr_io_password: github.token to both publish jobs
- Bump promci build/publish actions from v0.6.0 to v0.8.2 (SHA-pinned)
- Drop standalone checkout steps preceding promci build/publish steps
(promci v0.8.2 performs its own checkout)
- Add persist-credentials: false to check_release_notes checkout
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
The azureClient struct held the armcompute and armnetwork SDK clients as
concrete fields while satisfying the client interface. Once such a client is
reachable through an interface, Go's linker conservatively retains every
exported method of the concrete type plus the entire (de)serializer graph those
operations drag in, even though discovery calls only a handful of them.
Wrap each SDK client in a small adapter that captures only the operations
discovery uses as method-value closures, and box the adapters instead of the raw
clients. The concrete clients then live only inside closure contexts, which
reflection cannot traverse, so dead-code elimination drops the unused
operations.
This drops the retained operations per client from ~60 down to the 2-3 actually
used (UsedInIface markers go from 244+66 to 0), shrinking both the prometheus
and promtool binaries by ~3.2 MB each. No functional or API change.
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
Set ignore-scripts=true, allow-git=none, and min-release-age=3 to
harden npm installs against lifecycle script abuse, git-sourced
dependencies, and recently published packages.
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
The Kubernetes SD Discovery struct held the clientset as a
kubernetes.Interface field. Boxing the concrete *kubernetes.Clientset into
an interface marks it <UsedInIface>, so the Go linker conservatively retains
every API-group accessor and, transitively, every resource client and its
apply configurations, even though discovery only touches the core, apps,
batch, discovery and networking v1 groups.
Wrap the clientset in an adapter that captures only the used API-group
accessors as method-value closures and exposes them through a narrow
k8sClient interface. The concrete clientset now lives only inside closure
contexts, which reflection cannot traverse, so dead-code elimination drops
the unused groups. The fake clientset still satisfies the narrow interface,
so tests are unchanged.
This trims about 10 MB from each of the prometheus and promtool binaries.
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
The Discovery struct held *compute.Service and *compute.InstancesService as
fields and is boxed into the discovery.Discoverer interface. Once reflection is
reachable in the program (it always is, via the YAML/config machinery), the Go
linker conservatively retains every exported method of any concrete type
reachable through an interface, including via struct fields. *compute.Service
exposes ~150 sub-services and their operations, so all of them — 994 list/get
operations and their serializers — were retained even though discovery only
calls Instances.List.
Wrap the single used operation in a closure over the concrete *compute.Service
so the service lives only in closure context, which reflection cannot traverse.
Returning a *compute.InstancesListCall would not help, since that type has an
s *Service back-reference that re-propagates the marker, so the closure
encapsulates the whole List/Filter/Pages chain and only exposes the
*compute.InstanceList data type discovery already uses.
compute/v1 footprint drops from ~4.9 MB to ~33 KB, and the prometheus and
promtool binaries each shrink by ~13.5 MB. No functional or API change.
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
The AWS service-discovery code boxed each concrete SDK client (*ec2.Client,
*rds.Client, *lightsail.Client, *elasticache.Client, *ecs.Client and
*kafka.Client) into an interface, either directly or as a field of a struct
that is itself boxed into the Discoverer interface.
Once reflection is reachable in the program -- it always is, via the
YAML/config machinery -- the Go linker conservatively retains every exported
method of any concrete type reachable through an interface, including a type
held as a field of an interface-boxed struct. Each SDK client exposes the
service's full API (e.g. *ec2.Client has ~470 operation methods), so all of
their operation serializers and the corresponding types (de)serializer graphs
were kept, even though discovery only calls a handful of operations. EC2 alone
accounted for ~21 MB.
Wrap each client in a small adapter that captures only the operations
discovery uses as method-value closures. The concrete client then lives only
inside closure contexts, which reflection cannot traverse, so dead-code
elimination can drop the unused operations.
This reduces the binary sizes substantially:
prometheus 228.6 MB -> 162.6 MB (-66 MB, -29%)
promtool 205.1 MB -> 139.0 MB (-66 MB, -32%)
There is no functional or API change; the mocking interfaces used by the tests
are unchanged.
Signed-off-by: Julien Pivotto <291750+roidelapluie@users.noreply.github.com>
histogramSamplesV2 and floatHistogramSamplesV2 tracked the previous
sample's Ref and ST via a *RefHistogramSample pointer (prev). Taking the
address of a loop-local variable (prev = &rh) forced the compiler to
heap-allocate rh on every iteration; the first iteration also allocated
a separate sentinel struct. The pointed-to fields were only ever read as
two int64 scalars, so the pointer added zero semantic value.
Replace prev with two scalar variables (prevRef, prevST) and a boolean
sentinel. rh no longer has its address taken and stays on the stack.
This affects every caller of dec.HistogramSamples that produces V2
records (EnableSTStorage=true): WAL replay, the WAL watcher (remote
write tail), and checkpoint creation.
Benchmarks (go test -count=6 -benchmem, benchstat):
BenchmarkDecodeHistogramSamples (tsdb/record)
│ before │ after │
│ allocs/op │ allocs/op vs base │
buckets=0/v2 │ 2.001k ± 0%│ 1.000k ± 0% -50.02% (p=0.002)│
buckets=4/v2 │ 4.001k ± 0%│ 3.000k ± 0% -25.02% (p=0.002)│
buckets=16/v2 │ 4.001k ± 0%│ 3.000k ± 0% -25.02% (p=0.002)│
│ before │ after │
│ B/op │ B/op vs base │
buckets=0/v2 │ 187.5Ki ± 0%│ 156.2Ki ± 0% -16.68% (p=0.002)│
buckets=4/v2 │ 250.0Ki ± 0%│ 218.8Ki ± 0% -12.51% (p=0.002)│
buckets=16/v2 │ 437.5Ki ± 0%│ 406.2Ki ± 0% -7.15% (p=0.002)│
BenchmarkLoadWLs end-to-end WAL replay (tsdb), stStorage=true only
│ before │ after │
│ allocs/op │ allocs/op vs base │
histogramSeriesPct=1.000 │ 19.70M ± 0% │ 14.90M ± 0% -24.39% (p=0.002)│
histogramSeriesPct=0.500 │ 10.47M ± 0% │ 8.06M ± 0% -23.00% (p=0.002)│
│ before │ after │
│ B/op │ B/op vs base │
histogramSeriesPct=1.000 │ 1.539Gi ± 0%│ 1.394Gi ± 0% -9.42% (p=0.002)│
histogramSeriesPct=0.500 │ 1051.3Mi ± 0%│ 975.1Mi ± 0% -7.25% (p=0.002)│
│ before │ after │
│ sec/op │ sec/op vs base │
histogramSeriesPct=1.000 │ 824.9m ± 0% │ 762.6m ± 1% -7.55% (p=0.002)│
histogramSeriesPct=0.500 │ 488.6m ± 1% │ 451.4m ± 1% -7.61% (p=0.002)│
V1 paths and float-only shapes are unchanged (p >> 0.05 throughout).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Miguel Bernabeu Diaz <miguel.bernabeu@coralogix.com>
Add two benchmark components to measure the native histogram decode hot
path, which is shared by WAL replay, WAL watcher (remote write), and
checkpoint creation.
tsdb/record: BenchmarkDecodeHistogramSamples isolates the V1 and V2
histogram decoder paths across bucket counts (0, 4, 16), giving a
precise per-sample allocation signal for decoder changes.
tsdb: BenchmarkLoadWLs gains two new shapes:
- all-histogram (histogramSeriesPct=1.0, bucketsPerHistogram=8): mirrors
the existing "In between" float shape for direct comparison.
- mixed (histogramSeriesPct=0.5, bucketsPerHistogram=8): models a
deployment partway through migrating to native histograms.
Both shapes are parameterised over stStorage (V1 vs V2 encoding) via the
existing enableSTStorage loop, so benchstat can show the V1/V2 delta
without additional test infrastructure. The subtest names include
histogramSeriesPct and bucketsPerHistogram only when non-zero, leaving
existing float-only subtest names unchanged.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Miguel Bernabeu Diaz <miguel.bernabeu@coralogix.com>