Commit graph

25 commits

Author SHA1 Message Date
Casie Chen
cbf414d68a Add regex optimization for simple contains alternations
Signed-off-by: Casie Chen <casie.chen@grafana.com>
2026-02-06 16:16:58 -08:00
Nick Pillitteri
a769a7eeb7
model/labels: fix regex with capture, wildcards, literal (#17828)
Some checks are pending
buf.build / lint and publish (push) Waiting to run
CI / Go tests (push) Waiting to run
CI / More Go tests (push) Waiting to run
CI / Go tests with previous Go version (push) Waiting to run
CI / UI tests (push) Waiting to run
CI / Go tests on Windows (push) Waiting to run
CI / Mixins tests (push) Waiting to run
CI / Build Prometheus for common architectures (push) Waiting to run
CI / Build Prometheus for all architectures (push) Waiting to run
CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions
CI / Check generated parser (push) Waiting to run
CI / golangci-lint (push) Waiting to run
CI / fuzzing (push) Waiting to run
CI / codeql (push) Waiting to run
CI / Publish main branch artifacts (push) Blocked by required conditions
CI / Publish release artefacts (push) Blocked by required conditions
CI / Publish UI on npm Registry (push) Blocked by required conditions
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run
This change fixes an issue introduced in #17707. When a regex
with a wildcard, literal, and final wildcard surounded by a
capture group was parsed - the capture group was not removed
first preventing `optimizeConcatRegex` from running.

Found via fuzz testing.

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
2026-01-12 15:51:45 +00:00
Charles Korn
a919e6d5ef
model/labels: improve performance of regex matchers like .*-.*-.* (#17707)
#14173 introduced an optimisation to better handle regex patterns like .*-.*-.*. It identifies strings the pattern cannot possibly match (because they do not contain all of the literal values) and returns false from MatchString early.

However, if the string does contain all literal values, then the Go regex engine is used to confirm that the string does match the pattern. But this is not necessary in the case where the start and end of the pattern is .* and everything in between is either a literal or .*: if the string contains all of the literals in order, then it matches the pattern, and invoking Go's regex engine to confirm this is unnecessary and quite slow.

* Add some more test cases
* Add benchmark, since existing benchmark doesn't show much impact given most of the random test strings will not match the patterns.

Signed-off-by: Charles Korn <charles.korn@grafana.com>
2026-01-08 10:20:23 +00:00
Ben Kochie
e14795bbf4
Remove copyright date from headers (#17785)
Remove copyright dates from various files as part of [PROM-50].

[PROM-50]: https://github.com/prometheus/proposals/blob/main/proposals/0050-remove-copyright-dates.md

Signed-off-by: SuperQ <superq@gmail.com>
2026-01-05 13:46:21 +01:00
Ben Kochie
204249fcb5
Update golangci-lint (#17478)
Some checks are pending
buf.build / lint and publish (push) Waiting to run
CI / Go tests (push) Waiting to run
CI / More Go tests (push) Waiting to run
CI / Go tests with previous Go version (push) Waiting to run
CI / UI tests (push) Waiting to run
CI / Go tests on Windows (push) Waiting to run
CI / Mixins tests (push) Waiting to run
CI / Build Prometheus for common architectures (push) Waiting to run
CI / Build Prometheus for all architectures (push) Waiting to run
CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions
CI / Check generated parser (push) Waiting to run
CI / golangci-lint (push) Waiting to run
CI / fuzzing (push) Waiting to run
CI / codeql (push) Waiting to run
CI / Publish main branch artifacts (push) Blocked by required conditions
CI / Publish release artefacts (push) Blocked by required conditions
CI / Publish UI on npm Registry (push) Blocked by required conditions
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run
* Update golangci-lint to v2.6.0
* Fixup various linting issues.
* Fixup deprecations.
* Add exception for `labels.MetricName` deprecation.

Signed-off-by: SuperQ <superq@gmail.com>
2025-11-05 13:47:34 +01:00
Ben Kochie
48956f60d7
Update modernize (#17471)
Apply additional Go modernize tool improvements.

Signed-off-by: SuperQ <superq@gmail.com>
2025-11-04 05:13:49 +00:00
beorn7
747c5ee2b1 Apply analyzer "modernize" to the whole codebase
See
https://pkg.go.dev/golang.org/x/tools/gopls/internal/analysis/modernize
for details.

This ran into a few issues (arguably bugs in the modernize tool),
which I will fix in the next commit, so that we have transparency what
was done automatically.

Beyond those hiccups, I believe all the changes applied are
legitimate. Even where there might be no tangible direct gain, I would
argue it's still better to use the "modern" way to avoid micro
discussions in tiny style PRs later.

Signed-off-by: beorn7 <beorn@grafana.com>
2025-08-27 14:48:41 +02:00
Matthieu MOREL
c7d4b53ec1 chore: enable unused-parameter from revive
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-02-19 19:50:28 +01:00
Bryan Boreham
5571c7dc98 FastRegexMatcher: use stack memory for lowercase copy of string
Up to 32-byte values this saves garbage, runs faster.
For prefixes, only `toLower` the part we need for the map lookup.

Split toNormalisedLower into fast and slow paths, to avoid a penalty
for the `copy` call in the case where no allocations are done.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-10-28 16:28:58 +00:00
Mario Fernandez
5814920601
Fix: optimize .* regexp performance
Shortcut for `.*` matches newlines as well.
Add preamble change ^(?s:
Add test
dotAll flag por al regex
Add and fix regex tests

Signed-off-by: Mario Fernandez <mariofer@redhat.com>
2024-09-17 12:18:31 +02:00
Bryan Boreham
82a8c6abe2
[ENHANCEMENT] Optimize regexps with multiple prefixes (#13843)
For example `foo.*|bar.*|baz.*`. Instead of checking each one in turn,
we build a map of prefixes, then check the smaller set that could match
the string supplied.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Improve testing and readability

Address review comments on #13843

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2024-07-03 18:45:36 +01:00
Marco Pracucci
ec31acaf02
FastRegexMatcher: small optimization for the literal prefix case
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2024-07-01 10:12:50 +02:00
Oleg Zaytsev
4f78cc809c
Refactor toNormalisedLower: shorter and slightly faster. (#14299)
Refactor toNormalisedLower: shorter and slightly faster

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-06-18 09:57:37 +00:00
Ranveer Avhad
39902ba694
[BUGFIX] FastRegexpMatcher: do Unicode normalization as part of case-insensitive comparison (#14170)
* Converted string to standarized form
* Added golang.org/x/text in Go dependencies
* Added test cases for FastRegexMatcher
* Added benchmark for toNormalizedLower

Signed-off-by: RA <ranveeravhad777@gmail.com>
2024-06-10 18:31:41 -04:00
Marco Pracucci
a0807733be
Improved tests
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2024-06-04 10:34:15 +02:00
Marco Pracucci
78fdd2188d
Improve contains check done by FastRegexMatcher
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2024-06-04 10:34:15 +02:00
Oleg Zaytsev
8b4c9459a2
Check utf8.RuneError result
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-05-13 17:44:26 +02:00
Oleg Zaytsev
dbe88fae22
Add invalid utf8 test cases to regexp
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-05-13 17:05:31 +02:00
Oleg Zaytsev
fdfc6d4725
Benchmark zeroOrOneCharacterStringMatcher.Matches
This adds some more test cases for unicode values, and also a benchmark
for zeroOrOneCharacterStringMatcher.Matches()

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-05-13 15:36:55 +02:00
Oleg Zaytsev
2524a91591
Fix FastRegexMatcher matching multibyte runes with . (#14059)
When `zeroOrOneCharacterStringMatcher` wach checking the input string,
it assumed that if there are more than one bytes, then there are more
than one runes, but that's not necessarily true.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-05-07 16:33:37 +02:00
Bryan Boreham
7c28521451 [TESTS] Truncate some long test names, for readability
The strings produced by these tests can run to thousands of characters,
which makes test logs difficult to read.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-04-03 10:10:39 +01:00
carehabit
a672662073
all: fix some typos (#13863)
Signed-off-by: carehabit <shenyuting@outlook.com>
2024-04-01 18:06:05 +02:00
Marco Pracucci
bfec57bd2e
Further optimise FastRegexMatcher
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2024-01-25 10:40:57 +01:00
Bryan Boreham
579331446a
Allow downstream projects to use faster regexp (#10251)
* Add benchmark for FastRegexMatcher

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Use modified regexp package with optimisations

See https://github.com/grafana/regexp/tree/speedup#readme

Includes the following changes proposed upstream:
* [regexp: allow patterns with no alternates to be one-pass](https://go-review.googlesource.com/c/go/+/353711)
* [regexp: speed up onepass prefix check](https://go-review.googlesource.com/c/go/+/354909)
* [regexp: handle prefix string with fold-case](https://go-review.googlesource.com/c/go/+/358756)
* [regexp: avoid copying each instruction executed](https://go-review.googlesource.com/c/go/+/355789)
* [regexp: allow prefix string anchored at beginning](https://go-review.googlesource.com/c/go/+/377294)

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Use regexp code identical to upstream Go

Change `grafana/regexp` import to use `main` branch.

This means Prometheus is not using the proposed optimisations, but
downstream users of Prometheus code are able to `replace` the library
with the `speedup` branch which does.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2022-02-08 11:03:20 +01:00
beorn7
c954cd9d1d Move packages out of deprecated pkg directory
This creates a new `model` directory and moves all data-model related
packages over there:
  exemplar labels relabel rulefmt textparse timestamp value

All the others are more or less utilities and have been moved to `util`:
  gate logging modetimevfs pool runtime

Signed-off-by: beorn7 <beorn@grafana.com>
2021-11-09 08:03:10 +01:00
Renamed from pkg/labels/regexp_test.go (Browse further)