mirror of
https://github.com/redis/redis.git
synced 2026-06-11 01:40:25 -04:00
Implement the new Redis Array type (#15162)
Some checks are pending
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Reply-schemas linter / reply-schemas-linter (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
Some checks are pending
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Reply-schemas linter / reply-schemas-linter (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
# Redis Array For years, Redis has been missing a real indexed data structure for the use cases where the index and the spatial relationship of elements are semantic. Hashes give you random lookups, but you have to store an index as a key, and have no range visibility. Lists give you appending and trimming, but what is in the middle remains hard to access. Streams give you append-only events, which is another (useful, indeed) beast. None of these is what you want when the *position itself* has business meaning — slot 37, step 4, row 18552, day from 2934 to 2949, file line 11, 12, 15 and so forth. And, all those types, for different reasons, are all suboptimal when you want a **ring buffer** able to store the latest N observed samples of something. Up to now, users found ways (they always do \o/) using the fact that the data structures that are obvious in this universe are also extremely powerful, if well implemented. But this forces compromises. Arrays handle these index-first requirements natively, and usually with much better memory and CPU usage than the workarounds. If the use case is the right one, Arrays often provide much better space, time and usability at the same time. ## Internal encoding 1. When dense, an Array is essentially a more fancy C array. You don't pay anything for storing the index. 2. Yet, instead of going really flat, arrays are sliced into 4096-element slices, and each slice, when it contains just a few elements, uses a special sparse encoding. When a slice is empty it's just a `NULL` stored in the directory. 3. Small ints, floats, and short strings are pointer-tagged, so they cost zero additional memory beyond the pointer slot itself. 4. When very sparse, a super-directory of windowed directories is used. This allows the data type to be safe, instead of exhibiting pathological space or time behavior. This representation is only triggered when there are more than 8 million elements or very high indexes set. ## Use cases Arrays are mostly stateless if not for the fact that each array remembers the index of the latest added item, allowing `ARINSERT` and `ARRING` to work properly. Otherwise it is a set/get at this index game, with solid support for both setting / getting ranges, server-side scanning, returning only populated elements in a time which is proportional not to the range size, but to the population size. A few concrete examples, that may work as mental models for the set of problems that are similar to them (from the POV of the data modeling). **Thermometer.** A sensor reporting once per minute, with gaps: ``` ARSET temp:room12:day7 123 22.3 ARGETRANGE temp:room12:day7 600 660 # the 10:00–11:00 window, with NULLs ARSCAN temp:room12:day7 600 660 # only populated elements AROP temp:room12:day7 0 1439 MAX # peak of the day, server-side ``` Missing minutes cost little to nothing. Numeric aggregation runs inside Redis. Telemetry, IoT, meter readings, KPI rollups. **Calendar.** A clinic with 96 fifteen-minute slots per day: ``` ARSET sched:room12:day 32 booking:991 ARSCAN sched:room12:day 0 95 # only occupied slots ARGETRANGE sched:room12:day 48 63 # the afternoon full view to render ``` The slot number is the business key in this case. Room booking, parking spaces, warehouse bins, lockers, ... **Ring buffer.** ARRING replaces the classic LPUSH+LTRIM pattern. Imagine remote `dmesg`. ``` ARRING machine:123 200 "[141087.430123]: arm_cpu_init(): cpu 14 online" # Capped to 200 entries ARLASTITEMS machine:123 50 REV # 50 newest first ``` Faster than LPUSH+LTRIM, keep indexed access to past elements. Last-N alarms, recent fraud scores, access history, remote logs, device events. Ok here the use cases are mainly the ones of the old pattern: it is just a better fit and allows to access random items in the middle, aggregate server-side, and so forth. **Workflow.** Step number is the index, value is the status. Gaps are meaningful: ``` ARSET claim:99172 0 received ARSET claim:99172 3 waiting:reviewer42 ARSET claim:99172 5 approved ARGETRANGE claim:99172 0 5 # full workflow view, with NULLs for missing steps ARSCAN claim:99172 0 5 # only steps that have a state ARCOUNT claim:99172 # number of recorded steps ARLEN claim:99172 # highest reached step + 1 ``` **Skills knowledge base for agents.** Arrays are good at representing / grepping into Markdown files: ``` ARSET skill:metal_gpu 0 "...." ARSET skill:metal_gpu 1 "...." ARSET skill:metal_gpu 2 "...." ARGREP skill:metal_gpu - + RE "M3|M4" WITHVALUES ``` ARGREP has EXACT, MATCH, GLOB, RE, you can have multiple predicates, can select AND or OR behavior. **Bulk import results.** Sparse row annotations over millions of rows / CSV / ...: ``` ARSET import:job551 18552 ERR:bad_email ARSCAN import:job551 0 1000000 # Provides only rows that have something ``` ## TLDR If the position is part of the meaning, use an Array. If you want to aggregate or grep remotely, use an Array. Feedback welcome :) --------- Co-authored-by: debing.sun <debing.sun@redis.com> Co-authored-by: Shubham S Taple <155555100+ShubhamTaple@users.noreply.github.com> Co-authored-by: Yuan Wang <yuan.wang@redis.com> Co-authored-by: Marc Gravell <marc.gravell@gmail.com>
This commit is contained in:
parent
3ab7fe0812
commit
0d9576435f
86 changed files with 22258 additions and 39 deletions
1
.gitignore
vendored
1
.gitignore
vendored
|
|
@ -30,6 +30,7 @@ deps/lua/src/luac
|
|||
deps/lua/src/liblua.a
|
||||
deps/hdr_histogram/libhdrhistogram.a
|
||||
deps/fpconv/libfpconv.a
|
||||
deps/tre/libtre.a
|
||||
tests/tls/*
|
||||
.make-*
|
||||
.prerequisites
|
||||
|
|
|
|||
8
deps/Makefile
vendored
8
deps/Makefile
vendored
|
|
@ -59,6 +59,7 @@ distclean:
|
|||
-(cd jemalloc && [ -f Makefile ] && $(MAKE) distclean) > /dev/null || true
|
||||
-(cd hdr_histogram && $(MAKE) clean) > /dev/null || true
|
||||
-(cd fpconv && $(MAKE) clean) > /dev/null || true
|
||||
-(cd tre && $(MAKE) clean) > /dev/null || true
|
||||
-(cd xxhash && $(MAKE) clean) > /dev/null || true
|
||||
-(rm -f .make-*)
|
||||
|
||||
|
|
@ -94,6 +95,13 @@ fpconv: .make-prerequisites
|
|||
|
||||
.PHONY: fpconv
|
||||
|
||||
tre: .make-prerequisites
|
||||
@printf '%b %b\n' $(MAKECOLOR)MAKE$(ENDCOLOR) $(BINCOLOR)$@$(ENDCOLOR)
|
||||
cd tre && $(MAKE) CFLAGS="$(DEPS_CFLAGS)" LDFLAGS="$(DEPS_LDFLAGS)"
|
||||
|
||||
.PHONY: tre
|
||||
|
||||
|
||||
XXHASH_CFLAGS = -fPIC $(DEPS_CFLAGS)
|
||||
xxhash: .make-prerequisites
|
||||
@printf '%b %b\n' $(MAKECOLOR)MAKE$(ENDCOLOR) $(BINCOLOR)$@$(ENDCOLOR)
|
||||
|
|
|
|||
29
deps/tre/LICENSE
vendored
Normal file
29
deps/tre/LICENSE
vendored
Normal file
|
|
@ -0,0 +1,29 @@
|
|||
This is the license, copyright notice, and disclaimer for TRE, a regex
|
||||
matching package (library and tools) with support for approximate
|
||||
matching.
|
||||
|
||||
Copyright (c) 2001-2009 Ville Laurikari <vl@iki.fi>
|
||||
All rights reserved.
|
||||
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
modification, are permitted provided that the following conditions
|
||||
are met:
|
||||
|
||||
1. Redistributions of source code must retain the above copyright
|
||||
notice, this list of conditions and the following disclaimer.
|
||||
|
||||
2. Redistributions in binary form must reproduce the above copyright
|
||||
notice, this list of conditions and the following disclaimer in the
|
||||
documentation and/or other materials provided with the distribution.
|
||||
|
||||
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER AND CONTRIBUTORS
|
||||
``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
|
||||
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
|
||||
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
|
||||
HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
|
||||
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
|
||||
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
||||
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
||||
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
||||
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
|
||||
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
79
deps/tre/Makefile
vendored
Normal file
79
deps/tre/Makefile
vendored
Normal file
|
|
@ -0,0 +1,79 @@
|
|||
STD= -std=c99
|
||||
WARN= -Wall
|
||||
OPT= -Os
|
||||
|
||||
ifeq ($(SANITIZER),address)
|
||||
CFLAGS+=-fsanitize=address -fno-sanitize-recover=all -fno-omit-frame-pointer
|
||||
LDFLAGS+=-fsanitize=address
|
||||
else
|
||||
ifeq ($(SANITIZER),undefined)
|
||||
CFLAGS+=-fsanitize=undefined -fno-sanitize-recover=all -fno-omit-frame-pointer
|
||||
LDFLAGS+=-fsanitize=undefined
|
||||
else
|
||||
ifeq ($(SANITIZER),thread)
|
||||
CFLAGS+=-fsanitize=thread -fno-sanitize-recover=all -fno-omit-frame-pointer
|
||||
LDFLAGS+=-fsanitize=thread
|
||||
else
|
||||
ifeq ($(SANITIZER),memory)
|
||||
CFLAGS+=-fsanitize=memory -fsanitize-memory-track-origins=2 -fno-sanitize-recover=all -fno-omit-frame-pointer
|
||||
LDFLAGS+=-fsanitize=memory
|
||||
endif
|
||||
endif
|
||||
endif
|
||||
endif
|
||||
|
||||
R_CFLAGS= $(STD) $(WARN) $(OPT) $(DEBUG) $(CFLAGS) -DTRE_REGEX_T_FIELD=value -Ilocal_includes -Ilib
|
||||
R_LDFLAGS= $(LDFLAGS)
|
||||
DEBUG= -g
|
||||
|
||||
R_CC=$(CC) $(R_CFLAGS)
|
||||
R_LD=$(CC) $(R_LDFLAGS)
|
||||
|
||||
AR= ar
|
||||
ARFLAGS= rcs
|
||||
|
||||
TRE_OBJ=lib/regcomp.o lib/regerror.o lib/regexec.o lib/tre-ast.o lib/tre-compile.o \
|
||||
lib/tre-filter.o lib/tre-match-backtrack.o lib/tre-match-parallel.o \
|
||||
lib/tre-mem.o lib/tre-parse.o lib/tre-stack.o lib/xmalloc.o
|
||||
TRE_TESTS=tests/retest tests/test-str-source tests/test-literal-opt tests/test-malformed-regn
|
||||
|
||||
libtre.a: $(TRE_OBJ)
|
||||
$(AR) $(ARFLAGS) $@ $+
|
||||
|
||||
check: $(TRE_TESTS)
|
||||
@set -e; \
|
||||
for test in $(TRE_TESTS); do \
|
||||
echo "TEST $$test"; \
|
||||
./$$test; \
|
||||
done
|
||||
|
||||
tests/retest: tests/retest.c libtre.a
|
||||
$(R_LD) $(R_CFLAGS) -DHAVE_REGNEXEC -DHAVE_REGNCOMP -o $@ $< libtre.a
|
||||
|
||||
tests/test-str-source: tests/test-str-source.c libtre.a
|
||||
$(R_LD) $(R_CFLAGS) -o $@ $< libtre.a
|
||||
|
||||
tests/test-literal-opt: tests/test-literal-opt.c libtre.a
|
||||
$(R_LD) $(R_CFLAGS) -o $@ $< libtre.a
|
||||
|
||||
tests/test-malformed-regn: tests/test-malformed-regn.c libtre.a
|
||||
$(R_LD) $(R_CFLAGS) -o $@ $< libtre.a
|
||||
|
||||
lib/regcomp.o: lib/regcomp.c local_includes/tre.h local_includes/tre-config.h lib/tre-internal.h lib/xmalloc.h
|
||||
lib/regerror.o: lib/regerror.c local_includes/tre.h
|
||||
lib/regexec.o: lib/regexec.c local_includes/tre.h lib/tre-internal.h lib/xmalloc.h
|
||||
lib/tre-ast.o: lib/tre-ast.c lib/tre-ast.h lib/tre-internal.h
|
||||
lib/tre-compile.o: lib/tre-compile.c lib/tre-compile.h lib/tre-internal.h lib/tre-mem.h lib/tre-parse.h lib/tre-stack.h lib/xmalloc.h
|
||||
lib/tre-filter.o: lib/tre-filter.c lib/tre-filter.h lib/tre-internal.h
|
||||
lib/tre-match-backtrack.o: lib/tre-match-backtrack.c lib/tre-internal.h lib/tre-match-utils.h lib/tre-mem.h lib/tre-stack.h
|
||||
lib/tre-match-parallel.o: lib/tre-match-parallel.c lib/tre-internal.h lib/tre-match-utils.h lib/tre-mem.h
|
||||
lib/tre-mem.o: lib/tre-mem.c lib/tre-mem.h
|
||||
lib/tre-parse.o: lib/tre-parse.c lib/tre-ast.h lib/tre-compile.h lib/tre-filter.h lib/tre-internal.h lib/tre-mem.h lib/tre-parse.h lib/tre-stack.h lib/xmalloc.h
|
||||
lib/tre-stack.o: lib/tre-stack.c lib/tre-internal.h lib/tre-stack.h
|
||||
lib/xmalloc.o: lib/xmalloc.c lib/xmalloc.h
|
||||
|
||||
.c.o:
|
||||
$(R_CC) -c -o $@ $<
|
||||
|
||||
clean:
|
||||
rm -f $(TRE_OBJ) libtre.a $(TRE_TESTS)
|
||||
276
deps/tre/README.md
vendored
Normal file
276
deps/tre/README.md
vendored
Normal file
|
|
@ -0,0 +1,276 @@
|
|||
Introduction
|
||||
============
|
||||
|
||||
TRE is a lightweight, robust, and efficient POSIX compliant regexp
|
||||
matching library with some exciting features such as approximate
|
||||
(fuzzy) matching.
|
||||
|
||||
The matching algorithm used in TRE uses linear worst-case time in
|
||||
the length of the text being searched, and quadratic worst-case
|
||||
time in the length of the used regular expression.
|
||||
|
||||
In other words, the time complexity of the algorithm is O(M^2N), where
|
||||
M is the length of the regular expression and N is the length of the
|
||||
text. The used space is also quadratic on the length of the regex,
|
||||
but does not depend on the searched string. This quadratic behaviour
|
||||
occurs only on pathological cases which are probably very rare in
|
||||
practice.
|
||||
|
||||
|
||||
Hacking
|
||||
=======
|
||||
|
||||
Here's how to work with this code.
|
||||
|
||||
Prerequisites
|
||||
-------------
|
||||
|
||||
You will need the following tools installed on your system:
|
||||
|
||||
- autoconf
|
||||
- automake
|
||||
- gettext (including autopoint)
|
||||
- libtool
|
||||
- zip (optional)
|
||||
|
||||
|
||||
Building
|
||||
--------
|
||||
|
||||
First, prepare the tree. Change to the root of the source directory
|
||||
and run
|
||||
|
||||
./utils/autogen.sh
|
||||
|
||||
This will regenerate various things using the prerequisite tools so
|
||||
that you end up with a buildable tree.
|
||||
|
||||
After this, you can run the configure script and build TRE as usual:
|
||||
|
||||
./configure
|
||||
make
|
||||
make check
|
||||
make install
|
||||
|
||||
|
||||
Building a source code package
|
||||
------------------------------
|
||||
|
||||
In a prepared tree, this command creates a source code tarball:
|
||||
|
||||
./configure && make dist
|
||||
|
||||
Alternatively, you can run
|
||||
|
||||
./utils/build-sources.sh
|
||||
|
||||
which builds the source code packages and puts them in the `dist`
|
||||
subdirectory. This script needs a working `zip` command.
|
||||
|
||||
|
||||
Features
|
||||
========
|
||||
|
||||
TRE is not just yet another regexp matcher. TRE has some features
|
||||
which are not there in most free POSIX compatible implementations.
|
||||
Most of these features are not present in non-free implementations
|
||||
either, for that matter.
|
||||
|
||||
Approximate matching
|
||||
--------------------
|
||||
|
||||
Approximate pattern matching allows matches to be approximate, that
|
||||
is, allows the matches to be close to the searched pattern under some
|
||||
measure of closeness. TRE uses the edit-distance measure (also known
|
||||
as the Levenshtein distance) where characters can be inserted,
|
||||
deleted, or substituted in the searched text in order to get an exact
|
||||
match.
|
||||
|
||||
Each insertion, deletion, or substitution adds the distance, or cost,
|
||||
of the match. TRE can report the matches which have a cost lower than
|
||||
some given threshold value. TRE can also be used to search for
|
||||
matches with the lowest cost.
|
||||
|
||||
TRE includes a version of the agrep (approximate grep) command line
|
||||
tool for approximate regexp matching in the style of grep. Unlike
|
||||
other agrep implementations (like the one by Sun Wu and Udi Manber
|
||||
from University of Arizona) TRE agrep allows full regexps of any
|
||||
length, any number of errors, and non-uniform costs for insertion,
|
||||
deletion and substitution.
|
||||
|
||||
Strict standard conformance
|
||||
---------------------------
|
||||
|
||||
POSIX defines the behaviour of regexp functions precisely. TRE
|
||||
attempts to conform to these specifications as strictly as possible.
|
||||
TRE always returns the correct matches for subpatterns, for example.
|
||||
Very few other implementations do this correctly. In fact, the only
|
||||
other implementations besides TRE that I am aware of (free or not)
|
||||
that get it right are Rx by Tom Lord, Regex++ by John Maddock, and the
|
||||
AT&T ast regex by Glenn Fowler and Doug McIlroy.
|
||||
|
||||
The standard TRE tries to conform to is the IEEE Std 1003.1-2001, or
|
||||
Open Group Base Specifications Issue 6, commonly referred to as
|
||||
“POSIX”. The relevant parts are the base specifications on regular
|
||||
expressions (and the rationale) and the description of the `regcomp()`
|
||||
API.
|
||||
|
||||
For an excellent survey on POSIX regexp matchers, see the testregex
|
||||
pages by Glenn Fowler of AT&T Labs Research.
|
||||
|
||||
Predictable matching speed
|
||||
--------------------------
|
||||
|
||||
Because of the matching algorithm used in TRE, the maximum time
|
||||
consumed by any `regexec()` call is always directly proportional to
|
||||
the length of the searched string. There is one exception: if back
|
||||
references are used, the matching may take time that grows
|
||||
exponentially with the length of the string. This is because matching
|
||||
back references is an NP complete problem, and almost certainly
|
||||
requires exponential time to match in the worst case.
|
||||
|
||||
Predictable and modest memory consumption
|
||||
-----------------------------------------
|
||||
|
||||
A `regexec()` call never allocates memory from the heap. TRE allocates
|
||||
all the memory it needs during a `regcomp()` call, and some temporary
|
||||
working space from the stack frame for the duration of the `regexec()`
|
||||
call. The amount of temporary space needed is constant during
|
||||
matching and does not depend on the searched string. For regexps of
|
||||
reasonable size TRE needs less than 50K of dynamically allocated
|
||||
memory during the `regcomp()` call, less than 20K for the compiled
|
||||
pattern buffer, and less than two kilobytes of temporary working space
|
||||
from the stack frame during a `regexec()` call. There is no time /
|
||||
memory tradeoff. TRE is also small in code size; statically linking
|
||||
with TRE increases the executable size less than 30K (gcc-3.2, x86,
|
||||
GNU/Linux).
|
||||
|
||||
Wide character and multibyte character set support
|
||||
--------------------------------------------------
|
||||
|
||||
TRE supports multibyte character sets. This makes it possible to use
|
||||
regexps seamlessly with, for example, Japanese locales. TRE also
|
||||
provides a wide character API.
|
||||
|
||||
Binary pattern and data support
|
||||
-------------------------------
|
||||
|
||||
TRE provides APIs which allow binary zero characters both in regexps
|
||||
and searched strings. The standard API cannot be easily used to, for
|
||||
example, search for printable words from binary data (although it is
|
||||
possible with some hacking). Searching for patterns which contain
|
||||
binary zeroes embedded is not possible at all with the standard API.
|
||||
|
||||
Completely thread safe
|
||||
----------------------
|
||||
|
||||
TRE is completely thread safe. All the exported functions are
|
||||
re-entrant, and a single compiled regexp object can be used
|
||||
simultaneously in multiple contexts; e.g. in `main()` and a signal
|
||||
handler, or in many threads of a multithreaded application.
|
||||
|
||||
Portable
|
||||
--------
|
||||
|
||||
TRE is portable across multiple platforms. Below is a table of
|
||||
platforms and compilers used to develop and test TRE:
|
||||
|
||||
<table>
|
||||
<tr><th>Platform</th> <th>Compiler</th></tr>
|
||||
<tr><td>FreeBSD 14.1</td> <td>Clang 18</td></tr>
|
||||
<tr><td>Ubuntu 22.04</td> <td>GCC 11</td></tr>
|
||||
<tr><td>macOS 14.6</td> <td>Clang 14</td></tr>
|
||||
<tr><td>Windows 11</td> <td>Microsoft Visual Studio 2022</td></tr>
|
||||
</table>
|
||||
|
||||
TRE should compile without changes on most modern POSIX-like
|
||||
platforms, and be easily portable to any platform with a hosted C
|
||||
implementation.
|
||||
|
||||
Depending on the platform, you may need to install libutf8 to get
|
||||
wide character and multibyte character set support.
|
||||
|
||||
Free
|
||||
----
|
||||
|
||||
TRE is released under a license which is essentially the same as the
|
||||
“2 clause” BSD-style license used in NetBSD. See the file LICENSE for
|
||||
details.
|
||||
|
||||
Roadmap
|
||||
-------
|
||||
|
||||
There are currently two features, both related to collating elements,
|
||||
missing from 100% POSIX compliance. These are:
|
||||
|
||||
* Support for collating elements (e.g. `[[.\<X>.]]`, where `\<X>` is a
|
||||
collating element). It is not possible to support multi-character
|
||||
collating elements portably, since POSIX does not define a way to
|
||||
determine whether a character sequence is a multi-character
|
||||
collating element or not.
|
||||
|
||||
* Support for equivalence classes, for example `[[=\<X>=]]`, where
|
||||
`\<X>` is a collating element. An equivalence class matches any
|
||||
character which has the same primary collation weight as `\<X>`.
|
||||
Again, POSIX provides no portable mechanism for determining the
|
||||
primary collation weight of a collating element.
|
||||
|
||||
Note that other portable regexp implementations don't support
|
||||
collating elements either. The single exception is Regex++, which
|
||||
comes with its own database for collating elements for different
|
||||
locales. Support for collating elements and equivalence classes has
|
||||
not been widely requested and is not very high on the TODO list at the
|
||||
moment.
|
||||
|
||||
These are other features I'm planning to implement real soon now:
|
||||
|
||||
* All the missing GNU extensions enabled in GNU regex, such as
|
||||
`[[:<:]]` and `[[:>:]]`.
|
||||
|
||||
* A `REG_SHORTEST` `regexec()` flag for returning the shortest match
|
||||
instead of the longest match.
|
||||
|
||||
* Perl-compatible syntax:
|
||||
* `[:^class:]`
|
||||
Matches anything but the characters in class. Note that
|
||||
`[^[:class:]]` works already, this would be just a convenience
|
||||
shorthand.
|
||||
|
||||
* `\A`
|
||||
Match only at beginning of string.
|
||||
|
||||
* `\Z`
|
||||
Match only at end of string, or before newline at the end.
|
||||
|
||||
* `\z`
|
||||
Match only at end of string.
|
||||
|
||||
* `\l`
|
||||
Lowercase next char (think vi).
|
||||
|
||||
* `\u`
|
||||
Uppercase next char (think vi).
|
||||
|
||||
* `\L`
|
||||
Lowercase till `\E` (think vi).
|
||||
|
||||
* `\U`
|
||||
Uppercase till `\E` (think vi).
|
||||
|
||||
* `(?=pattern)`
|
||||
Zero-width positive look-ahead assertions.
|
||||
|
||||
* `(?!pattern)`
|
||||
Zero-width negative look-ahead assertions.
|
||||
|
||||
* `(?<=pattern)`
|
||||
Zero-width positive look-behind assertions.
|
||||
|
||||
* `(?<!pattern)`
|
||||
Zero-width negative look-behind assertions.
|
||||
|
||||
Documentation especially for the nonstandard features of TRE, such as
|
||||
approximate matching, is a work in progress (with “progress” loosely
|
||||
defined...) If you want to find an extension to use, reading the
|
||||
`include/tre/tre.h` header might provide some additional hints if you
|
||||
are comfortable with C source code.
|
||||
188
deps/tre/lib/regcomp.c
vendored
Normal file
188
deps/tre/lib/regcomp.c
vendored
Normal file
|
|
@ -0,0 +1,188 @@
|
|||
/*
|
||||
tre_regcomp.c - TRE POSIX compatible regex compilation functions.
|
||||
|
||||
This software is released under a BSD-style license.
|
||||
See the file LICENSE for details and copyright.
|
||||
|
||||
*/
|
||||
|
||||
#ifdef HAVE_CONFIG_H
|
||||
#include <config.h>
|
||||
#endif /* HAVE_CONFIG_H */
|
||||
|
||||
#include <string.h>
|
||||
#include <errno.h>
|
||||
#include <stdlib.h>
|
||||
|
||||
#include "tre-internal.h"
|
||||
#include "xmalloc.h"
|
||||
|
||||
int
|
||||
tre_regncomp(regex_t *preg, const char *regex, size_t n, int cflags)
|
||||
{
|
||||
int ret;
|
||||
if (n > TRE_MAX_RE)
|
||||
return REG_ESPACE;
|
||||
#if TRE_WCHAR
|
||||
tre_char_t *wregex;
|
||||
size_t wlen;
|
||||
|
||||
wregex = xmalloc(sizeof(tre_char_t) * (n + 1));
|
||||
if (wregex == NULL)
|
||||
return REG_ESPACE;
|
||||
|
||||
/* If the current locale uses the standard single byte encoding of
|
||||
characters, we don't do a multibyte string conversion. If we did,
|
||||
many applications which use the default locale would break since
|
||||
the default "C" locale uses the 7-bit ASCII character set, and
|
||||
all characters with the eighth bit set would be considered invalid. */
|
||||
#if TRE_MULTIBYTE
|
||||
if (TRE_MB_CUR_MAX == 1)
|
||||
#endif /* TRE_MULTIBYTE */
|
||||
{
|
||||
size_t i;
|
||||
const unsigned char *str = (const unsigned char *)regex;
|
||||
tre_char_t *wstr = wregex;
|
||||
|
||||
for (i = 0; i < n; i++)
|
||||
*(wstr++) = *(str++);
|
||||
wlen = n;
|
||||
}
|
||||
#if TRE_MULTIBYTE
|
||||
else
|
||||
{
|
||||
size_t consumed;
|
||||
tre_char_t *wcptr = wregex;
|
||||
#ifdef HAVE_MBSTATE_T
|
||||
mbstate_t state;
|
||||
memset(&state, '\0', sizeof(state));
|
||||
#endif /* HAVE_MBSTATE_T */
|
||||
while (n > 0)
|
||||
{
|
||||
consumed = tre_mbrtowc(wcptr, regex, n, &state);
|
||||
|
||||
switch (consumed)
|
||||
{
|
||||
case 0:
|
||||
if (*regex == '\0')
|
||||
consumed = 1;
|
||||
else
|
||||
{
|
||||
xfree(wregex);
|
||||
return REG_BADPAT;
|
||||
}
|
||||
break;
|
||||
case -1:
|
||||
DPRINT(("mbrtowc: error %d: %s.\n", errno, strerror(errno)));
|
||||
xfree(wregex);
|
||||
return REG_BADPAT;
|
||||
case -2:
|
||||
/* The last character wasn't complete. Let's not call it a
|
||||
fatal error. */
|
||||
consumed = n;
|
||||
break;
|
||||
}
|
||||
regex += consumed;
|
||||
n -= consumed;
|
||||
wcptr++;
|
||||
}
|
||||
wlen = wcptr - wregex;
|
||||
}
|
||||
#endif /* TRE_MULTIBYTE */
|
||||
|
||||
wregex[wlen] = L'\0';
|
||||
ret = tre_compile(preg, wregex, wlen, cflags);
|
||||
xfree(wregex);
|
||||
#else /* !TRE_WCHAR */
|
||||
ret = tre_compile(preg, (const tre_char_t *)regex, n, cflags);
|
||||
#endif /* !TRE_WCHAR */
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
/* this version takes bytes literally, to be used with raw vectors */
|
||||
int
|
||||
tre_regncompb(regex_t *preg, const char *regex, size_t n, int cflags)
|
||||
{
|
||||
int ret;
|
||||
if (n > TRE_MAX_RE)
|
||||
return REG_ESPACE;
|
||||
#if TRE_WCHAR /* wide chars = we need to convert it all to the wide format */
|
||||
tre_char_t *wregex;
|
||||
size_t i;
|
||||
|
||||
wregex = xmalloc(sizeof(tre_char_t) * n);
|
||||
if (wregex == NULL)
|
||||
return REG_ESPACE;
|
||||
|
||||
for (i = 0; i < n; i++)
|
||||
wregex[i] = (tre_char_t) ((unsigned char) regex[i]);
|
||||
|
||||
ret = tre_compile(preg, wregex, n, cflags | REG_USEBYTES);
|
||||
xfree(wregex);
|
||||
#else /* !TRE_WCHAR */
|
||||
ret = tre_compile(preg, (const tre_char_t *)regex, n, cflags | REG_USEBYTES);
|
||||
#endif /* !TRE_WCHAR */
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
int
|
||||
tre_regcomp(regex_t *preg, const char *regex, int cflags)
|
||||
{
|
||||
size_t n = regex ? strlen(regex) : 0;
|
||||
if (n > TRE_MAX_RE)
|
||||
return REG_ESPACE;
|
||||
return tre_regncomp(preg, regex, n, cflags);
|
||||
}
|
||||
|
||||
int
|
||||
tre_regcompb(regex_t *preg, const char *regex, int cflags)
|
||||
{
|
||||
int ret;
|
||||
tre_char_t *wregex;
|
||||
size_t i, n = regex ? strlen(regex) : 0;
|
||||
const unsigned char *str = (const unsigned char *)regex;
|
||||
tre_char_t *wstr;
|
||||
|
||||
if (n > TRE_MAX_RE)
|
||||
return REG_ESPACE;
|
||||
wregex = xmalloc(sizeof(tre_char_t) * (n + 1));
|
||||
if (wregex == NULL) return REG_ESPACE;
|
||||
wstr = wregex;
|
||||
|
||||
for (i = 0; i < n; i++)
|
||||
*(wstr++) = *(str++);
|
||||
wregex[n] = L'\0';
|
||||
ret = tre_compile(preg, wregex, n, cflags | REG_USEBYTES);
|
||||
xfree(wregex);
|
||||
return ret;
|
||||
}
|
||||
|
||||
|
||||
#ifdef TRE_WCHAR
|
||||
int
|
||||
tre_regwncomp(regex_t *preg, const wchar_t *regex, size_t n, int cflags)
|
||||
{
|
||||
if (n > TRE_MAX_RE)
|
||||
return REG_ESPACE;
|
||||
return tre_compile(preg, regex, n, cflags);
|
||||
}
|
||||
|
||||
int
|
||||
tre_regwcomp(regex_t *preg, const wchar_t *regex, int cflags)
|
||||
{
|
||||
size_t n = regex ? wcslen(regex) : 0;
|
||||
if (n > TRE_MAX_RE)
|
||||
return REG_ESPACE;
|
||||
return tre_compile(preg, regex, n, cflags);
|
||||
}
|
||||
#endif /* TRE_WCHAR */
|
||||
|
||||
void
|
||||
tre_regfree(regex_t *preg)
|
||||
{
|
||||
tre_free(preg);
|
||||
}
|
||||
|
||||
/* EOF */
|
||||
86
deps/tre/lib/regerror.c
vendored
Normal file
86
deps/tre/lib/regerror.c
vendored
Normal file
|
|
@ -0,0 +1,86 @@
|
|||
/*
|
||||
tre_regerror.c - POSIX tre_regerror() implementation for TRE.
|
||||
|
||||
This software is released under a BSD-style license.
|
||||
See the file LICENSE for details and copyright.
|
||||
|
||||
*/
|
||||
|
||||
#ifdef HAVE_CONFIG_H
|
||||
#include <config.h>
|
||||
#endif /* HAVE_CONFIG_H */
|
||||
|
||||
#include <string.h>
|
||||
#ifdef HAVE_WCHAR_H
|
||||
#include <wchar.h>
|
||||
#endif /* HAVE_WCHAR_H */
|
||||
#ifdef HAVE_WCTYPE_H
|
||||
#include <wctype.h>
|
||||
#endif /* HAVE_WCTYPE_H */
|
||||
|
||||
#include "tre-internal.h"
|
||||
|
||||
#ifdef HAVE_GETTEXT
|
||||
#include <libintl.h>
|
||||
#else
|
||||
#define dgettext(p, s) s
|
||||
#define gettext(s) s
|
||||
#endif
|
||||
|
||||
#define _(String) dgettext(PACKAGE, String)
|
||||
#define gettext_noop(String) String
|
||||
|
||||
#define xstr(s) str(s)
|
||||
#define str(s) #s
|
||||
|
||||
/* Error message strings for error codes listed in `tre.h'. This list
|
||||
needs to be in sync with the codes listed there, naturally. */
|
||||
static const char *tre_error_messages[] =
|
||||
{ gettext_noop("No error"), /* REG_OK */
|
||||
gettext_noop("No match"), /* REG_NOMATCH */
|
||||
gettext_noop("Invalid regexp"), /* REG_BADPAT */
|
||||
gettext_noop("Unknown collating element"), /* REG_ECOLLATE */
|
||||
gettext_noop("Unknown character class name"), /* REG_ECTYPE */
|
||||
gettext_noop("Trailing backslash"), /* REG_EESCAPE */
|
||||
gettext_noop("Invalid back reference"), /* REG_ESUBREG */
|
||||
gettext_noop("Missing ']'"), /* REG_EBRACK */
|
||||
gettext_noop("Missing ')'"), /* REG_EPAREN */
|
||||
gettext_noop("Missing '}'"), /* REG_EBRACE */
|
||||
gettext_noop("Invalid contents of {}"), /* REG_BADBR */
|
||||
gettext_noop("Invalid character range"), /* REG_ERANGE */
|
||||
gettext_noop("Out of memory"), /* REG_ESPACE */
|
||||
gettext_noop("Invalid use of repetition operators"), /* REG_BADRPT */
|
||||
gettext_noop("Maximum repetition in {} larger than " xstr(RE_DUP_MAX)), /* REG_BADMAX */
|
||||
};
|
||||
|
||||
size_t
|
||||
tre_regerror(int errcode, const regex_t *preg, char *errbuf, size_t errbuf_size)
|
||||
{
|
||||
const char *err;
|
||||
size_t err_len;
|
||||
|
||||
/*LINTED*/(void)&preg;
|
||||
if (errcode >= 0
|
||||
&& errcode < (int)(sizeof(tre_error_messages)
|
||||
/ sizeof(*tre_error_messages)))
|
||||
err = gettext(tre_error_messages[errcode]);
|
||||
else
|
||||
err = gettext("Unknown error");
|
||||
|
||||
err_len = strlen(err) + 1;
|
||||
if (errbuf_size > 0 && errbuf != NULL)
|
||||
{
|
||||
if (err_len > errbuf_size)
|
||||
{
|
||||
strncpy(errbuf, err, errbuf_size - 1);
|
||||
errbuf[errbuf_size - 1] = '\0';
|
||||
}
|
||||
else
|
||||
{
|
||||
strcpy(errbuf, err);
|
||||
}
|
||||
}
|
||||
return err_len;
|
||||
}
|
||||
|
||||
/* EOF */
|
||||
584
deps/tre/lib/regexec.c
vendored
Normal file
584
deps/tre/lib/regexec.c
vendored
Normal file
|
|
@ -0,0 +1,584 @@
|
|||
/*
|
||||
tre_regexec.c - TRE POSIX compatible matching functions (and more).
|
||||
|
||||
This software is released under a BSD-style license.
|
||||
See the file LICENSE for details and copyright.
|
||||
|
||||
*/
|
||||
|
||||
#ifdef HAVE_CONFIG_H
|
||||
#include <config.h>
|
||||
#endif /* HAVE_CONFIG_H */
|
||||
|
||||
#ifdef TRE_USE_ALLOCA
|
||||
/* AIX requires this to be the first thing in the file. */
|
||||
#ifndef __GNUC__
|
||||
# if HAVE_ALLOCA_H
|
||||
# include <alloca.h>
|
||||
# else
|
||||
# ifdef _AIX
|
||||
#pragma alloca
|
||||
# else
|
||||
# ifndef alloca /* predefined by HP cc +Olibcalls */
|
||||
char *alloca ();
|
||||
# endif
|
||||
# endif
|
||||
# endif
|
||||
#endif
|
||||
#endif /* TRE_USE_ALLOCA */
|
||||
|
||||
#include <assert.h>
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
#ifdef HAVE_WCHAR_H
|
||||
#include <wchar.h>
|
||||
#endif /* HAVE_WCHAR_H */
|
||||
#ifdef HAVE_WCTYPE_H
|
||||
#include <wctype.h>
|
||||
#endif /* HAVE_WCTYPE_H */
|
||||
#ifndef TRE_WCHAR
|
||||
#include <ctype.h>
|
||||
#endif /* !TRE_WCHAR */
|
||||
#ifdef HAVE_MALLOC_H
|
||||
#include <malloc.h>
|
||||
#endif /* HAVE_MALLOC_H */
|
||||
#include <limits.h>
|
||||
|
||||
#include "tre-internal.h"
|
||||
#include "xmalloc.h"
|
||||
|
||||
/* Literal alternatives are grouped by the first byte so the matcher can
|
||||
* reach the relevant candidates in O(1). In nocase mode the lookup uses the
|
||||
* same folded byte mapping that was applied at compile time. */
|
||||
static void
|
||||
tre_litopt_candidate_range(const tre_literal_opt_t *opt, unsigned char first_byte,
|
||||
size_t *start, size_t *end)
|
||||
{
|
||||
unsigned char key = opt->nocase ? opt->fold_map[first_byte] : first_byte;
|
||||
*start = opt->start_offsets[key];
|
||||
*end = opt->start_offsets[key + 1];
|
||||
}
|
||||
|
||||
static int
|
||||
tre_litopt_bytes_equal(const unsigned char *haystack,
|
||||
const unsigned char *needle, size_t len,
|
||||
const unsigned char *fold_map)
|
||||
{
|
||||
size_t i;
|
||||
|
||||
if (fold_map == NULL)
|
||||
return memcmp(haystack, needle, len) == 0;
|
||||
|
||||
for (i = 0; i < len; i++)
|
||||
if (fold_map[haystack[i]] != needle[i])
|
||||
return 0;
|
||||
return 1;
|
||||
}
|
||||
|
||||
static int
|
||||
tre_litopt_contains_case(const unsigned char *haystack, size_t hay_len,
|
||||
const unsigned char *needle, size_t needle_len,
|
||||
int *match_end_ofs)
|
||||
{
|
||||
const unsigned char *p;
|
||||
size_t remaining;
|
||||
|
||||
if (needle_len > hay_len)
|
||||
return 0;
|
||||
|
||||
p = haystack;
|
||||
remaining = hay_len;
|
||||
while (remaining >= needle_len)
|
||||
{
|
||||
p = memchr(p, needle[0], remaining - needle_len + 1);
|
||||
if (p == NULL)
|
||||
return 0;
|
||||
if (memcmp(p, needle, needle_len) == 0)
|
||||
{
|
||||
if (match_end_ofs != NULL)
|
||||
*match_end_ofs = (int)(p - haystack + needle_len);
|
||||
return 1;
|
||||
}
|
||||
remaining = hay_len - (size_t)(p - haystack) - 1;
|
||||
p++;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
/* Nocase substring matching is still byte-oriented, but scanning once and
|
||||
* only checking literals that share the same folded first byte avoids the
|
||||
* old O(haystack * literals) restart pattern. */
|
||||
static int
|
||||
tre_litopt_contains_nocase(const tre_literal_opt_t *opt,
|
||||
const unsigned char *haystack, size_t hay_len,
|
||||
int *match_end_ofs)
|
||||
{
|
||||
size_t i, start, end, j;
|
||||
|
||||
for (i = 0; i < hay_len; i++)
|
||||
{
|
||||
tre_litopt_candidate_range(opt, haystack[i], &start, &end);
|
||||
for (j = start; j < end; j++)
|
||||
{
|
||||
const tre_literal_opt_literal_t *lit = &opt->literals[j];
|
||||
if (lit->len <= hay_len - i
|
||||
&& tre_litopt_bytes_equal(haystack + i, lit->data, lit->len,
|
||||
opt->fold_map))
|
||||
{
|
||||
if (match_end_ofs != NULL)
|
||||
*match_end_ofs = (int)(i + lit->len);
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
static reg_errcode_t
|
||||
tre_match_literal_opt(const tre_tnfa_t *tnfa, const char *string, size_t len,
|
||||
int eflags, int *match_end_ofs)
|
||||
{
|
||||
const tre_literal_opt_t *opt = &tnfa->literal_opt;
|
||||
const unsigned char *haystack = (const unsigned char *)string;
|
||||
size_t start = 0, end = opt->num_literals, i;
|
||||
const unsigned char *fold_map = opt->nocase ? opt->fold_map : NULL;
|
||||
|
||||
if ((opt->mode == TRE_LITERAL_OPT_PREFIX
|
||||
|| opt->mode == TRE_LITERAL_OPT_EXACT)
|
||||
&& (eflags & REG_NOTBOL))
|
||||
return REG_NOMATCH;
|
||||
if ((opt->mode == TRE_LITERAL_OPT_SUFFIX
|
||||
|| opt->mode == TRE_LITERAL_OPT_EXACT)
|
||||
&& (eflags & REG_NOTEOL))
|
||||
return REG_NOMATCH;
|
||||
|
||||
if ((opt->mode == TRE_LITERAL_OPT_EXACT
|
||||
|| opt->mode == TRE_LITERAL_OPT_PREFIX)
|
||||
&& len > 0)
|
||||
tre_litopt_candidate_range(opt, haystack[0], &start, &end);
|
||||
|
||||
if (opt->mode == TRE_LITERAL_OPT_CONTAINS)
|
||||
{
|
||||
if (opt->nocase)
|
||||
return tre_litopt_contains_nocase(opt, haystack, len, match_end_ofs)
|
||||
? REG_OK : REG_NOMATCH;
|
||||
|
||||
for (i = 0; i < opt->num_literals; i++)
|
||||
{
|
||||
const tre_literal_opt_literal_t *lit = &opt->literals[i];
|
||||
if (tre_litopt_contains_case(haystack, len, lit->data, lit->len,
|
||||
match_end_ofs))
|
||||
return REG_OK;
|
||||
}
|
||||
return REG_NOMATCH;
|
||||
}
|
||||
|
||||
for (i = start; i < end; i++)
|
||||
{
|
||||
const tre_literal_opt_literal_t *lit = &opt->literals[i];
|
||||
|
||||
switch (opt->mode)
|
||||
{
|
||||
case TRE_LITERAL_OPT_EXACT:
|
||||
if (len == lit->len
|
||||
&& tre_litopt_bytes_equal(haystack, lit->data, len, fold_map))
|
||||
{
|
||||
if (match_end_ofs != NULL)
|
||||
*match_end_ofs = (int)len;
|
||||
return REG_OK;
|
||||
}
|
||||
break;
|
||||
|
||||
case TRE_LITERAL_OPT_PREFIX:
|
||||
if (len >= lit->len
|
||||
&& tre_litopt_bytes_equal(haystack, lit->data, lit->len,
|
||||
fold_map))
|
||||
{
|
||||
if (match_end_ofs != NULL)
|
||||
*match_end_ofs = (int)lit->len;
|
||||
return REG_OK;
|
||||
}
|
||||
break;
|
||||
|
||||
case TRE_LITERAL_OPT_SUFFIX:
|
||||
if (len >= lit->len
|
||||
&& tre_litopt_bytes_equal(haystack + len - lit->len, lit->data,
|
||||
lit->len, fold_map))
|
||||
{
|
||||
if (match_end_ofs != NULL)
|
||||
*match_end_ofs = (int)len;
|
||||
return REG_OK;
|
||||
}
|
||||
break;
|
||||
|
||||
case TRE_LITERAL_OPT_CONTAINS:
|
||||
case TRE_LITERAL_OPT_NONE:
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
return REG_NOMATCH;
|
||||
}
|
||||
|
||||
|
||||
/* Fills the POSIX.2 regmatch_t array according to the TNFA tag and match
|
||||
endpoint values. */
|
||||
void
|
||||
tre_fill_pmatch(size_t nmatch, regmatch_t pmatch[], int cflags,
|
||||
const tre_tnfa_t *tnfa, int *tags, int match_eo)
|
||||
{
|
||||
tre_submatch_data_t *submatch_data;
|
||||
unsigned int i, j;
|
||||
int *parents;
|
||||
|
||||
i = 0;
|
||||
if (match_eo >= 0 && !(cflags & REG_NOSUB))
|
||||
{
|
||||
/* Construct submatch offsets from the tags. */
|
||||
DPRINT(("end tag = t%d = %d\n", tnfa->end_tag, match_eo));
|
||||
submatch_data = tnfa->submatch_data;
|
||||
while (i < tnfa->num_submatches && i < nmatch)
|
||||
{
|
||||
if (submatch_data[i].so_tag == tnfa->end_tag)
|
||||
pmatch[i].rm_so = match_eo;
|
||||
else
|
||||
pmatch[i].rm_so = tags[submatch_data[i].so_tag];
|
||||
|
||||
if (submatch_data[i].eo_tag == tnfa->end_tag)
|
||||
pmatch[i].rm_eo = match_eo;
|
||||
else
|
||||
pmatch[i].rm_eo = tags[submatch_data[i].eo_tag];
|
||||
|
||||
/* If either of the endpoints were not used, this submatch
|
||||
was not part of the match. */
|
||||
if (pmatch[i].rm_so == -1 || pmatch[i].rm_eo == -1)
|
||||
pmatch[i].rm_so = pmatch[i].rm_eo = -1;
|
||||
|
||||
DPRINT(("pmatch[%d] = {t%d = %d, t%d = %d}\n", i,
|
||||
submatch_data[i].so_tag, pmatch[i].rm_so,
|
||||
submatch_data[i].eo_tag, pmatch[i].rm_eo));
|
||||
i++;
|
||||
}
|
||||
/* Reset all submatches that are not within all of their parent
|
||||
submatches. */
|
||||
i = 0;
|
||||
while (i < tnfa->num_submatches && i < nmatch)
|
||||
{
|
||||
if (pmatch[i].rm_eo == -1)
|
||||
assert(pmatch[i].rm_so == -1);
|
||||
assert(pmatch[i].rm_so <= pmatch[i].rm_eo);
|
||||
|
||||
parents = submatch_data[i].parents;
|
||||
if (parents != NULL)
|
||||
for (j = 0; parents[j] >= 0; j++)
|
||||
{
|
||||
DPRINT(("pmatch[%d] parent %d\n", i, parents[j]));
|
||||
if (pmatch[i].rm_so < pmatch[parents[j]].rm_so
|
||||
|| pmatch[i].rm_eo > pmatch[parents[j]].rm_eo)
|
||||
pmatch[i].rm_so = pmatch[i].rm_eo = -1;
|
||||
}
|
||||
i++;
|
||||
}
|
||||
}
|
||||
|
||||
while (i < nmatch)
|
||||
{
|
||||
pmatch[i].rm_so = -1;
|
||||
pmatch[i].rm_eo = -1;
|
||||
i++;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*
|
||||
Wrapper functions for POSIX compatible regexp matching.
|
||||
*/
|
||||
|
||||
int
|
||||
tre_have_backrefs(const regex_t *preg)
|
||||
{
|
||||
tre_tnfa_t *tnfa = (void *)preg->TRE_REGEX_T_FIELD;
|
||||
return tnfa->have_backrefs;
|
||||
}
|
||||
|
||||
int
|
||||
tre_have_approx(const regex_t *preg)
|
||||
{
|
||||
tre_tnfa_t *tnfa = (void *)preg->TRE_REGEX_T_FIELD;
|
||||
return tnfa->have_approx;
|
||||
}
|
||||
|
||||
static int
|
||||
tre_match(const tre_tnfa_t *tnfa, const void *string, ssize_t len,
|
||||
tre_str_type_t type, size_t nmatch, regmatch_t pmatch[],
|
||||
int eflags)
|
||||
{
|
||||
reg_errcode_t status;
|
||||
int *tags = NULL, eo;
|
||||
if (tnfa->num_tags > 0 && nmatch > 0)
|
||||
{
|
||||
#ifdef TRE_USE_ALLOCA
|
||||
tags = alloca(sizeof(*tags) * tnfa->num_tags);
|
||||
#else /* !TRE_USE_ALLOCA */
|
||||
tags = xmalloc(sizeof(*tags) * tnfa->num_tags);
|
||||
#endif /* !TRE_USE_ALLOCA */
|
||||
if (tags == NULL)
|
||||
return REG_ESPACE;
|
||||
}
|
||||
|
||||
if (type == STR_BYTE
|
||||
&& tnfa->literal_opt.mode != TRE_LITERAL_OPT_NONE
|
||||
&& (nmatch == 0 || (tnfa->cflags & REG_NOSUB))
|
||||
#ifdef TRE_APPROX
|
||||
&& !(eflags & REG_APPROX_MATCHER)
|
||||
#endif /* TRE_APPROX */
|
||||
&& !(eflags & REG_BACKTRACKING_MATCHER))
|
||||
{
|
||||
size_t byte_len = (len >= 0) ? (size_t)len : strlen((const char *)string);
|
||||
status = tre_match_literal_opt(tnfa, string, byte_len, eflags, &eo);
|
||||
|
||||
/* Even when the caller asked for no submatches, regexec() still has to
|
||||
* clear any pmatch entries it was handed. The normal matcher path does
|
||||
* this through tre_fill_pmatch(), so mirror that behavior here. */
|
||||
if (status == REG_OK && nmatch > 0)
|
||||
tre_fill_pmatch(nmatch, pmatch, tnfa->cflags, tnfa, NULL, eo);
|
||||
|
||||
#ifndef TRE_USE_ALLOCA
|
||||
if (tags)
|
||||
xfree(tags);
|
||||
#endif /* !TRE_USE_ALLOCA */
|
||||
return status;
|
||||
}
|
||||
|
||||
/* Dispatch to the appropriate matcher. */
|
||||
if (tnfa->have_backrefs || eflags & REG_BACKTRACKING_MATCHER)
|
||||
{
|
||||
/* The regex has back references, use the backtracking matcher. */
|
||||
if (type == STR_USER)
|
||||
{
|
||||
const tre_str_source *source = string;
|
||||
if (source->rewind == NULL || source->compare == NULL)
|
||||
{
|
||||
/* The backtracking matcher requires rewind and compare
|
||||
capabilities from the input stream. */
|
||||
#ifndef TRE_USE_ALLOCA
|
||||
if (tags)
|
||||
xfree(tags);
|
||||
#endif /* !TRE_USE_ALLOCA */
|
||||
return REG_BADPAT;
|
||||
}
|
||||
}
|
||||
status = tre_tnfa_run_backtrack(tnfa, string, len, type,
|
||||
tags, eflags, &eo);
|
||||
}
|
||||
#ifdef TRE_APPROX
|
||||
else if (tnfa->have_approx || eflags & REG_APPROX_MATCHER)
|
||||
{
|
||||
/* The regex uses approximate matching, use the approximate matcher. */
|
||||
regamatch_t match;
|
||||
regaparams_t params;
|
||||
tre_regaparams_default(¶ms);
|
||||
params.max_err = 0;
|
||||
params.max_cost = 0;
|
||||
status = tre_tnfa_run_approx(tnfa, string, len, type, tags,
|
||||
&match, params, eflags, &eo);
|
||||
}
|
||||
#endif /* TRE_APPROX */
|
||||
else
|
||||
{
|
||||
/* Exact matching, no back references, use the parallel matcher. */
|
||||
status = tre_tnfa_run_parallel(tnfa, string, len, type,
|
||||
tags, eflags, &eo);
|
||||
}
|
||||
|
||||
if (status == REG_OK)
|
||||
/* A match was found, so fill the submatch registers. */
|
||||
tre_fill_pmatch(nmatch, pmatch, tnfa->cflags, tnfa, tags, eo);
|
||||
#ifndef TRE_USE_ALLOCA
|
||||
if (tags)
|
||||
xfree(tags);
|
||||
#endif /* !TRE_USE_ALLOCA */
|
||||
return status;
|
||||
}
|
||||
|
||||
int
|
||||
tre_regnexec(const regex_t *preg, const char *str, size_t len,
|
||||
size_t nmatch, regmatch_t pmatch[], int eflags)
|
||||
{
|
||||
tre_tnfa_t *tnfa = (void *)preg->TRE_REGEX_T_FIELD;
|
||||
tre_str_type_t type = (TRE_MB_CUR_MAX == 1) ? STR_BYTE : STR_MBS;
|
||||
|
||||
return tre_match(tnfa, str, len, type, nmatch, pmatch, eflags);
|
||||
}
|
||||
|
||||
#ifdef TRE_USE_GNUC_REGEXEC_FPL
|
||||
int
|
||||
tre_regexec(const regex_t *preg, const char *str,
|
||||
size_t nmatch, regmatch_t pmatch[_Restrict_arr_ _REGEX_NELTS (nmatch)],
|
||||
int eflags)
|
||||
#else
|
||||
int
|
||||
tre_regexec(const regex_t *preg, const char *str,
|
||||
size_t nmatch, regmatch_t pmatch[], int eflags)
|
||||
#endif
|
||||
{
|
||||
return tre_regnexec(preg, str, -1, nmatch, pmatch, eflags);
|
||||
}
|
||||
|
||||
int
|
||||
tre_regexecb(const regex_t *preg, const char *str,
|
||||
size_t nmatch, regmatch_t pmatch[], int eflags)
|
||||
{
|
||||
tre_tnfa_t *tnfa = (void *)preg->TRE_REGEX_T_FIELD;
|
||||
|
||||
return tre_match(tnfa, str, -1, STR_BYTE, nmatch, pmatch, eflags);
|
||||
}
|
||||
|
||||
int
|
||||
tre_regnexecb(const regex_t *preg, const char *str, size_t len,
|
||||
size_t nmatch, regmatch_t pmatch[], int eflags)
|
||||
{
|
||||
tre_tnfa_t *tnfa = (void *)preg->TRE_REGEX_T_FIELD;
|
||||
|
||||
return tre_match(tnfa, str, len, STR_BYTE, nmatch, pmatch, eflags);
|
||||
}
|
||||
|
||||
|
||||
#ifdef TRE_WCHAR
|
||||
|
||||
int
|
||||
tre_regwnexec(const regex_t *preg, const wchar_t *str, size_t len,
|
||||
size_t nmatch, regmatch_t pmatch[], int eflags)
|
||||
{
|
||||
tre_tnfa_t *tnfa = (void *)preg->TRE_REGEX_T_FIELD;
|
||||
return tre_match(tnfa, str, len, STR_WIDE, nmatch, pmatch, eflags);
|
||||
}
|
||||
|
||||
int
|
||||
tre_regwexec(const regex_t *preg, const wchar_t *str,
|
||||
size_t nmatch, regmatch_t pmatch[], int eflags)
|
||||
{
|
||||
return tre_regwnexec(preg, str, -1, nmatch, pmatch, eflags);
|
||||
}
|
||||
|
||||
#endif /* TRE_WCHAR */
|
||||
|
||||
int
|
||||
tre_reguexec(const regex_t *preg, const tre_str_source *str,
|
||||
size_t nmatch, regmatch_t pmatch[], int eflags)
|
||||
{
|
||||
tre_tnfa_t *tnfa = (void *)preg->TRE_REGEX_T_FIELD;
|
||||
return tre_match(tnfa, str, -1, STR_USER, nmatch, pmatch, eflags);
|
||||
}
|
||||
|
||||
|
||||
#ifdef TRE_APPROX
|
||||
|
||||
/*
|
||||
Wrapper functions for approximate regexp matching.
|
||||
*/
|
||||
|
||||
static int
|
||||
tre_match_approx(const tre_tnfa_t *tnfa, const void *string, ssize_t len,
|
||||
tre_str_type_t type, regamatch_t *match, regaparams_t params,
|
||||
int eflags)
|
||||
{
|
||||
reg_errcode_t status;
|
||||
int *tags = NULL, eo;
|
||||
|
||||
/* If the regexp does not use approximate matching features, the
|
||||
maximum cost is zero, and the approximate matcher isn't forced,
|
||||
use the exact matcher instead. */
|
||||
if (params.max_cost == 0 && !tnfa->have_approx
|
||||
&& !(eflags & REG_APPROX_MATCHER))
|
||||
return tre_match(tnfa, string, len, type, match->nmatch, match->pmatch,
|
||||
eflags);
|
||||
|
||||
/* Back references are not supported by the approximate matcher. */
|
||||
if (tnfa->have_backrefs)
|
||||
return REG_BADPAT;
|
||||
|
||||
if (tnfa->num_tags > 0 && match->nmatch > 0)
|
||||
{
|
||||
#if TRE_USE_ALLOCA
|
||||
tags = alloca(sizeof(*tags) * tnfa->num_tags);
|
||||
#else /* !TRE_USE_ALLOCA */
|
||||
tags = xmalloc(sizeof(*tags) * tnfa->num_tags);
|
||||
#endif /* !TRE_USE_ALLOCA */
|
||||
if (tags == NULL)
|
||||
return REG_ESPACE;
|
||||
}
|
||||
status = tre_tnfa_run_approx(tnfa, string, len, type, tags,
|
||||
match, params, eflags, &eo);
|
||||
if (status == REG_OK)
|
||||
tre_fill_pmatch(match->nmatch, match->pmatch, tnfa->cflags, tnfa, tags, eo);
|
||||
#ifndef TRE_USE_ALLOCA
|
||||
if (tags)
|
||||
xfree(tags);
|
||||
#endif /* !TRE_USE_ALLOCA */
|
||||
return status;
|
||||
}
|
||||
|
||||
int
|
||||
tre_reganexec(const regex_t *preg, const char *str, size_t len,
|
||||
regamatch_t *match, regaparams_t params, int eflags)
|
||||
{
|
||||
tre_tnfa_t *tnfa = (void *)preg->TRE_REGEX_T_FIELD;
|
||||
tre_str_type_t type = (TRE_MB_CUR_MAX == 1) ? STR_BYTE : STR_MBS;
|
||||
|
||||
return tre_match_approx(tnfa, str, len, type, match, params, eflags);
|
||||
}
|
||||
|
||||
int
|
||||
tre_regaexec(const regex_t *preg, const char *str,
|
||||
regamatch_t *match, regaparams_t params, int eflags)
|
||||
{
|
||||
return tre_reganexec(preg, str, -1, match, params, eflags);
|
||||
}
|
||||
|
||||
int
|
||||
tre_regaexecb(const regex_t *preg, const char *str,
|
||||
regamatch_t *match, regaparams_t params, int eflags)
|
||||
{
|
||||
tre_tnfa_t *tnfa = (void *)preg->TRE_REGEX_T_FIELD;
|
||||
|
||||
return tre_match_approx(tnfa, str, -1, STR_BYTE, match, params, eflags);
|
||||
}
|
||||
|
||||
#ifdef TRE_WCHAR
|
||||
|
||||
int
|
||||
tre_regawnexec(const regex_t *preg, const wchar_t *str, size_t len,
|
||||
regamatch_t *match, regaparams_t params, int eflags)
|
||||
{
|
||||
tre_tnfa_t *tnfa = (void *)preg->TRE_REGEX_T_FIELD;
|
||||
return tre_match_approx(tnfa, str, len, STR_WIDE,
|
||||
match, params, eflags);
|
||||
}
|
||||
|
||||
int
|
||||
tre_regawexec(const regex_t *preg, const wchar_t *str,
|
||||
regamatch_t *match, regaparams_t params, int eflags)
|
||||
{
|
||||
return tre_regawnexec(preg, str, -1, match, params, eflags);
|
||||
}
|
||||
|
||||
#endif /* TRE_WCHAR */
|
||||
|
||||
void
|
||||
tre_regaparams_default(regaparams_t *params)
|
||||
{
|
||||
memset(params, 0, sizeof(*params));
|
||||
params->cost_ins = 1;
|
||||
params->cost_del = 1;
|
||||
params->cost_subst = 1;
|
||||
params->max_cost = INT_MAX;
|
||||
params->max_ins = INT_MAX;
|
||||
params->max_del = INT_MAX;
|
||||
params->max_subst = INT_MAX;
|
||||
params->max_err = INT_MAX;
|
||||
}
|
||||
|
||||
#endif /* TRE_APPROX */
|
||||
|
||||
/* EOF */
|
||||
226
deps/tre/lib/tre-ast.c
vendored
Normal file
226
deps/tre/lib/tre-ast.c
vendored
Normal file
|
|
@ -0,0 +1,226 @@
|
|||
/*
|
||||
tre-ast.c - Abstract syntax tree (AST) routines
|
||||
|
||||
This software is released under a BSD-style license.
|
||||
See the file LICENSE for details and copyright.
|
||||
|
||||
*/
|
||||
|
||||
#ifdef HAVE_CONFIG_H
|
||||
#include <config.h>
|
||||
#endif /* HAVE_CONFIG_H */
|
||||
#include <assert.h>
|
||||
|
||||
#include "tre-ast.h"
|
||||
#include "tre-mem.h"
|
||||
|
||||
tre_ast_node_t *
|
||||
tre_ast_new_node(tre_mem_t mem, tre_ast_type_t type, size_t size)
|
||||
{
|
||||
tre_ast_node_t *node;
|
||||
|
||||
node = tre_mem_calloc(mem, sizeof(*node));
|
||||
if (!node)
|
||||
return NULL;
|
||||
node->obj = tre_mem_calloc(mem, size);
|
||||
if (!node->obj)
|
||||
return NULL;
|
||||
node->type = type;
|
||||
node->nullable = -1;
|
||||
node->submatch_id = -1;
|
||||
|
||||
return node;
|
||||
}
|
||||
|
||||
tre_ast_node_t *
|
||||
tre_ast_new_literal(tre_mem_t mem, int code_min, int code_max)
|
||||
{
|
||||
tre_ast_node_t *node;
|
||||
tre_literal_t *lit;
|
||||
|
||||
node = tre_ast_new_node(mem, LITERAL, sizeof(tre_literal_t));
|
||||
if (!node)
|
||||
return NULL;
|
||||
lit = node->obj;
|
||||
lit->code_min = code_min;
|
||||
lit->code_max = code_max;
|
||||
lit->position = -1;
|
||||
|
||||
return node;
|
||||
}
|
||||
|
||||
tre_ast_node_t *
|
||||
tre_ast_new_iter(tre_mem_t mem, tre_ast_node_t *arg, int min, int max,
|
||||
int minimal)
|
||||
{
|
||||
tre_ast_node_t *node;
|
||||
tre_iteration_t *iter;
|
||||
|
||||
node = tre_ast_new_node(mem, ITERATION, sizeof(tre_iteration_t));
|
||||
if (!node)
|
||||
return NULL;
|
||||
iter = node->obj;
|
||||
iter->arg = arg;
|
||||
iter->min = min;
|
||||
iter->max = max;
|
||||
iter->minimal = minimal;
|
||||
node->num_submatches = arg->num_submatches;
|
||||
|
||||
return node;
|
||||
}
|
||||
|
||||
tre_ast_node_t *
|
||||
tre_ast_new_union(tre_mem_t mem, tre_ast_node_t *left, tre_ast_node_t *right)
|
||||
{
|
||||
tre_ast_node_t *node;
|
||||
|
||||
node = tre_ast_new_node(mem, UNION, sizeof(tre_union_t));
|
||||
if (node == NULL)
|
||||
return NULL;
|
||||
((tre_union_t *)node->obj)->left = left;
|
||||
((tre_union_t *)node->obj)->right = right;
|
||||
node->num_submatches = left->num_submatches + right->num_submatches;
|
||||
|
||||
return node;
|
||||
}
|
||||
|
||||
tre_ast_node_t *
|
||||
tre_ast_new_catenation(tre_mem_t mem, tre_ast_node_t *left,
|
||||
tre_ast_node_t *right)
|
||||
{
|
||||
tre_ast_node_t *node;
|
||||
|
||||
node = tre_ast_new_node(mem, CATENATION, sizeof(tre_catenation_t));
|
||||
if (node == NULL)
|
||||
return NULL;
|
||||
((tre_catenation_t *)node->obj)->left = left;
|
||||
((tre_catenation_t *)node->obj)->right = right;
|
||||
node->num_submatches = left->num_submatches + right->num_submatches;
|
||||
|
||||
return node;
|
||||
}
|
||||
|
||||
#ifdef TRE_DEBUG
|
||||
|
||||
static void
|
||||
tre_findent(FILE *stream, int i)
|
||||
{
|
||||
while (i-- > 0)
|
||||
fputc(' ', stream);
|
||||
}
|
||||
|
||||
void
|
||||
tre_print_params(int *params)
|
||||
{
|
||||
int i;
|
||||
if (params)
|
||||
{
|
||||
DPRINT(("params ["));
|
||||
for (i = 0; i < TRE_PARAM_LAST; i++)
|
||||
{
|
||||
if (params[i] == TRE_PARAM_UNSET)
|
||||
DPRINT(("unset"));
|
||||
else if (params[i] == TRE_PARAM_DEFAULT)
|
||||
DPRINT(("default"));
|
||||
else
|
||||
DPRINT(("%d", params[i]));
|
||||
if (i < TRE_PARAM_LAST - 1)
|
||||
DPRINT((", "));
|
||||
}
|
||||
DPRINT(("]"));
|
||||
}
|
||||
}
|
||||
|
||||
static void
|
||||
tre_do_print(FILE *stream, tre_ast_node_t *ast, int indent)
|
||||
{
|
||||
int code_min, code_max, pos;
|
||||
int num_tags = ast->num_tags;
|
||||
tre_literal_t *lit;
|
||||
tre_iteration_t *iter;
|
||||
|
||||
tre_findent(stream, indent);
|
||||
switch (ast->type)
|
||||
{
|
||||
case LITERAL:
|
||||
lit = ast->obj;
|
||||
code_min = lit->code_min;
|
||||
code_max = lit->code_max;
|
||||
pos = lit->position;
|
||||
if (IS_EMPTY(lit))
|
||||
{
|
||||
fprintf(stream, "literal empty\n");
|
||||
}
|
||||
else if (IS_ASSERTION(lit))
|
||||
{
|
||||
int i;
|
||||
char *assertions[] = { "bol", "eol", "ctype", "!ctype",
|
||||
"bow", "eow", "wb", "!wb" };
|
||||
if (code_max >= ASSERT_LAST << 1)
|
||||
assert(0);
|
||||
fprintf(stream, "assertions: ");
|
||||
for (i = 0; (1 << i) <= ASSERT_LAST; i++)
|
||||
if (code_max & (1 << i))
|
||||
fprintf(stream, "%s ", assertions[i]);
|
||||
fprintf(stream, "\n");
|
||||
}
|
||||
else if (IS_TAG(lit))
|
||||
{
|
||||
fprintf(stream, "tag %d\n", code_max);
|
||||
}
|
||||
else if (IS_BACKREF(lit))
|
||||
{
|
||||
fprintf(stream, "backref %d, pos %d\n", code_max, pos);
|
||||
}
|
||||
else if (IS_PARAMETER(lit))
|
||||
{
|
||||
tre_print_params(lit->u.params);
|
||||
fprintf(stream, "\n");
|
||||
}
|
||||
else
|
||||
{
|
||||
fprintf(stream, "literal (%c, %c) (%d, %d), pos %d, sub %d, "
|
||||
"%d tags\n", code_min, code_max, code_min, code_max, pos,
|
||||
ast->submatch_id, num_tags);
|
||||
}
|
||||
break;
|
||||
case ITERATION:
|
||||
iter = ast->obj;
|
||||
fprintf(stream, "iteration {%d, %d}, sub %d, %d tags, %s\n",
|
||||
iter->min, iter->max, ast->submatch_id, num_tags,
|
||||
iter->minimal ? "minimal" : "greedy");
|
||||
tre_do_print(stream, iter->arg, indent + 2);
|
||||
break;
|
||||
case UNION:
|
||||
fprintf(stream, "union, sub %d, %d tags\n", ast->submatch_id, num_tags);
|
||||
tre_do_print(stream, ((tre_union_t *)ast->obj)->left, indent + 2);
|
||||
tre_do_print(stream, ((tre_union_t *)ast->obj)->right, indent + 2);
|
||||
break;
|
||||
case CATENATION:
|
||||
fprintf(stream, "catenation, sub %d, %d tags\n", ast->submatch_id,
|
||||
num_tags);
|
||||
tre_do_print(stream, ((tre_catenation_t *)ast->obj)->left, indent + 2);
|
||||
tre_do_print(stream, ((tre_catenation_t *)ast->obj)->right, indent + 2);
|
||||
break;
|
||||
default:
|
||||
assert(0);
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
static void
|
||||
tre_ast_fprint(FILE *stream, tre_ast_node_t *ast)
|
||||
{
|
||||
tre_do_print(stream, ast, 0);
|
||||
}
|
||||
|
||||
void
|
||||
tre_ast_print(tre_ast_node_t *tree)
|
||||
{
|
||||
printf("AST:\n");
|
||||
tre_ast_fprint(stdout, tree);
|
||||
}
|
||||
|
||||
#endif /* TRE_DEBUG */
|
||||
|
||||
/* EOF */
|
||||
128
deps/tre/lib/tre-ast.h
vendored
Normal file
128
deps/tre/lib/tre-ast.h
vendored
Normal file
|
|
@ -0,0 +1,128 @@
|
|||
/*
|
||||
tre-ast.h - Abstract syntax tree (AST) definitions
|
||||
|
||||
This software is released under a BSD-style license.
|
||||
See the file LICENSE for details and copyright.
|
||||
|
||||
*/
|
||||
|
||||
|
||||
#ifndef TRE_AST_H
|
||||
#define TRE_AST_H 1
|
||||
|
||||
#include "tre-mem.h"
|
||||
#include "tre-internal.h"
|
||||
#include "tre-compile.h"
|
||||
|
||||
/* The different AST node types. */
|
||||
typedef enum {
|
||||
LITERAL,
|
||||
CATENATION,
|
||||
ITERATION,
|
||||
UNION
|
||||
} tre_ast_type_t;
|
||||
|
||||
/* Special subtypes of TRE_LITERAL. */
|
||||
#define EMPTY -1 /* Empty leaf (denotes empty string). */
|
||||
#define ASSERTION -2 /* Assertion leaf. */
|
||||
#define TAG -3 /* Tag leaf. */
|
||||
#define BACKREF -4 /* Back reference leaf. */
|
||||
#define PARAMETER -5 /* Parameter. */
|
||||
|
||||
#define IS_SPECIAL(x) ((x)->code_min < 0)
|
||||
#define IS_EMPTY(x) ((x)->code_min == EMPTY)
|
||||
#define IS_ASSERTION(x) ((x)->code_min == ASSERTION)
|
||||
#define IS_TAG(x) ((x)->code_min == TAG)
|
||||
#define IS_BACKREF(x) ((x)->code_min == BACKREF)
|
||||
#define IS_PARAMETER(x) ((x)->code_min == PARAMETER)
|
||||
|
||||
|
||||
/* A generic AST node. All AST nodes consist of this node on the top
|
||||
level with `obj' pointing to the actual content. */
|
||||
typedef struct {
|
||||
tre_ast_type_t type; /* Type of the node. */
|
||||
void *obj; /* Pointer to actual node. */
|
||||
int nullable;
|
||||
int submatch_id;
|
||||
unsigned int num_submatches;
|
||||
unsigned int num_tags;
|
||||
tre_pos_and_tags_t *firstpos;
|
||||
tre_pos_and_tags_t *lastpos;
|
||||
} tre_ast_node_t;
|
||||
|
||||
|
||||
/* A "literal" node. These are created for assertions, back references,
|
||||
tags, matching parameter settings, and all expressions that match one
|
||||
character. */
|
||||
typedef struct {
|
||||
long code_min;
|
||||
long code_max;
|
||||
int position;
|
||||
union {
|
||||
tre_ctype_t class;
|
||||
int *params;
|
||||
} u;
|
||||
tre_ctype_t *neg_classes;
|
||||
} tre_literal_t;
|
||||
|
||||
/* A "catenation" node. These are created when two regexps are concatenated.
|
||||
If there are more than one subexpressions in sequence, the `left' part
|
||||
holds all but the last, and `right' part holds the last subexpression
|
||||
(catenation is left associative). */
|
||||
typedef struct {
|
||||
tre_ast_node_t *left;
|
||||
tre_ast_node_t *right;
|
||||
} tre_catenation_t;
|
||||
|
||||
/* An "iteration" node. These are created for the "*", "+", "?", and "{m,n}"
|
||||
operators. */
|
||||
typedef struct {
|
||||
/* Subexpression to match. */
|
||||
tre_ast_node_t *arg;
|
||||
/* Minimum number of consecutive matches. */
|
||||
int min;
|
||||
/* Maximum number of consecutive matches. */
|
||||
int max;
|
||||
/* If 0, match as many characters as possible, if 1 match as few as
|
||||
possible. Note that this does not always mean the same thing as
|
||||
matching as many/few repetitions as possible. */
|
||||
unsigned int minimal:1;
|
||||
/* Approximate matching parameters (or NULL). */
|
||||
int *params;
|
||||
} tre_iteration_t;
|
||||
|
||||
/* An "union" node. These are created for the "|" operator. */
|
||||
typedef struct {
|
||||
tre_ast_node_t *left;
|
||||
tre_ast_node_t *right;
|
||||
} tre_union_t;
|
||||
|
||||
tre_ast_node_t *
|
||||
tre_ast_new_node(tre_mem_t mem, tre_ast_type_t type, size_t size);
|
||||
|
||||
tre_ast_node_t *
|
||||
tre_ast_new_literal(tre_mem_t mem, int code_min, int code_max);
|
||||
|
||||
tre_ast_node_t *
|
||||
tre_ast_new_iter(tre_mem_t mem, tre_ast_node_t *arg, int min, int max,
|
||||
int minimal);
|
||||
|
||||
tre_ast_node_t *
|
||||
tre_ast_new_union(tre_mem_t mem, tre_ast_node_t *left, tre_ast_node_t *right);
|
||||
|
||||
tre_ast_node_t *
|
||||
tre_ast_new_catenation(tre_mem_t mem, tre_ast_node_t *left,
|
||||
tre_ast_node_t *right);
|
||||
|
||||
#ifdef TRE_DEBUG
|
||||
void
|
||||
tre_ast_print(tre_ast_node_t *tree);
|
||||
|
||||
/* XXX - rethink AST printing API */
|
||||
void
|
||||
tre_print_params(int *params);
|
||||
#endif /* TRE_DEBUG */
|
||||
|
||||
#endif /* TRE_AST_H */
|
||||
|
||||
/* EOF */
|
||||
2673
deps/tre/lib/tre-compile.c
vendored
Normal file
2673
deps/tre/lib/tre-compile.c
vendored
Normal file
File diff suppressed because it is too large
Load diff
27
deps/tre/lib/tre-compile.h
vendored
Normal file
27
deps/tre/lib/tre-compile.h
vendored
Normal file
|
|
@ -0,0 +1,27 @@
|
|||
/*
|
||||
tre-compile.h: Regex compilation definitions
|
||||
|
||||
This software is released under a BSD-style license.
|
||||
See the file LICENSE for details and copyright.
|
||||
|
||||
*/
|
||||
|
||||
|
||||
#ifndef TRE_COMPILE_H
|
||||
#define TRE_COMPILE_H 1
|
||||
|
||||
typedef struct {
|
||||
int position;
|
||||
int code_min;
|
||||
int code_max;
|
||||
int *tags;
|
||||
int assertions;
|
||||
tre_ctype_t class;
|
||||
tre_ctype_t *neg_classes;
|
||||
int backref;
|
||||
int *params;
|
||||
} tre_pos_and_tags_t;
|
||||
|
||||
#endif /* TRE_COMPILE_H */
|
||||
|
||||
/* EOF */
|
||||
73
deps/tre/lib/tre-filter.c
vendored
Normal file
73
deps/tre/lib/tre-filter.c
vendored
Normal file
|
|
@ -0,0 +1,73 @@
|
|||
/*
|
||||
tre-filter.c: Histogram filter to quickly find regexp match candidates
|
||||
|
||||
This software is released under a BSD-style license.
|
||||
See the file LICENSE for details and copyright.
|
||||
|
||||
*/
|
||||
|
||||
/* The idea of this filter is quite simple. First, let's assume the
|
||||
search pattern is a simple string. In order for a substring of a
|
||||
longer string to match the search pattern, it must have the same
|
||||
numbers of different characters as the pattern, and those
|
||||
characters must occur in the same order as they occur in pattern. */
|
||||
|
||||
#ifdef HAVE_CONFIG_H
|
||||
#include <config.h>
|
||||
#endif /* HAVE_CONFIG_H */
|
||||
#include <stdio.h>
|
||||
#include "tre-internal.h"
|
||||
#include "tre-filter.h"
|
||||
|
||||
int
|
||||
tre_filter_find(const unsigned char *str, size_t len, tre_filter_t *filter)
|
||||
{
|
||||
unsigned short counts[256];
|
||||
unsigned int i;
|
||||
unsigned int window_len = filter->window_len;
|
||||
tre_filter_profile_t *profile = filter->profile;
|
||||
const unsigned char *str_orig = str;
|
||||
|
||||
DPRINT(("tre_filter_find: %.*s\n", len, str));
|
||||
|
||||
for (i = 0; i < elementsof(counts); i++)
|
||||
counts[i] = 0;
|
||||
|
||||
i = 0;
|
||||
while (*str && i < window_len && i < len)
|
||||
{
|
||||
counts[*str]++;
|
||||
i++;
|
||||
str++;
|
||||
len--;
|
||||
}
|
||||
|
||||
while (len > 0)
|
||||
{
|
||||
tre_filter_profile_t *p;
|
||||
counts[*str]++;
|
||||
counts[*(str - window_len)]--;
|
||||
|
||||
p = profile;
|
||||
while (p->ch)
|
||||
{
|
||||
if (counts[p->ch] < p->count)
|
||||
break;
|
||||
p++;
|
||||
}
|
||||
if (!p->ch)
|
||||
{
|
||||
DPRINT(("Found possible match at %d\n",
|
||||
str - str_orig));
|
||||
return str - str_orig;
|
||||
}
|
||||
else
|
||||
{
|
||||
DPRINT(("No match so far...\n"));
|
||||
}
|
||||
len--;
|
||||
str++;
|
||||
}
|
||||
DPRINT(("This string cannot match.\n"));
|
||||
return -1;
|
||||
}
|
||||
19
deps/tre/lib/tre-filter.h
vendored
Normal file
19
deps/tre/lib/tre-filter.h
vendored
Normal file
|
|
@ -0,0 +1,19 @@
|
|||
|
||||
|
||||
|
||||
|
||||
typedef struct {
|
||||
unsigned char ch;
|
||||
unsigned char count;
|
||||
} tre_filter_profile_t;
|
||||
|
||||
typedef struct {
|
||||
/* Length of the window where the character counts are kept. */
|
||||
int window_len;
|
||||
/* Required character counts table. */
|
||||
tre_filter_profile_t *profile;
|
||||
} tre_filter_t;
|
||||
|
||||
|
||||
int
|
||||
tre_filter_find(const unsigned char *str, size_t len, tre_filter_t *filter);
|
||||
319
deps/tre/lib/tre-internal.h
vendored
Normal file
319
deps/tre/lib/tre-internal.h
vendored
Normal file
|
|
@ -0,0 +1,319 @@
|
|||
/*
|
||||
tre-internal.h - TRE internal definitions
|
||||
|
||||
This software is released under a BSD-style license.
|
||||
See the file LICENSE for details and copyright.
|
||||
|
||||
*/
|
||||
|
||||
#ifndef TRE_INTERNAL_H
|
||||
#define TRE_INTERNAL_H 1
|
||||
|
||||
#ifdef HAVE_WCHAR_H
|
||||
#include <wchar.h>
|
||||
#endif /* HAVE_WCHAR_H */
|
||||
|
||||
#ifdef HAVE_WCTYPE_H
|
||||
#include <wctype.h>
|
||||
#endif /* HAVE_WCTYPE_H */
|
||||
|
||||
#ifdef HAVE_SYS_TYPES_H
|
||||
#include <sys/types.h>
|
||||
#endif /* HAVE_SYS_TYPES_H */
|
||||
|
||||
#include <limits.h>
|
||||
#include <ctype.h>
|
||||
#include "../local_includes/tre.h"
|
||||
|
||||
#define TRE_MAX_RE 65536
|
||||
#define TRE_MAX_STRING INT_MAX
|
||||
#define TRE_MAX_STACK 1048576
|
||||
|
||||
#ifdef TRE_DEBUG
|
||||
#include <stdio.h>
|
||||
#define DPRINT(msg) do {printf msg; fflush(stdout);} while(/*CONSTCOND*/(void)0,0)
|
||||
#else /* !TRE_DEBUG */
|
||||
#define DPRINT(msg) do { } while(/*CONSTCOND*/(void)0,0)
|
||||
#endif /* !TRE_DEBUG */
|
||||
|
||||
#define elementsof(x) ( sizeof(x) / sizeof(x[0]) )
|
||||
|
||||
#ifdef HAVE_MBRTOWC
|
||||
#define tre_mbrtowc(pwc, s, n, ps) (mbrtowc((pwc), (s), (n), (ps)))
|
||||
#else /* !HAVE_MBRTOWC */
|
||||
#ifdef HAVE_MBTOWC
|
||||
#define tre_mbrtowc(pwc, s, n, ps) (mbtowc((pwc), (s), (n)))
|
||||
#endif /* HAVE_MBTOWC */
|
||||
#endif /* !HAVE_MBRTOWC */
|
||||
|
||||
#ifdef TRE_MULTIBYTE
|
||||
#ifdef HAVE_MBSTATE_T
|
||||
#define TRE_MBSTATE
|
||||
#endif /* TRE_MULTIBYTE */
|
||||
#endif /* HAVE_MBSTATE_T */
|
||||
|
||||
/* Define the character types and functions. */
|
||||
#ifdef TRE_WCHAR
|
||||
|
||||
/* Wide characters. */
|
||||
typedef wint_t tre_cint_t;
|
||||
#if WCHAR_MAX <= INT_MAX
|
||||
#define TRE_CHAR_MAX WCHAR_MAX
|
||||
#else /* WCHAR_MAX > INT_MAX */
|
||||
#define TRE_CHAR_MAX INT_MAX
|
||||
#endif
|
||||
|
||||
#ifdef TRE_MULTIBYTE
|
||||
#define TRE_MB_CUR_MAX MB_CUR_MAX
|
||||
#else /* !TRE_MULTIBYTE */
|
||||
#define TRE_MB_CUR_MAX 1
|
||||
#endif /* !TRE_MULTIBYTE */
|
||||
|
||||
#define tre_isalnum iswalnum
|
||||
#define tre_isalpha iswalpha
|
||||
#ifdef HAVE_ISWBLANK
|
||||
#define tre_isblank iswblank
|
||||
#endif /* HAVE_ISWBLANK */
|
||||
#define tre_iscntrl iswcntrl
|
||||
#define tre_isdigit iswdigit
|
||||
#define tre_isgraph iswgraph
|
||||
#define tre_islower iswlower
|
||||
#define tre_isprint iswprint
|
||||
#define tre_ispunct iswpunct
|
||||
#define tre_isspace iswspace
|
||||
#define tre_isupper iswupper
|
||||
#define tre_isxdigit iswxdigit
|
||||
|
||||
#define tre_tolower towlower
|
||||
#define tre_toupper towupper
|
||||
#define tre_strlen wcslen
|
||||
|
||||
#else /* !TRE_WCHAR */
|
||||
|
||||
/* 8 bit characters. */
|
||||
typedef short tre_cint_t;
|
||||
#define TRE_CHAR_MAX 255
|
||||
#define TRE_MB_CUR_MAX 1
|
||||
|
||||
#define tre_isalnum isalnum
|
||||
#define tre_isalpha isalpha
|
||||
#ifdef HAVE_ISASCII
|
||||
#define tre_isascii isascii
|
||||
#endif /* HAVE_ISASCII */
|
||||
#ifdef HAVE_ISBLANK
|
||||
#define tre_isblank isblank
|
||||
#endif /* HAVE_ISBLANK */
|
||||
#define tre_iscntrl iscntrl
|
||||
#define tre_isdigit isdigit
|
||||
#define tre_isgraph isgraph
|
||||
#define tre_islower islower
|
||||
#define tre_isprint isprint
|
||||
#define tre_ispunct ispunct
|
||||
#define tre_isspace isspace
|
||||
#define tre_isupper isupper
|
||||
#define tre_isxdigit isxdigit
|
||||
|
||||
#define tre_tolower(c) (tre_cint_t)(tolower(c))
|
||||
#define tre_toupper(c) (tre_cint_t)(toupper(c))
|
||||
#define tre_strlen(s) (strlen((const char*)s))
|
||||
|
||||
#endif /* !TRE_WCHAR */
|
||||
|
||||
#if defined(TRE_WCHAR) && defined(HAVE_ISWCTYPE) && defined(HAVE_WCTYPE)
|
||||
#define TRE_USE_SYSTEM_WCTYPE 1
|
||||
#endif
|
||||
|
||||
#ifdef TRE_USE_SYSTEM_WCTYPE
|
||||
/* Use system provided iswctype() and wctype(). */
|
||||
typedef wctype_t tre_ctype_t;
|
||||
#define tre_isctype iswctype
|
||||
#define tre_ctype wctype
|
||||
#else /* !TRE_USE_SYSTEM_WCTYPE */
|
||||
/* Define our own versions of iswctype() and wctype(). */
|
||||
typedef int (*tre_ctype_t)(tre_cint_t);
|
||||
#define tre_isctype(c, type) ( (type)(c) )
|
||||
tre_ctype_t tre_ctype(const char *name);
|
||||
#endif /* !TRE_USE_SYSTEM_WCTYPE */
|
||||
|
||||
typedef enum { STR_WIDE, STR_BYTE, STR_MBS, STR_USER } tre_str_type_t;
|
||||
|
||||
/* Returns number of bytes to add to (char *)ptr to make it
|
||||
properly aligned for the type. */
|
||||
#define ALIGN(ptr, type) \
|
||||
((((long)ptr) % sizeof(type)) \
|
||||
? (sizeof(type) - (((long)ptr) % sizeof(type))) \
|
||||
: 0)
|
||||
|
||||
#undef MAX
|
||||
#undef MIN
|
||||
#define MAX(a, b) (((a) >= (b)) ? (a) : (b))
|
||||
#define MIN(a, b) (((a) <= (b)) ? (a) : (b))
|
||||
|
||||
/* Define STRF to the correct printf formatter for strings. */
|
||||
#ifdef TRE_WCHAR
|
||||
#define STRF "ls"
|
||||
#else /* !TRE_WCHAR */
|
||||
#define STRF "s"
|
||||
#endif /* !TRE_WCHAR */
|
||||
|
||||
/* TNFA transition type. A TNFA state is an array of transitions,
|
||||
the terminator is a transition with NULL `state'. */
|
||||
typedef struct tnfa_transition tre_tnfa_transition_t;
|
||||
|
||||
struct tnfa_transition {
|
||||
/* Range of accepted characters. */
|
||||
tre_cint_t code_min;
|
||||
tre_cint_t code_max;
|
||||
/* Pointer to the destination state. */
|
||||
tre_tnfa_transition_t *state;
|
||||
/* ID number of the destination state. */
|
||||
int state_id;
|
||||
/* -1 terminated array of tags (or NULL). */
|
||||
int *tags;
|
||||
/* Matching parameters settings (or NULL). */
|
||||
int *params;
|
||||
/* Assertion bitmap. */
|
||||
int assertions;
|
||||
/* Assertion parameters. */
|
||||
union {
|
||||
/* Character class assertion. */
|
||||
tre_ctype_t class;
|
||||
/* Back reference assertion. */
|
||||
int backref;
|
||||
} u;
|
||||
/* Negative character class assertions. */
|
||||
tre_ctype_t *neg_classes;
|
||||
};
|
||||
|
||||
|
||||
/* Assertions. */
|
||||
#define ASSERT_AT_BOL 1 /* Beginning of line. */
|
||||
#define ASSERT_AT_EOL 2 /* End of line. */
|
||||
#define ASSERT_CHAR_CLASS 4 /* Character class in `class'. */
|
||||
#define ASSERT_CHAR_CLASS_NEG 8 /* Character classes in `neg_classes'. */
|
||||
#define ASSERT_AT_BOW 16 /* Beginning of word. */
|
||||
#define ASSERT_AT_EOW 32 /* End of word. */
|
||||
#define ASSERT_AT_WB 64 /* Word boundary. */
|
||||
#define ASSERT_AT_WB_NEG 128 /* Not a word boundary. */
|
||||
#define ASSERT_BACKREF 256 /* A back reference in `backref'. */
|
||||
#define ASSERT_LAST 256
|
||||
|
||||
/* Tag directions. */
|
||||
typedef enum {
|
||||
TRE_TAG_MINIMIZE = 0,
|
||||
TRE_TAG_MAXIMIZE = 1
|
||||
} tre_tag_direction_t;
|
||||
|
||||
/* Parameters that can be changed dynamically while matching. */
|
||||
typedef enum {
|
||||
TRE_PARAM_COST_INS = 0,
|
||||
TRE_PARAM_COST_DEL = 1,
|
||||
TRE_PARAM_COST_SUBST = 2,
|
||||
TRE_PARAM_COST_MAX = 3,
|
||||
TRE_PARAM_MAX_INS = 4,
|
||||
TRE_PARAM_MAX_DEL = 5,
|
||||
TRE_PARAM_MAX_SUBST = 6,
|
||||
TRE_PARAM_MAX_ERR = 7,
|
||||
TRE_PARAM_DEPTH = 8,
|
||||
TRE_PARAM_LAST = 9
|
||||
} tre_param_t;
|
||||
|
||||
/* Unset matching parameter */
|
||||
#define TRE_PARAM_UNSET -1
|
||||
|
||||
/* Signifies the default matching parameter value. */
|
||||
#define TRE_PARAM_DEFAULT -2
|
||||
|
||||
/* Instructions to compute submatch register values from tag values
|
||||
after a successful match. */
|
||||
struct tre_submatch_data {
|
||||
/* Tag that gives the value for rm_so (submatch start offset). */
|
||||
int so_tag;
|
||||
/* Tag that gives the value for rm_eo (submatch end offset). */
|
||||
int eo_tag;
|
||||
/* List of submatches this submatch is contained in. */
|
||||
int *parents;
|
||||
};
|
||||
|
||||
typedef struct tre_submatch_data tre_submatch_data_t;
|
||||
|
||||
typedef enum {
|
||||
TRE_LITERAL_OPT_NONE = 0,
|
||||
TRE_LITERAL_OPT_CONTAINS,
|
||||
TRE_LITERAL_OPT_PREFIX,
|
||||
TRE_LITERAL_OPT_SUFFIX,
|
||||
TRE_LITERAL_OPT_EXACT
|
||||
} tre_literal_opt_mode_t;
|
||||
|
||||
typedef struct {
|
||||
unsigned char *data;
|
||||
size_t len;
|
||||
} tre_literal_opt_literal_t;
|
||||
|
||||
typedef struct {
|
||||
tre_literal_opt_mode_t mode;
|
||||
int nocase;
|
||||
size_t num_literals;
|
||||
/* Folded byte mapping used by the nocase fast path. */
|
||||
unsigned char fold_map[256];
|
||||
/* Literal index ranges grouped by the first literal byte. */
|
||||
size_t start_offsets[257];
|
||||
tre_literal_opt_literal_t *literals;
|
||||
} tre_literal_opt_t;
|
||||
|
||||
|
||||
/* TNFA definition. */
|
||||
typedef struct tnfa tre_tnfa_t;
|
||||
|
||||
struct tnfa {
|
||||
tre_tnfa_transition_t *transitions;
|
||||
unsigned int num_transitions;
|
||||
tre_tnfa_transition_t *initial;
|
||||
tre_tnfa_transition_t *final;
|
||||
tre_submatch_data_t *submatch_data;
|
||||
char *firstpos_chars;
|
||||
int first_char;
|
||||
unsigned int num_submatches;
|
||||
tre_tag_direction_t *tag_directions;
|
||||
int *minimal_tags;
|
||||
int num_tags;
|
||||
int num_minimals;
|
||||
int end_tag;
|
||||
int num_states;
|
||||
int cflags;
|
||||
int have_backrefs;
|
||||
int have_approx;
|
||||
int params_depth;
|
||||
tre_literal_opt_t literal_opt;
|
||||
};
|
||||
|
||||
int
|
||||
tre_compile(regex_t *preg, const tre_char_t *regex, size_t n, int cflags);
|
||||
|
||||
void
|
||||
tre_free(regex_t *preg);
|
||||
|
||||
void
|
||||
tre_fill_pmatch(size_t nmatch, regmatch_t pmatch[], int cflags,
|
||||
const tre_tnfa_t *tnfa, int *tags, int match_eo);
|
||||
|
||||
reg_errcode_t
|
||||
tre_tnfa_run_parallel(const tre_tnfa_t *tnfa, const void *string, ssize_t len,
|
||||
tre_str_type_t type, int *match_tags, int eflags,
|
||||
int *match_end_ofs);
|
||||
|
||||
reg_errcode_t
|
||||
tre_tnfa_run_backtrack(const tre_tnfa_t *tnfa, const void *string, ssize_t len,
|
||||
tre_str_type_t type, int *match_tags, int eflags,
|
||||
int *match_end_ofs);
|
||||
|
||||
#ifdef TRE_APPROX
|
||||
reg_errcode_t
|
||||
tre_tnfa_run_approx(const tre_tnfa_t *tnfa, const void *string, ssize_t len,
|
||||
tre_str_type_t type, int *match_tags, regamatch_t *match,
|
||||
regaparams_t params, int eflags, int *match_end_ofs);
|
||||
#endif /* TRE_APPROX */
|
||||
|
||||
#endif /* TRE_INTERNAL_H */
|
||||
|
||||
/* EOF */
|
||||
676
deps/tre/lib/tre-match-backtrack.c
vendored
Normal file
676
deps/tre/lib/tre-match-backtrack.c
vendored
Normal file
|
|
@ -0,0 +1,676 @@
|
|||
/*
|
||||
tre-match-backtrack.c - TRE backtracking regex matching engine
|
||||
|
||||
This software is released under a BSD-style license.
|
||||
See the file LICENSE for details and copyright.
|
||||
|
||||
*/
|
||||
|
||||
/*
|
||||
This matcher is for regexps that use back referencing. Regexp matching
|
||||
with back referencing is an NP-complete problem on the number of back
|
||||
references. The easiest way to match them is to use a backtracking
|
||||
routine which basically goes through all possible paths in the TNFA
|
||||
and chooses the one which results in the best (leftmost and longest)
|
||||
match. This can be spectacularly expensive and may run out of stack
|
||||
space, but there really is no better known generic algorithm. Quoting
|
||||
Henry Spencer from comp.compilers:
|
||||
<URL: http://compilers.iecc.com/comparch/article/93-03-102>
|
||||
|
||||
POSIX.2 REs require longest match, which is really exciting to
|
||||
implement since the obsolete ("basic") variant also includes
|
||||
\<digit>. I haven't found a better way of tackling this than doing
|
||||
a preliminary match using a DFA (or simulation) on a modified RE
|
||||
that just replicates subREs for \<digit>, and then doing a
|
||||
backtracking match to determine whether the subRE matches were
|
||||
right. This can be rather slow, but I console myself with the
|
||||
thought that people who use \<digit> deserve very slow execution.
|
||||
(Pun unintentional but very appropriate.)
|
||||
|
||||
*/
|
||||
|
||||
|
||||
#ifdef HAVE_CONFIG_H
|
||||
#include <config.h>
|
||||
#endif /* HAVE_CONFIG_H */
|
||||
|
||||
#ifdef TRE_USE_ALLOCA
|
||||
/* AIX requires this to be the first thing in the file. */
|
||||
#ifndef __GNUC__
|
||||
# if HAVE_ALLOCA_H
|
||||
# include <alloca.h>
|
||||
# else
|
||||
# ifdef _AIX
|
||||
#pragma alloca
|
||||
# else
|
||||
# ifndef alloca /* predefined by HP cc +Olibcalls */
|
||||
char *alloca ();
|
||||
# endif
|
||||
# endif
|
||||
# endif
|
||||
#endif
|
||||
#endif /* TRE_USE_ALLOCA */
|
||||
|
||||
#include <assert.h>
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
#ifdef HAVE_WCHAR_H
|
||||
#include <wchar.h>
|
||||
#endif /* HAVE_WCHAR_H */
|
||||
#ifdef HAVE_WCTYPE_H
|
||||
#include <wctype.h>
|
||||
#endif /* HAVE_WCTYPE_H */
|
||||
#ifndef TRE_WCHAR
|
||||
#include <ctype.h>
|
||||
#endif /* !TRE_WCHAR */
|
||||
#ifdef HAVE_MALLOC_H
|
||||
#include <malloc.h>
|
||||
#endif /* HAVE_MALLOC_H */
|
||||
|
||||
#include "tre-internal.h"
|
||||
#include "tre-mem.h"
|
||||
#include "tre-match-utils.h"
|
||||
#include "xmalloc.h"
|
||||
|
||||
typedef struct {
|
||||
int pos;
|
||||
const char *str_byte;
|
||||
#ifdef TRE_WCHAR
|
||||
const wchar_t *str_wide;
|
||||
#endif /* TRE_WCHAR */
|
||||
tre_tnfa_transition_t *state;
|
||||
int state_id;
|
||||
int next_c;
|
||||
int *tags;
|
||||
#ifdef TRE_MBSTATE
|
||||
mbstate_t mbstate;
|
||||
#endif /* TRE_MBSTATE */
|
||||
} tre_backtrack_item_t;
|
||||
|
||||
typedef struct tre_backtrack_struct {
|
||||
tre_backtrack_item_t item;
|
||||
struct tre_backtrack_struct *prev;
|
||||
struct tre_backtrack_struct *next;
|
||||
} *tre_backtrack_t;
|
||||
|
||||
#ifdef TRE_WCHAR
|
||||
#define BT_STACK_WIDE_IN(_str_wide) stack->item.str_wide = (_str_wide)
|
||||
#define BT_STACK_WIDE_OUT (str_wide) = stack->item.str_wide
|
||||
#else /* !TRE_WCHAR */
|
||||
#define BT_STACK_WIDE_IN(_str_wide)
|
||||
#define BT_STACK_WIDE_OUT
|
||||
#endif /* !TRE_WCHAR */
|
||||
|
||||
#ifdef TRE_MBSTATE
|
||||
#define BT_STACK_MBSTATE_IN stack->item.mbstate = (mbstate)
|
||||
#define BT_STACK_MBSTATE_OUT (mbstate) = stack->item.mbstate
|
||||
#else /* !TRE_MBSTATE */
|
||||
#define BT_STACK_MBSTATE_IN
|
||||
#define BT_STACK_MBSTATE_OUT
|
||||
#endif /* !TRE_MBSTATE */
|
||||
|
||||
|
||||
#ifdef TRE_USE_ALLOCA
|
||||
#define tre_bt_mem_new tre_mem_newa
|
||||
#define tre_bt_mem_alloc tre_mem_alloca
|
||||
#define tre_bt_mem_destroy(obj) do { } while (0)
|
||||
#define xafree(obj) do { } while (0) /* do nothing, obj was obtained with alloca() */
|
||||
#else /* !TRE_USE_ALLOCA */
|
||||
#define tre_bt_mem_new tre_mem_new
|
||||
#define tre_bt_mem_alloc tre_mem_alloc
|
||||
#define tre_bt_mem_destroy tre_mem_destroy
|
||||
#define xafree(obj) xfree(obj)
|
||||
#endif /* !TRE_USE_ALLOCA */
|
||||
|
||||
|
||||
#define BT_STACK_PUSH(_pos, _str_byte, _str_wide, _state, _state_id, _next_c, _tags, _mbstate) \
|
||||
do \
|
||||
{ \
|
||||
int i; \
|
||||
if (!stack->next) \
|
||||
{ \
|
||||
tre_backtrack_t s; \
|
||||
s = tre_bt_mem_alloc(mem, sizeof(*s)); \
|
||||
if (!s) \
|
||||
{ \
|
||||
tre_bt_mem_destroy(mem); \
|
||||
if (tags) \
|
||||
xafree(tags); \
|
||||
if (pmatch) \
|
||||
xafree(pmatch); \
|
||||
if (states_seen) \
|
||||
xafree(states_seen); \
|
||||
return REG_ESPACE; \
|
||||
} \
|
||||
s->prev = stack; \
|
||||
s->next = NULL; \
|
||||
s->item.tags = tre_bt_mem_alloc(mem, \
|
||||
sizeof(*tags) * tnfa->num_tags); \
|
||||
if (!s->item.tags) \
|
||||
{ \
|
||||
tre_bt_mem_destroy(mem); \
|
||||
if (tags) \
|
||||
xafree(tags); \
|
||||
if (pmatch) \
|
||||
xafree(pmatch); \
|
||||
if (states_seen) \
|
||||
xafree(states_seen); \
|
||||
return REG_ESPACE; \
|
||||
} \
|
||||
stack->next = s; \
|
||||
stack = s; \
|
||||
} \
|
||||
else \
|
||||
stack = stack->next; \
|
||||
stack->item.pos = (_pos); \
|
||||
stack->item.str_byte = (_str_byte); \
|
||||
BT_STACK_WIDE_IN(_str_wide); \
|
||||
stack->item.state = (_state); \
|
||||
stack->item.state_id = (_state_id); \
|
||||
stack->item.next_c = (_next_c); \
|
||||
for (i = 0; i < tnfa->num_tags; i++) \
|
||||
stack->item.tags[i] = (_tags)[i]; \
|
||||
BT_STACK_MBSTATE_IN; \
|
||||
} \
|
||||
while (/*CONSTCOND*/(void)0,0)
|
||||
|
||||
#define BT_STACK_POP() \
|
||||
do \
|
||||
{ \
|
||||
int i; \
|
||||
assert(stack->prev); \
|
||||
pos = stack->item.pos; \
|
||||
if (type == STR_USER) \
|
||||
str_source->rewind(pos + pos_add_next, str_source->context); \
|
||||
str_byte = stack->item.str_byte; \
|
||||
BT_STACK_WIDE_OUT; \
|
||||
state = stack->item.state; \
|
||||
next_c = (tre_char_t) stack->item.next_c; \
|
||||
for (i = 0; i < tnfa->num_tags; i++) \
|
||||
tags[i] = stack->item.tags[i]; \
|
||||
BT_STACK_MBSTATE_OUT; \
|
||||
stack = stack->prev; \
|
||||
} \
|
||||
while (/*CONSTCOND*/(void)0,0)
|
||||
|
||||
#undef MIN
|
||||
#define MIN(a, b) ((a) <= (b) ? (a) : (b))
|
||||
|
||||
reg_errcode_t
|
||||
tre_tnfa_run_backtrack(const tre_tnfa_t *tnfa, const void *string,
|
||||
ssize_t len, tre_str_type_t type, int *match_tags,
|
||||
int eflags, int *match_end_ofs)
|
||||
{
|
||||
/* State variables required by GET_NEXT_WCHAR. */
|
||||
tre_char_t prev_c = 0, next_c = 0;
|
||||
const char *str_byte = string;
|
||||
ssize_t pos = 0;
|
||||
unsigned int pos_add_next = 1;
|
||||
#ifdef TRE_WCHAR
|
||||
const wchar_t *str_wide = string;
|
||||
#ifdef TRE_MBSTATE
|
||||
mbstate_t mbstate;
|
||||
#endif /* TRE_MBSTATE */
|
||||
#endif /* TRE_WCHAR */
|
||||
int reg_notbol = eflags & REG_NOTBOL;
|
||||
int reg_noteol = eflags & REG_NOTEOL;
|
||||
int reg_newline = tnfa->cflags & REG_NEWLINE;
|
||||
int str_user_end = 0;
|
||||
|
||||
/* These are used to remember the necessary values of the above
|
||||
variables to return to the position where the current search
|
||||
started from. */
|
||||
int next_c_start;
|
||||
const char *str_byte_start;
|
||||
int pos_start = -1;
|
||||
#ifdef TRE_WCHAR
|
||||
const wchar_t *str_wide_start;
|
||||
#endif /* TRE_WCHAR */
|
||||
#ifdef TRE_MBSTATE
|
||||
mbstate_t mbstate_start;
|
||||
#endif /* TRE_MBSTATE */
|
||||
reg_errcode_t ret;
|
||||
|
||||
/* End offset of best match so far, or -1 if no match found yet. */
|
||||
int match_eo = -1;
|
||||
/* Tag arrays. */
|
||||
int *next_tags, *tags = NULL;
|
||||
/* Current TNFA state. */
|
||||
tre_tnfa_transition_t *state;
|
||||
int *states_seen = NULL;
|
||||
|
||||
/* Memory allocator to for allocating the backtracking stack. */
|
||||
tre_mem_t mem = tre_bt_mem_new();
|
||||
|
||||
/* The backtracking stack. */
|
||||
tre_backtrack_t stack;
|
||||
|
||||
tre_tnfa_transition_t *trans_i;
|
||||
regmatch_t *pmatch = NULL;
|
||||
|
||||
/*
|
||||
* TRE internals tend to use int instead of size_t for positions or
|
||||
* lengths and don't check for overflow. This will take time to fix
|
||||
* properly. In the meantime, simply limit the input to what we can
|
||||
* handle.
|
||||
*/
|
||||
if (len > TRE_MAX_STRING)
|
||||
len = TRE_MAX_STRING;
|
||||
|
||||
#ifdef TRE_MBSTATE
|
||||
memset(&mbstate, '\0', sizeof(mbstate));
|
||||
#endif /* TRE_MBSTATE */
|
||||
|
||||
if (!mem)
|
||||
return REG_ESPACE;
|
||||
stack = tre_bt_mem_alloc(mem, sizeof(*stack));
|
||||
if (!stack)
|
||||
{
|
||||
ret = REG_ESPACE;
|
||||
goto error_exit;
|
||||
}
|
||||
stack->prev = NULL;
|
||||
stack->next = NULL;
|
||||
|
||||
DPRINT(("tnfa_execute_backtrack, input type %d\n", type));
|
||||
DPRINT(("len = %zd\n", len));
|
||||
|
||||
#ifdef TRE_USE_ALLOCA
|
||||
tags = alloca(sizeof(*tags) * tnfa->num_tags);
|
||||
pmatch = alloca(sizeof(*pmatch) * tnfa->num_submatches);
|
||||
states_seen = alloca(sizeof(*states_seen) * tnfa->num_states);
|
||||
#else /* !TRE_USE_ALLOCA */
|
||||
if (tnfa->num_tags)
|
||||
{
|
||||
tags = xmalloc(sizeof(*tags) * tnfa->num_tags);
|
||||
if (!tags)
|
||||
{
|
||||
ret = REG_ESPACE;
|
||||
goto error_exit;
|
||||
}
|
||||
}
|
||||
if (tnfa->num_submatches)
|
||||
{
|
||||
pmatch = xmalloc(sizeof(*pmatch) * tnfa->num_submatches);
|
||||
if (!pmatch)
|
||||
{
|
||||
ret = REG_ESPACE;
|
||||
goto error_exit;
|
||||
}
|
||||
}
|
||||
if (tnfa->num_states)
|
||||
{
|
||||
states_seen = xmalloc(sizeof(*states_seen) * tnfa->num_states);
|
||||
if (!states_seen)
|
||||
{
|
||||
ret = REG_ESPACE;
|
||||
goto error_exit;
|
||||
}
|
||||
}
|
||||
#endif /* !TRE_USE_ALLOCA */
|
||||
|
||||
retry:
|
||||
{
|
||||
int i;
|
||||
for (i = 0; i < tnfa->num_tags; i++)
|
||||
{
|
||||
tags[i] = -1;
|
||||
if (match_tags)
|
||||
match_tags[i] = -1;
|
||||
}
|
||||
for (i = 0; i < tnfa->num_states; i++)
|
||||
states_seen[i] = 0;
|
||||
}
|
||||
|
||||
state = NULL;
|
||||
pos = pos_start;
|
||||
if (type == STR_USER)
|
||||
str_source->rewind(pos + pos_add_next, str_source->context);
|
||||
GET_NEXT_WCHAR();
|
||||
pos_start = pos;
|
||||
next_c_start = next_c;
|
||||
str_byte_start = str_byte;
|
||||
#ifdef TRE_WCHAR
|
||||
str_wide_start = str_wide;
|
||||
#endif /* TRE_WCHAR */
|
||||
#ifdef TRE_MBSTATE
|
||||
mbstate_start = mbstate;
|
||||
#endif /* TRE_MBSTATE */
|
||||
|
||||
/* Handle initial states. */
|
||||
next_tags = NULL;
|
||||
for (trans_i = tnfa->initial; trans_i->state; trans_i++)
|
||||
{
|
||||
DPRINT(("> init %p, prev_c %lc\n", trans_i->state, (tre_cint_t)prev_c));
|
||||
if (trans_i->assertions && CHECK_ASSERTIONS(trans_i->assertions))
|
||||
{
|
||||
DPRINT(("assert failed\n"));
|
||||
continue;
|
||||
}
|
||||
if (state == NULL)
|
||||
{
|
||||
/* Start from this state. */
|
||||
state = trans_i->state;
|
||||
next_tags = trans_i->tags;
|
||||
}
|
||||
else
|
||||
{
|
||||
/* Backtrack to this state. */
|
||||
DPRINT(("saving state %d for backtracking\n", trans_i->state_id));
|
||||
BT_STACK_PUSH(pos, str_byte, str_wide, trans_i->state,
|
||||
trans_i->state_id, next_c, tags, mbstate);
|
||||
{
|
||||
int *tmp = trans_i->tags;
|
||||
if (tmp)
|
||||
while (*tmp >= 0)
|
||||
stack->item.tags[*tmp++] = pos;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if (next_tags)
|
||||
for (; *next_tags >= 0; next_tags++)
|
||||
tags[*next_tags] = pos;
|
||||
|
||||
|
||||
DPRINT(("entering match loop, pos %zd, str_byte %p\n", pos, str_byte));
|
||||
DPRINT(("pos:chr/code | state and tags\n"));
|
||||
DPRINT(("-------------+------------------------------------------------\n"));
|
||||
|
||||
if (state == NULL)
|
||||
goto backtrack;
|
||||
|
||||
while (/*CONSTCOND*/(void)1,1)
|
||||
{
|
||||
tre_tnfa_transition_t *next_state;
|
||||
int empty_br_match;
|
||||
|
||||
DPRINT(("start loop\n"));
|
||||
if (state == tnfa->final)
|
||||
{
|
||||
DPRINT((" match found, %d %zd\n", match_eo, pos));
|
||||
if (match_eo < pos
|
||||
|| (match_eo == pos
|
||||
&& match_tags
|
||||
&& tre_tag_order(tnfa->num_tags, tnfa->tag_directions,
|
||||
tags, match_tags)))
|
||||
{
|
||||
int i;
|
||||
/* This match wins the previous match. */
|
||||
DPRINT((" win previous\n"));
|
||||
match_eo = pos;
|
||||
if (match_tags)
|
||||
for (i = 0; i < tnfa->num_tags; i++)
|
||||
match_tags[i] = tags[i];
|
||||
}
|
||||
/* Our TNFAs never have transitions leaving from the final state,
|
||||
so we jump right to backtracking. */
|
||||
goto backtrack;
|
||||
}
|
||||
|
||||
#ifdef TRE_DEBUG
|
||||
DPRINT(("%3zd:%2lc/%05d | %p ", pos, (tre_cint_t)next_c, (int)next_c,
|
||||
state));
|
||||
{
|
||||
int i;
|
||||
for (i = 0; i < tnfa->num_tags; i++)
|
||||
DPRINT(("%d%s", tags[i], i < tnfa->num_tags - 1 ? ", " : ""));
|
||||
DPRINT(("\n"));
|
||||
}
|
||||
#endif /* TRE_DEBUG */
|
||||
|
||||
/* Go to the next character in the input string. */
|
||||
empty_br_match = 0;
|
||||
trans_i = state;
|
||||
if (trans_i->state && trans_i->assertions & ASSERT_BACKREF)
|
||||
{
|
||||
/* This is a back reference state. All transitions leaving from
|
||||
this state have the same back reference "assertion". Instead
|
||||
of reading the next character, we match the back reference. */
|
||||
int so, eo, bt = trans_i->u.backref;
|
||||
int bt_len;
|
||||
int result;
|
||||
|
||||
DPRINT((" should match back reference %d\n", bt));
|
||||
/* Get the substring we need to match against. Remember to
|
||||
turn off REG_NOSUB temporarily. */
|
||||
tre_fill_pmatch(bt + 1, pmatch, tnfa->cflags & ~REG_NOSUB,
|
||||
tnfa, tags, pos);
|
||||
so = pmatch[bt].rm_so;
|
||||
eo = pmatch[bt].rm_eo;
|
||||
bt_len = eo - so;
|
||||
|
||||
#ifdef TRE_DEBUG
|
||||
{
|
||||
int slen;
|
||||
if (len < 0)
|
||||
slen = bt_len;
|
||||
else
|
||||
slen = MIN(bt_len, len - pos);
|
||||
|
||||
if (type == STR_BYTE)
|
||||
{
|
||||
DPRINT((" substring (len %d) is [%d, %d[: '%.*s'\n",
|
||||
bt_len, so, eo, bt_len, (char*)string + so));
|
||||
DPRINT((" current string is '%.*s'\n", slen, str_byte - 1));
|
||||
}
|
||||
#ifdef TRE_WCHAR
|
||||
else if (type == STR_WIDE)
|
||||
{
|
||||
DPRINT((" substring (len %d) is [%d, %d[: '%.*" STRF "'\n",
|
||||
bt_len, so, eo, bt_len, (wchar_t*)string + so));
|
||||
DPRINT((" current string is '%.*" STRF "'\n",
|
||||
slen, str_wide - 1));
|
||||
}
|
||||
#endif /* TRE_WCHAR */
|
||||
}
|
||||
#endif
|
||||
|
||||
if (len < 0)
|
||||
{
|
||||
if (type == STR_USER)
|
||||
result = str_source->compare((unsigned)so, (unsigned)pos,
|
||||
(unsigned)bt_len,
|
||||
str_source->context);
|
||||
#ifdef TRE_WCHAR
|
||||
else if (type == STR_WIDE)
|
||||
result = wcsncmp((const wchar_t*)string + so, str_wide - 1,
|
||||
(size_t)bt_len);
|
||||
#endif /* TRE_WCHAR */
|
||||
else
|
||||
result = strncmp((const char*)string + so, str_byte - 1,
|
||||
(size_t)bt_len);
|
||||
}
|
||||
else if (len - pos < bt_len)
|
||||
result = 1;
|
||||
#ifdef TRE_WCHAR
|
||||
else if (type == STR_WIDE)
|
||||
result = wmemcmp((const wchar_t*)string + so, str_wide - 1,
|
||||
(size_t)bt_len);
|
||||
#endif /* TRE_WCHAR */
|
||||
else
|
||||
result = memcmp((const char*)string + so, str_byte - 1,
|
||||
(size_t)bt_len);
|
||||
|
||||
if (result == 0)
|
||||
{
|
||||
/* Back reference matched. Check for infinite loop. */
|
||||
if (bt_len == 0)
|
||||
empty_br_match = 1;
|
||||
if (empty_br_match && states_seen[trans_i->state_id])
|
||||
{
|
||||
DPRINT((" avoid loop\n"));
|
||||
goto backtrack;
|
||||
}
|
||||
|
||||
states_seen[trans_i->state_id] = empty_br_match;
|
||||
|
||||
/* Advance in input string and resync `prev_c', `next_c'
|
||||
and pos. */
|
||||
DPRINT((" back reference matched\n"));
|
||||
str_byte += bt_len - 1;
|
||||
#ifdef TRE_WCHAR
|
||||
str_wide += bt_len - 1;
|
||||
#endif /* TRE_WCHAR */
|
||||
pos += bt_len - 1;
|
||||
GET_NEXT_WCHAR();
|
||||
DPRINT((" pos now %zd\n", pos));
|
||||
}
|
||||
else
|
||||
{
|
||||
DPRINT((" back reference did not match\n"));
|
||||
goto backtrack;
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
/* Check for end of string. */
|
||||
if (len < 0)
|
||||
{
|
||||
if (type == STR_USER)
|
||||
{
|
||||
if (str_user_end)
|
||||
goto backtrack;
|
||||
}
|
||||
else if (next_c == L'\0' || pos >= TRE_MAX_STRING)
|
||||
goto backtrack;
|
||||
}
|
||||
else
|
||||
{
|
||||
if (pos >= len)
|
||||
goto backtrack;
|
||||
}
|
||||
|
||||
/* Read the next character. */
|
||||
GET_NEXT_WCHAR();
|
||||
}
|
||||
|
||||
next_state = NULL;
|
||||
for (trans_i = state; trans_i->state; trans_i++)
|
||||
{
|
||||
DPRINT((" transition %d-%d (%c-%c) %d to %d\n",
|
||||
trans_i->code_min, trans_i->code_max,
|
||||
trans_i->code_min, trans_i->code_max,
|
||||
trans_i->assertions, trans_i->state_id));
|
||||
if (trans_i->code_min <= (tre_cint_t)prev_c
|
||||
&& trans_i->code_max >= (tre_cint_t)prev_c)
|
||||
{
|
||||
if (trans_i->assertions
|
||||
&& (CHECK_ASSERTIONS(trans_i->assertions)
|
||||
|| CHECK_CHAR_CLASSES(trans_i, tnfa, eflags)))
|
||||
{
|
||||
DPRINT((" assertion failed\n"));
|
||||
continue;
|
||||
}
|
||||
|
||||
if (next_state == NULL)
|
||||
{
|
||||
/* First matching transition. */
|
||||
DPRINT((" Next state is %d\n", trans_i->state_id));
|
||||
next_state = trans_i->state;
|
||||
next_tags = trans_i->tags;
|
||||
}
|
||||
else
|
||||
{
|
||||
/* Second matching transition. We may need to backtrack here
|
||||
to take this transition instead of the first one, so we
|
||||
push this transition in the backtracking stack so we can
|
||||
jump back here if needed. */
|
||||
DPRINT((" saving state %d for backtracking\n",
|
||||
trans_i->state_id));
|
||||
BT_STACK_PUSH(pos, str_byte, str_wide, trans_i->state,
|
||||
trans_i->state_id, next_c, tags, mbstate);
|
||||
{
|
||||
int *tmp;
|
||||
for (tmp = trans_i->tags; tmp && *tmp >= 0; tmp++)
|
||||
stack->item.tags[*tmp] = pos;
|
||||
}
|
||||
#if 0 /* XXX - it's important not to look at all transitions here to keep
|
||||
the stack small! */
|
||||
break;
|
||||
#endif
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if (next_state != NULL)
|
||||
{
|
||||
/* Matching transitions were found. Take the first one. */
|
||||
state = next_state;
|
||||
|
||||
/* Update the tag values. */
|
||||
if (next_tags)
|
||||
while (*next_tags >= 0)
|
||||
tags[*next_tags++] = pos;
|
||||
}
|
||||
else
|
||||
{
|
||||
backtrack:
|
||||
/* A matching transition was not found. Try to backtrack. */
|
||||
if (stack->prev)
|
||||
{
|
||||
DPRINT((" backtracking\n"));
|
||||
if (stack->item.state->assertions & ASSERT_BACKREF)
|
||||
{
|
||||
DPRINT((" states_seen[%d] = 0\n",
|
||||
stack->item.state_id));
|
||||
states_seen[stack->item.state_id] = 0;
|
||||
}
|
||||
|
||||
BT_STACK_POP();
|
||||
}
|
||||
else if (match_eo < 0)
|
||||
{
|
||||
/* Try starting from a later position in the input string. */
|
||||
/* Check for end of string. */
|
||||
if (len < 0)
|
||||
{
|
||||
if (next_c_start == L'\0' || pos_start >= TRE_MAX_STRING)
|
||||
{
|
||||
DPRINT(("end of string.\n"));
|
||||
break;
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
if (pos_start >= len)
|
||||
{
|
||||
DPRINT(("end of string.\n"));
|
||||
break;
|
||||
}
|
||||
}
|
||||
DPRINT(("restarting from next start position\n"));
|
||||
next_c = (tre_char_t) next_c_start;
|
||||
#ifdef TRE_MBSTATE
|
||||
mbstate = mbstate_start;
|
||||
#endif /* TRE_MBSTATE */
|
||||
str_byte = str_byte_start;
|
||||
#ifdef TRE_WCHAR
|
||||
str_wide = str_wide_start;
|
||||
#endif /* TRE_WCHAR */
|
||||
goto retry;
|
||||
}
|
||||
else
|
||||
{
|
||||
DPRINT(("finished\n"));
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
ret = match_eo >= 0 ? REG_OK : REG_NOMATCH;
|
||||
*match_end_ofs = match_eo;
|
||||
|
||||
error_exit:
|
||||
tre_bt_mem_destroy(mem);
|
||||
#ifndef TRE_USE_ALLOCA
|
||||
if (tags)
|
||||
xafree(tags);
|
||||
if (pmatch)
|
||||
xafree(pmatch);
|
||||
if (states_seen)
|
||||
xafree(states_seen);
|
||||
#endif /* !TRE_USE_ALLOCA */
|
||||
|
||||
return ret;
|
||||
}
|
||||
538
deps/tre/lib/tre-match-parallel.c
vendored
Normal file
538
deps/tre/lib/tre-match-parallel.c
vendored
Normal file
|
|
@ -0,0 +1,538 @@
|
|||
/*
|
||||
tre-match-parallel.c - TRE parallel regex matching engine
|
||||
|
||||
This software is released under a BSD-style license.
|
||||
See the file LICENSE for details and copyright.
|
||||
|
||||
*/
|
||||
|
||||
/*
|
||||
This algorithm searches for matches basically by reading characters
|
||||
in the searched string one by one, starting at the beginning. All
|
||||
matching paths in the TNFA are traversed in parallel. When two or
|
||||
more paths reach the same state, exactly one is chosen according to
|
||||
tag ordering rules; if returning submatches is not required it does
|
||||
not matter which path is chosen.
|
||||
|
||||
The worst case time required for finding the leftmost and longest
|
||||
match, or determining that there is no match, is always linearly
|
||||
dependent on the length of the text being searched.
|
||||
|
||||
This algorithm cannot handle TNFAs with back referencing nodes.
|
||||
See `tre-match-backtrack.c'.
|
||||
*/
|
||||
|
||||
|
||||
#ifdef HAVE_CONFIG_H
|
||||
#include <config.h>
|
||||
#endif /* HAVE_CONFIG_H */
|
||||
|
||||
#ifdef TRE_USE_ALLOCA
|
||||
/* AIX requires this to be the first thing in the file. */
|
||||
#ifndef __GNUC__
|
||||
# if HAVE_ALLOCA_H
|
||||
# include <alloca.h>
|
||||
# else
|
||||
# ifdef _AIX
|
||||
#pragma alloca
|
||||
# else
|
||||
# ifndef alloca /* predefined by HP cc +Olibcalls */
|
||||
char *alloca ();
|
||||
# endif
|
||||
# endif
|
||||
# endif
|
||||
#endif
|
||||
#endif /* TRE_USE_ALLOCA */
|
||||
|
||||
#include <assert.h>
|
||||
#include <stdint.h>
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
#ifdef HAVE_WCHAR_H
|
||||
#include <wchar.h>
|
||||
#endif /* HAVE_WCHAR_H */
|
||||
#ifdef HAVE_WCTYPE_H
|
||||
#include <wctype.h>
|
||||
#endif /* HAVE_WCTYPE_H */
|
||||
#ifndef TRE_WCHAR
|
||||
#include <ctype.h>
|
||||
#endif /* !TRE_WCHAR */
|
||||
#ifdef HAVE_MALLOC_H
|
||||
#include <malloc.h>
|
||||
#endif /* HAVE_MALLOC_H */
|
||||
|
||||
#include "tre-internal.h"
|
||||
#include "tre-match-utils.h"
|
||||
#include "xmalloc.h"
|
||||
|
||||
|
||||
|
||||
typedef struct {
|
||||
tre_tnfa_transition_t *state;
|
||||
int *tags;
|
||||
} tre_tnfa_reach_t;
|
||||
|
||||
typedef struct {
|
||||
int pos;
|
||||
int **tags;
|
||||
} tre_reach_pos_t;
|
||||
|
||||
|
||||
#ifdef TRE_DEBUG
|
||||
static void
|
||||
tre_print_reach(const tre_tnfa_reach_t *reach, int num_tags)
|
||||
{
|
||||
int i;
|
||||
|
||||
while (reach->state != NULL)
|
||||
{
|
||||
DPRINT((" %p", (void *)reach->state));
|
||||
if (num_tags > 0)
|
||||
{
|
||||
DPRINT(("/"));
|
||||
for (i = 0; i < num_tags; i++)
|
||||
{
|
||||
DPRINT(("%d:%d", i, reach->tags[i]));
|
||||
if (i < (num_tags-1))
|
||||
DPRINT((","));
|
||||
}
|
||||
}
|
||||
reach++;
|
||||
}
|
||||
DPRINT(("\n"));
|
||||
|
||||
}
|
||||
#endif /* TRE_DEBUG */
|
||||
|
||||
reg_errcode_t
|
||||
tre_tnfa_run_parallel(const tre_tnfa_t *tnfa, const void *string, ssize_t len,
|
||||
tre_str_type_t type, int *match_tags, int eflags,
|
||||
int *match_end_ofs)
|
||||
{
|
||||
/* State variables required by GET_NEXT_WCHAR. */
|
||||
tre_char_t prev_c = 0, next_c = 0;
|
||||
const char *str_byte = string;
|
||||
ssize_t pos = -1;
|
||||
unsigned int pos_add_next = 1;
|
||||
#ifdef TRE_WCHAR
|
||||
const wchar_t *str_wide = string;
|
||||
#ifdef TRE_MBSTATE
|
||||
mbstate_t mbstate;
|
||||
#endif /* TRE_MBSTATE */
|
||||
#endif /* TRE_WCHAR */
|
||||
reg_errcode_t ret;
|
||||
int reg_notbol = eflags & REG_NOTBOL;
|
||||
int reg_noteol = eflags & REG_NOTEOL;
|
||||
int reg_newline = tnfa->cflags & REG_NEWLINE;
|
||||
int str_user_end = 0;
|
||||
|
||||
char *buf;
|
||||
tre_tnfa_transition_t *trans_i;
|
||||
tre_tnfa_reach_t *reach, *reach_next, *reach_i, *reach_next_i;
|
||||
tre_reach_pos_t *reach_pos;
|
||||
int *tag_i;
|
||||
int num_tags, i;
|
||||
|
||||
int match_eo = -1; /* end offset of match (-1 if no match found yet) */
|
||||
int new_match = 0;
|
||||
int *tmp_tags = NULL;
|
||||
int *tmp_iptr;
|
||||
|
||||
/*
|
||||
* TRE internals tend to use int instead of size_t for positions or
|
||||
* lengths and don't check for overflow. This will take time to fix
|
||||
* properly. In the meantime, simply limit the input to what we can
|
||||
* handle.
|
||||
*/
|
||||
if (len > TRE_MAX_STRING)
|
||||
len = TRE_MAX_STRING;
|
||||
|
||||
#ifdef TRE_MBSTATE
|
||||
memset(&mbstate, '\0', sizeof(mbstate));
|
||||
#endif /* TRE_MBSTATE */
|
||||
|
||||
DPRINT(("tre_tnfa_run_parallel, input type %d\n", type));
|
||||
|
||||
if (!match_tags)
|
||||
num_tags = 0;
|
||||
else
|
||||
num_tags = tnfa->num_tags;
|
||||
|
||||
/* Allocate memory for temporary data required for matching. This needs to
|
||||
be done for every matching operation to be thread safe. This allocates
|
||||
everything in a single large block from the stack frame using alloca()
|
||||
or with malloc() if alloca is unavailable. */
|
||||
{
|
||||
size_t tbytes, rbytes, pbytes, xbytes, total_bytes;
|
||||
size_t num_states = (size_t)tnfa->num_states;
|
||||
size_t state_tag_bytes, reach_bytes;
|
||||
size_t padding = (sizeof(long) - 1) * 4;
|
||||
char *tmp_buf;
|
||||
|
||||
if (num_states > SIZE_MAX / sizeof(*reach_pos))
|
||||
return REG_ESPACE;
|
||||
pbytes = sizeof(*reach_pos) * num_states;
|
||||
|
||||
if (num_states + 1 > SIZE_MAX / sizeof(*reach_next))
|
||||
return REG_ESPACE;
|
||||
rbytes = sizeof(*reach_next) * (num_states + 1);
|
||||
|
||||
if ((size_t)num_tags > SIZE_MAX / sizeof(*tmp_tags))
|
||||
return REG_ESPACE;
|
||||
tbytes = sizeof(*tmp_tags) * (size_t)num_tags;
|
||||
|
||||
if ((size_t)num_tags > SIZE_MAX / sizeof(int))
|
||||
return REG_ESPACE;
|
||||
xbytes = sizeof(int) * (size_t)num_tags;
|
||||
|
||||
if (num_states > 0 && xbytes > SIZE_MAX / num_states)
|
||||
return REG_ESPACE;
|
||||
state_tag_bytes = xbytes * num_states;
|
||||
|
||||
if (rbytes > SIZE_MAX - state_tag_bytes)
|
||||
return REG_ESPACE;
|
||||
reach_bytes = rbytes + state_tag_bytes;
|
||||
|
||||
if (reach_bytes > (SIZE_MAX - padding - tbytes - pbytes) / 2)
|
||||
return REG_ESPACE;
|
||||
|
||||
/* Compute the length of the block we need. */
|
||||
total_bytes =
|
||||
padding + reach_bytes * 2 + tbytes + pbytes;
|
||||
|
||||
/* Allocate the memory. */
|
||||
#ifdef TRE_USE_ALLOCA
|
||||
buf = alloca(total_bytes);
|
||||
#else /* !TRE_USE_ALLOCA */
|
||||
buf = xmalloc(total_bytes);
|
||||
#endif /* !TRE_USE_ALLOCA */
|
||||
if (buf == NULL)
|
||||
return REG_ESPACE;
|
||||
memset(buf, 0, total_bytes);
|
||||
|
||||
/* Get the various pointers within tmp_buf (properly aligned). */
|
||||
tmp_tags = (void *)buf;
|
||||
tmp_buf = buf + tbytes;
|
||||
tmp_buf += ALIGN(tmp_buf, long);
|
||||
reach_next = (void *)tmp_buf;
|
||||
tmp_buf += rbytes;
|
||||
tmp_buf += ALIGN(tmp_buf, long);
|
||||
reach = (void *)tmp_buf;
|
||||
tmp_buf += rbytes;
|
||||
tmp_buf += ALIGN(tmp_buf, long);
|
||||
reach_pos = (void *)tmp_buf;
|
||||
tmp_buf += pbytes;
|
||||
tmp_buf += ALIGN(tmp_buf, long);
|
||||
for (i = 0; i < tnfa->num_states; i++)
|
||||
{
|
||||
reach[i].tags = (void *)tmp_buf;
|
||||
tmp_buf += xbytes;
|
||||
reach_next[i].tags = (void *)tmp_buf;
|
||||
tmp_buf += xbytes;
|
||||
}
|
||||
}
|
||||
|
||||
for (i = 0; i < tnfa->num_states; i++)
|
||||
reach_pos[i].pos = -1;
|
||||
|
||||
/* If only one character can start a match, find it first. */
|
||||
if (tnfa->first_char >= 0 && type == STR_BYTE && str_byte)
|
||||
{
|
||||
const char *orig_str = str_byte;
|
||||
int first = tnfa->first_char;
|
||||
|
||||
if (len >= 0)
|
||||
str_byte = memchr(orig_str, first, (size_t)len);
|
||||
else
|
||||
str_byte = strchr(orig_str, first);
|
||||
if (str_byte == NULL)
|
||||
{
|
||||
#ifndef TRE_USE_ALLOCA
|
||||
if (buf)
|
||||
xfree(buf);
|
||||
#endif /* !TRE_USE_ALLOCA */
|
||||
return REG_NOMATCH;
|
||||
}
|
||||
DPRINT(("skipped %lu chars\n", (unsigned long)(str_byte - orig_str)));
|
||||
if (str_byte >= orig_str + 1)
|
||||
prev_c = (unsigned char)*(str_byte - 1);
|
||||
next_c = (unsigned char)*str_byte;
|
||||
pos = str_byte - orig_str;
|
||||
if (len < 0 || pos < len)
|
||||
str_byte++;
|
||||
}
|
||||
else
|
||||
{
|
||||
GET_NEXT_WCHAR();
|
||||
pos = 0;
|
||||
}
|
||||
|
||||
#if 0
|
||||
/* Skip over characters that cannot possibly be the first character
|
||||
of a match. */
|
||||
if (tnfa->firstpos_chars != NULL)
|
||||
{
|
||||
char *chars = tnfa->firstpos_chars;
|
||||
|
||||
if (len < 0)
|
||||
{
|
||||
const char *orig_str = str_byte;
|
||||
/* XXX - use strpbrk() and wcspbrk() because they might be
|
||||
optimized for the target architecture. Try also strcspn()
|
||||
and wcscspn() and compare the speeds. */
|
||||
while (next_c != L'\0' && !chars[next_c])
|
||||
{
|
||||
next_c = *str_byte++;
|
||||
}
|
||||
prev_c = *(str_byte - 2);
|
||||
pos += str_byte - orig_str;
|
||||
DPRINT(("skipped %d chars\n", str_byte - orig_str));
|
||||
}
|
||||
else
|
||||
{
|
||||
while (pos <= len && !chars[next_c])
|
||||
{
|
||||
prev_c = next_c;
|
||||
next_c = (unsigned char)(*str_byte++);
|
||||
pos++;
|
||||
}
|
||||
}
|
||||
}
|
||||
#endif
|
||||
|
||||
DPRINT(("length: %zd\n", len));
|
||||
DPRINT(("pos:chr/code | states and tags\n"));
|
||||
DPRINT(("-------------+------------------------------------------------\n"));
|
||||
|
||||
reach_next_i = reach_next;
|
||||
while (/*CONSTCOND*/(void)1,1)
|
||||
{
|
||||
/* If no match found yet, add the initial states to `reach_next'. */
|
||||
if (match_eo < 0)
|
||||
{
|
||||
DPRINT((" init >"));
|
||||
trans_i = tnfa->initial;
|
||||
while (trans_i->state != NULL)
|
||||
{
|
||||
if (reach_pos[trans_i->state_id].pos < pos)
|
||||
{
|
||||
if (trans_i->assertions
|
||||
&& CHECK_ASSERTIONS(trans_i->assertions))
|
||||
{
|
||||
DPRINT(("assertion failed\n"));
|
||||
trans_i++;
|
||||
continue;
|
||||
}
|
||||
|
||||
DPRINT((" %p", (void *)trans_i->state));
|
||||
reach_next_i->state = trans_i->state;
|
||||
for (i = 0; i < num_tags; i++)
|
||||
reach_next_i->tags[i] = -1;
|
||||
tag_i = trans_i->tags;
|
||||
if (tag_i)
|
||||
while (*tag_i >= 0)
|
||||
{
|
||||
if (*tag_i < num_tags)
|
||||
reach_next_i->tags[*tag_i] = pos;
|
||||
tag_i++;
|
||||
}
|
||||
if (reach_next_i->state == tnfa->final)
|
||||
{
|
||||
DPRINT((" found empty match\n"));
|
||||
match_eo = pos;
|
||||
new_match = 1;
|
||||
for (i = 0; i < num_tags; i++)
|
||||
match_tags[i] = reach_next_i->tags[i];
|
||||
}
|
||||
reach_pos[trans_i->state_id].pos = pos;
|
||||
reach_pos[trans_i->state_id].tags = &reach_next_i->tags;
|
||||
reach_next_i++;
|
||||
}
|
||||
trans_i++;
|
||||
}
|
||||
DPRINT(("\n"));
|
||||
reach_next_i->state = NULL;
|
||||
}
|
||||
else
|
||||
{
|
||||
if (num_tags == 0 || reach_next_i == reach_next)
|
||||
/* We have found a match. */
|
||||
break;
|
||||
}
|
||||
|
||||
/* Check for end of string. */
|
||||
if (len < 0)
|
||||
{
|
||||
if (type == STR_USER)
|
||||
{
|
||||
if (str_user_end)
|
||||
break;
|
||||
}
|
||||
else if (next_c == L'\0' || pos >= TRE_MAX_STRING)
|
||||
break;
|
||||
}
|
||||
else
|
||||
{
|
||||
if (pos >= len)
|
||||
break;
|
||||
}
|
||||
|
||||
GET_NEXT_WCHAR();
|
||||
|
||||
#ifdef TRE_DEBUG
|
||||
DPRINT(("%3zd:%2lc/%05d |", pos - 1, (tre_cint_t)prev_c, (int)prev_c));
|
||||
tre_print_reach(reach_next, num_tags);
|
||||
DPRINT(("%3zd:%2lc/%05d |", pos, (tre_cint_t)next_c, (int)next_c));
|
||||
tre_print_reach(reach_next, num_tags);
|
||||
#endif /* TRE_DEBUG */
|
||||
|
||||
/* Swap `reach' and `reach_next'. */
|
||||
reach_i = reach;
|
||||
reach = reach_next;
|
||||
reach_next = reach_i;
|
||||
|
||||
/* For each state in `reach', weed out states that don't fulfill the
|
||||
minimal matching conditions. */
|
||||
if (tnfa->num_minimals && new_match)
|
||||
{
|
||||
new_match = 0;
|
||||
reach_next_i = reach_next;
|
||||
for (reach_i = reach; reach_i->state; reach_i++)
|
||||
{
|
||||
int skip = 0;
|
||||
for (i = 0; tnfa->minimal_tags[i] >= 0; i += 2)
|
||||
{
|
||||
int end = tnfa->minimal_tags[i];
|
||||
int start = tnfa->minimal_tags[i + 1];
|
||||
DPRINT((" Minimal start %d, end %d\n", start, end));
|
||||
if (end >= num_tags)
|
||||
{
|
||||
DPRINT((" Throwing %p out.\n", reach_i->state));
|
||||
skip = 1;
|
||||
break;
|
||||
}
|
||||
else if (reach_i->tags[start] == match_tags[start]
|
||||
&& reach_i->tags[end] < match_tags[end])
|
||||
{
|
||||
DPRINT((" Throwing %p out because t%d < %d\n",
|
||||
reach_i->state, end, match_tags[end]));
|
||||
skip = 1;
|
||||
break;
|
||||
}
|
||||
}
|
||||
if (!skip)
|
||||
{
|
||||
reach_next_i->state = reach_i->state;
|
||||
tmp_iptr = reach_next_i->tags;
|
||||
reach_next_i->tags = reach_i->tags;
|
||||
reach_i->tags = tmp_iptr;
|
||||
reach_next_i++;
|
||||
}
|
||||
}
|
||||
reach_next_i->state = NULL;
|
||||
|
||||
/* Swap `reach' and `reach_next'. */
|
||||
reach_i = reach;
|
||||
reach = reach_next;
|
||||
reach_next = reach_i;
|
||||
}
|
||||
|
||||
/* For each state in `reach' see if there is a transition leaving with
|
||||
the current input symbol to a state not yet in `reach_next', and
|
||||
add the destination states to `reach_next'. */
|
||||
reach_next_i = reach_next;
|
||||
for (reach_i = reach; reach_i->state; reach_i++)
|
||||
{
|
||||
for (trans_i = reach_i->state; trans_i->state; trans_i++)
|
||||
{
|
||||
/* Does this transition match the input symbol? */
|
||||
if (trans_i->code_min <= (tre_cint_t)prev_c &&
|
||||
trans_i->code_max >= (tre_cint_t)prev_c)
|
||||
{
|
||||
if (trans_i->assertions
|
||||
&& (CHECK_ASSERTIONS(trans_i->assertions)
|
||||
|| CHECK_CHAR_CLASSES(trans_i, tnfa, eflags)))
|
||||
{
|
||||
DPRINT(("assertion failed\n"));
|
||||
continue;
|
||||
}
|
||||
|
||||
/* Compute the tags after this transition. */
|
||||
for (i = 0; i < num_tags; i++)
|
||||
tmp_tags[i] = reach_i->tags[i];
|
||||
tag_i = trans_i->tags;
|
||||
if (tag_i != NULL)
|
||||
while (*tag_i >= 0)
|
||||
{
|
||||
if (*tag_i < num_tags)
|
||||
tmp_tags[*tag_i] = pos;
|
||||
tag_i++;
|
||||
}
|
||||
|
||||
if (reach_pos[trans_i->state_id].pos < pos)
|
||||
{
|
||||
/* Found an unvisited node. */
|
||||
reach_next_i->state = trans_i->state;
|
||||
tmp_iptr = reach_next_i->tags;
|
||||
reach_next_i->tags = tmp_tags;
|
||||
tmp_tags = tmp_iptr;
|
||||
reach_pos[trans_i->state_id].pos = pos;
|
||||
reach_pos[trans_i->state_id].tags = &reach_next_i->tags;
|
||||
|
||||
if (reach_next_i->state == tnfa->final
|
||||
&& (match_eo == -1
|
||||
|| (num_tags > 0
|
||||
&& reach_next_i->tags[0] <= match_tags[0])))
|
||||
{
|
||||
DPRINT((" found match %p\n", trans_i->state));
|
||||
match_eo = pos;
|
||||
new_match = 1;
|
||||
for (i = 0; i < num_tags; i++)
|
||||
match_tags[i] = reach_next_i->tags[i];
|
||||
}
|
||||
reach_next_i++;
|
||||
|
||||
}
|
||||
else
|
||||
{
|
||||
assert(reach_pos[trans_i->state_id].pos == pos);
|
||||
/* Another path has also reached this state. We choose
|
||||
the winner by examining the tag values for both
|
||||
paths. */
|
||||
if (tre_tag_order(num_tags, tnfa->tag_directions,
|
||||
tmp_tags,
|
||||
*reach_pos[trans_i->state_id].tags))
|
||||
{
|
||||
/* The new path wins. */
|
||||
tmp_iptr = *reach_pos[trans_i->state_id].tags;
|
||||
*reach_pos[trans_i->state_id].tags = tmp_tags;
|
||||
if (trans_i->state == tnfa->final)
|
||||
{
|
||||
DPRINT((" found better match\n"));
|
||||
match_eo = pos;
|
||||
new_match = 1;
|
||||
for (i = 0; i < num_tags; i++)
|
||||
match_tags[i] = tmp_tags[i];
|
||||
}
|
||||
tmp_tags = tmp_iptr;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
reach_next_i->state = NULL;
|
||||
}
|
||||
|
||||
DPRINT(("match end offset = %d\n", match_eo));
|
||||
|
||||
*match_end_ofs = match_eo;
|
||||
ret = match_eo >= 0 ? REG_OK : REG_NOMATCH;
|
||||
|
||||
#ifndef TRE_USE_ALLOCA
|
||||
if (buf)
|
||||
xfree(buf);
|
||||
#endif /* !TRE_USE_ALLOCA */
|
||||
return ret;
|
||||
}
|
||||
|
||||
/* EOF */
|
||||
215
deps/tre/lib/tre-match-utils.h
vendored
Normal file
215
deps/tre/lib/tre-match-utils.h
vendored
Normal file
|
|
@ -0,0 +1,215 @@
|
|||
/*
|
||||
tre-match-utils.h - TRE matcher helper definitions
|
||||
|
||||
This software is released under a BSD-style license.
|
||||
See the file LICENSE for details and copyright.
|
||||
|
||||
*/
|
||||
|
||||
#define str_source ((const tre_str_source*)string)
|
||||
|
||||
#ifdef TRE_WCHAR
|
||||
|
||||
#ifdef TRE_MULTIBYTE
|
||||
|
||||
/* Wide character and multibyte support. */
|
||||
|
||||
#define GET_NEXT_WCHAR() \
|
||||
do { \
|
||||
prev_c = next_c; \
|
||||
if (type == STR_BYTE) \
|
||||
{ \
|
||||
pos++; \
|
||||
if (len >= 0 && pos >= len) \
|
||||
next_c = '\0'; \
|
||||
else \
|
||||
next_c = (unsigned char)(*str_byte++); \
|
||||
} \
|
||||
else if (type == STR_WIDE) \
|
||||
{ \
|
||||
pos++; \
|
||||
if (len >= 0 && pos >= len) \
|
||||
next_c = L'\0'; \
|
||||
else \
|
||||
next_c = *str_wide++; \
|
||||
} \
|
||||
else if (type == STR_MBS) \
|
||||
{ \
|
||||
pos += pos_add_next; \
|
||||
if (str_byte == NULL) \
|
||||
next_c = L'\0'; \
|
||||
else \
|
||||
{ \
|
||||
size_t w; \
|
||||
size_t max; \
|
||||
if (len >= 0) \
|
||||
max = len - pos; \
|
||||
else \
|
||||
max = 32; \
|
||||
if (max <= 0) \
|
||||
{ \
|
||||
next_c = L'\0'; \
|
||||
pos_add_next = 1; \
|
||||
} \
|
||||
else \
|
||||
{ \
|
||||
w = tre_mbrtowc(&next_c, str_byte, (size_t)max, &mbstate); \
|
||||
if (w == (size_t)-1 || w == (size_t)-2) \
|
||||
return REG_NOMATCH; \
|
||||
if (w == 0 && len >= 0) \
|
||||
{ \
|
||||
pos_add_next = 1; \
|
||||
next_c = 0; \
|
||||
str_byte++; \
|
||||
} \
|
||||
else \
|
||||
{ \
|
||||
pos_add_next = w; \
|
||||
str_byte += w; \
|
||||
} \
|
||||
} \
|
||||
} \
|
||||
} \
|
||||
else if (type == STR_USER) \
|
||||
{ \
|
||||
pos += pos_add_next; \
|
||||
str_user_end = str_source->get_next_char(&next_c, &pos_add_next, \
|
||||
str_source->context); \
|
||||
} \
|
||||
} while(/*CONSTCOND*/(void)0,0)
|
||||
|
||||
#else /* !TRE_MULTIBYTE */
|
||||
|
||||
/* Wide character support, no multibyte support. */
|
||||
|
||||
#define GET_NEXT_WCHAR() \
|
||||
do { \
|
||||
prev_c = next_c; \
|
||||
if (type == STR_BYTE) \
|
||||
{ \
|
||||
pos++; \
|
||||
if (len >= 0 && pos >= len) \
|
||||
next_c = '\0'; \
|
||||
else \
|
||||
next_c = (unsigned char)(*str_byte++); \
|
||||
} \
|
||||
else if (type == STR_WIDE) \
|
||||
{ \
|
||||
pos++; \
|
||||
if (len >= 0 && pos >= len) \
|
||||
next_c = L'\0'; \
|
||||
else \
|
||||
next_c = *str_wide++; \
|
||||
} \
|
||||
else if (type == STR_USER) \
|
||||
{ \
|
||||
pos += pos_add_next; \
|
||||
str_user_end = str_source->get_next_char(&next_c, &pos_add_next, \
|
||||
str_source->context); \
|
||||
} \
|
||||
} while(/*CONSTCOND*/(void)0,0)
|
||||
|
||||
#endif /* !TRE_MULTIBYTE */
|
||||
|
||||
#else /* !TRE_WCHAR */
|
||||
|
||||
/* No wide character or multibyte support. */
|
||||
|
||||
#define GET_NEXT_WCHAR() \
|
||||
do { \
|
||||
prev_c = next_c; \
|
||||
if (type == STR_BYTE) \
|
||||
{ \
|
||||
pos++; \
|
||||
if (len >= 0 && pos >= len) \
|
||||
next_c = '\0'; \
|
||||
else \
|
||||
next_c = (unsigned char)(*str_byte++); \
|
||||
} \
|
||||
else if (type == STR_USER) \
|
||||
{ \
|
||||
pos += pos_add_next; \
|
||||
str_user_end = str_source->get_next_char(&next_c, &pos_add_next, \
|
||||
str_source->context); \
|
||||
} \
|
||||
} while(/*CONSTCOND*/(void)0,0)
|
||||
|
||||
#endif /* !TRE_WCHAR */
|
||||
|
||||
|
||||
|
||||
#define IS_WORD_CHAR(c) ((c) == L'_' || tre_isalnum(c))
|
||||
|
||||
#define CHECK_ASSERTIONS(assertions) \
|
||||
(((assertions & ASSERT_AT_BOL) \
|
||||
&& (pos > 0 || reg_notbol) \
|
||||
&& (prev_c != L'\n' || !reg_newline)) \
|
||||
|| ((assertions & ASSERT_AT_EOL) \
|
||||
&& (next_c != L'\0' || reg_noteol) \
|
||||
&& (next_c != L'\n' || !reg_newline)) \
|
||||
|| ((assertions & ASSERT_AT_BOW) \
|
||||
&& (IS_WORD_CHAR(prev_c) || !IS_WORD_CHAR(next_c))) \
|
||||
|| ((assertions & ASSERT_AT_EOW) \
|
||||
&& (!IS_WORD_CHAR(prev_c) || IS_WORD_CHAR(next_c))) \
|
||||
|| ((assertions & ASSERT_AT_WB) \
|
||||
&& (pos != 0 && next_c != L'\0' \
|
||||
&& IS_WORD_CHAR(prev_c) == IS_WORD_CHAR(next_c))) \
|
||||
|| ((assertions & ASSERT_AT_WB_NEG) \
|
||||
&& (pos == 0 || next_c == L'\0' \
|
||||
|| IS_WORD_CHAR(prev_c) != IS_WORD_CHAR(next_c))))
|
||||
|
||||
#define CHECK_CHAR_CLASSES(trans_i, tnfa, eflags) \
|
||||
(((trans_i->assertions & ASSERT_CHAR_CLASS) \
|
||||
&& !(tnfa->cflags & REG_ICASE) \
|
||||
&& !tre_isctype((tre_cint_t)prev_c, trans_i->u.class)) \
|
||||
|| ((trans_i->assertions & ASSERT_CHAR_CLASS) \
|
||||
&& (tnfa->cflags & REG_ICASE) \
|
||||
&& !tre_isctype(tre_tolower((tre_cint_t)prev_c),trans_i->u.class) \
|
||||
&& !tre_isctype(tre_toupper((tre_cint_t)prev_c),trans_i->u.class)) \
|
||||
|| ((trans_i->assertions & ASSERT_CHAR_CLASS_NEG) \
|
||||
&& tre_neg_char_classes_match(trans_i->neg_classes,(tre_cint_t)prev_c,\
|
||||
tnfa->cflags & REG_ICASE)))
|
||||
|
||||
|
||||
|
||||
|
||||
/* Returns 1 if `t1' wins `t2', 0 otherwise. */
|
||||
inline static int
|
||||
tre_tag_order(int num_tags, tre_tag_direction_t *tag_directions,
|
||||
int *t1, int *t2)
|
||||
{
|
||||
int i;
|
||||
for (i = 0; i < num_tags; i++)
|
||||
{
|
||||
if (tag_directions[i] == TRE_TAG_MINIMIZE)
|
||||
{
|
||||
if (t1[i] < t2[i])
|
||||
return 1;
|
||||
if (t1[i] > t2[i])
|
||||
return 0;
|
||||
}
|
||||
else
|
||||
{
|
||||
if (t1[i] > t2[i])
|
||||
return 1;
|
||||
if (t1[i] < t2[i])
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
/* assert(0);*/
|
||||
return 0;
|
||||
}
|
||||
|
||||
inline static int
|
||||
tre_neg_char_classes_match(tre_ctype_t *classes, tre_cint_t wc, int icase)
|
||||
{
|
||||
DPRINT(("neg_char_classes_test: %p, %d, %d\n", classes, wc, icase));
|
||||
while (*classes != (tre_ctype_t)0)
|
||||
if ((!icase && tre_isctype(wc, *classes))
|
||||
|| (icase && (tre_isctype(tre_toupper(wc), *classes)
|
||||
|| tre_isctype(tre_tolower(wc), *classes))))
|
||||
return 1; /* Match. */
|
||||
else
|
||||
classes++;
|
||||
return 0; /* No match. */
|
||||
}
|
||||
155
deps/tre/lib/tre-mem.c
vendored
Normal file
155
deps/tre/lib/tre-mem.c
vendored
Normal file
|
|
@ -0,0 +1,155 @@
|
|||
/*
|
||||
tre-mem.c - TRE memory allocator
|
||||
|
||||
This software is released under a BSD-style license.
|
||||
See the file LICENSE for details and copyright.
|
||||
|
||||
*/
|
||||
|
||||
/*
|
||||
This memory allocator is for allocating small memory blocks efficiently
|
||||
in terms of memory overhead and execution speed. The allocated blocks
|
||||
cannot be freed individually, only all at once. There can be multiple
|
||||
allocators, though.
|
||||
*/
|
||||
|
||||
#ifdef HAVE_CONFIG_H
|
||||
#include <config.h>
|
||||
#endif /* HAVE_CONFIG_H */
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
|
||||
#include "tre-internal.h"
|
||||
#include "tre-mem.h"
|
||||
#include "xmalloc.h"
|
||||
|
||||
|
||||
/* Returns a new memory allocator or NULL if out of memory. */
|
||||
tre_mem_t
|
||||
tre_mem_new_impl(int provided, void *provided_block)
|
||||
{
|
||||
tre_mem_t mem;
|
||||
if (provided)
|
||||
{
|
||||
mem = provided_block;
|
||||
memset(mem, 0, sizeof(*mem));
|
||||
}
|
||||
else
|
||||
mem = xcalloc(1, sizeof(*mem));
|
||||
if (mem == NULL)
|
||||
return NULL;
|
||||
return mem;
|
||||
}
|
||||
|
||||
|
||||
/* Frees the memory allocator and all memory allocated with it. */
|
||||
void
|
||||
tre_mem_destroy(tre_mem_t mem)
|
||||
{
|
||||
tre_list_t *tmp, *l = mem->blocks;
|
||||
|
||||
while (l != NULL)
|
||||
{
|
||||
xfree(l->data);
|
||||
tmp = l->next;
|
||||
xfree(l);
|
||||
l = tmp;
|
||||
}
|
||||
xfree(mem);
|
||||
}
|
||||
|
||||
|
||||
/* Allocates a block of `size' bytes from `mem'. Returns a pointer to the
|
||||
allocated block or NULL if an underlying malloc() failed. */
|
||||
void *
|
||||
tre_mem_alloc_impl(tre_mem_t mem, int provided, void *provided_block,
|
||||
int zero, size_t size)
|
||||
{
|
||||
void *ptr;
|
||||
|
||||
if (mem->failed)
|
||||
{
|
||||
DPRINT(("tre_mem_alloc: oops, called after failure?!\n"));
|
||||
return NULL;
|
||||
}
|
||||
|
||||
#ifdef MALLOC_DEBUGGING
|
||||
if (!provided)
|
||||
{
|
||||
ptr = xmalloc(1);
|
||||
if (ptr == NULL)
|
||||
{
|
||||
DPRINT(("tre_mem_alloc: xmalloc forced failure\n"));
|
||||
mem->failed = 1;
|
||||
return NULL;
|
||||
}
|
||||
xfree(ptr);
|
||||
}
|
||||
#endif /* MALLOC_DEBUGGING */
|
||||
|
||||
if (mem->n < size)
|
||||
{
|
||||
/* We need more memory than is available in the current block.
|
||||
Allocate a new block. */
|
||||
tre_list_t *l;
|
||||
if (provided)
|
||||
{
|
||||
DPRINT(("tre_mem_alloc: using provided block\n"));
|
||||
if (provided_block == NULL)
|
||||
{
|
||||
DPRINT(("tre_mem_alloc: provided block was NULL\n"));
|
||||
mem->failed = 1;
|
||||
return NULL;
|
||||
}
|
||||
mem->ptr = provided_block;
|
||||
mem->n = TRE_MEM_BLOCK_SIZE;
|
||||
}
|
||||
else
|
||||
{
|
||||
size_t block_size;
|
||||
if (size * 8 > TRE_MEM_BLOCK_SIZE)
|
||||
block_size = size * 8;
|
||||
else
|
||||
block_size = TRE_MEM_BLOCK_SIZE;
|
||||
DPRINT(("tre_mem_alloc: allocating new %zu byte block\n",
|
||||
block_size));
|
||||
l = xmalloc(sizeof(*l));
|
||||
if (l == NULL)
|
||||
{
|
||||
mem->failed = 1;
|
||||
return NULL;
|
||||
}
|
||||
l->data = xmalloc(block_size);
|
||||
if (l->data == NULL)
|
||||
{
|
||||
xfree(l);
|
||||
mem->failed = 1;
|
||||
return NULL;
|
||||
}
|
||||
l->next = NULL;
|
||||
if (mem->current != NULL)
|
||||
mem->current->next = l;
|
||||
if (mem->blocks == NULL)
|
||||
mem->blocks = l;
|
||||
mem->current = l;
|
||||
mem->ptr = l->data;
|
||||
mem->n = block_size;
|
||||
}
|
||||
}
|
||||
|
||||
/* Make sure the next pointer will be aligned. */
|
||||
size += ALIGN(mem->ptr + size, long);
|
||||
|
||||
/* Allocate from current block. */
|
||||
ptr = mem->ptr;
|
||||
mem->ptr += size;
|
||||
mem->n -= size;
|
||||
|
||||
/* Set to zero if needed. */
|
||||
if (zero)
|
||||
memset(ptr, 0, size);
|
||||
|
||||
return ptr;
|
||||
}
|
||||
|
||||
/* EOF */
|
||||
66
deps/tre/lib/tre-mem.h
vendored
Normal file
66
deps/tre/lib/tre-mem.h
vendored
Normal file
|
|
@ -0,0 +1,66 @@
|
|||
/*
|
||||
tre-mem.h - TRE memory allocator interface
|
||||
|
||||
This software is released under a BSD-style license.
|
||||
See the file LICENSE for details and copyright.
|
||||
|
||||
*/
|
||||
|
||||
#ifndef TRE_MEM_H
|
||||
#define TRE_MEM_H 1
|
||||
|
||||
#include <stdlib.h>
|
||||
|
||||
#define TRE_MEM_BLOCK_SIZE 1024
|
||||
|
||||
typedef struct tre_list {
|
||||
void *data;
|
||||
struct tre_list *next;
|
||||
} tre_list_t;
|
||||
|
||||
typedef struct tre_mem_struct {
|
||||
tre_list_t *blocks;
|
||||
tre_list_t *current;
|
||||
char *ptr;
|
||||
size_t n;
|
||||
int failed;
|
||||
void **provided;
|
||||
} *tre_mem_t;
|
||||
|
||||
|
||||
tre_mem_t tre_mem_new_impl(int provided, void *provided_block);
|
||||
void *tre_mem_alloc_impl(tre_mem_t mem, int provided, void *provided_block,
|
||||
int zero, size_t size);
|
||||
|
||||
/* Returns a new memory allocator or NULL if out of memory. */
|
||||
#define tre_mem_new() tre_mem_new_impl(0, NULL)
|
||||
|
||||
/* Allocates a block of `size' bytes from `mem'. Returns a pointer to the
|
||||
allocated block or NULL if an underlying malloc() failed. */
|
||||
#define tre_mem_alloc(mem, size) tre_mem_alloc_impl(mem, 0, NULL, 0, size)
|
||||
|
||||
/* Allocates a block of `size' bytes from `mem'. Returns a pointer to the
|
||||
allocated block or NULL if an underlying malloc() failed. The memory
|
||||
is set to zero. */
|
||||
#define tre_mem_calloc(mem, size) tre_mem_alloc_impl(mem, 0, NULL, 1, size)
|
||||
|
||||
#ifdef TRE_USE_ALLOCA
|
||||
/* alloca() versions. Like above, but memory is allocated with alloca()
|
||||
instead of malloc(). */
|
||||
|
||||
#define tre_mem_newa() \
|
||||
tre_mem_new_impl(1, alloca(sizeof(struct tre_mem_struct)))
|
||||
|
||||
#define tre_mem_alloca(mem, size) \
|
||||
((mem)->n >= (size) \
|
||||
? tre_mem_alloc_impl((mem), 1, NULL, 0, (size)) \
|
||||
: tre_mem_alloc_impl((mem), 1, alloca(TRE_MEM_BLOCK_SIZE), 0, (size)))
|
||||
#endif /* TRE_USE_ALLOCA */
|
||||
|
||||
|
||||
/* Frees the memory allocator and all memory allocated with it. */
|
||||
void tre_mem_destroy(tre_mem_t mem);
|
||||
|
||||
#endif /* TRE_MEM_H */
|
||||
|
||||
/* EOF */
|
||||
1758
deps/tre/lib/tre-parse.c
vendored
Normal file
1758
deps/tre/lib/tre-parse.c
vendored
Normal file
File diff suppressed because it is too large
Load diff
52
deps/tre/lib/tre-parse.h
vendored
Normal file
52
deps/tre/lib/tre-parse.h
vendored
Normal file
|
|
@ -0,0 +1,52 @@
|
|||
/*
|
||||
tre-parse.c - Regexp parser definitions
|
||||
|
||||
This software is released under a BSD-style license.
|
||||
See the file LICENSE for details and copyright.
|
||||
|
||||
*/
|
||||
|
||||
#ifndef TRE_PARSE_H
|
||||
#define TRE_PARSE_H 1
|
||||
|
||||
/* Parse context. */
|
||||
typedef struct {
|
||||
/* Memory allocator. The AST is allocated using this. */
|
||||
tre_mem_t mem;
|
||||
/* Stack used for keeping track of regexp syntax. */
|
||||
tre_stack_t *stack;
|
||||
/* The parse result. */
|
||||
tre_ast_node_t *result;
|
||||
/* The regexp to parse and its length. */
|
||||
const tre_char_t *re;
|
||||
/* The first character of the entire regexp. */
|
||||
const tre_char_t *re_start;
|
||||
/* The first character after the end of the regexp. */
|
||||
const tre_char_t *re_end;
|
||||
size_t len;
|
||||
/* Current submatch ID. */
|
||||
int submatch_id;
|
||||
/* The highest back reference or -1 if none seen so far. */
|
||||
int max_backref;
|
||||
/* This flag is set if the regexp uses approximate matching. */
|
||||
int have_approx;
|
||||
/* This flag is set if the regexp changes cflags inline using (?...) */
|
||||
int have_inline_cflags;
|
||||
/* Compilation flags. */
|
||||
int cflags;
|
||||
/* If this flag is set the top-level submatch is not captured. */
|
||||
int nofirstsub;
|
||||
/* The currently set approximate matching parameters. */
|
||||
int params[TRE_PARAM_LAST];
|
||||
/* the MB_CUR_MAX in use */
|
||||
int mb_cur_max;
|
||||
} tre_parse_ctx_t;
|
||||
|
||||
/* Parses a wide character regexp pattern into a syntax tree. This parser
|
||||
handles both syntaxes (BRE and ERE), including the TRE extensions. */
|
||||
reg_errcode_t
|
||||
tre_parse(tre_parse_ctx_t *ctx);
|
||||
|
||||
#endif /* TRE_PARSE_H */
|
||||
|
||||
/* EOF */
|
||||
123
deps/tre/lib/tre-stack.c
vendored
Normal file
123
deps/tre/lib/tre-stack.c
vendored
Normal file
|
|
@ -0,0 +1,123 @@
|
|||
/*
|
||||
tre-stack.c - Simple stack implementation
|
||||
|
||||
This software is released under a BSD-style license.
|
||||
See the file LICENSE for details and copyright.
|
||||
|
||||
*/
|
||||
|
||||
#ifdef HAVE_CONFIG_H
|
||||
#include <config.h>
|
||||
#endif /* HAVE_CONFIG_H */
|
||||
#include <stdlib.h>
|
||||
#include <assert.h>
|
||||
|
||||
#include "tre-internal.h"
|
||||
#include "tre-stack.h"
|
||||
#include "xmalloc.h"
|
||||
|
||||
union tre_stack_item {
|
||||
void *voidptr_value;
|
||||
int int_value;
|
||||
};
|
||||
|
||||
struct tre_stack_rec {
|
||||
size_t size;
|
||||
size_t max_size;
|
||||
size_t ptr;
|
||||
union tre_stack_item *stack;
|
||||
};
|
||||
|
||||
|
||||
tre_stack_t *
|
||||
tre_stack_new(size_t size, size_t max_size)
|
||||
{
|
||||
tre_stack_t *s;
|
||||
|
||||
s = xmalloc(sizeof(*s));
|
||||
if (s != NULL)
|
||||
{
|
||||
s->stack = xmalloc(sizeof(*s->stack) * size);
|
||||
if (s->stack == NULL)
|
||||
{
|
||||
xfree(s);
|
||||
return NULL;
|
||||
}
|
||||
s->size = size;
|
||||
s->max_size = max_size;
|
||||
s->ptr = 0;
|
||||
}
|
||||
return s;
|
||||
}
|
||||
|
||||
void
|
||||
tre_stack_destroy(tre_stack_t *s)
|
||||
{
|
||||
xfree(s->stack);
|
||||
xfree(s);
|
||||
}
|
||||
|
||||
size_t
|
||||
tre_stack_num_items(tre_stack_t *s)
|
||||
{
|
||||
return s->ptr;
|
||||
}
|
||||
|
||||
static reg_errcode_t
|
||||
tre_stack_push(tre_stack_t *s, union tre_stack_item value)
|
||||
{
|
||||
if (s->ptr < s->size)
|
||||
{
|
||||
s->stack[s->ptr] = value;
|
||||
s->ptr++;
|
||||
}
|
||||
else
|
||||
{
|
||||
if (s->size >= s->max_size)
|
||||
{
|
||||
DPRINT(("tre_stack_push: stack full\n"));
|
||||
return REG_ESPACE;
|
||||
}
|
||||
else
|
||||
{
|
||||
union tre_stack_item *new_buffer;
|
||||
size_t new_size;
|
||||
DPRINT(("tre_stack_push: trying to realloc more space\n"));
|
||||
new_size = s->size + s->size;
|
||||
if (new_size > s->max_size)
|
||||
new_size = s->max_size;
|
||||
new_buffer = xrealloc(s->stack, sizeof(*new_buffer) * new_size);
|
||||
if (new_buffer == NULL)
|
||||
{
|
||||
DPRINT(("tre_stack_push: realloc failed.\n"));
|
||||
return REG_ESPACE;
|
||||
}
|
||||
DPRINT(("tre_stack_push: realloc succeeded.\n"));
|
||||
assert(new_size > s->size);
|
||||
s->size = new_size;
|
||||
s->stack = new_buffer;
|
||||
tre_stack_push(s, value);
|
||||
}
|
||||
}
|
||||
return REG_OK;
|
||||
}
|
||||
|
||||
#define define_pushf(typetag, type) \
|
||||
declare_pushf(typetag, type) { \
|
||||
union tre_stack_item item; \
|
||||
item.typetag ## _value = value; \
|
||||
return tre_stack_push(s, item); \
|
||||
}
|
||||
|
||||
define_pushf(int, int)
|
||||
define_pushf(voidptr, void *)
|
||||
|
||||
#define define_popf(typetag, type) \
|
||||
declare_popf(typetag, type) { \
|
||||
return s->stack[--s->ptr].typetag ## _value; \
|
||||
}
|
||||
|
||||
define_popf(int, int)
|
||||
define_popf(voidptr, void *)
|
||||
|
||||
/* EOF */
|
||||
76
deps/tre/lib/tre-stack.h
vendored
Normal file
76
deps/tre/lib/tre-stack.h
vendored
Normal file
|
|
@ -0,0 +1,76 @@
|
|||
/*
|
||||
tre-stack.h: Stack definitions
|
||||
|
||||
This software is released under a BSD-style license.
|
||||
See the file LICENSE for details and copyright.
|
||||
|
||||
*/
|
||||
|
||||
|
||||
#ifndef TRE_STACK_H
|
||||
#define TRE_STACK_H 1
|
||||
|
||||
#include "../local_includes/tre.h"
|
||||
|
||||
typedef struct tre_stack_rec tre_stack_t;
|
||||
|
||||
/* Creates a new stack object with initial size `size' and maximum size
|
||||
`max_size'. Pushing an additional item onto a full stack will resize
|
||||
the stack to double its capacity until the maximum is reached. Returns
|
||||
the stack object or NULL if out of memory. */
|
||||
tre_stack_t *
|
||||
tre_stack_new(size_t size, size_t max_size);
|
||||
|
||||
/* Frees the stack object. */
|
||||
void
|
||||
tre_stack_destroy(tre_stack_t *s);
|
||||
|
||||
/* Returns the current number of items on the stack. */
|
||||
size_t
|
||||
tre_stack_num_items(tre_stack_t *s);
|
||||
|
||||
/* Each tre_stack_push_*(tre_stack_t *s, <type> value) function pushes
|
||||
`value' on top of stack `s'. Returns REG_ESPACE if out of memory.
|
||||
This tries to realloc() more space before failing if maximum size
|
||||
has not yet been reached. Returns REG_OK if successful. */
|
||||
#define declare_pushf(typetag, type) \
|
||||
reg_errcode_t tre_stack_push_ ## typetag(tre_stack_t *s, type value)
|
||||
|
||||
declare_pushf(voidptr, void *);
|
||||
declare_pushf(int, int);
|
||||
|
||||
/* Each tre_stack_pop_*(tre_stack_t *s) function pops the topmost
|
||||
element off of stack `s' and returns it. The stack must not be
|
||||
empty. */
|
||||
#define declare_popf(typetag, type) \
|
||||
type tre_stack_pop_ ## typetag(tre_stack_t *s)
|
||||
|
||||
declare_popf(voidptr, void *);
|
||||
declare_popf(int, int);
|
||||
|
||||
/* Just to save some typing. */
|
||||
#define STACK_PUSH(s, typetag, value) \
|
||||
do \
|
||||
{ \
|
||||
status = tre_stack_push_ ## typetag(s, value); \
|
||||
} \
|
||||
while (/*CONSTCOND*/(void)0,0)
|
||||
|
||||
#define STACK_PUSHX(s, typetag, value) \
|
||||
{ \
|
||||
status = tre_stack_push_ ## typetag(s, value); \
|
||||
if (status != REG_OK) \
|
||||
break; \
|
||||
}
|
||||
|
||||
#define STACK_PUSHR(s, typetag, value) \
|
||||
{ \
|
||||
reg_errcode_t _status; \
|
||||
_status = tre_stack_push_ ## typetag(s, value); \
|
||||
if (_status != REG_OK) \
|
||||
return _status; \
|
||||
}
|
||||
|
||||
#endif /* TRE_STACK_H */
|
||||
|
||||
/* EOF */
|
||||
362
deps/tre/lib/xmalloc.c
vendored
Normal file
362
deps/tre/lib/xmalloc.c
vendored
Normal file
|
|
@ -0,0 +1,362 @@
|
|||
/*
|
||||
xmalloc.c - Simple malloc debugging library implementation
|
||||
|
||||
This software is released under a BSD-style license.
|
||||
See the file LICENSE for details and copyright.
|
||||
|
||||
*/
|
||||
|
||||
/*
|
||||
TODO:
|
||||
- red zones
|
||||
- group dumps by source location
|
||||
*/
|
||||
|
||||
#ifdef HAVE_CONFIG_H
|
||||
#include <config.h>
|
||||
#endif /* HAVE_CONFIG_H */
|
||||
|
||||
#include <stdint.h>
|
||||
#include <stdlib.h>
|
||||
#include <assert.h>
|
||||
#include <stdio.h>
|
||||
#define XMALLOC_INTERNAL 1
|
||||
#include "xmalloc.h"
|
||||
|
||||
|
||||
/*
|
||||
Internal stuff.
|
||||
*/
|
||||
|
||||
typedef struct hashTableItemRec {
|
||||
void *ptr;
|
||||
size_t bytes;
|
||||
const char *file;
|
||||
int line;
|
||||
const char *func;
|
||||
struct hashTableItemRec *next;
|
||||
} hashTableItem;
|
||||
|
||||
typedef struct {
|
||||
hashTableItem **table;
|
||||
} hashTable;
|
||||
|
||||
static int xmalloc_peak;
|
||||
int xmalloc_current;
|
||||
static int xmalloc_peak_blocks;
|
||||
int xmalloc_current_blocks;
|
||||
static int xmalloc_fail_after;
|
||||
|
||||
#define TABLE_BITS 8
|
||||
#define TABLE_MASK ((1 << TABLE_BITS) - 1)
|
||||
#define TABLE_SIZE (1 << TABLE_BITS)
|
||||
|
||||
static hashTable *
|
||||
hash_table_new(void)
|
||||
{
|
||||
hashTable *tbl;
|
||||
|
||||
tbl = malloc(sizeof(*tbl));
|
||||
|
||||
if (tbl != NULL)
|
||||
{
|
||||
tbl->table = calloc(TABLE_SIZE, sizeof(*tbl->table));
|
||||
|
||||
if (tbl->table == NULL)
|
||||
{
|
||||
free(tbl);
|
||||
return NULL;
|
||||
}
|
||||
}
|
||||
|
||||
return tbl;
|
||||
}
|
||||
|
||||
static unsigned int
|
||||
hash_void_ptr(void *ptr)
|
||||
{
|
||||
unsigned int hash;
|
||||
unsigned int i;
|
||||
|
||||
/* I took this hash function just off the top of my head, I have
|
||||
no idea whether it is bad or very bad. */
|
||||
hash = 0;
|
||||
for (i = 0; i < sizeof(ptr) * 8 / TABLE_BITS; i++)
|
||||
{
|
||||
hash ^= (uintptr_t)ptr >> i * 8;
|
||||
hash += i * 17;
|
||||
hash &= TABLE_MASK;
|
||||
}
|
||||
return hash;
|
||||
}
|
||||
|
||||
static void
|
||||
hash_table_add(hashTable *tbl, void *ptr, size_t bytes,
|
||||
const char *file, int line, const char *func)
|
||||
{
|
||||
unsigned int i;
|
||||
hashTableItem *item, *new;
|
||||
|
||||
i = hash_void_ptr(ptr);
|
||||
|
||||
item = tbl->table[i];
|
||||
if (item != NULL)
|
||||
while (item->next != NULL)
|
||||
item = item->next;
|
||||
|
||||
new = malloc(sizeof(*new));
|
||||
assert(new != NULL);
|
||||
new->ptr = ptr;
|
||||
new->bytes = bytes;
|
||||
new->file = file;
|
||||
new->line = line;
|
||||
new->func = func;
|
||||
new->next = NULL;
|
||||
if (item != NULL)
|
||||
item->next = new;
|
||||
else
|
||||
tbl->table[i] = new;
|
||||
|
||||
xmalloc_current += bytes;
|
||||
if (xmalloc_current > xmalloc_peak)
|
||||
xmalloc_peak = xmalloc_current;
|
||||
xmalloc_current_blocks++;
|
||||
if (xmalloc_current_blocks > xmalloc_peak_blocks)
|
||||
xmalloc_peak_blocks = xmalloc_current_blocks;
|
||||
}
|
||||
|
||||
static void
|
||||
#if defined(__GNUC__) && __GNUC__ >= 10
|
||||
__attribute__((access(none, 2)))
|
||||
#endif
|
||||
hash_table_del(hashTable *tbl, void *ptr)
|
||||
{
|
||||
int i;
|
||||
hashTableItem *item, *prev;
|
||||
|
||||
i = hash_void_ptr(ptr);
|
||||
|
||||
item = tbl->table[i];
|
||||
if (item == NULL)
|
||||
{
|
||||
printf("xfree: invalid ptr %p\n", ptr);
|
||||
abort();
|
||||
}
|
||||
prev = NULL;
|
||||
while (item->ptr != ptr)
|
||||
{
|
||||
prev = item;
|
||||
item = item->next;
|
||||
}
|
||||
if (item->ptr != ptr)
|
||||
{
|
||||
printf("xfree: invalid ptr %p\n", ptr);
|
||||
abort();
|
||||
}
|
||||
|
||||
xmalloc_current -= item->bytes;
|
||||
xmalloc_current_blocks--;
|
||||
|
||||
if (prev != NULL)
|
||||
{
|
||||
prev->next = item->next;
|
||||
free(item);
|
||||
}
|
||||
else
|
||||
{
|
||||
tbl->table[i] = item->next;
|
||||
free(item);
|
||||
}
|
||||
}
|
||||
|
||||
static hashTable *xmalloc_table = NULL;
|
||||
|
||||
static void
|
||||
xmalloc_init(void)
|
||||
{
|
||||
if (xmalloc_table == NULL)
|
||||
{
|
||||
xmalloc_table = hash_table_new();
|
||||
xmalloc_peak = 0;
|
||||
xmalloc_peak_blocks = 0;
|
||||
xmalloc_current = 0;
|
||||
xmalloc_current_blocks = 0;
|
||||
xmalloc_fail_after = -1;
|
||||
}
|
||||
assert(xmalloc_table != NULL);
|
||||
assert(xmalloc_table->table != NULL);
|
||||
}
|
||||
|
||||
|
||||
|
||||
/*
|
||||
Public API.
|
||||
*/
|
||||
|
||||
void
|
||||
xmalloc_configure(int fail_after)
|
||||
{
|
||||
xmalloc_init();
|
||||
xmalloc_fail_after = fail_after;
|
||||
}
|
||||
|
||||
int
|
||||
xmalloc_dump_leaks(void)
|
||||
{
|
||||
unsigned int i;
|
||||
unsigned int num_leaks = 0;
|
||||
size_t leaked_bytes = 0;
|
||||
hashTableItem *item;
|
||||
|
||||
xmalloc_init();
|
||||
|
||||
for (i = 0; i < TABLE_SIZE; i++)
|
||||
{
|
||||
item = xmalloc_table->table[i];
|
||||
while (item != NULL)
|
||||
{
|
||||
printf("%s:%d: %s: %zu bytes at %p not freed\n",
|
||||
item->file, item->line, item->func, item->bytes, item->ptr);
|
||||
num_leaks++;
|
||||
leaked_bytes += item->bytes;
|
||||
item = item->next;
|
||||
}
|
||||
}
|
||||
if (num_leaks == 0)
|
||||
printf("No memory leaks.\n");
|
||||
else
|
||||
printf("%u unfreed memory chuncks, total %zu unfreed bytes.\n",
|
||||
num_leaks, leaked_bytes);
|
||||
printf("Peak memory consumption %d bytes (%.1f kB, %.1f MB) in %d blocks ",
|
||||
xmalloc_peak, (double)xmalloc_peak / 1024,
|
||||
(double)xmalloc_peak / (1024*1024), xmalloc_peak_blocks);
|
||||
printf("(average ");
|
||||
if (xmalloc_peak_blocks)
|
||||
printf("%d", ((xmalloc_peak + xmalloc_peak_blocks / 2)
|
||||
/ xmalloc_peak_blocks));
|
||||
else
|
||||
printf("N/A");
|
||||
printf(" bytes per block).\n");
|
||||
|
||||
return num_leaks;
|
||||
}
|
||||
|
||||
void *
|
||||
xmalloc_impl(size_t size, const char *file, int line, const char *func)
|
||||
{
|
||||
void *ptr;
|
||||
|
||||
xmalloc_init();
|
||||
assert(size > 0);
|
||||
|
||||
if (xmalloc_fail_after == 0)
|
||||
{
|
||||
xmalloc_fail_after = -2;
|
||||
#if 0
|
||||
printf("xmalloc: forced failure %s:%d: %s\n", file, line, func);
|
||||
#endif
|
||||
return NULL;
|
||||
}
|
||||
else if (xmalloc_fail_after == -2)
|
||||
{
|
||||
printf("xmalloc: called after failure from %s:%d: %s\n",
|
||||
file, line, func);
|
||||
assert(0);
|
||||
}
|
||||
else if (xmalloc_fail_after > 0)
|
||||
xmalloc_fail_after--;
|
||||
|
||||
ptr = malloc(size);
|
||||
if (ptr != NULL)
|
||||
hash_table_add(xmalloc_table, ptr, (int)size, file, line, func);
|
||||
return ptr;
|
||||
}
|
||||
|
||||
void *
|
||||
xcalloc_impl(size_t nmemb, size_t size, const char *file, int line,
|
||||
const char *func)
|
||||
{
|
||||
void *ptr;
|
||||
|
||||
xmalloc_init();
|
||||
assert(size > 0);
|
||||
|
||||
if (xmalloc_fail_after == 0)
|
||||
{
|
||||
xmalloc_fail_after = -2;
|
||||
#if 0
|
||||
printf("xcalloc: forced failure %s:%d: %s\n", file, line, func);
|
||||
#endif
|
||||
return NULL;
|
||||
}
|
||||
else if (xmalloc_fail_after == -2)
|
||||
{
|
||||
printf("xcalloc: called after failure from %s:%d: %s\n",
|
||||
file, line, func);
|
||||
assert(0);
|
||||
}
|
||||
else if (xmalloc_fail_after > 0)
|
||||
xmalloc_fail_after--;
|
||||
|
||||
ptr = calloc(nmemb, size);
|
||||
if (ptr != NULL)
|
||||
hash_table_add(xmalloc_table, ptr, (int)(nmemb * size), file, line, func);
|
||||
return ptr;
|
||||
}
|
||||
|
||||
void
|
||||
xfree_impl(void *ptr, const char *file, int line, const char *func)
|
||||
{
|
||||
/*LINTED*/(void)&file;
|
||||
/*LINTED*/(void)&line;
|
||||
/*LINTED*/(void)&func;
|
||||
xmalloc_init();
|
||||
|
||||
if (ptr != NULL)
|
||||
hash_table_del(xmalloc_table, ptr);
|
||||
free(ptr);
|
||||
}
|
||||
|
||||
void *
|
||||
xrealloc_impl(void *ptr, size_t new_size, const char *file, int line,
|
||||
const char *func)
|
||||
{
|
||||
void *new_ptr;
|
||||
|
||||
xmalloc_init();
|
||||
assert(ptr != NULL);
|
||||
assert(new_size > 0);
|
||||
|
||||
if (xmalloc_fail_after == 0)
|
||||
{
|
||||
xmalloc_fail_after = -2;
|
||||
return NULL;
|
||||
}
|
||||
else if (xmalloc_fail_after == -2)
|
||||
{
|
||||
printf("xrealloc: called after failure from %s:%d: %s\n",
|
||||
file, line, func);
|
||||
assert(0);
|
||||
}
|
||||
else if (xmalloc_fail_after > 0)
|
||||
xmalloc_fail_after--;
|
||||
|
||||
new_ptr = realloc(ptr, new_size);
|
||||
if (new_ptr != NULL && new_ptr != ptr)
|
||||
{
|
||||
#if defined(__GNUC__) && !defined(__clang__)
|
||||
#pragma GCC diagnostic push
|
||||
#pragma GCC diagnostic ignored "-Wuse-after-free"
|
||||
#endif
|
||||
hash_table_del(xmalloc_table, ptr);
|
||||
#if defined(__GNUC__) && !defined(__clang__)
|
||||
#pragma GCC diagnostic pop
|
||||
#endif
|
||||
hash_table_add(xmalloc_table, new_ptr, (int)new_size, file, line, func);
|
||||
}
|
||||
return new_ptr;
|
||||
}
|
||||
|
||||
|
||||
|
||||
/* EOF */
|
||||
77
deps/tre/lib/xmalloc.h
vendored
Normal file
77
deps/tre/lib/xmalloc.h
vendored
Normal file
|
|
@ -0,0 +1,77 @@
|
|||
/*
|
||||
xmalloc.h - Simple malloc debugging library API
|
||||
|
||||
This software is released under a BSD-style license.
|
||||
See the file LICENSE for details and copyright.
|
||||
|
||||
*/
|
||||
|
||||
#ifndef _XMALLOC_H
|
||||
#define _XMALLOC_H 1
|
||||
|
||||
void *xmalloc_impl(size_t size, const char *file, int line, const char *func);
|
||||
void *xcalloc_impl(size_t nmemb, size_t size, const char *file, int line,
|
||||
const char *func);
|
||||
void xfree_impl(void *ptr, const char *file, int line, const char *func);
|
||||
void *xrealloc_impl(void *ptr, size_t new_size, const char *file, int line,
|
||||
const char *func);
|
||||
int xmalloc_dump_leaks(void);
|
||||
void xmalloc_configure(int fail_after);
|
||||
|
||||
|
||||
#ifndef XMALLOC_INTERNAL
|
||||
#ifdef MALLOC_DEBUGGING
|
||||
|
||||
/* Version 2.4 and later of GCC define a magical variable `__PRETTY_FUNCTION__'
|
||||
which contains the name of the function currently being defined.
|
||||
# define __XMALLOC_FUNCTION __PRETTY_FUNCTION__
|
||||
This is broken in G++ before version 2.6.
|
||||
C9x has a similar variable called __func__, but prefer the GCC one since
|
||||
it demangles C++ function names. */
|
||||
# ifdef __GNUC__
|
||||
# if __GNUC__ > 2 || (__GNUC__ == 2 \
|
||||
&& __GNUC_MINOR__ >= (defined __cplusplus ? 6 : 4))
|
||||
# define __XMALLOC_FUNCTION __PRETTY_FUNCTION__
|
||||
# else
|
||||
# define __XMALLOC_FUNCTION ((const char *) 0)
|
||||
# endif
|
||||
# else
|
||||
# if defined __STDC_VERSION__ && __STDC_VERSION__ >= 199901L
|
||||
# define __XMALLOC_FUNCTION __func__
|
||||
# else
|
||||
# define __XMALLOC_FUNCTION ((const char *) 0)
|
||||
# endif
|
||||
# endif
|
||||
|
||||
#define xmalloc(size) xmalloc_impl(size, __FILE__, __LINE__, \
|
||||
__XMALLOC_FUNCTION)
|
||||
#define xcalloc(nmemb, size) xcalloc_impl(nmemb, size, __FILE__, __LINE__, \
|
||||
__XMALLOC_FUNCTION)
|
||||
#define xfree(ptr) xfree_impl(ptr, __FILE__, __LINE__, __XMALLOC_FUNCTION)
|
||||
#define xrealloc(ptr, new_size) xrealloc_impl(ptr, new_size, __FILE__, \
|
||||
__LINE__, __XMALLOC_FUNCTION)
|
||||
#undef malloc
|
||||
#undef calloc
|
||||
#undef free
|
||||
#undef realloc
|
||||
|
||||
#define malloc USE_XMALLOC_INSTEAD_OF_MALLOC
|
||||
#define calloc USE_XCALLOC_INSTEAD_OF_CALLOC
|
||||
#define free USE_XFREE_INSTEAD_OF_FREE
|
||||
#define realloc USE_XREALLOC_INSTEAD_OF_REALLOC
|
||||
|
||||
#else /* !MALLOC_DEBUGGING */
|
||||
|
||||
#include <stdlib.h>
|
||||
|
||||
#define xmalloc(size) malloc(size)
|
||||
#define xcalloc(nmemb, size) calloc(nmemb, size)
|
||||
#define xfree(ptr) free(ptr)
|
||||
#define xrealloc(ptr, new_size) realloc(ptr, new_size)
|
||||
|
||||
#endif /* !MALLOC_DEBUGGING */
|
||||
#endif /* !XMALLOC_INTERNAL */
|
||||
|
||||
#endif /* _XMALLOC_H */
|
||||
|
||||
/* EOF */
|
||||
48
deps/tre/local_includes/regex.h
vendored
Normal file
48
deps/tre/local_includes/regex.h
vendored
Normal file
|
|
@ -0,0 +1,48 @@
|
|||
/*
|
||||
regex.h - TRE legacy API
|
||||
|
||||
This software is released under a BSD-style license.
|
||||
See the file LICENSE for details and copyright.
|
||||
|
||||
This header is for source level compatibility with old code using
|
||||
the <tre/regex.h> header which defined the TRE API functions without
|
||||
a prefix. New code should include <tre/tre.h> instead.
|
||||
|
||||
*/
|
||||
|
||||
#ifndef TRE_REXEX_H
|
||||
#define TRE_REGEX_H 1
|
||||
|
||||
#ifdef USE_LOCAL_TRE_H
|
||||
/* Use the header(s) from the TRE package that this file is part of.
|
||||
(Yes, this file is in local_include too, but the explict path
|
||||
means there is no way to get a system tre.h by accident.) */
|
||||
#include "../local_includes/tre.h"
|
||||
#else
|
||||
/* Use the header(s) from an installed version of the TRE package
|
||||
(so that this application matches the installed libtre),
|
||||
not the one(s) in the local_includes directory. */
|
||||
#include <tre/tre.h>
|
||||
#endif
|
||||
|
||||
#ifndef TRE_USE_SYSTEM_REGEX_H
|
||||
#define regcomp tre_regcomp
|
||||
#define regerror tre_regerror
|
||||
#define regexec tre_regexec
|
||||
#define regfree tre_regfree
|
||||
#endif /* TRE_USE_SYSTEM_REGEX_H */
|
||||
|
||||
#define regacomp tre_regacomp
|
||||
#define regaexec tre_regaexec
|
||||
#define regancomp tre_regancomp
|
||||
#define reganexec tre_reganexec
|
||||
#define regawncomp tre_regawncomp
|
||||
#define regawnexec tre_regawnexec
|
||||
#define regncomp tre_regncomp
|
||||
#define regnexec tre_regnexec
|
||||
#define regwcomp tre_regwcomp
|
||||
#define regwexec tre_regwexec
|
||||
#define regwncomp tre_regwncomp
|
||||
#define regwnexec tre_regwnexec
|
||||
|
||||
#endif /* TRE_REGEX_H */
|
||||
14
deps/tre/local_includes/tre-config.h
vendored
Normal file
14
deps/tre/local_includes/tre-config.h
vendored
Normal file
|
|
@ -0,0 +1,14 @@
|
|||
/* Minimal TRE configuration for Redis.
|
||||
*
|
||||
* We use TRE as a byte-oriented regex matcher for ARGREP. Redis SDS values are
|
||||
* binary-safe byte strings, so we intentionally keep the dependency build
|
||||
* simple: no wide-char path, no multibyte locale handling, and no approximate
|
||||
* matching engine.
|
||||
*/
|
||||
|
||||
#define HAVE_SYS_TYPES_H 1
|
||||
|
||||
#define TRE_VERSION "redis-vendored"
|
||||
#define TRE_VERSION_1 0
|
||||
#define TRE_VERSION_2 0
|
||||
#define TRE_VERSION_3 0
|
||||
344
deps/tre/local_includes/tre.h
vendored
Normal file
344
deps/tre/local_includes/tre.h
vendored
Normal file
|
|
@ -0,0 +1,344 @@
|
|||
/*
|
||||
tre.h - TRE public API definitions
|
||||
|
||||
This software is released under a BSD-style license.
|
||||
See the file LICENSE for details and copyright.
|
||||
|
||||
*/
|
||||
|
||||
#ifndef TRE_H
|
||||
#define TRE_H 1
|
||||
|
||||
#ifdef USE_LOCAL_TRE_H
|
||||
/* Make certain to use the header(s) from the TRE package that this
|
||||
file is part of by giving the full path to the header from this directory. */
|
||||
#include "../local_includes/tre-config.h"
|
||||
#else
|
||||
/* Use the header in the same directory as this file if there is one. */
|
||||
#include "tre-config.h"
|
||||
#endif
|
||||
|
||||
#ifdef HAVE_SYS_TYPES_H
|
||||
#include <sys/types.h>
|
||||
#endif /* HAVE_SYS_TYPES_H */
|
||||
|
||||
#ifdef HAVE_LIBUTF8_H
|
||||
#include <libutf8.h>
|
||||
#endif /* HAVE_LIBUTF8_H */
|
||||
|
||||
#ifdef TRE_USE_SYSTEM_REGEX_H
|
||||
/* Include the system regex.h to make TRE ABI compatible with the
|
||||
system regex. */
|
||||
#include TRE_SYSTEM_REGEX_H_PATH
|
||||
#define tre_regcomp regcomp
|
||||
#define tre_regexec regexec
|
||||
#define tre_regerror regerror
|
||||
#define tre_regfree regfree
|
||||
/* The GNU C regex has a number of refinements to the POSIX standard for the
|
||||
formal parameter list of the regexec() function, and some of those fail to
|
||||
compile when using LLVM. The refinements seem to be opt-out rather than
|
||||
opt-in when using a recent gcc, and they produce a warning when TRE tries
|
||||
to mimic the API without the refinements. The TRE code still works but
|
||||
the warnings are distracting, so try to #define a flag to indicate when to
|
||||
add the refinements to TRE's parameter list too. */
|
||||
#ifdef __GNUC__
|
||||
/* Try to test something that looks pretty REGEX specific and hope we don't
|
||||
need a zillion different platform+compiler specific tests to deal with this. */
|
||||
#ifdef _REGEX_NELTS
|
||||
/* Define a TRE specific flag here so that:
|
||||
1) there is only one place where code has to be changed if the test above is not adequate, and
|
||||
2) the flag can be used in any other parts of the TRE source that might be affected by the
|
||||
GNUC refinements.
|
||||
Note that this flag is only defined when all of TRE_USE_SYSTEM_REGEX_H, __GNUC__, and _REGEX_NELTS are defined. */
|
||||
#define TRE_USE_GNUC_REGEXEC_FPL 1
|
||||
#endif
|
||||
#endif
|
||||
#endif /* TRE_USE_SYSTEM_REGEX_H */
|
||||
|
||||
#ifdef __cplusplus
|
||||
extern "C" {
|
||||
#endif
|
||||
|
||||
#ifdef TRE_USE_SYSTEM_REGEX_H
|
||||
|
||||
#ifndef REG_OK
|
||||
#define REG_OK 0
|
||||
#endif /* !REG_OK */
|
||||
|
||||
#ifndef HAVE_REG_ERRCODE_T
|
||||
typedef int reg_errcode_t;
|
||||
#endif /* !HAVE_REG_ERRCODE_T */
|
||||
|
||||
#if !defined(REG_NOSPEC) && !defined(REG_LITERAL)
|
||||
#define REG_LITERAL 0x1000
|
||||
#endif
|
||||
|
||||
/* Extra tre_regcomp() return error codes. */
|
||||
#define REG_BADMAX REG_BADBR
|
||||
|
||||
/* Extra tre_regcomp() flags. */
|
||||
#ifndef REG_BASIC
|
||||
#define REG_BASIC 0
|
||||
#endif /* !REG_BASIC */
|
||||
#define REG_RIGHT_ASSOC (REG_LITERAL << 1)
|
||||
#ifdef REG_UNGREEDY
|
||||
/* We're going to use TRE code, so we need the TRE define (dodge problem in MacOS). */
|
||||
#undef REG_UNGREEDY
|
||||
#endif
|
||||
#define REG_UNGREEDY (REG_RIGHT_ASSOC << 1)
|
||||
|
||||
#define REG_USEBYTES (REG_UNGREEDY << 1)
|
||||
|
||||
/* Extra tre_regexec() flags. */
|
||||
#define REG_APPROX_MATCHER 0x1000
|
||||
#ifdef REG_BACKTRACKING_MATCHER
|
||||
/* We're going to use TRE code, so we need the TRE define (dodge problem in MacOS). */
|
||||
#undef REG_BACKTRACKING_MATCHER
|
||||
#endif
|
||||
#define REG_BACKTRACKING_MATCHER (REG_APPROX_MATCHER << 1)
|
||||
|
||||
#else /* !TRE_USE_SYSTEM_REGEX_H */
|
||||
|
||||
/* If the we're not using system regex.h, we need to define the
|
||||
structs and enums ourselves. */
|
||||
|
||||
typedef int regoff_t;
|
||||
typedef struct {
|
||||
size_t re_nsub; /* Number of parenthesized subexpressions. */
|
||||
void *value; /* For internal use only. */
|
||||
} regex_t;
|
||||
|
||||
typedef struct {
|
||||
regoff_t rm_so;
|
||||
regoff_t rm_eo;
|
||||
} regmatch_t;
|
||||
|
||||
|
||||
typedef enum {
|
||||
REG_OK = 0, /* No error. */
|
||||
/* POSIX tre_regcomp() return error codes. (In the order listed in the
|
||||
standard.) */
|
||||
REG_NOMATCH, /* No match. */
|
||||
REG_BADPAT, /* Invalid regexp. */
|
||||
REG_ECOLLATE, /* Unknown collating element. */
|
||||
REG_ECTYPE, /* Unknown character class name. */
|
||||
REG_EESCAPE, /* Trailing backslash. */
|
||||
REG_ESUBREG, /* Invalid back reference. */
|
||||
REG_EBRACK, /* "[]" imbalance */
|
||||
REG_EPAREN, /* "\(\)" or "()" imbalance */
|
||||
REG_EBRACE, /* "\{\}" or "{}" imbalance */
|
||||
REG_BADBR, /* Invalid content of {} */
|
||||
REG_ERANGE, /* Invalid use of range operator */
|
||||
REG_ESPACE, /* Out of memory. */
|
||||
REG_BADRPT, /* Invalid use of repetition operators. */
|
||||
REG_BADMAX, /* Maximum repetition in {} too large */
|
||||
} reg_errcode_t;
|
||||
|
||||
/* POSIX tre_regcomp() flags. */
|
||||
#define REG_EXTENDED 1
|
||||
#define REG_ICASE (REG_EXTENDED << 1)
|
||||
#define REG_NEWLINE (REG_ICASE << 1)
|
||||
#define REG_NOSUB (REG_NEWLINE << 1)
|
||||
|
||||
/* Extra tre_regcomp() flags. */
|
||||
#define REG_BASIC 0
|
||||
#define REG_LITERAL (REG_NOSUB << 1)
|
||||
#define REG_RIGHT_ASSOC (REG_LITERAL << 1)
|
||||
#define REG_UNGREEDY (REG_RIGHT_ASSOC << 1)
|
||||
#define REG_USEBYTES (REG_UNGREEDY << 1)
|
||||
|
||||
/* POSIX tre_regexec() flags. */
|
||||
#define REG_NOTBOL 1
|
||||
#define REG_NOTEOL (REG_NOTBOL << 1)
|
||||
|
||||
/* Extra tre_regexec() flags. */
|
||||
#define REG_APPROX_MATCHER (REG_NOTEOL << 1)
|
||||
#define REG_BACKTRACKING_MATCHER (REG_APPROX_MATCHER << 1)
|
||||
|
||||
#endif /* !TRE_USE_SYSTEM_REGEX_H */
|
||||
|
||||
/* REG_NOSPEC and REG_LITERAL mean the same thing. */
|
||||
#if defined(REG_LITERAL) && !defined(REG_NOSPEC)
|
||||
#define REG_NOSPEC REG_LITERAL
|
||||
#elif defined(REG_NOSPEC) && !defined(REG_LITERAL)
|
||||
#define REG_LITERAL REG_NOSPEC
|
||||
#endif /* defined(REG_NOSPEC) */
|
||||
|
||||
/* The maximum number of iterations in a bound expression. */
|
||||
#undef RE_DUP_MAX
|
||||
#define RE_DUP_MAX 255
|
||||
|
||||
/* The POSIX.2 regexp functions */
|
||||
extern int
|
||||
tre_regcomp(regex_t *preg, const char *regex, int cflags);
|
||||
|
||||
#ifdef TRE_USE_GNUC_REGEXEC_FPL
|
||||
extern int
|
||||
tre_regexec(const regex_t *preg, const char *string,
|
||||
size_t nmatch, regmatch_t pmatch[_Restrict_arr_ _REGEX_NELTS (nmatch)],
|
||||
int eflags);
|
||||
#else
|
||||
extern int
|
||||
tre_regexec(const regex_t *preg, const char *string, size_t nmatch,
|
||||
regmatch_t pmatch[], int eflags);
|
||||
#endif
|
||||
|
||||
extern int
|
||||
tre_regcompb(regex_t *preg, const char *regex, int cflags);
|
||||
|
||||
extern int
|
||||
tre_regexecb(const regex_t *preg, const char *string, size_t nmatch,
|
||||
regmatch_t pmatch[], int eflags);
|
||||
|
||||
extern size_t
|
||||
tre_regerror(int errcode, const regex_t *preg, char *errbuf,
|
||||
size_t errbuf_size);
|
||||
|
||||
extern void
|
||||
tre_regfree(regex_t *preg);
|
||||
|
||||
#ifdef TRE_WCHAR
|
||||
#ifdef HAVE_WCHAR_H
|
||||
#include <wchar.h>
|
||||
#endif /* HAVE_WCHAR_H */
|
||||
|
||||
/* Wide character versions (not in POSIX.2). */
|
||||
extern int
|
||||
tre_regwcomp(regex_t *preg, const wchar_t *regex, int cflags);
|
||||
|
||||
extern int
|
||||
tre_regwexec(const regex_t *preg, const wchar_t *string,
|
||||
size_t nmatch, regmatch_t pmatch[], int eflags);
|
||||
#endif /* TRE_WCHAR */
|
||||
|
||||
/* Versions with a maximum length argument and therefore the capability to
|
||||
handle null characters in the middle of the strings (not in POSIX.2). */
|
||||
extern int
|
||||
tre_regncomp(regex_t *preg, const char *regex, size_t len, int cflags);
|
||||
|
||||
extern int
|
||||
tre_regnexec(const regex_t *preg, const char *string, size_t len,
|
||||
size_t nmatch, regmatch_t pmatch[], int eflags);
|
||||
|
||||
/* regn*b versions take byte literally as 8-bit values */
|
||||
extern int
|
||||
tre_regncompb(regex_t *preg, const char *regex, size_t n, int cflags);
|
||||
|
||||
extern int
|
||||
tre_regnexecb(const regex_t *preg, const char *str, size_t len,
|
||||
size_t nmatch, regmatch_t pmatch[], int eflags);
|
||||
|
||||
#ifdef TRE_WCHAR
|
||||
extern int
|
||||
tre_regwncomp(regex_t *preg, const wchar_t *regex, size_t len, int cflags);
|
||||
|
||||
extern int
|
||||
tre_regwnexec(const regex_t *preg, const wchar_t *string, size_t len,
|
||||
size_t nmatch, regmatch_t pmatch[], int eflags);
|
||||
#endif /* TRE_WCHAR */
|
||||
|
||||
#ifdef TRE_APPROX
|
||||
|
||||
/* Approximate matching parameter struct. */
|
||||
typedef struct {
|
||||
int cost_ins; /* Default cost of an inserted character. */
|
||||
int cost_del; /* Default cost of a deleted character. */
|
||||
int cost_subst; /* Default cost of a substituted character. */
|
||||
int max_cost; /* Maximum allowed cost of a match. */
|
||||
|
||||
int max_ins; /* Maximum allowed number of inserts. */
|
||||
int max_del; /* Maximum allowed number of deletes. */
|
||||
int max_subst; /* Maximum allowed number of substitutes. */
|
||||
int max_err; /* Maximum allowed number of errors total. */
|
||||
} regaparams_t;
|
||||
|
||||
/* Approximate matching result struct. */
|
||||
typedef struct {
|
||||
size_t nmatch; /* Length of pmatch[] array. */
|
||||
regmatch_t *pmatch; /* Submatch data. */
|
||||
int cost; /* Cost of the match. */
|
||||
int num_ins; /* Number of inserts in the match. */
|
||||
int num_del; /* Number of deletes in the match. */
|
||||
int num_subst; /* Number of substitutes in the match. */
|
||||
} regamatch_t;
|
||||
|
||||
|
||||
/* Approximate matching functions. */
|
||||
extern int
|
||||
tre_regaexec(const regex_t *preg, const char *string,
|
||||
regamatch_t *match, regaparams_t params, int eflags);
|
||||
|
||||
extern int
|
||||
tre_reganexec(const regex_t *preg, const char *string, size_t len,
|
||||
regamatch_t *match, regaparams_t params, int eflags);
|
||||
|
||||
extern int
|
||||
tre_regaexecb(const regex_t *preg, const char *string,
|
||||
regamatch_t *match, regaparams_t params, int eflags);
|
||||
|
||||
#ifdef TRE_WCHAR
|
||||
/* Wide character approximate matching. */
|
||||
extern int
|
||||
tre_regawexec(const regex_t *preg, const wchar_t *string,
|
||||
regamatch_t *match, regaparams_t params, int eflags);
|
||||
|
||||
extern int
|
||||
tre_regawnexec(const regex_t *preg, const wchar_t *string, size_t len,
|
||||
regamatch_t *match, regaparams_t params, int eflags);
|
||||
#endif /* TRE_WCHAR */
|
||||
|
||||
/* Sets the parameters to default values. */
|
||||
extern void
|
||||
tre_regaparams_default(regaparams_t *params);
|
||||
#endif /* TRE_APPROX */
|
||||
|
||||
#ifdef TRE_WCHAR
|
||||
typedef wchar_t tre_char_t;
|
||||
#else /* !TRE_WCHAR */
|
||||
typedef unsigned char tre_char_t;
|
||||
#endif /* !TRE_WCHAR */
|
||||
|
||||
typedef struct {
|
||||
int (*get_next_char)(tre_char_t *c, unsigned int *pos_add, void *context);
|
||||
void (*rewind)(size_t pos, void *context);
|
||||
int (*compare)(size_t pos1, size_t pos2, size_t len, void *context);
|
||||
void *context;
|
||||
} tre_str_source;
|
||||
|
||||
extern int
|
||||
tre_reguexec(const regex_t *preg, const tre_str_source *string,
|
||||
size_t nmatch, regmatch_t pmatch[], int eflags);
|
||||
|
||||
/* Returns the version string. The returned string is static. */
|
||||
extern char *
|
||||
tre_version(void);
|
||||
|
||||
/* Returns the value for a config parameter. The type to which `result'
|
||||
must point to depends of the value of `query', see documentation for
|
||||
more details. */
|
||||
extern int
|
||||
tre_config(int query, void *result);
|
||||
|
||||
enum {
|
||||
TRE_CONFIG_APPROX,
|
||||
TRE_CONFIG_WCHAR,
|
||||
TRE_CONFIG_MULTIBYTE,
|
||||
TRE_CONFIG_SYSTEM_ABI,
|
||||
TRE_CONFIG_VERSION
|
||||
};
|
||||
|
||||
/* Returns 1 if the compiled pattern has back references, 0 if not. */
|
||||
extern int
|
||||
tre_have_backrefs(const regex_t *preg);
|
||||
|
||||
/* Returns 1 if the compiled pattern uses approximate matching features,
|
||||
0 if not. */
|
||||
extern int
|
||||
tre_have_approx(const regex_t *preg);
|
||||
|
||||
#ifdef __cplusplus
|
||||
}
|
||||
#endif
|
||||
#endif /* TRE_H */
|
||||
|
||||
/* EOF */
|
||||
1871
deps/tre/tests/retest.c
vendored
Normal file
1871
deps/tre/tests/retest.c
vendored
Normal file
File diff suppressed because it is too large
Load diff
303
deps/tre/tests/test-literal-opt.c
vendored
Normal file
303
deps/tre/tests/test-literal-opt.c
vendored
Normal file
|
|
@ -0,0 +1,303 @@
|
|||
/*
|
||||
test-literal-opt.c - Validate TRE literal optimization against the
|
||||
generic matcher.
|
||||
|
||||
This software is released under a BSD-style license.
|
||||
See the file LICENSE for details and copyright.
|
||||
|
||||
*/
|
||||
|
||||
#ifdef HAVE_CONFIG_H
|
||||
#include <config.h>
|
||||
#endif /* HAVE_CONFIG_H */
|
||||
|
||||
#include <locale.h>
|
||||
#include <stdio.h>
|
||||
#include <string.h>
|
||||
|
||||
#include "tre-internal.h"
|
||||
|
||||
#define PMATCH_SLOTS 4
|
||||
#define RC_ANY -9999
|
||||
|
||||
typedef struct {
|
||||
const char *name;
|
||||
const char *pattern;
|
||||
size_t pattern_len;
|
||||
int cflags;
|
||||
const char *string;
|
||||
size_t string_len;
|
||||
int eflags;
|
||||
int expected_rc;
|
||||
tre_literal_opt_mode_t expected_mode;
|
||||
} litopt_case_t;
|
||||
|
||||
static void
|
||||
init_pmatch(regmatch_t pmatch[], size_t count)
|
||||
{
|
||||
size_t i;
|
||||
|
||||
for (i = 0; i < count; i++)
|
||||
{
|
||||
pmatch[i].rm_so = 111;
|
||||
pmatch[i].rm_eo = 222;
|
||||
}
|
||||
}
|
||||
|
||||
static int
|
||||
same_pmatch(const regmatch_t a[], const regmatch_t b[], size_t count)
|
||||
{
|
||||
size_t i;
|
||||
|
||||
for (i = 0; i < count; i++)
|
||||
if (a[i].rm_so != b[i].rm_so || a[i].rm_eo != b[i].rm_eo)
|
||||
return 0;
|
||||
return 1;
|
||||
}
|
||||
|
||||
static int
|
||||
pmatch_cleared(const regmatch_t pmatch[], size_t count)
|
||||
{
|
||||
size_t i;
|
||||
|
||||
for (i = 0; i < count; i++)
|
||||
if (pmatch[i].rm_so != -1 || pmatch[i].rm_eo != -1)
|
||||
return 0;
|
||||
return 1;
|
||||
}
|
||||
|
||||
static int
|
||||
run_case(const litopt_case_t *tc)
|
||||
{
|
||||
regex_t preg;
|
||||
tre_tnfa_t *tnfa;
|
||||
regmatch_t fast[PMATCH_SLOTS], slow[PMATCH_SLOTS];
|
||||
tre_literal_opt_mode_t saved_mode;
|
||||
char errbuf[256];
|
||||
int errcode, fast_rc, slow_rc;
|
||||
|
||||
memset(&preg, 0, sizeof(preg));
|
||||
errcode = tre_regncompb(&preg, tc->pattern, tc->pattern_len, tc->cflags);
|
||||
if (errcode != REG_OK)
|
||||
{
|
||||
tre_regerror(errcode, &preg, errbuf, sizeof(errbuf));
|
||||
fprintf(stderr, "%s: compile failed: %s\n", tc->name, errbuf);
|
||||
return 1;
|
||||
}
|
||||
|
||||
tnfa = (tre_tnfa_t *)preg.value;
|
||||
if (tnfa->literal_opt.mode != tc->expected_mode)
|
||||
{
|
||||
fprintf(stderr, "%s: optimizer mode %d, expected %d\n",
|
||||
tc->name, (int)tnfa->literal_opt.mode, (int)tc->expected_mode);
|
||||
tre_regfree(&preg);
|
||||
return 1;
|
||||
}
|
||||
|
||||
init_pmatch(fast, PMATCH_SLOTS);
|
||||
init_pmatch(slow, PMATCH_SLOTS);
|
||||
|
||||
fast_rc = tre_regnexecb(&preg, tc->string, tc->string_len,
|
||||
PMATCH_SLOTS, fast, tc->eflags);
|
||||
|
||||
saved_mode = tnfa->literal_opt.mode;
|
||||
tnfa->literal_opt.mode = TRE_LITERAL_OPT_NONE;
|
||||
slow_rc = tre_regnexecb(&preg, tc->string, tc->string_len,
|
||||
PMATCH_SLOTS, slow, tc->eflags);
|
||||
tnfa->literal_opt.mode = saved_mode;
|
||||
|
||||
if (fast_rc != slow_rc)
|
||||
{
|
||||
fprintf(stderr, "%s: fast rc %d, slow rc %d\n",
|
||||
tc->name, fast_rc, slow_rc);
|
||||
tre_regfree(&preg);
|
||||
return 1;
|
||||
}
|
||||
|
||||
if (tc->expected_rc != RC_ANY && fast_rc != tc->expected_rc)
|
||||
{
|
||||
fprintf(stderr, "%s: rc %d, expected %d\n",
|
||||
tc->name, fast_rc, tc->expected_rc);
|
||||
tre_regfree(&preg);
|
||||
return 1;
|
||||
}
|
||||
|
||||
if (!same_pmatch(fast, slow, PMATCH_SLOTS))
|
||||
{
|
||||
fprintf(stderr, "%s: fast and slow pmatch differ\n", tc->name);
|
||||
tre_regfree(&preg);
|
||||
return 1;
|
||||
}
|
||||
|
||||
if ((tc->cflags & REG_NOSUB) && fast_rc == REG_OK
|
||||
&& !pmatch_cleared(fast, PMATCH_SLOTS))
|
||||
{
|
||||
fprintf(stderr, "%s: REG_NOSUB match did not clear pmatch\n", tc->name);
|
||||
tre_regfree(&preg);
|
||||
return 1;
|
||||
}
|
||||
|
||||
tre_regfree(&preg);
|
||||
return 0;
|
||||
}
|
||||
|
||||
int
|
||||
main(void)
|
||||
{
|
||||
static const char nonascii_pattern[] = { (char)0xc0, '|', (char)0xe0 };
|
||||
static const char nonascii_haystack[] = { 'x', (char)0xe0, 'y' };
|
||||
static const litopt_case_t cases[] = {
|
||||
{
|
||||
"contains basic",
|
||||
"foo|bar|baz",
|
||||
sizeof("foo|bar|baz") - 1,
|
||||
REG_EXTENDED | REG_NOSUB,
|
||||
"xxbaryy",
|
||||
sizeof("xxbaryy") - 1,
|
||||
0,
|
||||
REG_OK,
|
||||
TRE_LITERAL_OPT_CONTAINS
|
||||
},
|
||||
{
|
||||
"contains ignores bol/eol flags",
|
||||
"foo|bar|baz",
|
||||
sizeof("foo|bar|baz") - 1,
|
||||
REG_EXTENDED | REG_NOSUB,
|
||||
"xxbaryy",
|
||||
sizeof("xxbaryy") - 1,
|
||||
REG_NOTBOL | REG_NOTEOL,
|
||||
REG_OK,
|
||||
TRE_LITERAL_OPT_CONTAINS
|
||||
},
|
||||
{
|
||||
"prefix basic",
|
||||
"^(foo|bar|baz)",
|
||||
sizeof("^(foo|bar|baz)") - 1,
|
||||
REG_EXTENDED | REG_NOSUB,
|
||||
"barrier",
|
||||
sizeof("barrier") - 1,
|
||||
0,
|
||||
REG_OK,
|
||||
TRE_LITERAL_OPT_PREFIX
|
||||
},
|
||||
{
|
||||
"prefix respects REG_NOTBOL",
|
||||
"^(foo|bar|baz)",
|
||||
sizeof("^(foo|bar|baz)") - 1,
|
||||
REG_EXTENDED | REG_NOSUB,
|
||||
"barrier",
|
||||
sizeof("barrier") - 1,
|
||||
REG_NOTBOL,
|
||||
REG_NOMATCH,
|
||||
TRE_LITERAL_OPT_PREFIX
|
||||
},
|
||||
{
|
||||
"suffix basic",
|
||||
"(foo|bar|baz)$",
|
||||
sizeof("(foo|bar|baz)$") - 1,
|
||||
REG_EXTENDED | REG_NOSUB,
|
||||
"crowbar",
|
||||
sizeof("crowbar") - 1,
|
||||
0,
|
||||
REG_OK,
|
||||
TRE_LITERAL_OPT_SUFFIX
|
||||
},
|
||||
{
|
||||
"suffix respects REG_NOTEOL",
|
||||
"(foo|bar|baz)$",
|
||||
sizeof("(foo|bar|baz)$") - 1,
|
||||
REG_EXTENDED | REG_NOSUB,
|
||||
"crowbar",
|
||||
sizeof("crowbar") - 1,
|
||||
REG_NOTEOL,
|
||||
REG_NOMATCH,
|
||||
TRE_LITERAL_OPT_SUFFIX
|
||||
},
|
||||
{
|
||||
"exact basic",
|
||||
"^(foo|bar|baz)$",
|
||||
sizeof("^(foo|bar|baz)$") - 1,
|
||||
REG_EXTENDED | REG_NOSUB,
|
||||
"bar",
|
||||
sizeof("bar") - 1,
|
||||
0,
|
||||
REG_OK,
|
||||
TRE_LITERAL_OPT_EXACT
|
||||
},
|
||||
{
|
||||
"exact respects REG_NOTBOL",
|
||||
"^(foo|bar|baz)$",
|
||||
sizeof("^(foo|bar|baz)$") - 1,
|
||||
REG_EXTENDED | REG_NOSUB,
|
||||
"bar",
|
||||
sizeof("bar") - 1,
|
||||
REG_NOTBOL,
|
||||
REG_NOMATCH,
|
||||
TRE_LITERAL_OPT_EXACT
|
||||
},
|
||||
{
|
||||
"exact respects REG_NOTEOL",
|
||||
"^(foo|bar|baz)$",
|
||||
sizeof("^(foo|bar|baz)$") - 1,
|
||||
REG_EXTENDED | REG_NOSUB,
|
||||
"bar",
|
||||
sizeof("bar") - 1,
|
||||
REG_NOTEOL,
|
||||
REG_NOMATCH,
|
||||
TRE_LITERAL_OPT_EXACT
|
||||
},
|
||||
{
|
||||
"empty alternation disables optimization",
|
||||
"(|foo|bar)",
|
||||
sizeof("(|foo|bar)") - 1,
|
||||
REG_EXTENDED | REG_NOSUB,
|
||||
"",
|
||||
0,
|
||||
0,
|
||||
REG_OK,
|
||||
TRE_LITERAL_OPT_NONE
|
||||
},
|
||||
{
|
||||
"inline flag disable stays generic",
|
||||
"foo(?-i:zap)zot",
|
||||
sizeof("foo(?-i:zap)zot") - 1,
|
||||
REG_EXTENDED | REG_ICASE | REG_NOSUB,
|
||||
"FoOzApZOt",
|
||||
sizeof("FoOzApZOt") - 1,
|
||||
0,
|
||||
REG_NOMATCH,
|
||||
TRE_LITERAL_OPT_NONE
|
||||
},
|
||||
{
|
||||
"inline flag disable still matches exact scoped bytes",
|
||||
"foo(?-i:zap)zot",
|
||||
sizeof("foo(?-i:zap)zot") - 1,
|
||||
REG_EXTENDED | REG_ICASE | REG_NOSUB,
|
||||
"FoOzapZOt",
|
||||
sizeof("FoOzapZOt") - 1,
|
||||
0,
|
||||
REG_OK,
|
||||
TRE_LITERAL_OPT_NONE
|
||||
},
|
||||
{
|
||||
"nocase non-ascii bytes stay in sync",
|
||||
nonascii_pattern,
|
||||
sizeof(nonascii_pattern),
|
||||
REG_EXTENDED | REG_ICASE | REG_NOSUB,
|
||||
nonascii_haystack,
|
||||
sizeof(nonascii_haystack),
|
||||
0,
|
||||
RC_ANY,
|
||||
TRE_LITERAL_OPT_CONTAINS
|
||||
}
|
||||
};
|
||||
size_t i;
|
||||
int failures = 0;
|
||||
|
||||
setlocale(LC_CTYPE, "en_US.ISO-8859-1");
|
||||
|
||||
for (i = 0; i < elementsof(cases); i++)
|
||||
failures += run_case(&cases[i]);
|
||||
|
||||
return failures;
|
||||
}
|
||||
85
deps/tre/tests/test-malformed-regn.c
vendored
Normal file
85
deps/tre/tests/test-malformed-regn.c
vendored
Normal file
|
|
@ -0,0 +1,85 @@
|
|||
/*
|
||||
test-malformed-regn.c - Verify exact-length edge-case regexps compile or fail
|
||||
cleanly both with and without a trailing NUL byte.
|
||||
|
||||
This software is released under a BSD-style license.
|
||||
See the file LICENSE for details and copyright.
|
||||
*/
|
||||
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
|
||||
#include "tre.h"
|
||||
|
||||
typedef struct {
|
||||
const char *name;
|
||||
const char *pattern;
|
||||
int expected_err;
|
||||
} malformed_case_t;
|
||||
|
||||
static int
|
||||
run_case(const malformed_case_t *tc, int nul_terminated)
|
||||
{
|
||||
regex_t preg;
|
||||
size_t len = strlen(tc->pattern);
|
||||
size_t alloc_len = len + (nul_terminated ? 1 : 0);
|
||||
char *pattern = malloc(alloc_len ? alloc_len : 1);
|
||||
int errcode;
|
||||
|
||||
if (pattern == NULL)
|
||||
{
|
||||
fprintf(stderr, "%s: out of memory\n", tc->name);
|
||||
return 1;
|
||||
}
|
||||
|
||||
if (len > 0)
|
||||
memcpy(pattern, tc->pattern, len);
|
||||
if (nul_terminated)
|
||||
pattern[len] = '\0';
|
||||
|
||||
memset(&preg, 0, sizeof(preg));
|
||||
errcode = tre_regncompb(&preg, pattern, len, REG_EXTENDED | REG_NOSUB);
|
||||
if (errcode == REG_OK)
|
||||
tre_regfree(&preg);
|
||||
|
||||
free(pattern);
|
||||
|
||||
if (errcode != tc->expected_err)
|
||||
{
|
||||
char errbuf[128];
|
||||
memset(&preg, 0, sizeof(preg));
|
||||
tre_regerror(errcode, &preg, errbuf, sizeof(errbuf));
|
||||
fprintf(stderr, "%s (%s): got %d (%s), expected %d\n",
|
||||
tc->name, nul_terminated ? "nul" : "exact",
|
||||
errcode, errbuf, tc->expected_err);
|
||||
return 1;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
int
|
||||
main(void)
|
||||
{
|
||||
static const malformed_case_t cases[] = {
|
||||
{ "open paren", "(", REG_EPAREN },
|
||||
{ "open bracket", "[", REG_EBRACK },
|
||||
{ "unterminated comment", "(?#", REG_BADPAT },
|
||||
{ "unterminated inline flags", "(?i", REG_BADPAT },
|
||||
{ "short hex escape", "\\x", REG_OK },
|
||||
{ "unterminated wide hex", "\\x{", REG_EBRACE },
|
||||
{ "empty wide hex", "\\x{}", REG_OK }
|
||||
};
|
||||
size_t i;
|
||||
|
||||
for (i = 0; i < sizeof(cases) / sizeof(*cases); i++)
|
||||
{
|
||||
if (run_case(&cases[i], 0))
|
||||
return 1;
|
||||
if (run_case(&cases[i], 1))
|
||||
return 1;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
192
deps/tre/tests/test-str-source.c
vendored
Normal file
192
deps/tre/tests/test-str-source.c
vendored
Normal file
|
|
@ -0,0 +1,192 @@
|
|||
/*
|
||||
test-str-source.c - Sample program for using tre_reguexec()
|
||||
|
||||
This software is released under a BSD-style license.
|
||||
See the file LICENSE for details and copyright.
|
||||
|
||||
*/
|
||||
|
||||
#ifdef HAVE_CONFIG_H
|
||||
#include <config.h>
|
||||
#endif /* HAVE_CONFIG_H */
|
||||
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
/* look for getopt in order to use a -o option for output. */
|
||||
#if defined(HAVE_UNISTD_H)
|
||||
#include <unistd.h>
|
||||
#elif defined(HAVE_GETOPT_H)
|
||||
#include <getopt.h>
|
||||
#endif
|
||||
|
||||
#include "tre-internal.h"
|
||||
|
||||
static FILE *outf = NULL;
|
||||
|
||||
/* Context structure for the tre_str_source wrappers. */
|
||||
typedef struct {
|
||||
/* Our string. */
|
||||
const char *str;
|
||||
/* Current position in the string. */
|
||||
size_t pos;
|
||||
} str_handler_ctx;
|
||||
|
||||
/* The get_next_char() handler. Sets `c' to the value of the next character,
|
||||
and increases `pos_add' by the number of bytes read. Returns 1 if the
|
||||
string has ended, 0 if there are more characters. */
|
||||
static int
|
||||
str_handler_get_next(tre_char_t *c, unsigned int *pos_add, void *context)
|
||||
{
|
||||
str_handler_ctx *ctx = context;
|
||||
unsigned char ch = ctx->str[ctx->pos];
|
||||
|
||||
#ifdef TRE_DEBUG
|
||||
fprintf(outf, "str[%lu] = %d\n", (unsigned long)ctx->pos, ch);
|
||||
#endif /* TRE_DEBUG */
|
||||
*c = ch;
|
||||
if (ch)
|
||||
ctx->pos++;
|
||||
*pos_add = 1;
|
||||
|
||||
return ch == '\0';
|
||||
}
|
||||
|
||||
/* The rewind() handler. Resets the current position in the input string. */
|
||||
static void
|
||||
str_handler_rewind(size_t pos, void *context)
|
||||
{
|
||||
str_handler_ctx *ctx = context;
|
||||
|
||||
#ifdef TRE_DEBUG
|
||||
fprintf(outf, "rewind to %lu\n", (unsigned long)pos);
|
||||
#endif /* TRE_DEBUG */
|
||||
ctx->pos = pos;
|
||||
}
|
||||
|
||||
/* The compare() handler. Compares two substrings in the input and returns
|
||||
0 if the substrings are equal, and a nonzero value if not. */
|
||||
static int
|
||||
str_handler_compare(size_t pos1, size_t pos2, size_t len, void *context)
|
||||
{
|
||||
str_handler_ctx *ctx = context;
|
||||
#ifdef TRE_DEBUG
|
||||
fprintf(outf, "comparing %lu-%lu and %lu-%lu\n",
|
||||
(unsigned long)pos1, (unsigned long)pos1 + len,
|
||||
(unsigned long)pos2, (unsigned long)pos2 + len);
|
||||
#endif /* TRE_DEBUG */
|
||||
return strncmp(ctx->str + pos1, ctx->str + pos2, len);
|
||||
}
|
||||
|
||||
/* Creates a tre_str_source wrapper around the string `str'. Returns the
|
||||
tre_str_source object or NULL if out of memory. */
|
||||
static tre_str_source *
|
||||
make_str_source(const char *str)
|
||||
{
|
||||
tre_str_source *s;
|
||||
str_handler_ctx *ctx;
|
||||
|
||||
s = calloc(1, sizeof(*s));
|
||||
if (!s)
|
||||
return NULL;
|
||||
|
||||
ctx = malloc(sizeof(str_handler_ctx));
|
||||
if (!ctx)
|
||||
{
|
||||
free(s);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
ctx->str = str;
|
||||
ctx->pos = 0;
|
||||
s->context = ctx;
|
||||
s->get_next_char = str_handler_get_next;
|
||||
s->rewind = str_handler_rewind;
|
||||
s->compare = str_handler_compare;
|
||||
|
||||
return s;
|
||||
}
|
||||
|
||||
/* Frees the memory allocated for `s'. */
|
||||
static void
|
||||
free_str_source(tre_str_source *s)
|
||||
{
|
||||
free(s->context);
|
||||
free(s);
|
||||
}
|
||||
|
||||
/* Run one test with tre_reguexec. Returns 1 if the regex matches, 0 if
|
||||
it doesn't, and -1 if an error occurs. */
|
||||
static int
|
||||
test_reguexec(const char *str, const char *regex)
|
||||
{
|
||||
regex_t preg;
|
||||
tre_str_source *source;
|
||||
regmatch_t pmatch[5];
|
||||
int ret;
|
||||
|
||||
if ((source = make_str_source(str)) == NULL)
|
||||
{
|
||||
fprintf(stderr, "Out of memory\n");
|
||||
ret = -1;
|
||||
}
|
||||
else
|
||||
{
|
||||
if (tre_regcomp(&preg, regex, REG_EXTENDED) != REG_OK)
|
||||
{
|
||||
fprintf(stderr, "Failed to compile /%s/\n", regex);
|
||||
ret = -1;
|
||||
}
|
||||
else
|
||||
{
|
||||
if (tre_reguexec(&preg, source, elementsof(pmatch), pmatch, 0) == 0)
|
||||
{
|
||||
fprintf(outf, "Match: /%s/ matches \"%.*s\" in \"%s\"\n", regex,
|
||||
(int)(pmatch[0].rm_eo - pmatch[0].rm_so),
|
||||
str + pmatch[0].rm_so, str);
|
||||
ret = 1;
|
||||
}
|
||||
else
|
||||
{
|
||||
fprintf(outf, "No match: /%s/ in \"%s\"\n", regex, str);
|
||||
ret = 0;
|
||||
}
|
||||
tre_regfree(&preg);
|
||||
}
|
||||
free_str_source(source);
|
||||
}
|
||||
return ret;
|
||||
}
|
||||
|
||||
int
|
||||
main(int argc, char **argv)
|
||||
{
|
||||
int ret = 0;
|
||||
outf = stdout;
|
||||
#if defined(HAVE_UNISTD_H) || defined(HAVE_GETOPT_H)
|
||||
int opt;
|
||||
while ((opt = getopt(argc, argv, "o:")) != EOF)
|
||||
{
|
||||
switch (opt)
|
||||
{
|
||||
case 'o':
|
||||
if ((outf = fopen(optarg, "w")) == NULL)
|
||||
{
|
||||
perror(optarg);
|
||||
exit(1);
|
||||
}
|
||||
break;
|
||||
default:
|
||||
/* getopt() will have printed an error message already */
|
||||
exit(1);
|
||||
}
|
||||
}
|
||||
#endif
|
||||
ret += test_reguexec("xfoofofoofoo", "(foo)\\1") != 1;
|
||||
ret += test_reguexec("catcat", "(cat|dog)\\1") != 1;
|
||||
ret += test_reguexec("catdog", "(cat|dog)\\1") != 0;
|
||||
ret += test_reguexec("dogdog", "(cat|dog)\\1") != 1;
|
||||
ret += test_reguexec("dogcat", "(cat|dog)\\1") != 0;
|
||||
|
||||
return ret;
|
||||
}
|
||||
34
redis.conf
34
redis.conf
|
|
@ -2044,6 +2044,7 @@ latency-monitor-threshold 0
|
|||
# e Evicted events (events generated when a key is evicted for maxmemory)
|
||||
# n New key events (Note: not included in the 'A' class)
|
||||
# t Stream commands
|
||||
# a Array commands
|
||||
# d Module key type events
|
||||
# m Key-miss events (Note: It is not included in the 'A' class)
|
||||
# o Overwritten events generated every time a key is overwritten.
|
||||
|
|
@ -2057,7 +2058,7 @@ latency-monitor-threshold 0
|
|||
# __subkeyspaceitem@<db>__:<key>\n<subkey> prefix.
|
||||
# V Subkeyspaceevent events, published with
|
||||
# __subkeyspaceevent@<db>__:<event>|<key> prefix.
|
||||
# A Alias for g$lshzxetd, so that the "AKE" string means all the events
|
||||
# A Alias for g$lshzxetad, so that the "AKE" string means all the events
|
||||
# except key-miss, new key, overwritten, type-changed and rate-limit.
|
||||
#
|
||||
# The "notify-keyspace-events" takes as argument a string that is composed
|
||||
|
|
@ -2187,6 +2188,37 @@ stream-node-max-entries 100
|
|||
# stream-idmp-duration 100
|
||||
# stream-idmp-maxsize 100
|
||||
|
||||
# Arrays use a sliced directory structure for O(1) access. The slice size
|
||||
# controls the granularity of memory allocation - each slice covers a range
|
||||
# of indices. Must be a power of two between 256 and 65536.
|
||||
#
|
||||
# Smaller slices (1024-2048): Better for sparse data with large gaps between
|
||||
# indices, or many small arrays. Uses less memory per slice but more directory
|
||||
# entries.
|
||||
#
|
||||
# Larger slices (8192-16384): Better for dense/contiguous data. Fewer directory
|
||||
# entries but may waste memory if data is sparse within slices.
|
||||
#
|
||||
# Default 4096 works well for mixed workloads. If you change this setting via
|
||||
# CONFIG SET, existing arrays retain their original slice size.
|
||||
#
|
||||
# IMPORTANT CONSIDERATION: Redis arrays, for slices with very few elements, are
|
||||
# able to use a sparse representation, where the slice is not really
|
||||
# materialized into an actual contiguous allocation. See the next configuration
|
||||
# parameters for more information.
|
||||
array-slice-size 4096
|
||||
|
||||
# Arrays start with sparse slices (sorted key-value pairs) for memory efficiency
|
||||
# when elements are scattered. When a sparse slice exceeds array-sparse-kmax
|
||||
# entries, it promotes to a dense slice (direct array). When a dense slice's
|
||||
# element count drops below array-sparse-kmin and demotion would save memory,
|
||||
# it demotes back to sparse. Set kmax to 0 to disable sparse encoding entirely.
|
||||
# Set kmin to 0 if you never want dense slices to be demoted to sparse (useful
|
||||
# when in your work load arrays reach an almost empty state to be filled again
|
||||
# and so forth).
|
||||
array-sparse-kmax 10
|
||||
array-sparse-kmin 5
|
||||
|
||||
# Active rehashing uses 1 millisecond every 100 milliseconds of CPU time in
|
||||
# order to help rehashing the main Redis hash table (the one mapping top-level
|
||||
# keys to values). The hash table implementation Redis uses (see dict.c)
|
||||
|
|
|
|||
|
|
@ -37,7 +37,7 @@ endif
|
|||
ifneq ($(OPTIMIZATION),-O0)
|
||||
OPTIMIZATION+=-fno-omit-frame-pointer
|
||||
endif
|
||||
DEPENDENCY_TARGETS=hiredis linenoise lua hdr_histogram fpconv xxhash
|
||||
DEPENDENCY_TARGETS=hiredis linenoise lua hdr_histogram fpconv xxhash tre
|
||||
NODEPS:=clean distclean
|
||||
|
||||
# Default settings
|
||||
|
|
@ -384,7 +384,7 @@ endif
|
|||
|
||||
REDIS_SERVER_NAME=redis-server$(PROG_SUFFIX)
|
||||
REDIS_SENTINEL_NAME=redis-sentinel$(PROG_SUFFIX)
|
||||
REDIS_SERVER_OBJ=threads_mngr.o memory_prefetch.o adlist.o quicklist.o ae.o anet.o dict.o ebuckets.o eventnotifier.o iothread.o mstr.o entry.o kvstore.o fwtree.o estore.o server.o sds.o zmalloc.o lzf_c.o lzf_d.o pqsort.o zipmap.o sha1.o ziplist.o release.o networking.o util.o object.o db.o replication.o rdb.o t_string.o t_list.o t_set.o t_zset.o t_hash.o config.o aof.o pubsub.o multi.o debug.o sort.o intset.o syncio.o cluster.o cluster_asm.o cluster_legacy.o cluster_slot_stats.o crc16.o endianconv.o slowlog.o eval.o bio.o rio.o rand.o memtest.o syscheck.o crcspeed.o crccombine.o crc64.o bitops.o sentinel.o notify.o setproctitle.o blocked.o hyperloglog.o latency.o sparkline.o redis-check-rdb.o redis-check-aof.o geo.o lazyfree.o module.o evict.o expire.o geohash.o geohash_helper.o childinfo.o defrag.o siphash.o rax.o t_stream.o listpack.o localtime.o lolwut.o lolwut5.o lolwut6.o lolwut8.o acl.o tracking.o socket.o tls.o sha256.o timeout.o setcpuaffinity.o monotonic.o mt19937-64.o resp_parser.o call_reply.o script_lua.o script.o functions.o function_lua.o commands.o strl.o connection.o unix.o logreqres.o keymeta.o chk.o hotkeys.o gcra.o vector.o fast_float_strtod.o
|
||||
REDIS_SERVER_OBJ=threads_mngr.o memory_prefetch.o adlist.o quicklist.o ae.o anet.o dict.o ebuckets.o eventnotifier.o iothread.o mstr.o entry.o kvstore.o fwtree.o estore.o server.o sds.o zmalloc.o lzf_c.o lzf_d.o pqsort.o zipmap.o sha1.o ziplist.o release.o networking.o util.o object.o db.o replication.o rdb.o t_string.o t_list.o t_set.o t_zset.o t_hash.o t_array.o sparsearray.o config.o aof.o pubsub.o multi.o debug.o sort.o intset.o syncio.o cluster.o cluster_asm.o cluster_legacy.o cluster_slot_stats.o crc16.o endianconv.o slowlog.o eval.o bio.o rio.o rand.o memtest.o syscheck.o crcspeed.o crccombine.o crc64.o bitops.o sentinel.o notify.o setproctitle.o blocked.o hyperloglog.o latency.o sparkline.o redis-check-rdb.o redis-check-aof.o geo.o lazyfree.o module.o evict.o expire.o geohash.o geohash_helper.o childinfo.o defrag.o siphash.o rax.o t_stream.o listpack.o localtime.o lolwut.o lolwut5.o lolwut6.o lolwut8.o acl.o tracking.o socket.o tls.o sha256.o timeout.o setcpuaffinity.o monotonic.o mt19937-64.o resp_parser.o call_reply.o script_lua.o script.o functions.o function_lua.o commands.o strl.o connection.o unix.o logreqres.o keymeta.o chk.o hotkeys.o gcra.o vector.o fast_float_strtod.o
|
||||
REDIS_CLI_NAME=redis-cli$(PROG_SUFFIX)
|
||||
REDIS_CLI_OBJ=anet.o adlist.o dict.o redis-cli.o zmalloc.o release.o ae.o redisassert.o crcspeed.o crccombine.o crc64.o siphash.o crc16.o monotonic.o cli_common.o mt19937-64.o strl.o cli_commands.o
|
||||
REDIS_BENCHMARK_NAME=redis-benchmark$(PROG_SUFFIX)
|
||||
|
|
@ -444,7 +444,7 @@ endif
|
|||
|
||||
# redis-server
|
||||
$(REDIS_SERVER_NAME): $(REDIS_SERVER_OBJ) $(REDIS_VEC_SETS_OBJ)
|
||||
$(REDIS_LD) -o $@ $^ ../deps/hiredis/libhiredis.a ../deps/lua/src/liblua.a ../deps/hdr_histogram/libhdrhistogram.a ../deps/fpconv/libfpconv.a ../deps/xxhash/libxxhash.a $(FINAL_LIBS)
|
||||
$(REDIS_LD) -o $@ $^ ../deps/hiredis/libhiredis.a ../deps/lua/src/liblua.a ../deps/hdr_histogram/libhdrhistogram.a ../deps/fpconv/libfpconv.a ../deps/xxhash/libxxhash.a ../deps/tre/libtre.a $(FINAL_LIBS)
|
||||
|
||||
# redis-sentinel
|
||||
$(REDIS_SENTINEL_NAME): $(REDIS_SERVER_NAME)
|
||||
|
|
|
|||
|
|
@ -57,6 +57,7 @@ struct ACLCategoryItem {
|
|||
{"list", ACL_CATEGORY_LIST},
|
||||
{"hash", ACL_CATEGORY_HASH},
|
||||
{"string", ACL_CATEGORY_STRING},
|
||||
{"array", ACL_CATEGORY_ARRAY},
|
||||
{"bitmap", ACL_CATEGORY_BITMAP},
|
||||
{"hyperloglog", ACL_CATEGORY_HYPERLOGLOG},
|
||||
{"geo", ACL_CATEGORY_GEO},
|
||||
|
|
|
|||
112
src/aof.c
112
src/aof.c
|
|
@ -2515,6 +2515,116 @@ werr:
|
|||
return 0;
|
||||
}
|
||||
|
||||
/* Write unsigned 64-bit integer as bulk string.
|
||||
* Unlike rioWriteBulkLongLong which uses signed representation,
|
||||
* this correctly handles values >= 2^63 (e.g., array indices). */
|
||||
static int rioWriteBulkUnsignedLongLong(rio *r, uint64_t value) {
|
||||
char buf[24];
|
||||
int len = ull2string(buf, sizeof(buf), value);
|
||||
return rioWriteBulkString(r, buf, len);
|
||||
}
|
||||
|
||||
/* Helper to emit a single array element for AOF rewrite.
|
||||
* Returns 0 on error, 1 on success. Updates count and items. */
|
||||
static int aofEmitArrayElement(rio *r, robj *key, uint64_t idx, void *v,
|
||||
long long *count, long long *items) {
|
||||
if (*count == 0) {
|
||||
int cmd_items = (*items > AOF_REWRITE_ITEMS_PER_CMD/2) ?
|
||||
AOF_REWRITE_ITEMS_PER_CMD/2 : *items; /* pairs of idx+val */
|
||||
if (!rioWriteBulkCount(r,'*',2+cmd_items*2) ||
|
||||
!rioWriteBulkString(r,"ARMSET",6) ||
|
||||
!rioWriteBulkObject(r,key))
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
|
||||
/* Write index (unsigned to handle indices >= 2^63) */
|
||||
if (!rioWriteBulkUnsignedLongLong(r, idx)) return 0;
|
||||
|
||||
/* Write value - inline types use scratch space, arString aliases directly. */
|
||||
char buf[AR_INLINE_BUFSIZE];
|
||||
size_t len;
|
||||
const char *data = arDecode(v, buf, sizeof(buf), &len);
|
||||
if (!rioWriteBulkString(r, data, len)) return 0;
|
||||
|
||||
if (++(*count) == AOF_REWRITE_ITEMS_PER_CMD/2) *count = 0;
|
||||
(*items)--;
|
||||
return 1;
|
||||
}
|
||||
|
||||
/* Helper to emit all elements from a slice for AOF rewrite. */
|
||||
static int aofEmitSliceElements(rio *r, robj *key, arSlice *s, uint64_t slice_id,
|
||||
uint32_t slice_size, long long *count, long long *items) {
|
||||
if (s->encoding == AR_SLICE_DENSE) {
|
||||
for (uint32_t i = 0; i < s->layout.dense.winsize; i++) {
|
||||
void *v = s->layout.dense.items[i];
|
||||
if (arIsEmpty(v)) continue;
|
||||
uint64_t idx = arMakeIdx(slice_id, s->layout.dense.offset + i, slice_size);
|
||||
if (!aofEmitArrayElement(r, key, idx, v, count, items)) return 0;
|
||||
}
|
||||
} else {
|
||||
/* Sparse slice */
|
||||
uint16_t *offsets = s->layout.sparse.offsets;
|
||||
void **values = s->layout.sparse.values;
|
||||
for (uint32_t i = 0; i < s->count; i++) {
|
||||
uint64_t idx = arMakeIdx(slice_id, offsets[i], slice_size);
|
||||
if (!aofEmitArrayElement(r, key, idx, values[i], count, items)) return 0;
|
||||
}
|
||||
}
|
||||
return 1;
|
||||
}
|
||||
|
||||
/* Emit the commands needed to rebuild an array object.
|
||||
* The function returns 0 on error, 1 on success. */
|
||||
int rewriteArrayObject(rio *r, robj *key, robj *o) {
|
||||
redisArray *ar = o->ptr;
|
||||
long long count = 0, items = ar->count;
|
||||
if (items == 0) return 1;
|
||||
|
||||
/* Iterate through all slices, handling both flat directory mode and
|
||||
* superdir mode. This mirrors the iteration logic in rdb.c. */
|
||||
if (ar->superdir) {
|
||||
/* Superdir mode: iterate through blocks */
|
||||
for (uint32_t bi = 0; bi < ar->sdir_len; bi++) {
|
||||
arSDirEntry *e = ar->superdir + bi;
|
||||
uint64_t block_base = e->block_id * AR_SUPER_BLOCK_SLOTS;
|
||||
|
||||
for (uint32_t si = 0; si < AR_SUPER_BLOCK_SLOTS; si++) {
|
||||
arSlice *s = e->slots[si];
|
||||
if (!s) continue;
|
||||
uint64_t slice_id = block_base + si;
|
||||
if (!aofEmitSliceElements(r, key, s, slice_id, ar->slice_size,
|
||||
&count, &items)) return 0;
|
||||
}
|
||||
}
|
||||
} else {
|
||||
/* Flat directory mode */
|
||||
for (uint64_t slice_id = 0; slice_id <= ar->dir_highest_used && slice_id < ar->dir_alloc; slice_id++) {
|
||||
arSlice *s = ar->dir[slice_id];
|
||||
if (!s) continue;
|
||||
if (!aofEmitSliceElements(r, key, s, slice_id, ar->slice_size,
|
||||
&count, &items)) return 0;
|
||||
}
|
||||
}
|
||||
|
||||
/* If insert_idx is set, emit ARSEEK command to restore it.
|
||||
* When insert_idx == UINT64_MAX-1, we emit ARSEEK UINT64_MAX which
|
||||
* correctly sets insert_idx back to UINT64_MAX-1 (terminal state). */
|
||||
if (ar->insert_idx != AR_INSERT_IDX_NONE) {
|
||||
/* ARSEEK key insert_idx+1 (ARSEEK sets position for next insert) */
|
||||
if (!rioWriteBulkCount(r,'*',3) ||
|
||||
!rioWriteBulkString(r,"ARSEEK",6) ||
|
||||
!rioWriteBulkObject(r,key) ||
|
||||
!rioWriteBulkUnsignedLongLong(r, ar->insert_idx + 1))
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
|
||||
return 1;
|
||||
}
|
||||
|
||||
int rewriteObject(rio *r, robj *key, robj *o, int dbid, long long expiretime) {
|
||||
/* Save the key and associated value */
|
||||
if (o->type == OBJ_STRING) {
|
||||
|
|
@ -2536,6 +2646,8 @@ int rewriteObject(rio *r, robj *key, robj *o, int dbid, long long expiretime) {
|
|||
if (rewriteStreamObject(r,key,o) == 0) return C_ERR;
|
||||
} else if (o->type == OBJ_GCRA) {
|
||||
if (rewriteGCRAObject(r,key,o) == 0) return C_ERR;
|
||||
} else if (o->type == OBJ_ARRAY) {
|
||||
if (rewriteArrayObject(r,key,o) == 0) return C_ERR;
|
||||
} else if (o->type == OBJ_MODULE) {
|
||||
if (rewriteModuleObject(r,key,o,dbid) == 0) return C_ERR;
|
||||
} else {
|
||||
|
|
|
|||
549
src/commands.def
549
src/commands.def
|
|
@ -24,6 +24,7 @@ const char *COMMAND_GROUP_STR[] = {
|
|||
"geo",
|
||||
"stream",
|
||||
"bitmap",
|
||||
"array",
|
||||
"module",
|
||||
"rate_limit"
|
||||
};
|
||||
|
|
@ -31,6 +32,535 @@ const char *COMMAND_GROUP_STR[] = {
|
|||
const char *commandGroupStr(int index) {
|
||||
return COMMAND_GROUP_STR[index];
|
||||
}
|
||||
/********** ARCOUNT ********************/
|
||||
|
||||
#ifndef SKIP_CMD_HISTORY_TABLE
|
||||
/* ARCOUNT history */
|
||||
#define ARCOUNT_History NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_TIPS_TABLE
|
||||
/* ARCOUNT tips */
|
||||
#define ARCOUNT_Tips NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_KEY_SPECS_TABLE
|
||||
/* ARCOUNT key specs */
|
||||
keySpec ARCOUNT_Keyspecs[1] = {
|
||||
{NULL,CMD_KEY_RO|CMD_KEY_ACCESS,KSPEC_BS_INDEX,.bs.index={1},KSPEC_FK_RANGE,.fk.range={0,1,0}}
|
||||
};
|
||||
#endif
|
||||
|
||||
/* ARCOUNT argument table */
|
||||
struct COMMAND_ARG ARCOUNT_Args[] = {
|
||||
{MAKE_ARG("key",ARG_TYPE_KEY,0,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
};
|
||||
|
||||
/********** ARDEL ********************/
|
||||
|
||||
#ifndef SKIP_CMD_HISTORY_TABLE
|
||||
/* ARDEL history */
|
||||
#define ARDEL_History NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_TIPS_TABLE
|
||||
/* ARDEL tips */
|
||||
#define ARDEL_Tips NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_KEY_SPECS_TABLE
|
||||
/* ARDEL key specs */
|
||||
keySpec ARDEL_Keyspecs[1] = {
|
||||
{NULL,CMD_KEY_RW|CMD_KEY_DELETE,KSPEC_BS_INDEX,.bs.index={1},KSPEC_FK_RANGE,.fk.range={0,1,0}}
|
||||
};
|
||||
#endif
|
||||
|
||||
/* ARDEL argument table */
|
||||
struct COMMAND_ARG ARDEL_Args[] = {
|
||||
{MAKE_ARG("key",ARG_TYPE_KEY,0,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("index",ARG_TYPE_INTEGER,-1,NULL,NULL,NULL,CMD_ARG_MULTIPLE,0,NULL)},
|
||||
};
|
||||
|
||||
/********** ARDELRANGE ********************/
|
||||
|
||||
#ifndef SKIP_CMD_HISTORY_TABLE
|
||||
/* ARDELRANGE history */
|
||||
#define ARDELRANGE_History NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_TIPS_TABLE
|
||||
/* ARDELRANGE tips */
|
||||
#define ARDELRANGE_Tips NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_KEY_SPECS_TABLE
|
||||
/* ARDELRANGE key specs */
|
||||
keySpec ARDELRANGE_Keyspecs[1] = {
|
||||
{NULL,CMD_KEY_RW|CMD_KEY_DELETE,KSPEC_BS_INDEX,.bs.index={1},KSPEC_FK_RANGE,.fk.range={0,1,0}}
|
||||
};
|
||||
#endif
|
||||
|
||||
/* ARDELRANGE range argument table */
|
||||
struct COMMAND_ARG ARDELRANGE_range_Subargs[] = {
|
||||
{MAKE_ARG("start",ARG_TYPE_INTEGER,-1,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("end",ARG_TYPE_INTEGER,-1,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
};
|
||||
|
||||
/* ARDELRANGE argument table */
|
||||
struct COMMAND_ARG ARDELRANGE_Args[] = {
|
||||
{MAKE_ARG("key",ARG_TYPE_KEY,0,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("range",ARG_TYPE_BLOCK,-1,NULL,NULL,NULL,CMD_ARG_MULTIPLE,2,NULL),.subargs=ARDELRANGE_range_Subargs},
|
||||
};
|
||||
|
||||
/********** ARGET ********************/
|
||||
|
||||
#ifndef SKIP_CMD_HISTORY_TABLE
|
||||
/* ARGET history */
|
||||
#define ARGET_History NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_TIPS_TABLE
|
||||
/* ARGET tips */
|
||||
#define ARGET_Tips NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_KEY_SPECS_TABLE
|
||||
/* ARGET key specs */
|
||||
keySpec ARGET_Keyspecs[1] = {
|
||||
{NULL,CMD_KEY_RO|CMD_KEY_ACCESS,KSPEC_BS_INDEX,.bs.index={1},KSPEC_FK_RANGE,.fk.range={0,1,0}}
|
||||
};
|
||||
#endif
|
||||
|
||||
/* ARGET argument table */
|
||||
struct COMMAND_ARG ARGET_Args[] = {
|
||||
{MAKE_ARG("key",ARG_TYPE_KEY,0,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("index",ARG_TYPE_INTEGER,-1,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
};
|
||||
|
||||
/********** ARGETRANGE ********************/
|
||||
|
||||
#ifndef SKIP_CMD_HISTORY_TABLE
|
||||
/* ARGETRANGE history */
|
||||
#define ARGETRANGE_History NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_TIPS_TABLE
|
||||
/* ARGETRANGE tips */
|
||||
#define ARGETRANGE_Tips NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_KEY_SPECS_TABLE
|
||||
/* ARGETRANGE key specs */
|
||||
keySpec ARGETRANGE_Keyspecs[1] = {
|
||||
{NULL,CMD_KEY_RO|CMD_KEY_ACCESS,KSPEC_BS_INDEX,.bs.index={1},KSPEC_FK_RANGE,.fk.range={0,1,0}}
|
||||
};
|
||||
#endif
|
||||
|
||||
/* ARGETRANGE argument table */
|
||||
struct COMMAND_ARG ARGETRANGE_Args[] = {
|
||||
{MAKE_ARG("key",ARG_TYPE_KEY,0,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("start",ARG_TYPE_INTEGER,-1,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("end",ARG_TYPE_INTEGER,-1,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
};
|
||||
|
||||
/********** ARGREP ********************/
|
||||
|
||||
#ifndef SKIP_CMD_HISTORY_TABLE
|
||||
/* ARGREP history */
|
||||
#define ARGREP_History NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_TIPS_TABLE
|
||||
/* ARGREP tips */
|
||||
#define ARGREP_Tips NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_KEY_SPECS_TABLE
|
||||
/* ARGREP key specs */
|
||||
keySpec ARGREP_Keyspecs[1] = {
|
||||
{NULL,CMD_KEY_RO|CMD_KEY_ACCESS,KSPEC_BS_INDEX,.bs.index={1},KSPEC_FK_RANGE,.fk.range={0,1,0}}
|
||||
};
|
||||
#endif
|
||||
|
||||
/* ARGREP predicate exact argument table */
|
||||
struct COMMAND_ARG ARGREP_predicate_exact_Subargs[] = {
|
||||
{MAKE_ARG("exact",ARG_TYPE_PURE_TOKEN,-1,"EXACT",NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("string",ARG_TYPE_STRING,-1,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
};
|
||||
|
||||
/* ARGREP predicate match argument table */
|
||||
struct COMMAND_ARG ARGREP_predicate_match_Subargs[] = {
|
||||
{MAKE_ARG("match",ARG_TYPE_PURE_TOKEN,-1,"MATCH",NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("string",ARG_TYPE_STRING,-1,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
};
|
||||
|
||||
/* ARGREP predicate glob argument table */
|
||||
struct COMMAND_ARG ARGREP_predicate_glob_Subargs[] = {
|
||||
{MAKE_ARG("glob",ARG_TYPE_PURE_TOKEN,-1,"GLOB",NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("pattern",ARG_TYPE_STRING,-1,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
};
|
||||
|
||||
/* ARGREP predicate re argument table */
|
||||
struct COMMAND_ARG ARGREP_predicate_re_Subargs[] = {
|
||||
{MAKE_ARG("re",ARG_TYPE_PURE_TOKEN,-1,"RE",NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("pattern",ARG_TYPE_STRING,-1,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
};
|
||||
|
||||
/* ARGREP predicate argument table */
|
||||
struct COMMAND_ARG ARGREP_predicate_Subargs[] = {
|
||||
{MAKE_ARG("exact",ARG_TYPE_BLOCK,-1,NULL,NULL,NULL,CMD_ARG_NONE,2,NULL),.subargs=ARGREP_predicate_exact_Subargs},
|
||||
{MAKE_ARG("match",ARG_TYPE_BLOCK,-1,NULL,NULL,NULL,CMD_ARG_NONE,2,NULL),.subargs=ARGREP_predicate_match_Subargs},
|
||||
{MAKE_ARG("glob",ARG_TYPE_BLOCK,-1,NULL,NULL,NULL,CMD_ARG_NONE,2,NULL),.subargs=ARGREP_predicate_glob_Subargs},
|
||||
{MAKE_ARG("re",ARG_TYPE_BLOCK,-1,NULL,NULL,NULL,CMD_ARG_NONE,2,NULL),.subargs=ARGREP_predicate_re_Subargs},
|
||||
};
|
||||
|
||||
/* ARGREP options argument table */
|
||||
struct COMMAND_ARG ARGREP_options_Subargs[] = {
|
||||
{MAKE_ARG("and",ARG_TYPE_PURE_TOKEN,-1,"AND",NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("or",ARG_TYPE_PURE_TOKEN,-1,"OR",NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("limit",ARG_TYPE_INTEGER,-1,"LIMIT",NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("withvalues",ARG_TYPE_PURE_TOKEN,-1,"WITHVALUES",NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("nocase",ARG_TYPE_PURE_TOKEN,-1,"NOCASE",NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
};
|
||||
|
||||
/* ARGREP argument table */
|
||||
struct COMMAND_ARG ARGREP_Args[] = {
|
||||
{MAKE_ARG("key",ARG_TYPE_KEY,0,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("start",ARG_TYPE_STRING,-1,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("end",ARG_TYPE_STRING,-1,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("predicate",ARG_TYPE_ONEOF,-1,NULL,NULL,NULL,CMD_ARG_MULTIPLE,4,NULL),.subargs=ARGREP_predicate_Subargs},
|
||||
{MAKE_ARG("options",ARG_TYPE_ONEOF,-1,NULL,NULL,NULL,CMD_ARG_OPTIONAL|CMD_ARG_MULTIPLE,5,NULL),.subargs=ARGREP_options_Subargs},
|
||||
};
|
||||
|
||||
/********** ARINFO ********************/
|
||||
|
||||
#ifndef SKIP_CMD_HISTORY_TABLE
|
||||
/* ARINFO history */
|
||||
#define ARINFO_History NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_TIPS_TABLE
|
||||
/* ARINFO tips */
|
||||
#define ARINFO_Tips NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_KEY_SPECS_TABLE
|
||||
/* ARINFO key specs */
|
||||
keySpec ARINFO_Keyspecs[1] = {
|
||||
{NULL,CMD_KEY_RO|CMD_KEY_ACCESS,KSPEC_BS_INDEX,.bs.index={1},KSPEC_FK_RANGE,.fk.range={0,1,0}}
|
||||
};
|
||||
#endif
|
||||
|
||||
/* ARINFO argument table */
|
||||
struct COMMAND_ARG ARINFO_Args[] = {
|
||||
{MAKE_ARG("key",ARG_TYPE_KEY,0,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("full",ARG_TYPE_PURE_TOKEN,-1,"FULL",NULL,NULL,CMD_ARG_OPTIONAL,0,NULL)},
|
||||
};
|
||||
|
||||
/********** ARINSERT ********************/
|
||||
|
||||
#ifndef SKIP_CMD_HISTORY_TABLE
|
||||
/* ARINSERT history */
|
||||
#define ARINSERT_History NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_TIPS_TABLE
|
||||
/* ARINSERT tips */
|
||||
#define ARINSERT_Tips NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_KEY_SPECS_TABLE
|
||||
/* ARINSERT key specs */
|
||||
keySpec ARINSERT_Keyspecs[1] = {
|
||||
{NULL,CMD_KEY_RW|CMD_KEY_UPDATE,KSPEC_BS_INDEX,.bs.index={1},KSPEC_FK_RANGE,.fk.range={0,1,0}}
|
||||
};
|
||||
#endif
|
||||
|
||||
/* ARINSERT argument table */
|
||||
struct COMMAND_ARG ARINSERT_Args[] = {
|
||||
{MAKE_ARG("key",ARG_TYPE_KEY,0,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("value",ARG_TYPE_STRING,-1,NULL,NULL,NULL,CMD_ARG_MULTIPLE,0,NULL)},
|
||||
};
|
||||
|
||||
/********** ARLASTITEMS ********************/
|
||||
|
||||
#ifndef SKIP_CMD_HISTORY_TABLE
|
||||
/* ARLASTITEMS history */
|
||||
#define ARLASTITEMS_History NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_TIPS_TABLE
|
||||
/* ARLASTITEMS tips */
|
||||
#define ARLASTITEMS_Tips NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_KEY_SPECS_TABLE
|
||||
/* ARLASTITEMS key specs */
|
||||
keySpec ARLASTITEMS_Keyspecs[1] = {
|
||||
{NULL,CMD_KEY_RO|CMD_KEY_ACCESS,KSPEC_BS_INDEX,.bs.index={1},KSPEC_FK_RANGE,.fk.range={0,1,0}}
|
||||
};
|
||||
#endif
|
||||
|
||||
/* ARLASTITEMS argument table */
|
||||
struct COMMAND_ARG ARLASTITEMS_Args[] = {
|
||||
{MAKE_ARG("key",ARG_TYPE_KEY,0,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("count",ARG_TYPE_INTEGER,-1,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("rev",ARG_TYPE_PURE_TOKEN,-1,"REV",NULL,NULL,CMD_ARG_OPTIONAL,0,NULL)},
|
||||
};
|
||||
|
||||
/********** ARLEN ********************/
|
||||
|
||||
#ifndef SKIP_CMD_HISTORY_TABLE
|
||||
/* ARLEN history */
|
||||
#define ARLEN_History NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_TIPS_TABLE
|
||||
/* ARLEN tips */
|
||||
#define ARLEN_Tips NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_KEY_SPECS_TABLE
|
||||
/* ARLEN key specs */
|
||||
keySpec ARLEN_Keyspecs[1] = {
|
||||
{NULL,CMD_KEY_RO|CMD_KEY_ACCESS,KSPEC_BS_INDEX,.bs.index={1},KSPEC_FK_RANGE,.fk.range={0,1,0}}
|
||||
};
|
||||
#endif
|
||||
|
||||
/* ARLEN argument table */
|
||||
struct COMMAND_ARG ARLEN_Args[] = {
|
||||
{MAKE_ARG("key",ARG_TYPE_KEY,0,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
};
|
||||
|
||||
/********** ARMGET ********************/
|
||||
|
||||
#ifndef SKIP_CMD_HISTORY_TABLE
|
||||
/* ARMGET history */
|
||||
#define ARMGET_History NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_TIPS_TABLE
|
||||
/* ARMGET tips */
|
||||
#define ARMGET_Tips NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_KEY_SPECS_TABLE
|
||||
/* ARMGET key specs */
|
||||
keySpec ARMGET_Keyspecs[1] = {
|
||||
{NULL,CMD_KEY_RO|CMD_KEY_ACCESS,KSPEC_BS_INDEX,.bs.index={1},KSPEC_FK_RANGE,.fk.range={0,1,0}}
|
||||
};
|
||||
#endif
|
||||
|
||||
/* ARMGET argument table */
|
||||
struct COMMAND_ARG ARMGET_Args[] = {
|
||||
{MAKE_ARG("key",ARG_TYPE_KEY,0,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("index",ARG_TYPE_INTEGER,-1,NULL,NULL,NULL,CMD_ARG_MULTIPLE,0,NULL)},
|
||||
};
|
||||
|
||||
/********** ARMSET ********************/
|
||||
|
||||
#ifndef SKIP_CMD_HISTORY_TABLE
|
||||
/* ARMSET history */
|
||||
#define ARMSET_History NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_TIPS_TABLE
|
||||
/* ARMSET tips */
|
||||
#define ARMSET_Tips NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_KEY_SPECS_TABLE
|
||||
/* ARMSET key specs */
|
||||
keySpec ARMSET_Keyspecs[1] = {
|
||||
{NULL,CMD_KEY_RW|CMD_KEY_UPDATE,KSPEC_BS_INDEX,.bs.index={1},KSPEC_FK_RANGE,.fk.range={0,1,0}}
|
||||
};
|
||||
#endif
|
||||
|
||||
/* ARMSET data argument table */
|
||||
struct COMMAND_ARG ARMSET_data_Subargs[] = {
|
||||
{MAKE_ARG("index",ARG_TYPE_INTEGER,-1,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("value",ARG_TYPE_STRING,-1,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
};
|
||||
|
||||
/* ARMSET argument table */
|
||||
struct COMMAND_ARG ARMSET_Args[] = {
|
||||
{MAKE_ARG("key",ARG_TYPE_KEY,0,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("data",ARG_TYPE_BLOCK,-1,NULL,NULL,NULL,CMD_ARG_MULTIPLE,2,NULL),.subargs=ARMSET_data_Subargs},
|
||||
};
|
||||
|
||||
/********** ARNEXT ********************/
|
||||
|
||||
#ifndef SKIP_CMD_HISTORY_TABLE
|
||||
/* ARNEXT history */
|
||||
#define ARNEXT_History NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_TIPS_TABLE
|
||||
/* ARNEXT tips */
|
||||
#define ARNEXT_Tips NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_KEY_SPECS_TABLE
|
||||
/* ARNEXT key specs */
|
||||
keySpec ARNEXT_Keyspecs[1] = {
|
||||
{NULL,CMD_KEY_RO|CMD_KEY_ACCESS,KSPEC_BS_INDEX,.bs.index={1},KSPEC_FK_RANGE,.fk.range={0,1,0}}
|
||||
};
|
||||
#endif
|
||||
|
||||
/* ARNEXT argument table */
|
||||
struct COMMAND_ARG ARNEXT_Args[] = {
|
||||
{MAKE_ARG("key",ARG_TYPE_KEY,0,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
};
|
||||
|
||||
/********** AROP ********************/
|
||||
|
||||
#ifndef SKIP_CMD_HISTORY_TABLE
|
||||
/* AROP history */
|
||||
#define AROP_History NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_TIPS_TABLE
|
||||
/* AROP tips */
|
||||
#define AROP_Tips NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_KEY_SPECS_TABLE
|
||||
/* AROP key specs */
|
||||
keySpec AROP_Keyspecs[1] = {
|
||||
{NULL,CMD_KEY_RO|CMD_KEY_ACCESS,KSPEC_BS_INDEX,.bs.index={1},KSPEC_FK_RANGE,.fk.range={0,1,0}}
|
||||
};
|
||||
#endif
|
||||
|
||||
/* AROP operation match argument table */
|
||||
struct COMMAND_ARG AROP_operation_match_Subargs[] = {
|
||||
{MAKE_ARG("match",ARG_TYPE_PURE_TOKEN,-1,"MATCH",NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("value",ARG_TYPE_STRING,-1,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
};
|
||||
|
||||
/* AROP operation argument table */
|
||||
struct COMMAND_ARG AROP_operation_Subargs[] = {
|
||||
{MAKE_ARG("sum",ARG_TYPE_PURE_TOKEN,-1,"SUM",NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("min",ARG_TYPE_PURE_TOKEN,-1,"MIN",NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("max",ARG_TYPE_PURE_TOKEN,-1,"MAX",NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("and",ARG_TYPE_PURE_TOKEN,-1,"AND",NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("or",ARG_TYPE_PURE_TOKEN,-1,"OR",NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("xor",ARG_TYPE_PURE_TOKEN,-1,"XOR",NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("match",ARG_TYPE_BLOCK,-1,NULL,NULL,NULL,CMD_ARG_NONE,2,NULL),.subargs=AROP_operation_match_Subargs},
|
||||
{MAKE_ARG("used",ARG_TYPE_PURE_TOKEN,-1,"USED",NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
};
|
||||
|
||||
/* AROP argument table */
|
||||
struct COMMAND_ARG AROP_Args[] = {
|
||||
{MAKE_ARG("key",ARG_TYPE_KEY,0,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("start",ARG_TYPE_INTEGER,-1,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("end",ARG_TYPE_INTEGER,-1,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("operation",ARG_TYPE_ONEOF,-1,NULL,NULL,NULL,CMD_ARG_NONE,8,NULL),.subargs=AROP_operation_Subargs},
|
||||
};
|
||||
|
||||
/********** ARRING ********************/
|
||||
|
||||
#ifndef SKIP_CMD_HISTORY_TABLE
|
||||
/* ARRING history */
|
||||
#define ARRING_History NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_TIPS_TABLE
|
||||
/* ARRING tips */
|
||||
#define ARRING_Tips NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_KEY_SPECS_TABLE
|
||||
/* ARRING key specs */
|
||||
keySpec ARRING_Keyspecs[1] = {
|
||||
{NULL,CMD_KEY_RW|CMD_KEY_UPDATE,KSPEC_BS_INDEX,.bs.index={1},KSPEC_FK_RANGE,.fk.range={0,1,0}}
|
||||
};
|
||||
#endif
|
||||
|
||||
/* ARRING argument table */
|
||||
struct COMMAND_ARG ARRING_Args[] = {
|
||||
{MAKE_ARG("key",ARG_TYPE_KEY,0,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("size",ARG_TYPE_INTEGER,-1,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("value",ARG_TYPE_STRING,-1,NULL,NULL,NULL,CMD_ARG_MULTIPLE,0,NULL)},
|
||||
};
|
||||
|
||||
/********** ARSCAN ********************/
|
||||
|
||||
#ifndef SKIP_CMD_HISTORY_TABLE
|
||||
/* ARSCAN history */
|
||||
#define ARSCAN_History NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_TIPS_TABLE
|
||||
/* ARSCAN tips */
|
||||
#define ARSCAN_Tips NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_KEY_SPECS_TABLE
|
||||
/* ARSCAN key specs */
|
||||
keySpec ARSCAN_Keyspecs[1] = {
|
||||
{NULL,CMD_KEY_RO|CMD_KEY_ACCESS,KSPEC_BS_INDEX,.bs.index={1},KSPEC_FK_RANGE,.fk.range={0,1,0}}
|
||||
};
|
||||
#endif
|
||||
|
||||
/* ARSCAN argument table */
|
||||
struct COMMAND_ARG ARSCAN_Args[] = {
|
||||
{MAKE_ARG("key",ARG_TYPE_KEY,0,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("start",ARG_TYPE_INTEGER,-1,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("end",ARG_TYPE_INTEGER,-1,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("limit",ARG_TYPE_INTEGER,-1,"LIMIT",NULL,NULL,CMD_ARG_OPTIONAL,0,NULL)},
|
||||
};
|
||||
|
||||
/********** ARSEEK ********************/
|
||||
|
||||
#ifndef SKIP_CMD_HISTORY_TABLE
|
||||
/* ARSEEK history */
|
||||
#define ARSEEK_History NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_TIPS_TABLE
|
||||
/* ARSEEK tips */
|
||||
#define ARSEEK_Tips NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_KEY_SPECS_TABLE
|
||||
/* ARSEEK key specs */
|
||||
keySpec ARSEEK_Keyspecs[1] = {
|
||||
{NULL,CMD_KEY_RW|CMD_KEY_UPDATE,KSPEC_BS_INDEX,.bs.index={1},KSPEC_FK_RANGE,.fk.range={0,1,0}}
|
||||
};
|
||||
#endif
|
||||
|
||||
/* ARSEEK argument table */
|
||||
struct COMMAND_ARG ARSEEK_Args[] = {
|
||||
{MAKE_ARG("key",ARG_TYPE_KEY,0,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("index",ARG_TYPE_INTEGER,-1,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
};
|
||||
|
||||
/********** ARSET ********************/
|
||||
|
||||
#ifndef SKIP_CMD_HISTORY_TABLE
|
||||
/* ARSET history */
|
||||
#define ARSET_History NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_TIPS_TABLE
|
||||
/* ARSET tips */
|
||||
#define ARSET_Tips NULL
|
||||
#endif
|
||||
|
||||
#ifndef SKIP_CMD_KEY_SPECS_TABLE
|
||||
/* ARSET key specs */
|
||||
keySpec ARSET_Keyspecs[1] = {
|
||||
{NULL,CMD_KEY_RW|CMD_KEY_UPDATE,KSPEC_BS_INDEX,.bs.index={1},KSPEC_FK_RANGE,.fk.range={0,1,0}}
|
||||
};
|
||||
#endif
|
||||
|
||||
/* ARSET argument table */
|
||||
struct COMMAND_ARG ARSET_Args[] = {
|
||||
{MAKE_ARG("key",ARG_TYPE_KEY,0,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("index",ARG_TYPE_INTEGER,-1,NULL,NULL,NULL,CMD_ARG_NONE,0,NULL)},
|
||||
{MAKE_ARG("value",ARG_TYPE_STRING,-1,NULL,NULL,NULL,CMD_ARG_MULTIPLE,0,NULL)},
|
||||
};
|
||||
|
||||
/********** BITCOUNT ********************/
|
||||
|
||||
#ifndef SKIP_CMD_HISTORY_TABLE
|
||||
|
|
@ -11876,6 +12406,25 @@ struct COMMAND_ARG WATCH_Args[] = {
|
|||
|
||||
/* Main command table */
|
||||
struct COMMAND_STRUCT redisCommandTable[] = {
|
||||
/* array */
|
||||
{MAKE_CMD("arcount","Returns the number of non-empty elements in an array.","O(1)","8.8.0",CMD_DOC_NONE,NULL,NULL,"array",COMMAND_GROUP_ARRAY,ARCOUNT_History,0,ARCOUNT_Tips,0,arcountCommand,2,CMD_READONLY|CMD_FAST,ACL_CATEGORY_ARRAY,ARCOUNT_Keyspecs,1,NULL,1),.args=ARCOUNT_Args},
|
||||
{MAKE_CMD("ardel","Deletes elements at the specified indices in an array.","O(N) where N is the number of indices to delete","8.8.0",CMD_DOC_NONE,NULL,NULL,"array",COMMAND_GROUP_ARRAY,ARDEL_History,0,ARDEL_Tips,0,ardelCommand,-3,CMD_WRITE|CMD_FAST,ACL_CATEGORY_ARRAY,ARDEL_Keyspecs,1,NULL,2),.args=ARDEL_Args},
|
||||
{MAKE_CMD("ardelrange","Deletes elements in one or more ranges.","Proportional to the number of existing elements / slices touched, not to the numeric span of the requested ranges","8.8.0",CMD_DOC_NONE,NULL,NULL,"array",COMMAND_GROUP_ARRAY,ARDELRANGE_History,0,ARDELRANGE_Tips,0,ardelrangeCommand,-4,CMD_WRITE,ACL_CATEGORY_ARRAY,ARDELRANGE_Keyspecs,1,NULL,2),.args=ARDELRANGE_Args},
|
||||
{MAKE_CMD("arget","Gets the value at an index in an array.","O(1)","8.8.0",CMD_DOC_NONE,NULL,NULL,"array",COMMAND_GROUP_ARRAY,ARGET_History,0,ARGET_Tips,0,argetCommand,3,CMD_READONLY|CMD_FAST,ACL_CATEGORY_ARRAY,ARGET_Keyspecs,1,NULL,2),.args=ARGET_Args},
|
||||
{MAKE_CMD("argetrange","Gets values in a range of indices.","O(N) where N is the range length","8.8.0",CMD_DOC_NONE,NULL,NULL,"array",COMMAND_GROUP_ARRAY,ARGETRANGE_History,0,ARGETRANGE_Tips,0,argetrangeCommand,4,CMD_READONLY,ACL_CATEGORY_ARRAY,ARGETRANGE_Keyspecs,1,NULL,3),.args=ARGETRANGE_Args},
|
||||
{MAKE_CMD("argrep","Searches array elements in a range using textual predicates.","O(P * C) where P is the number of visited positions in touched slices and C is the cost of evaluating the predicates on one existing element.","8.8.0",CMD_DOC_NONE,NULL,NULL,"array",COMMAND_GROUP_ARRAY,ARGREP_History,0,ARGREP_Tips,0,argrepCommand,-6,CMD_READONLY,ACL_CATEGORY_ARRAY,ARGREP_Keyspecs,1,NULL,5),.args=ARGREP_Args},
|
||||
{MAKE_CMD("arinfo","Returns metadata about an array.","O(1), or O(N) with FULL option where N is the number of slices.","8.8.0",CMD_DOC_NONE,NULL,NULL,"array",COMMAND_GROUP_ARRAY,ARINFO_History,0,ARINFO_Tips,0,arinfoCommand,-2,CMD_READONLY,ACL_CATEGORY_ARRAY,ARINFO_Keyspecs,1,NULL,2),.args=ARINFO_Args},
|
||||
{MAKE_CMD("arinsert","Inserts one or more values at consecutive indices.","O(N) where N is the number of values","8.8.0",CMD_DOC_NONE,NULL,NULL,"array",COMMAND_GROUP_ARRAY,ARINSERT_History,0,ARINSERT_Tips,0,arinsertCommand,-3,CMD_WRITE|CMD_DENYOOM|CMD_FAST,ACL_CATEGORY_ARRAY,ARINSERT_Keyspecs,1,NULL,2),.args=ARINSERT_Args},
|
||||
{MAKE_CMD("arlastitems","Returns the most recently inserted elements.","O(N) where N is the count","8.8.0",CMD_DOC_NONE,NULL,NULL,"array",COMMAND_GROUP_ARRAY,ARLASTITEMS_History,0,ARLASTITEMS_Tips,0,arlastitemsCommand,-3,CMD_READONLY,ACL_CATEGORY_ARRAY,ARLASTITEMS_Keyspecs,1,NULL,3),.args=ARLASTITEMS_Args},
|
||||
{MAKE_CMD("arlen","Returns the length of an array (max index + 1).","O(1)","8.8.0",CMD_DOC_NONE,NULL,NULL,"array",COMMAND_GROUP_ARRAY,ARLEN_History,0,ARLEN_Tips,0,arlenCommand,2,CMD_READONLY|CMD_FAST,ACL_CATEGORY_ARRAY,ARLEN_Keyspecs,1,NULL,1),.args=ARLEN_Args},
|
||||
{MAKE_CMD("armget","Gets values at multiple indices in an array.","O(N) where N is the number of indices","8.8.0",CMD_DOC_NONE,NULL,NULL,"array",COMMAND_GROUP_ARRAY,ARMGET_History,0,ARMGET_Tips,0,armgetCommand,-3,CMD_READONLY|CMD_FAST,ACL_CATEGORY_ARRAY,ARMGET_Keyspecs,1,NULL,2),.args=ARMGET_Args},
|
||||
{MAKE_CMD("armset","Sets multiple index-value pairs in an array.","O(N) where N is the number of pairs","8.8.0",CMD_DOC_NONE,NULL,NULL,"array",COMMAND_GROUP_ARRAY,ARMSET_History,0,ARMSET_Tips,0,armsetCommand,-4,CMD_WRITE|CMD_DENYOOM|CMD_FAST,ACL_CATEGORY_ARRAY,ARMSET_Keyspecs,1,NULL,2),.args=ARMSET_Args},
|
||||
{MAKE_CMD("arnext","Returns the next index ARINSERT would use.","O(1)","8.8.0",CMD_DOC_NONE,NULL,NULL,"array",COMMAND_GROUP_ARRAY,ARNEXT_History,0,ARNEXT_Tips,0,arnextCommand,2,CMD_READONLY|CMD_FAST,ACL_CATEGORY_ARRAY,ARNEXT_Keyspecs,1,NULL,1),.args=ARNEXT_Args},
|
||||
{MAKE_CMD("arop","Performs aggregate operations on array elements in a range.","O(P) where P is visited positions in touched slices (dense scanned slots + sparse entries), with worst-case O(|end-start|+1) and typical case close to O(N), where N is the number of existing elements in range.","8.8.0",CMD_DOC_NONE,NULL,NULL,"array",COMMAND_GROUP_ARRAY,AROP_History,0,AROP_Tips,0,aropCommand,-5,CMD_READONLY,ACL_CATEGORY_ARRAY,AROP_Keyspecs,1,NULL,4),.args=AROP_Args},
|
||||
{MAKE_CMD("arring","Inserts values into a ring buffer of specified size, wrapping and truncating as needed.","O(M) normally, O(N+M) on ring resize, where N is the maximum of the old and new ring size and M is the number of inserted values","8.8.0",CMD_DOC_NONE,NULL,NULL,"array",COMMAND_GROUP_ARRAY,ARRING_History,0,ARRING_Tips,0,arringCommand,-4,CMD_WRITE|CMD_DENYOOM,ACL_CATEGORY_ARRAY,ARRING_Keyspecs,1,NULL,3),.args=ARRING_Args},
|
||||
{MAKE_CMD("arscan","Iterates existing elements in a range, returning index-value pairs.","O(P) where P is visited positions in touched slices (dense scanned slots + sparse entries), with worst-case O(|end-start|+1) and typical case close to O(N), where N is the number of existing elements in range.","8.8.0",CMD_DOC_NONE,NULL,NULL,"array",COMMAND_GROUP_ARRAY,ARSCAN_History,0,ARSCAN_Tips,0,arscanCommand,-4,CMD_READONLY,ACL_CATEGORY_ARRAY,ARSCAN_Keyspecs,1,NULL,4),.args=ARSCAN_Args},
|
||||
{MAKE_CMD("arseek","Sets the ARINSERT / ARRING cursor to a specific index.","O(1)","8.8.0",CMD_DOC_NONE,NULL,NULL,"array",COMMAND_GROUP_ARRAY,ARSEEK_History,0,ARSEEK_Tips,0,arseekCommand,3,CMD_WRITE|CMD_FAST,ACL_CATEGORY_ARRAY,ARSEEK_Keyspecs,1,NULL,2),.args=ARSEEK_Args},
|
||||
{MAKE_CMD("arset","Sets one or more contiguous values starting at an index in an array.","O(N) where N is the number of values","8.8.0",CMD_DOC_NONE,NULL,NULL,"array",COMMAND_GROUP_ARRAY,ARSET_History,0,ARSET_Tips,0,arsetCommand,-4,CMD_WRITE|CMD_DENYOOM|CMD_FAST,ACL_CATEGORY_ARRAY,ARSET_Keyspecs,1,NULL,3),.args=ARSET_Args},
|
||||
/* bitmap */
|
||||
{MAKE_CMD("bitcount","Counts the number of set bits (population counting) in a string.","O(N)","2.6.0",CMD_DOC_NONE,NULL,NULL,"bitmap",COMMAND_GROUP_BITMAP,BITCOUNT_History,1,BITCOUNT_Tips,0,bitcountCommand,-2,CMD_READONLY,ACL_CATEGORY_BITMAP,BITCOUNT_Keyspecs,1,NULL,2),.args=BITCOUNT_Args},
|
||||
{MAKE_CMD("bitfield","Performs arbitrary bitfield integer operations on strings.","O(1) for each subcommand specified","3.2.0",CMD_DOC_NONE,NULL,NULL,"bitmap",COMMAND_GROUP_BITMAP,BITFIELD_History,0,BITFIELD_Tips,0,bitfieldCommand,-2,CMD_WRITE|CMD_DENYOOM,ACL_CATEGORY_BITMAP,BITFIELD_Keyspecs,1,bitfieldGetKeys,2),.args=BITFIELD_Args},
|
||||
|
|
|
|||
48
src/commands/arcount.json
Normal file
48
src/commands/arcount.json
Normal file
|
|
@ -0,0 +1,48 @@
|
|||
{
|
||||
"ARCOUNT": {
|
||||
"summary": "Returns the number of non-empty elements in an array.",
|
||||
"complexity": "O(1)",
|
||||
"group": "array",
|
||||
"since": "8.8.0",
|
||||
"arity": 2,
|
||||
"function": "arcountCommand",
|
||||
"command_flags": [
|
||||
"READONLY",
|
||||
"FAST"
|
||||
],
|
||||
"acl_categories": [
|
||||
"ARRAY"
|
||||
],
|
||||
"key_specs": [
|
||||
{
|
||||
"flags": [
|
||||
"RO",
|
||||
"ACCESS"
|
||||
],
|
||||
"begin_search": {
|
||||
"index": {
|
||||
"pos": 1
|
||||
}
|
||||
},
|
||||
"find_keys": {
|
||||
"range": {
|
||||
"lastkey": 0,
|
||||
"step": 1,
|
||||
"limit": 0
|
||||
}
|
||||
}
|
||||
}
|
||||
],
|
||||
"reply_schema": {
|
||||
"description": "The number of non-empty elements, or 0 if key does not exist.",
|
||||
"type": "integer"
|
||||
},
|
||||
"arguments": [
|
||||
{
|
||||
"name": "key",
|
||||
"type": "key",
|
||||
"key_spec_index": 0
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
53
src/commands/ardel.json
Normal file
53
src/commands/ardel.json
Normal file
|
|
@ -0,0 +1,53 @@
|
|||
{
|
||||
"ARDEL": {
|
||||
"summary": "Deletes elements at the specified indices in an array.",
|
||||
"complexity": "O(N) where N is the number of indices to delete",
|
||||
"group": "array",
|
||||
"since": "8.8.0",
|
||||
"arity": -3,
|
||||
"function": "ardelCommand",
|
||||
"command_flags": [
|
||||
"WRITE",
|
||||
"FAST"
|
||||
],
|
||||
"acl_categories": [
|
||||
"ARRAY"
|
||||
],
|
||||
"key_specs": [
|
||||
{
|
||||
"flags": [
|
||||
"RW",
|
||||
"DELETE"
|
||||
],
|
||||
"begin_search": {
|
||||
"index": {
|
||||
"pos": 1
|
||||
}
|
||||
},
|
||||
"find_keys": {
|
||||
"range": {
|
||||
"lastkey": 0,
|
||||
"step": 1,
|
||||
"limit": 0
|
||||
}
|
||||
}
|
||||
}
|
||||
],
|
||||
"reply_schema": {
|
||||
"description": "Number of elements deleted.",
|
||||
"type": "integer"
|
||||
},
|
||||
"arguments": [
|
||||
{
|
||||
"name": "key",
|
||||
"type": "key",
|
||||
"key_spec_index": 0
|
||||
},
|
||||
{
|
||||
"name": "index",
|
||||
"type": "integer",
|
||||
"multiple": true
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
62
src/commands/ardelrange.json
Normal file
62
src/commands/ardelrange.json
Normal file
|
|
@ -0,0 +1,62 @@
|
|||
{
|
||||
"ARDELRANGE": {
|
||||
"summary": "Deletes elements in one or more ranges.",
|
||||
"complexity": "Proportional to the number of existing elements / slices touched, not to the numeric span of the requested ranges",
|
||||
"group": "array",
|
||||
"since": "8.8.0",
|
||||
"arity": -4,
|
||||
"function": "ardelrangeCommand",
|
||||
"command_flags": [
|
||||
"WRITE"
|
||||
],
|
||||
"acl_categories": [
|
||||
"ARRAY"
|
||||
],
|
||||
"key_specs": [
|
||||
{
|
||||
"flags": [
|
||||
"RW",
|
||||
"DELETE"
|
||||
],
|
||||
"begin_search": {
|
||||
"index": {
|
||||
"pos": 1
|
||||
}
|
||||
},
|
||||
"find_keys": {
|
||||
"range": {
|
||||
"lastkey": 0,
|
||||
"step": 1,
|
||||
"limit": 0
|
||||
}
|
||||
}
|
||||
}
|
||||
],
|
||||
"reply_schema": {
|
||||
"description": "Number of elements deleted.",
|
||||
"type": "integer"
|
||||
},
|
||||
"arguments": [
|
||||
{
|
||||
"name": "key",
|
||||
"type": "key",
|
||||
"key_spec_index": 0
|
||||
},
|
||||
{
|
||||
"name": "range",
|
||||
"type": "block",
|
||||
"multiple": true,
|
||||
"arguments": [
|
||||
{
|
||||
"name": "start",
|
||||
"type": "integer"
|
||||
},
|
||||
{
|
||||
"name": "end",
|
||||
"type": "integer"
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
60
src/commands/arget.json
Normal file
60
src/commands/arget.json
Normal file
|
|
@ -0,0 +1,60 @@
|
|||
{
|
||||
"ARGET": {
|
||||
"summary": "Gets the value at an index in an array.",
|
||||
"complexity": "O(1)",
|
||||
"group": "array",
|
||||
"since": "8.8.0",
|
||||
"arity": 3,
|
||||
"function": "argetCommand",
|
||||
"command_flags": [
|
||||
"READONLY",
|
||||
"FAST"
|
||||
],
|
||||
"acl_categories": [
|
||||
"ARRAY"
|
||||
],
|
||||
"key_specs": [
|
||||
{
|
||||
"flags": [
|
||||
"RO",
|
||||
"ACCESS"
|
||||
],
|
||||
"begin_search": {
|
||||
"index": {
|
||||
"pos": 1
|
||||
}
|
||||
},
|
||||
"find_keys": {
|
||||
"range": {
|
||||
"lastkey": 0,
|
||||
"step": 1,
|
||||
"limit": 0
|
||||
}
|
||||
}
|
||||
}
|
||||
],
|
||||
"reply_schema": {
|
||||
"oneOf": [
|
||||
{
|
||||
"description": "The value at the given index.",
|
||||
"type": "string"
|
||||
},
|
||||
{
|
||||
"description": "Null reply if key or index does not exist.",
|
||||
"type": "null"
|
||||
}
|
||||
]
|
||||
},
|
||||
"arguments": [
|
||||
{
|
||||
"name": "key",
|
||||
"type": "key",
|
||||
"key_spec_index": 0
|
||||
},
|
||||
{
|
||||
"name": "index",
|
||||
"type": "integer"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
64
src/commands/argetrange.json
Normal file
64
src/commands/argetrange.json
Normal file
|
|
@ -0,0 +1,64 @@
|
|||
{
|
||||
"ARGETRANGE": {
|
||||
"summary": "Gets values in a range of indices.",
|
||||
"complexity": "O(N) where N is the range length",
|
||||
"group": "array",
|
||||
"since": "8.8.0",
|
||||
"arity": 4,
|
||||
"function": "argetrangeCommand",
|
||||
"command_flags": [
|
||||
"READONLY"
|
||||
],
|
||||
"acl_categories": [
|
||||
"ARRAY"
|
||||
],
|
||||
"key_specs": [
|
||||
{
|
||||
"flags": [
|
||||
"RO",
|
||||
"ACCESS"
|
||||
],
|
||||
"begin_search": {
|
||||
"index": {
|
||||
"pos": 1
|
||||
}
|
||||
},
|
||||
"find_keys": {
|
||||
"range": {
|
||||
"lastkey": 0,
|
||||
"step": 1,
|
||||
"limit": 0
|
||||
}
|
||||
}
|
||||
}
|
||||
],
|
||||
"reply_schema": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"oneOf": [
|
||||
{
|
||||
"type": "string"
|
||||
},
|
||||
{
|
||||
"type": "null"
|
||||
}
|
||||
]
|
||||
}
|
||||
},
|
||||
"arguments": [
|
||||
{
|
||||
"name": "key",
|
||||
"type": "key",
|
||||
"key_spec_index": 0
|
||||
},
|
||||
{
|
||||
"name": "start",
|
||||
"type": "integer"
|
||||
},
|
||||
{
|
||||
"name": "end",
|
||||
"type": "integer"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
182
src/commands/argrep.json
Normal file
182
src/commands/argrep.json
Normal file
|
|
@ -0,0 +1,182 @@
|
|||
{
|
||||
"ARGREP": {
|
||||
"summary": "Searches array elements in a range using textual predicates.",
|
||||
"complexity": "O(P * C) where P is the number of visited positions in touched slices and C is the cost of evaluating the predicates on one existing element.",
|
||||
"group": "array",
|
||||
"since": "8.8.0",
|
||||
"arity": -6,
|
||||
"function": "argrepCommand",
|
||||
"command_flags": [
|
||||
"READONLY"
|
||||
],
|
||||
"acl_categories": [
|
||||
"ARRAY"
|
||||
],
|
||||
"key_specs": [
|
||||
{
|
||||
"flags": [
|
||||
"RO",
|
||||
"ACCESS"
|
||||
],
|
||||
"begin_search": {
|
||||
"index": {
|
||||
"pos": 1
|
||||
}
|
||||
},
|
||||
"find_keys": {
|
||||
"range": {
|
||||
"lastkey": 0,
|
||||
"step": 1,
|
||||
"limit": 0
|
||||
}
|
||||
}
|
||||
}
|
||||
],
|
||||
"reply_schema": {
|
||||
"anyOf": [
|
||||
{
|
||||
"description": "Array of matching indexes.",
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "integer",
|
||||
"description": "Index of a matching element"
|
||||
}
|
||||
},
|
||||
{
|
||||
"description": "Array of [index, value] pairs. Returned in case `WITHVALUES` was used.",
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "array",
|
||||
"minItems": 2,
|
||||
"maxItems": 2,
|
||||
"items": [
|
||||
{
|
||||
"type": "integer",
|
||||
"description": "Index of a matching element"
|
||||
},
|
||||
{
|
||||
"type": "string",
|
||||
"description": "Value at that index"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
]
|
||||
},
|
||||
"arguments": [
|
||||
{
|
||||
"name": "key",
|
||||
"type": "key",
|
||||
"key_spec_index": 0
|
||||
},
|
||||
{
|
||||
"name": "start",
|
||||
"type": "string"
|
||||
},
|
||||
{
|
||||
"name": "end",
|
||||
"type": "string"
|
||||
},
|
||||
{
|
||||
"name": "predicate",
|
||||
"type": "oneof",
|
||||
"multiple": true,
|
||||
"arguments": [
|
||||
{
|
||||
"name": "exact",
|
||||
"type": "block",
|
||||
"arguments": [
|
||||
{
|
||||
"name": "exact",
|
||||
"type": "pure-token",
|
||||
"token": "EXACT"
|
||||
},
|
||||
{
|
||||
"name": "string",
|
||||
"type": "string"
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "match",
|
||||
"type": "block",
|
||||
"arguments": [
|
||||
{
|
||||
"name": "match",
|
||||
"type": "pure-token",
|
||||
"token": "MATCH"
|
||||
},
|
||||
{
|
||||
"name": "string",
|
||||
"type": "string"
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "glob",
|
||||
"type": "block",
|
||||
"arguments": [
|
||||
{
|
||||
"name": "glob",
|
||||
"type": "pure-token",
|
||||
"token": "GLOB"
|
||||
},
|
||||
{
|
||||
"name": "pattern",
|
||||
"type": "string"
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "re",
|
||||
"type": "block",
|
||||
"arguments": [
|
||||
{
|
||||
"name": "re",
|
||||
"type": "pure-token",
|
||||
"token": "RE"
|
||||
},
|
||||
{
|
||||
"name": "pattern",
|
||||
"type": "string"
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "options",
|
||||
"type": "oneof",
|
||||
"optional": true,
|
||||
"multiple": true,
|
||||
"arguments": [
|
||||
{
|
||||
"name": "and",
|
||||
"type": "pure-token",
|
||||
"token": "AND"
|
||||
},
|
||||
{
|
||||
"name": "or",
|
||||
"type": "pure-token",
|
||||
"token": "OR"
|
||||
},
|
||||
{
|
||||
"name": "limit",
|
||||
"type": "integer",
|
||||
"token": "LIMIT"
|
||||
},
|
||||
{
|
||||
"name": "withvalues",
|
||||
"type": "pure-token",
|
||||
"token": "WITHVALUES"
|
||||
},
|
||||
{
|
||||
"name": "nocase",
|
||||
"type": "pure-token",
|
||||
"token": "NOCASE"
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
103
src/commands/arinfo.json
Normal file
103
src/commands/arinfo.json
Normal file
|
|
@ -0,0 +1,103 @@
|
|||
{
|
||||
"ARINFO": {
|
||||
"summary": "Returns metadata about an array.",
|
||||
"complexity": "O(1), or O(N) with FULL option where N is the number of slices.",
|
||||
"group": "array",
|
||||
"since": "8.8.0",
|
||||
"arity": -2,
|
||||
"function": "arinfoCommand",
|
||||
"command_flags": [
|
||||
"READONLY"
|
||||
],
|
||||
"acl_categories": [
|
||||
"ARRAY"
|
||||
],
|
||||
"key_specs": [
|
||||
{
|
||||
"flags": [
|
||||
"RO",
|
||||
"ACCESS"
|
||||
],
|
||||
"begin_search": {
|
||||
"index": {
|
||||
"pos": 1
|
||||
}
|
||||
},
|
||||
"find_keys": {
|
||||
"range": {
|
||||
"lastkey": 0,
|
||||
"step": 1,
|
||||
"limit": 0
|
||||
}
|
||||
}
|
||||
}
|
||||
],
|
||||
"reply_schema": {
|
||||
"type": "object",
|
||||
"additionalProperties": false,
|
||||
"properties": {
|
||||
"count": {
|
||||
"type": "integer",
|
||||
"description": "Total number of non-empty elements."
|
||||
},
|
||||
"len": {
|
||||
"type": "integer",
|
||||
"description": "Logical length (highest index + 1)."
|
||||
},
|
||||
"next-insert-index": {
|
||||
"type": "integer",
|
||||
"description": "Index the next ARINSERT would use, or 0 if unset/exhausted."
|
||||
},
|
||||
"slices": {
|
||||
"type": "integer",
|
||||
"description": "Number of allocated slices."
|
||||
},
|
||||
"directory-size": {
|
||||
"type": "integer",
|
||||
"description": "Directory allocation capacity (flat dir_alloc or superdir sdir_cap)."
|
||||
},
|
||||
"super-dir-entries": {
|
||||
"type": "integer",
|
||||
"description": "Number of super-directory entries (0 if not in superdir mode)."
|
||||
},
|
||||
"slice-size": {
|
||||
"type": "integer",
|
||||
"description": "Configured slice size."
|
||||
},
|
||||
"dense-slices": {
|
||||
"type": "integer",
|
||||
"description": "Number of dense slices (FULL only)."
|
||||
},
|
||||
"sparse-slices": {
|
||||
"type": "integer",
|
||||
"description": "Number of sparse slices (FULL only)."
|
||||
},
|
||||
"avg-dense-size": {
|
||||
"type": "number",
|
||||
"description": "Average allocation size of dense slices (FULL only)."
|
||||
},
|
||||
"avg-dense-fill": {
|
||||
"type": "number",
|
||||
"description": "Average fill rate of dense slices (FULL only)."
|
||||
},
|
||||
"avg-sparse-size": {
|
||||
"type": "number",
|
||||
"description": "Average capacity of sparse slices (FULL only)."
|
||||
}
|
||||
}
|
||||
},
|
||||
"arguments": [
|
||||
{
|
||||
"name": "key",
|
||||
"type": "key",
|
||||
"key_spec_index": 0
|
||||
},
|
||||
{
|
||||
"name": "full",
|
||||
"type": "pure-token",
|
||||
"token": "FULL",
|
||||
"optional": true
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
54
src/commands/arinsert.json
Normal file
54
src/commands/arinsert.json
Normal file
|
|
@ -0,0 +1,54 @@
|
|||
{
|
||||
"ARINSERT": {
|
||||
"summary": "Inserts one or more values at consecutive indices.",
|
||||
"complexity": "O(N) where N is the number of values",
|
||||
"group": "array",
|
||||
"since": "8.8.0",
|
||||
"arity": -3,
|
||||
"function": "arinsertCommand",
|
||||
"command_flags": [
|
||||
"WRITE",
|
||||
"DENYOOM",
|
||||
"FAST"
|
||||
],
|
||||
"acl_categories": [
|
||||
"ARRAY"
|
||||
],
|
||||
"key_specs": [
|
||||
{
|
||||
"flags": [
|
||||
"RW",
|
||||
"UPDATE"
|
||||
],
|
||||
"begin_search": {
|
||||
"index": {
|
||||
"pos": 1
|
||||
}
|
||||
},
|
||||
"find_keys": {
|
||||
"range": {
|
||||
"lastkey": 0,
|
||||
"step": 1,
|
||||
"limit": 0
|
||||
}
|
||||
}
|
||||
}
|
||||
],
|
||||
"reply_schema": {
|
||||
"description": "The last index where a value was inserted.",
|
||||
"type": "integer"
|
||||
},
|
||||
"arguments": [
|
||||
{
|
||||
"name": "key",
|
||||
"type": "key",
|
||||
"key_spec_index": 0
|
||||
},
|
||||
{
|
||||
"name": "value",
|
||||
"type": "string",
|
||||
"multiple": true
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
66
src/commands/arlastitems.json
Normal file
66
src/commands/arlastitems.json
Normal file
|
|
@ -0,0 +1,66 @@
|
|||
{
|
||||
"ARLASTITEMS": {
|
||||
"summary": "Returns the most recently inserted elements.",
|
||||
"complexity": "O(N) where N is the count",
|
||||
"group": "array",
|
||||
"since": "8.8.0",
|
||||
"arity": -3,
|
||||
"function": "arlastitemsCommand",
|
||||
"command_flags": [
|
||||
"READONLY"
|
||||
],
|
||||
"acl_categories": [
|
||||
"ARRAY"
|
||||
],
|
||||
"key_specs": [
|
||||
{
|
||||
"flags": [
|
||||
"RO",
|
||||
"ACCESS"
|
||||
],
|
||||
"begin_search": {
|
||||
"index": {
|
||||
"pos": 1
|
||||
}
|
||||
},
|
||||
"find_keys": {
|
||||
"range": {
|
||||
"lastkey": 0,
|
||||
"step": 1,
|
||||
"limit": 0
|
||||
}
|
||||
}
|
||||
}
|
||||
],
|
||||
"reply_schema": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"oneOf": [
|
||||
{
|
||||
"type": "string"
|
||||
},
|
||||
{
|
||||
"type": "null"
|
||||
}
|
||||
]
|
||||
}
|
||||
},
|
||||
"arguments": [
|
||||
{
|
||||
"name": "key",
|
||||
"type": "key",
|
||||
"key_spec_index": 0
|
||||
},
|
||||
{
|
||||
"name": "count",
|
||||
"type": "integer"
|
||||
},
|
||||
{
|
||||
"name": "rev",
|
||||
"type": "pure-token",
|
||||
"token": "REV",
|
||||
"optional": true
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
48
src/commands/arlen.json
Normal file
48
src/commands/arlen.json
Normal file
|
|
@ -0,0 +1,48 @@
|
|||
{
|
||||
"ARLEN": {
|
||||
"summary": "Returns the length of an array (max index + 1).",
|
||||
"complexity": "O(1)",
|
||||
"group": "array",
|
||||
"since": "8.8.0",
|
||||
"arity": 2,
|
||||
"function": "arlenCommand",
|
||||
"command_flags": [
|
||||
"READONLY",
|
||||
"FAST"
|
||||
],
|
||||
"acl_categories": [
|
||||
"ARRAY"
|
||||
],
|
||||
"key_specs": [
|
||||
{
|
||||
"flags": [
|
||||
"RO",
|
||||
"ACCESS"
|
||||
],
|
||||
"begin_search": {
|
||||
"index": {
|
||||
"pos": 1
|
||||
}
|
||||
},
|
||||
"find_keys": {
|
||||
"range": {
|
||||
"lastkey": 0,
|
||||
"step": 1,
|
||||
"limit": 0
|
||||
}
|
||||
}
|
||||
}
|
||||
],
|
||||
"reply_schema": {
|
||||
"description": "The length of the array (max index + 1), or 0 if key does not exist.",
|
||||
"type": "integer"
|
||||
},
|
||||
"arguments": [
|
||||
{
|
||||
"name": "key",
|
||||
"type": "key",
|
||||
"key_spec_index": 0
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
62
src/commands/armget.json
Normal file
62
src/commands/armget.json
Normal file
|
|
@ -0,0 +1,62 @@
|
|||
{
|
||||
"ARMGET": {
|
||||
"summary": "Gets values at multiple indices in an array.",
|
||||
"complexity": "O(N) where N is the number of indices",
|
||||
"group": "array",
|
||||
"since": "8.8.0",
|
||||
"arity": -3,
|
||||
"function": "armgetCommand",
|
||||
"command_flags": [
|
||||
"READONLY",
|
||||
"FAST"
|
||||
],
|
||||
"acl_categories": [
|
||||
"ARRAY"
|
||||
],
|
||||
"key_specs": [
|
||||
{
|
||||
"flags": [
|
||||
"RO",
|
||||
"ACCESS"
|
||||
],
|
||||
"begin_search": {
|
||||
"index": {
|
||||
"pos": 1
|
||||
}
|
||||
},
|
||||
"find_keys": {
|
||||
"range": {
|
||||
"lastkey": 0,
|
||||
"step": 1,
|
||||
"limit": 0
|
||||
}
|
||||
}
|
||||
}
|
||||
],
|
||||
"reply_schema": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"oneOf": [
|
||||
{
|
||||
"type": "string"
|
||||
},
|
||||
{
|
||||
"type": "null"
|
||||
}
|
||||
]
|
||||
}
|
||||
},
|
||||
"arguments": [
|
||||
{
|
||||
"name": "key",
|
||||
"type": "key",
|
||||
"key_spec_index": 0
|
||||
},
|
||||
{
|
||||
"name": "index",
|
||||
"type": "integer",
|
||||
"multiple": true
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
64
src/commands/armset.json
Normal file
64
src/commands/armset.json
Normal file
|
|
@ -0,0 +1,64 @@
|
|||
{
|
||||
"ARMSET": {
|
||||
"summary": "Sets multiple index-value pairs in an array.",
|
||||
"complexity": "O(N) where N is the number of pairs",
|
||||
"group": "array",
|
||||
"since": "8.8.0",
|
||||
"arity": -4,
|
||||
"function": "armsetCommand",
|
||||
"command_flags": [
|
||||
"WRITE",
|
||||
"DENYOOM",
|
||||
"FAST"
|
||||
],
|
||||
"acl_categories": [
|
||||
"ARRAY"
|
||||
],
|
||||
"key_specs": [
|
||||
{
|
||||
"flags": [
|
||||
"RW",
|
||||
"UPDATE"
|
||||
],
|
||||
"begin_search": {
|
||||
"index": {
|
||||
"pos": 1
|
||||
}
|
||||
},
|
||||
"find_keys": {
|
||||
"range": {
|
||||
"lastkey": 0,
|
||||
"step": 1,
|
||||
"limit": 0
|
||||
}
|
||||
}
|
||||
}
|
||||
],
|
||||
"reply_schema": {
|
||||
"description": "Number of new slots that were set (previously empty).",
|
||||
"type": "integer"
|
||||
},
|
||||
"arguments": [
|
||||
{
|
||||
"name": "key",
|
||||
"type": "key",
|
||||
"key_spec_index": 0
|
||||
},
|
||||
{
|
||||
"name": "data",
|
||||
"type": "block",
|
||||
"multiple": true,
|
||||
"arguments": [
|
||||
{
|
||||
"name": "index",
|
||||
"type": "integer"
|
||||
},
|
||||
{
|
||||
"name": "value",
|
||||
"type": "string"
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
56
src/commands/arnext.json
Normal file
56
src/commands/arnext.json
Normal file
|
|
@ -0,0 +1,56 @@
|
|||
{
|
||||
"ARNEXT": {
|
||||
"summary": "Returns the next index ARINSERT would use.",
|
||||
"complexity": "O(1)",
|
||||
"group": "array",
|
||||
"since": "8.8.0",
|
||||
"arity": 2,
|
||||
"function": "arnextCommand",
|
||||
"command_flags": [
|
||||
"READONLY",
|
||||
"FAST"
|
||||
],
|
||||
"acl_categories": [
|
||||
"ARRAY"
|
||||
],
|
||||
"key_specs": [
|
||||
{
|
||||
"flags": [
|
||||
"RO",
|
||||
"ACCESS"
|
||||
],
|
||||
"begin_search": {
|
||||
"index": {
|
||||
"pos": 1
|
||||
}
|
||||
},
|
||||
"find_keys": {
|
||||
"range": {
|
||||
"lastkey": 0,
|
||||
"step": 1,
|
||||
"limit": 0
|
||||
}
|
||||
}
|
||||
}
|
||||
],
|
||||
"reply_schema": {
|
||||
"oneOf": [
|
||||
{
|
||||
"description": "The next index ARINSERT would use. Returns 0 for missing keys or when no insert happened yet.",
|
||||
"type": "integer"
|
||||
},
|
||||
{
|
||||
"description": "Null when the insertion cursor is exhausted (next insert would overflow).",
|
||||
"type": "null"
|
||||
}
|
||||
]
|
||||
},
|
||||
"arguments": [
|
||||
{
|
||||
"name": "key",
|
||||
"type": "key",
|
||||
"key_spec_index": 0
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
123
src/commands/arop.json
Normal file
123
src/commands/arop.json
Normal file
|
|
@ -0,0 +1,123 @@
|
|||
{
|
||||
"AROP": {
|
||||
"summary": "Performs aggregate operations on array elements in a range.",
|
||||
"complexity": "O(P) where P is visited positions in touched slices (dense scanned slots + sparse entries), with worst-case O(|end-start|+1) and typical case close to O(N), where N is the number of existing elements in range.",
|
||||
"group": "array",
|
||||
"since": "8.8.0",
|
||||
"arity": -5,
|
||||
"function": "aropCommand",
|
||||
"command_flags": [
|
||||
"READONLY"
|
||||
],
|
||||
"acl_categories": [
|
||||
"ARRAY"
|
||||
],
|
||||
"key_specs": [
|
||||
{
|
||||
"flags": [
|
||||
"RO",
|
||||
"ACCESS"
|
||||
],
|
||||
"begin_search": {
|
||||
"index": {
|
||||
"pos": 1
|
||||
}
|
||||
},
|
||||
"find_keys": {
|
||||
"range": {
|
||||
"lastkey": 0,
|
||||
"step": 1,
|
||||
"limit": 0
|
||||
}
|
||||
}
|
||||
}
|
||||
],
|
||||
"reply_schema": {
|
||||
"oneOf": [
|
||||
{
|
||||
"description": "Result of the operation.",
|
||||
"type": "string"
|
||||
},
|
||||
{
|
||||
"description": "Integer result for MATCH, USED, AND, OR, XOR.",
|
||||
"type": "integer"
|
||||
},
|
||||
{
|
||||
"description": "Null if no elements match the operation.",
|
||||
"type": "null"
|
||||
}
|
||||
]
|
||||
},
|
||||
"arguments": [
|
||||
{
|
||||
"name": "key",
|
||||
"type": "key",
|
||||
"key_spec_index": 0
|
||||
},
|
||||
{
|
||||
"name": "start",
|
||||
"type": "integer"
|
||||
},
|
||||
{
|
||||
"name": "end",
|
||||
"type": "integer"
|
||||
},
|
||||
{
|
||||
"name": "operation",
|
||||
"type": "oneof",
|
||||
"arguments": [
|
||||
{
|
||||
"name": "sum",
|
||||
"type": "pure-token",
|
||||
"token": "SUM"
|
||||
},
|
||||
{
|
||||
"name": "min",
|
||||
"type": "pure-token",
|
||||
"token": "MIN"
|
||||
},
|
||||
{
|
||||
"name": "max",
|
||||
"type": "pure-token",
|
||||
"token": "MAX"
|
||||
},
|
||||
{
|
||||
"name": "and",
|
||||
"type": "pure-token",
|
||||
"token": "AND"
|
||||
},
|
||||
{
|
||||
"name": "or",
|
||||
"type": "pure-token",
|
||||
"token": "OR"
|
||||
},
|
||||
{
|
||||
"name": "xor",
|
||||
"type": "pure-token",
|
||||
"token": "XOR"
|
||||
},
|
||||
{
|
||||
"name": "match",
|
||||
"type": "block",
|
||||
"arguments": [
|
||||
{
|
||||
"name": "match",
|
||||
"type": "pure-token",
|
||||
"token": "MATCH"
|
||||
},
|
||||
{
|
||||
"name": "value",
|
||||
"type": "string"
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "used",
|
||||
"type": "pure-token",
|
||||
"token": "USED"
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
57
src/commands/arring.json
Normal file
57
src/commands/arring.json
Normal file
|
|
@ -0,0 +1,57 @@
|
|||
{
|
||||
"ARRING": {
|
||||
"summary": "Inserts values into a ring buffer of specified size, wrapping and truncating as needed.",
|
||||
"complexity": "O(M) normally, O(N+M) on ring resize, where N is the maximum of the old and new ring size and M is the number of inserted values",
|
||||
"group": "array",
|
||||
"since": "8.8.0",
|
||||
"arity": -4,
|
||||
"function": "arringCommand",
|
||||
"command_flags": [
|
||||
"WRITE",
|
||||
"DENYOOM"
|
||||
],
|
||||
"acl_categories": [
|
||||
"ARRAY"
|
||||
],
|
||||
"key_specs": [
|
||||
{
|
||||
"flags": [
|
||||
"RW",
|
||||
"UPDATE"
|
||||
],
|
||||
"begin_search": {
|
||||
"index": {
|
||||
"pos": 1
|
||||
}
|
||||
},
|
||||
"find_keys": {
|
||||
"range": {
|
||||
"lastkey": 0,
|
||||
"step": 1,
|
||||
"limit": 0
|
||||
}
|
||||
}
|
||||
}
|
||||
],
|
||||
"reply_schema": {
|
||||
"description": "The last index where a value was inserted.",
|
||||
"type": "integer"
|
||||
},
|
||||
"arguments": [
|
||||
{
|
||||
"name": "key",
|
||||
"type": "key",
|
||||
"key_spec_index": 0
|
||||
},
|
||||
{
|
||||
"name": "size",
|
||||
"type": "integer"
|
||||
},
|
||||
{
|
||||
"name": "value",
|
||||
"type": "string",
|
||||
"multiple": true
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
76
src/commands/arscan.json
Normal file
76
src/commands/arscan.json
Normal file
|
|
@ -0,0 +1,76 @@
|
|||
{
|
||||
"ARSCAN": {
|
||||
"summary": "Iterates existing elements in a range, returning index-value pairs.",
|
||||
"complexity": "O(P) where P is visited positions in touched slices (dense scanned slots + sparse entries), with worst-case O(|end-start|+1) and typical case close to O(N), where N is the number of existing elements in range.",
|
||||
"group": "array",
|
||||
"since": "8.8.0",
|
||||
"arity": -4,
|
||||
"function": "arscanCommand",
|
||||
"command_flags": [
|
||||
"READONLY"
|
||||
],
|
||||
"acl_categories": [
|
||||
"ARRAY"
|
||||
],
|
||||
"key_specs": [
|
||||
{
|
||||
"flags": [
|
||||
"RO",
|
||||
"ACCESS"
|
||||
],
|
||||
"begin_search": {
|
||||
"index": {
|
||||
"pos": 1
|
||||
}
|
||||
},
|
||||
"find_keys": {
|
||||
"range": {
|
||||
"lastkey": 0,
|
||||
"step": 1,
|
||||
"limit": 0
|
||||
}
|
||||
}
|
||||
}
|
||||
],
|
||||
"reply_schema": {
|
||||
"description": "Array of [index, value] pairs.",
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "array",
|
||||
"minItems": 2,
|
||||
"maxItems": 2,
|
||||
"items": [
|
||||
{
|
||||
"type": "integer",
|
||||
"description": "Index of existing element"
|
||||
},
|
||||
{
|
||||
"type": "string",
|
||||
"description": "Value at that index"
|
||||
}
|
||||
]
|
||||
}
|
||||
},
|
||||
"arguments": [
|
||||
{
|
||||
"name": "key",
|
||||
"type": "key",
|
||||
"key_spec_index": 0
|
||||
},
|
||||
{
|
||||
"name": "start",
|
||||
"type": "integer"
|
||||
},
|
||||
{
|
||||
"name": "end",
|
||||
"type": "integer"
|
||||
},
|
||||
{
|
||||
"name": "limit",
|
||||
"token": "LIMIT",
|
||||
"type": "integer",
|
||||
"optional": true
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
52
src/commands/arseek.json
Normal file
52
src/commands/arseek.json
Normal file
|
|
@ -0,0 +1,52 @@
|
|||
{
|
||||
"ARSEEK": {
|
||||
"summary": "Sets the ARINSERT / ARRING cursor to a specific index.",
|
||||
"complexity": "O(1)",
|
||||
"group": "array",
|
||||
"since": "8.8.0",
|
||||
"arity": 3,
|
||||
"function": "arseekCommand",
|
||||
"command_flags": [
|
||||
"WRITE",
|
||||
"FAST"
|
||||
],
|
||||
"acl_categories": [
|
||||
"ARRAY"
|
||||
],
|
||||
"key_specs": [
|
||||
{
|
||||
"flags": [
|
||||
"RW",
|
||||
"UPDATE"
|
||||
],
|
||||
"begin_search": {
|
||||
"index": {
|
||||
"pos": 1
|
||||
}
|
||||
},
|
||||
"find_keys": {
|
||||
"range": {
|
||||
"lastkey": 0,
|
||||
"step": 1,
|
||||
"limit": 0
|
||||
}
|
||||
}
|
||||
}
|
||||
],
|
||||
"reply_schema": {
|
||||
"description": "1 if the cursor was set, 0 if the key does not exist.",
|
||||
"type": "integer"
|
||||
},
|
||||
"arguments": [
|
||||
{
|
||||
"name": "key",
|
||||
"type": "key",
|
||||
"key_spec_index": 0
|
||||
},
|
||||
{
|
||||
"name": "index",
|
||||
"type": "integer"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
58
src/commands/arset.json
Normal file
58
src/commands/arset.json
Normal file
|
|
@ -0,0 +1,58 @@
|
|||
{
|
||||
"ARSET": {
|
||||
"summary": "Sets one or more contiguous values starting at an index in an array.",
|
||||
"complexity": "O(N) where N is the number of values",
|
||||
"group": "array",
|
||||
"since": "8.8.0",
|
||||
"arity": -4,
|
||||
"function": "arsetCommand",
|
||||
"command_flags": [
|
||||
"WRITE",
|
||||
"DENYOOM",
|
||||
"FAST"
|
||||
],
|
||||
"acl_categories": [
|
||||
"ARRAY"
|
||||
],
|
||||
"key_specs": [
|
||||
{
|
||||
"flags": [
|
||||
"RW",
|
||||
"UPDATE"
|
||||
],
|
||||
"begin_search": {
|
||||
"index": {
|
||||
"pos": 1
|
||||
}
|
||||
},
|
||||
"find_keys": {
|
||||
"range": {
|
||||
"lastkey": 0,
|
||||
"step": 1,
|
||||
"limit": 0
|
||||
}
|
||||
}
|
||||
}
|
||||
],
|
||||
"reply_schema": {
|
||||
"description": "Number of new slots that were set (previously empty).",
|
||||
"type": "integer"
|
||||
},
|
||||
"arguments": [
|
||||
{
|
||||
"name": "key",
|
||||
"type": "key",
|
||||
"key_spec_index": 0
|
||||
},
|
||||
{
|
||||
"name": "index",
|
||||
"type": "integer"
|
||||
},
|
||||
{
|
||||
"name": "value",
|
||||
"type": "string",
|
||||
"multiple": true
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
|
|
@ -59,6 +59,9 @@
|
|||
{
|
||||
"const": "hyperloglog"
|
||||
},
|
||||
{
|
||||
"const": "array"
|
||||
},
|
||||
{
|
||||
"const": "list"
|
||||
},
|
||||
|
|
|
|||
31
src/config.c
31
src/config.c
|
|
@ -2461,6 +2461,33 @@ static int isValidProcTitleTemplate(char *val, const char **err) {
|
|||
return 1;
|
||||
}
|
||||
|
||||
/* Validate that array-slice-size is a power of two */
|
||||
static int isValidArraySliceSize(long long val, const char **err) {
|
||||
if (val <= 0 || (val & (val - 1)) != 0) {
|
||||
*err = "array-slice-size must be a power of two";
|
||||
return 0;
|
||||
}
|
||||
return 1;
|
||||
}
|
||||
|
||||
/* Validate array-sparse-kmax: if non-zero, must be > kmin */
|
||||
static int isValidArraySparseKmax(long long val, const char **err) {
|
||||
if (val > 0 && (unsigned int)val <= server.array_sparse_kmin) {
|
||||
*err = "array-sparse-kmax must be greater than array-sparse-kmin when non-zero";
|
||||
return 0;
|
||||
}
|
||||
return 1;
|
||||
}
|
||||
|
||||
/* Validate array-sparse-kmin: must be < kmax when kmax is non-zero */
|
||||
static int isValidArraySparseKmin(long long val, const char **err) {
|
||||
if (server.array_sparse_kmax > 0 && (unsigned int)val >= server.array_sparse_kmax) {
|
||||
*err = "array-sparse-kmin must be less than array-sparse-kmax";
|
||||
return 0;
|
||||
}
|
||||
return 1;
|
||||
}
|
||||
|
||||
static int updateLocaleCollate(const char **err) {
|
||||
const char *s = setlocale(LC_COLLATE, server.locale_collate);
|
||||
if (s == NULL) {
|
||||
|
|
@ -3252,6 +3279,10 @@ standardConfig static_configs[] = {
|
|||
createUIntConfig("socket-mark-id", NULL, IMMUTABLE_CONFIG, 0, UINT_MAX, server.socket_mark_id, 0, INTEGER_CONFIG, NULL, NULL),
|
||||
createUIntConfig("max-new-connections-per-cycle", NULL, MODIFIABLE_CONFIG, 1, 1000, server.max_new_conns_per_cycle, 10, INTEGER_CONFIG, NULL, NULL),
|
||||
createUIntConfig("max-new-tls-connections-per-cycle", NULL, MODIFIABLE_CONFIG, 1, 1000, server.max_new_tls_conns_per_cycle, 1, INTEGER_CONFIG, NULL, NULL),
|
||||
/* Array type configuration */
|
||||
createUIntConfig("array-slice-size", NULL, MODIFIABLE_CONFIG, AR_SLICE_SIZE_MIN, AR_SLICE_SIZE_MAX, server.array_slice_size, AR_SLICE_SIZE_DEFAULT, INTEGER_CONFIG, isValidArraySliceSize, NULL),
|
||||
createUIntConfig("array-sparse-kmax", NULL, MODIFIABLE_CONFIG, 0, 256, server.array_sparse_kmax, AR_SPARSE_KMAX_DEFAULT, INTEGER_CONFIG, isValidArraySparseKmax, NULL),
|
||||
createUIntConfig("array-sparse-kmin", NULL, MODIFIABLE_CONFIG, 0, 256, server.array_sparse_kmin, AR_SPARSE_KMIN_DEFAULT, INTEGER_CONFIG, isValidArraySparseKmin, NULL),
|
||||
#ifdef LOG_REQ_RES
|
||||
createUIntConfig("client-default-resp", NULL, IMMUTABLE_CONFIG | HIDDEN_CONFIG, 2, 3, server.client_default_resp, 2, INTEGER_CONFIG, NULL, NULL),
|
||||
#endif
|
||||
|
|
|
|||
14
src/db.c
14
src/db.c
|
|
@ -1751,14 +1751,15 @@ int parseScanCursorOrReply(client *c, robj *o, unsigned long long *cursor) {
|
|||
}
|
||||
|
||||
char *obj_type_name[OBJ_TYPE_MAX] = {
|
||||
"string",
|
||||
"list",
|
||||
"set",
|
||||
"zset",
|
||||
"hash",
|
||||
"string",
|
||||
"list",
|
||||
"set",
|
||||
"zset",
|
||||
"hash",
|
||||
NULL, /* module type is special */
|
||||
"stream",
|
||||
"gcra"
|
||||
"gcra",
|
||||
"array"
|
||||
};
|
||||
|
||||
/* Helper function to get type from a string in scan commands */
|
||||
|
|
@ -2438,6 +2439,7 @@ void copyCommand(client *c) {
|
|||
newobj = moduleTypeDupOrReply(c, key, newkey, dst->id, o);
|
||||
if (!newobj) return;
|
||||
break;
|
||||
case OBJ_ARRAY: newobj = arrayTypeDup(o); break;
|
||||
default:
|
||||
addReplyError(c, "unknown type object");
|
||||
return;
|
||||
|
|
|
|||
15
src/debug.c
15
src/debug.c
|
|
@ -274,6 +274,21 @@ void xorObjectDigest(redisDb *db, robj *keyobj, unsigned char *digest, robj *o)
|
|||
mt->digest(&md,mv->value);
|
||||
xorDigest(digest,md.x,sizeof(md.x));
|
||||
}
|
||||
} else if (o->type == OBJ_ARRAY) {
|
||||
redisArray *ar = o->ptr;
|
||||
uint64_t len = arLen(ar);
|
||||
for (uint64_t idx = 0; idx < len; idx++) {
|
||||
void *v = arGet(ar, idx);
|
||||
if (arIsEmpty(v)) {
|
||||
/* For empty slots, contribute "(null)" */
|
||||
mixDigest(digest, "(null)", 6);
|
||||
} else {
|
||||
char vbuf[AR_INLINE_BUFSIZE];
|
||||
size_t vlen;
|
||||
const char *data = arDecode(v, vbuf, sizeof(vbuf), &vlen);
|
||||
mixDigest(digest, data, vlen);
|
||||
}
|
||||
}
|
||||
} else {
|
||||
serverPanic("Unknown object type");
|
||||
}
|
||||
|
|
|
|||
32
src/defrag.c
32
src/defrag.c
|
|
@ -754,6 +754,32 @@ void defragSet(defragKeysCtx *ctx, kvobj *ob) {
|
|||
ob->ptr = newd;
|
||||
}
|
||||
|
||||
/* Arrays can be expensive to defrag in one shot because they may contain many
|
||||
* independently allocated slices. Small arrays are defragmented immediately,
|
||||
* while large arrays are queued for later and processed one slice per step. */
|
||||
void defragArray(defragKeysCtx *ctx, kvobj *ob) {
|
||||
serverAssert(ob->type == OBJ_ARRAY);
|
||||
/* Maybe arCount() is not the best possible value to check against
|
||||
* server.active_defrag_max_scan_fields, also because anyway when we
|
||||
* defrag incrementally, we defrag a since slice per call. Yet it makes
|
||||
* sense in a non very obvious way, for several reasons:
|
||||
*
|
||||
* 1. If the array is very sparse, it is an upper bound to the max
|
||||
* number of slices it is composed to.
|
||||
* 2. If the array is dense, we will scan in the default case at most 4096
|
||||
* entries, and the default defrag limit for max scans is 1000. They
|
||||
* are kinda comparable numbers.
|
||||
* 3. In case of a highly sparse array with huge indexes, in superdir mode,
|
||||
* yet the super blocks are going to be at max arCount().
|
||||
*
|
||||
* So regardless of the fact we later will defrag in slice units, this
|
||||
* is a good trigger for the one shot or incremental selection. */
|
||||
if (arCount(ob->ptr) > server.active_defrag_max_scan_fields)
|
||||
defragLater(ctx, ob);
|
||||
else
|
||||
ob->ptr = arDefrag(ob->ptr, activeDefragAlloc);
|
||||
}
|
||||
|
||||
/* Defrag callback for radix tree iterator, called for each node,
|
||||
* used in order to defrag the nodes allocations. */
|
||||
int defragRaxNode(raxNode **noderef, void *privdata) {
|
||||
|
|
@ -1172,6 +1198,8 @@ void defragKey(defragKeysCtx *ctx, dictEntry *de, dictEntryLink link) {
|
|||
#endif
|
||||
} else if (ob->type == OBJ_MODULE) {
|
||||
defragModule(ctx,db, ob);
|
||||
} else if (ob->type == OBJ_ARRAY) {
|
||||
defragArray(ctx, ob);
|
||||
} else {
|
||||
serverPanic("Unknown object type");
|
||||
}
|
||||
|
|
@ -1288,6 +1316,10 @@ int defragLaterItem(kvobj *ob, unsigned long *cursor, monotime endtime, int dbid
|
|||
robj keyobj;
|
||||
initStaticStringObject(keyobj, kvobjGetKey(ob));
|
||||
return moduleLateDefrag(&keyobj, ob, cursor, endtime, dbid);
|
||||
} else if (ob->type == OBJ_ARRAY) {
|
||||
redisArray *ar = ob->ptr;
|
||||
*cursor = arDefragIncremental(&ar, *cursor, activeDefragAlloc);
|
||||
ob->ptr = ar;
|
||||
} else {
|
||||
*cursor = 0; /* object type/encoding may have changed since we schedule it for later */
|
||||
}
|
||||
|
|
|
|||
|
|
@ -13,11 +13,6 @@
|
|||
#include "cluster.h"
|
||||
#include <sys/resource.h>
|
||||
|
||||
static inline int nearestNextPowerOf2(unsigned int count) {
|
||||
if (count <= 1) return 1;
|
||||
return 1 << (32 - __builtin_clz(count-1));
|
||||
}
|
||||
|
||||
/* Comparison function for qsort to sort slot indices */
|
||||
static inline int slotCompare(const void *a, const void *b) {
|
||||
return (*(const int *)a) - (*(const int *)b);
|
||||
|
|
|
|||
|
|
@ -207,6 +207,9 @@ size_t lazyfreeGetFreeEffort(robj *key, robj *obj, int dbid) {
|
|||
/* If the module's free_effort returns 0, we will use asynchronous free
|
||||
* memory by default. */
|
||||
return effort == 0 ? ULONG_MAX : effort;
|
||||
} else if (obj->type == OBJ_ARRAY) {
|
||||
redisArray *ar = obj->ptr;
|
||||
return arCount(ar);
|
||||
} else {
|
||||
return 1; /* Everything else is a single allocation. */
|
||||
}
|
||||
|
|
|
|||
|
|
@ -4255,6 +4255,7 @@ int RM_KeyType(RedisModuleKey *key) {
|
|||
case OBJ_MODULE: return REDISMODULE_KEYTYPE_MODULE;
|
||||
case OBJ_STREAM: return REDISMODULE_KEYTYPE_STREAM;
|
||||
case OBJ_GCRA: return REDISMODULE_KEYTYPE_GCRA;
|
||||
case OBJ_ARRAY: return REDISMODULE_KEYTYPE_ARRAY;
|
||||
default: return REDISMODULE_KEYTYPE_EMPTY;
|
||||
}
|
||||
}
|
||||
|
|
|
|||
|
|
@ -1181,6 +1181,18 @@ void addReplyLongLongFromStr(client *c, robj *str) {
|
|||
addReplyProto(c,"\r\n",2);
|
||||
}
|
||||
|
||||
/* Reply with unsigned 64-bit value. Uses integer reply when value fits in
|
||||
* signed long long, otherwise big number (RESP3) or bulk string (RESP2). */
|
||||
void addReplyUnsignedLongLong(client *c, uint64_t v) {
|
||||
if (v <= (uint64_t)LLONG_MAX) {
|
||||
addReplyLongLong(c, (long long)v);
|
||||
} else {
|
||||
char buf[LONG_STR_SIZE];
|
||||
int len = ull2string(buf, sizeof(buf), v);
|
||||
addReplyBigNum(c, buf, len);
|
||||
}
|
||||
}
|
||||
|
||||
void addReplyAggregateLen(client *c, long length, int prefix) {
|
||||
serverAssert(length >= 0);
|
||||
if (_prepareClientToWrite(c) != C_OK) return;
|
||||
|
|
|
|||
|
|
@ -37,6 +37,7 @@ int keyspaceEventsStringToFlags(char *classes) {
|
|||
case 't': flags |= NOTIFY_STREAM; break;
|
||||
case 'm': flags |= NOTIFY_KEY_MISS; break;
|
||||
case 'd': flags |= NOTIFY_MODULE; break;
|
||||
case 'a': flags |= NOTIFY_ARRAY; break;
|
||||
case 'n': flags |= NOTIFY_NEW; break;
|
||||
case 'o': flags |= NOTIFY_OVERWRITTEN; break;
|
||||
case 'c': flags |= NOTIFY_TYPE_CHANGED; break;
|
||||
|
|
@ -72,6 +73,7 @@ sds keyspaceEventsFlagsToString(int flags) {
|
|||
if (flags & NOTIFY_EVICTED) res = sdscatlen(res,"e",1);
|
||||
if (flags & NOTIFY_STREAM) res = sdscatlen(res,"t",1);
|
||||
if (flags & NOTIFY_MODULE) res = sdscatlen(res,"d",1);
|
||||
if (flags & NOTIFY_ARRAY) res = sdscatlen(res,"a",1);
|
||||
if (flags & NOTIFY_NEW) res = sdscatlen(res,"n",1);
|
||||
if (flags & NOTIFY_OVERWRITTEN) res = sdscatlen(res,"o",1);
|
||||
if (flags & NOTIFY_TYPE_CHANGED) res = sdscatlen(res,"c",1);
|
||||
|
|
|
|||
26
src/object.c
26
src/object.c
|
|
@ -531,6 +531,13 @@ robj *createGCRAObject(long long value) {
|
|||
return o;
|
||||
}
|
||||
|
||||
robj *createArrayObject(void) {
|
||||
redisArray *ar = arNew();
|
||||
robj *o = createObject(OBJ_ARRAY, ar);
|
||||
o->encoding = OBJ_ENCODING_SLICED_ARRAY;
|
||||
return o;
|
||||
}
|
||||
|
||||
robj *createModuleObject(moduleType *mt, void *value) {
|
||||
moduleValue *mv = zmalloc(sizeof(*mv));
|
||||
mv->type = mt;
|
||||
|
|
@ -611,6 +618,10 @@ void freeGCRAObject(robj *o) {
|
|||
#endif
|
||||
}
|
||||
|
||||
void freeArrayObject(robj *o) {
|
||||
arFree(o->ptr);
|
||||
}
|
||||
|
||||
void incrRefCount(robj *o) {
|
||||
if (o->refcount < OBJ_FIRST_SPECIAL_REFCOUNT - 1) {
|
||||
o->refcount++;
|
||||
|
|
@ -663,6 +674,7 @@ void decrRefCount(robj *o) {
|
|||
case OBJ_MODULE: freeModuleObject(o); break;
|
||||
case OBJ_STREAM: freeStreamObject(o); break;
|
||||
case OBJ_GCRA: freeGCRAObject(o); break;
|
||||
case OBJ_ARRAY: freeArrayObject(o); break;
|
||||
default: serverPanic("Unknown object type"); break;
|
||||
}
|
||||
}
|
||||
|
|
@ -810,6 +822,11 @@ void dismissStreamObject(robj *o, size_t size_hint) {
|
|||
}
|
||||
}
|
||||
|
||||
/* See dismissObject() */
|
||||
void dismissArrayObject(robj *o, size_t size_hint) {
|
||||
arDismiss(o->ptr, size_hint);
|
||||
}
|
||||
|
||||
void dismissGCRAObject(robj *o, size_t size_hint) {
|
||||
/* GCRA is a single allocation of a long long thus way smaller than a
|
||||
* page-size. The dismiss mechanism is not needed for it - hence NOOP.*/
|
||||
|
|
@ -846,6 +863,7 @@ void dismissObject(robj *o, size_t size_hint) {
|
|||
case OBJ_HASH: dismissHashObject(o, size_hint); break;
|
||||
case OBJ_STREAM: dismissStreamObject(o, size_hint); break;
|
||||
case OBJ_GCRA: dismissGCRAObject(o, size_hint); break;
|
||||
case OBJ_ARRAY: dismissArrayObject(o, size_hint); break;
|
||||
default: break;
|
||||
}
|
||||
#else
|
||||
|
|
@ -968,6 +986,7 @@ size_t getObjectLength(robj *o) {
|
|||
case OBJ_HASH: return hashTypeLength(o, 0);
|
||||
case OBJ_STREAM: return streamLength(o);
|
||||
case OBJ_GCRA: return gcraObjectLength(o);
|
||||
case OBJ_ARRAY: return arCount(o->ptr);
|
||||
default: return 0;
|
||||
}
|
||||
}
|
||||
|
|
@ -1265,6 +1284,7 @@ char *strEncoding(int encoding) {
|
|||
case OBJ_ENCODING_SKIPLIST: return "skiplist";
|
||||
case OBJ_ENCODING_EMBSTR: return "embstr";
|
||||
case OBJ_ENCODING_STREAM: return "stream";
|
||||
case OBJ_ENCODING_SLICED_ARRAY: return "sliced-array";
|
||||
default: return "unknown";
|
||||
}
|
||||
}
|
||||
|
|
@ -1283,7 +1303,8 @@ size_t kvobjComputeSize(robj *key, kvobj *o, size_t sample_size, int dbid) {
|
|||
o->type == OBJ_ZSET ||
|
||||
o->type == OBJ_HASH ||
|
||||
o->type == OBJ_STREAM ||
|
||||
o->type == OBJ_GCRA)
|
||||
o->type == OBJ_GCRA ||
|
||||
o->type == OBJ_ARRAY)
|
||||
{
|
||||
return kvobjAllocSize(o);
|
||||
} else if (o->type == OBJ_MODULE) {
|
||||
|
|
@ -1311,6 +1332,9 @@ size_t kvobjAllocSize(kvobj *o) {
|
|||
asize += s->alloc_size;
|
||||
} else if (o->type == OBJ_GCRA) {
|
||||
asize += gcraTypeAllocSize(o);
|
||||
} else if (o->type == OBJ_ARRAY) {
|
||||
redisArray *ar = o->ptr;
|
||||
asize += ar->alloc_size;
|
||||
} else if (o->type == OBJ_MODULE) {
|
||||
/* TODO: Provide moduleGetAllocSize() module API for O(1) allocation size retrieval */
|
||||
}
|
||||
|
|
|
|||
|
|
@ -85,6 +85,7 @@ struct RedisModuleType;
|
|||
#define OBJ_ENCODING_STREAM 10 /* Encoded as a radix tree of listpacks */
|
||||
#define OBJ_ENCODING_LISTPACK 11 /* Encoded as a listpack */
|
||||
#define OBJ_ENCODING_LISTPACK_EX 12 /* Encoded as listpack, extended with metadata */
|
||||
#define OBJ_ENCODING_SLICED_ARRAY 13 /* Encoded as sliced array */
|
||||
|
||||
#define LRU_BITS 24
|
||||
#define LRU_CLOCK_MAX ((1<<LRU_BITS)-1) /* Max value of obj->lru */
|
||||
|
|
@ -163,6 +164,7 @@ robj *createZsetListpackObject(void);
|
|||
robj *createStreamObject(void);
|
||||
robj *createGCRAObject(long long value);
|
||||
robj *createModuleObject(struct RedisModuleType *mt, void *value);
|
||||
robj *createArrayObject(void);
|
||||
int getLongFromObjectOrReply(struct client *c, robj *o, long *target, const char *msg);
|
||||
int getPositiveLongFromObjectOrReply(struct client *c, robj *o, long *target, const char *msg);
|
||||
int getRangeLongFromObjectOrReply(struct client *c, robj *o, long min, long max, long *target, const char *msg);
|
||||
|
|
|
|||
254
src/rdb.c
254
src/rdb.c
|
|
@ -124,33 +124,42 @@ time_t rdbLoadTime(rio *rdb) {
|
|||
return (time_t)t32;
|
||||
}
|
||||
|
||||
ssize_t rdbSaveMillisecondTime(rio *rdb, long long t) {
|
||||
int64_t t64 = (int64_t) t;
|
||||
memrev64ifbe(&t64); /* Store in little endian. */
|
||||
return rdbWriteRaw(rdb,&t64,8);
|
||||
/* Save a signed 64-bit integer in little-endian format. */
|
||||
ssize_t rdbSaveSignedInteger(rio *rdb, int64_t val) {
|
||||
memrev64ifbe(&val); /* Store in little endian. */
|
||||
return rdbWriteRaw(rdb, &val, 8);
|
||||
}
|
||||
|
||||
/* This function loads a time from the RDB file. It gets the version of the
|
||||
* RDB because, unfortunately, before Redis 5 (RDB version 9), the function
|
||||
* failed to convert data to/from little endian, so RDB files with keys having
|
||||
* expires could not be shared between big endian and little endian systems
|
||||
* (because the expire time will be totally wrong). The fix for this is just
|
||||
* to call memrev64ifbe(), however if we fix this for all the RDB versions,
|
||||
/* This function loads a signed 64-bit integer from the RDB file. It gets the
|
||||
* version of the RDB because, unfortunately, before Redis 5 (RDB version 9),
|
||||
* the function failed to convert data to/from little endian, so RDB files with
|
||||
* keys having expires could not be shared between big endian and little endian
|
||||
* systems (because the expire time will be totally wrong). The fix for this is
|
||||
* just to call memrev64ifbe(), however if we fix this for all the RDB versions,
|
||||
* this call will introduce an incompatibility for big endian systems:
|
||||
* after upgrading to Redis version 5 they will no longer be able to load their
|
||||
* own old RDB files. Because of that, we instead fix the function only for new
|
||||
* RDB versions, and load older RDB versions as we used to do in the past,
|
||||
* allowing big endian systems to load their own old RDB files.
|
||||
*
|
||||
* On I/O error the function returns LLONG_MAX, however if this is also a
|
||||
* On I/O error the function returns INT64_MAX, however if this is also a
|
||||
* valid stored value, the caller should use rioGetReadError() to check for
|
||||
* errors after calling this function. */
|
||||
long long rdbLoadMillisecondTime(rio *rdb, int rdbver) {
|
||||
int64_t t64;
|
||||
if (rioRead(rdb,&t64,8) == 0) return LLONG_MAX;
|
||||
int64_t rdbLoadSignedInteger(rio *rdb, int rdbver) {
|
||||
int64_t val;
|
||||
if (rioRead(rdb, &val, 8) == 0) return INT64_MAX;
|
||||
if (rdbver >= 9) /* Check the top comment of this function. */
|
||||
memrev64ifbe(&t64); /* Convert in big endian if the system is BE. */
|
||||
return (long long)t64;
|
||||
memrev64ifbe(&val); /* Convert in big endian if the system is BE. */
|
||||
return val;
|
||||
}
|
||||
|
||||
/* Wrappers for millisecond time - these just call the signed integer functions */
|
||||
ssize_t rdbSaveMillisecondTime(rio *rdb, long long t) {
|
||||
return rdbSaveSignedInteger(rdb, (int64_t)t);
|
||||
}
|
||||
|
||||
long long rdbLoadMillisecondTime(rio *rdb, int rdbver) {
|
||||
return (long long)rdbLoadSignedInteger(rdb, rdbver);
|
||||
}
|
||||
|
||||
/* Saves an encoded length. The first two bits in the first byte are used to
|
||||
|
|
@ -717,6 +726,8 @@ int rdbSaveObjectType(rio *rdb, robj *o) {
|
|||
return rdbSaveType(rdb,RDB_TYPE_GCRA);
|
||||
case OBJ_MODULE:
|
||||
return rdbSaveType(rdb,RDB_TYPE_MODULE_2);
|
||||
case OBJ_ARRAY:
|
||||
return rdbSaveType(rdb,RDB_TYPE_ARRAY);
|
||||
default:
|
||||
serverPanic("Unknown object type");
|
||||
}
|
||||
|
|
@ -1039,6 +1050,68 @@ size_t rdbSaveStreamConsumers(rio *rdb, streamCG *cg) {
|
|||
|
||||
/* Save a Redis object.
|
||||
* Returns -1 on error, number of bytes written on success. */
|
||||
static ssize_t rdbSaveArrayElement(rio *rdb, uint64_t idx, void *v) {
|
||||
ssize_t n, nwritten = 0;
|
||||
|
||||
if ((n = rdbSaveLen(rdb, idx)) == -1) return -1;
|
||||
nwritten += n;
|
||||
|
||||
if (arIsInt(v)) {
|
||||
if ((n = rdbSaveLen(rdb, AR_RDB_TAG_INT)) == -1) return -1;
|
||||
nwritten += n;
|
||||
int64_t ival = arToInt(v);
|
||||
if ((n = rdbSaveSignedInteger(rdb, ival)) == -1) return -1;
|
||||
nwritten += n;
|
||||
} else if (arIsFloat(v)) {
|
||||
if ((n = rdbSaveLen(rdb, AR_RDB_TAG_FLOAT)) == -1) return -1;
|
||||
nwritten += n;
|
||||
double d = arToDouble(v);
|
||||
if (rdbSaveBinaryDoubleValue(rdb, d) == -1) return -1;
|
||||
nwritten += 8;
|
||||
} else if (arIsSmallStr(v)) {
|
||||
char buf[AR_SMALLSTR_MAXLEN + 1];
|
||||
int len = arToSmallStr(v, buf);
|
||||
if ((n = rdbSaveLen(rdb, AR_RDB_TAG_SMALLSTR)) == -1) return -1;
|
||||
nwritten += n;
|
||||
if ((n = rdbSaveRawString(rdb, (unsigned char *)buf, len)) == -1) return -1;
|
||||
nwritten += n;
|
||||
} else {
|
||||
if ((n = rdbSaveLen(rdb, AR_RDB_TAG_SDS)) == -1) return -1;
|
||||
nwritten += n;
|
||||
if ((n = rdbSaveRawString(rdb, (unsigned char *)arStringData(v), arStringLen(v))) == -1) return -1;
|
||||
nwritten += n;
|
||||
}
|
||||
|
||||
return nwritten;
|
||||
}
|
||||
|
||||
static ssize_t rdbSaveArraySlice(rio *rdb, arSlice *s, uint64_t slice_id,
|
||||
uint32_t slice_size) {
|
||||
ssize_t n, nwritten = 0;
|
||||
|
||||
if (s->encoding == AR_SLICE_DENSE) {
|
||||
for (uint32_t i = 0; i < s->layout.dense.winsize; i++) {
|
||||
void *v = s->layout.dense.items[i];
|
||||
if (arIsEmpty(v)) continue;
|
||||
|
||||
uint64_t idx = arMakeIdx(slice_id, s->layout.dense.offset + i, slice_size);
|
||||
if ((n = rdbSaveArrayElement(rdb, idx, v)) == -1) return -1;
|
||||
nwritten += n;
|
||||
}
|
||||
} else {
|
||||
uint16_t *offsets = s->layout.sparse.offsets;
|
||||
void **values = s->layout.sparse.values;
|
||||
|
||||
for (uint32_t i = 0; i < s->count; i++) {
|
||||
uint64_t idx = arMakeIdx(slice_id, offsets[i], slice_size);
|
||||
if ((n = rdbSaveArrayElement(rdb, idx, values[i])) == -1) return -1;
|
||||
nwritten += n;
|
||||
}
|
||||
}
|
||||
|
||||
return nwritten;
|
||||
}
|
||||
|
||||
ssize_t rdbSaveObject(rio *rdb, robj *o, robj *key, int dbid) {
|
||||
ssize_t n = 0, nwritten = 0;
|
||||
|
||||
|
|
@ -1432,6 +1505,57 @@ ssize_t rdbSaveObject(rio *rdb, robj *o, robj *key, int dbid) {
|
|||
zfree(io.ctx);
|
||||
}
|
||||
return io.error ? -1 : (ssize_t)io.bytes;
|
||||
} else if (o->type == OBJ_ARRAY) {
|
||||
/* Save an array value. We persist only elements and insert_idx - no
|
||||
* implementation details like slice_size. Arrays are loaded using
|
||||
* the current ar_slice_size config. */
|
||||
redisArray *ar = o->ptr;
|
||||
|
||||
/* Save count */
|
||||
if ((n = rdbSaveLen(rdb, ar->count)) == -1) return -1;
|
||||
nwritten += n;
|
||||
|
||||
/* Save insert_idx: 0 = none, 1 = has value followed by actual value.
|
||||
* We can't save UINT64_MAX directly with rdbSaveLen/rdbLoadLen because
|
||||
* rdbLoadLen returns UINT64_MAX (RDB_LENERR) to signal an error, making
|
||||
* it impossible to distinguish a valid UINT64_MAX value from an error. */
|
||||
if (ar->insert_idx == AR_INSERT_IDX_NONE) {
|
||||
if ((n = rdbSaveLen(rdb, 0)) == -1) return -1;
|
||||
nwritten += n;
|
||||
} else {
|
||||
if ((n = rdbSaveLen(rdb, 1)) == -1) return -1;
|
||||
nwritten += n;
|
||||
if ((n = rdbSaveLen(rdb, ar->insert_idx)) == -1) return -1;
|
||||
nwritten += n;
|
||||
}
|
||||
|
||||
/* Save elements in index order.
|
||||
* We need to iterate through all slices, handling both flat directory
|
||||
* mode and superdir mode. In superdir mode, blocks are sorted by
|
||||
* block_id, so we iterate through blocks in order. */
|
||||
if (ar->superdir) {
|
||||
/* Superdir mode: iterate through blocks */
|
||||
for (uint32_t bi = 0; bi < ar->sdir_len; bi++) {
|
||||
arSDirEntry *e = ar->superdir + bi;
|
||||
uint64_t block_base = e->block_id * AR_SUPER_BLOCK_SLOTS;
|
||||
|
||||
for (uint32_t si = 0; si < AR_SUPER_BLOCK_SLOTS; si++) {
|
||||
arSlice *s = e->slots[si];
|
||||
if (!s) continue;
|
||||
uint64_t slice_id = block_base + si;
|
||||
if ((n = rdbSaveArraySlice(rdb, s, slice_id, ar->slice_size)) == -1) return -1;
|
||||
nwritten += n;
|
||||
}
|
||||
}
|
||||
} else {
|
||||
/* Flat directory mode */
|
||||
for (uint64_t slice_id = 0; slice_id <= ar->dir_highest_used && slice_id < ar->dir_alloc; slice_id++) {
|
||||
arSlice *s = ar->dir[slice_id];
|
||||
if (!s) continue;
|
||||
if ((n = rdbSaveArraySlice(rdb, s, slice_id, ar->slice_size)) == -1) return -1;
|
||||
nwritten += n;
|
||||
}
|
||||
}
|
||||
} else {
|
||||
serverPanic("Unknown object type");
|
||||
}
|
||||
|
|
@ -3653,6 +3777,104 @@ robj *rdbLoadObject(int rdbtype, rio *rdb, sds key, int dbid, int *error)
|
|||
return NULL;
|
||||
}
|
||||
o = createGCRAObject((long long)time);
|
||||
} else if (rdbtype == RDB_TYPE_ARRAY) {
|
||||
/* Load array value. We only persist elements and insert_idx - no
|
||||
* implementation details. Arrays use current ar_slice_size config. */
|
||||
uint64_t count;
|
||||
if ((count = rdbLoadLen(rdb, NULL)) == RDB_LENERR) return NULL;
|
||||
if (count == 0) {
|
||||
rdbReportCorruptRDB("Empty array (count == 0) is invalid");
|
||||
return NULL;
|
||||
}
|
||||
|
||||
/* Load insert_idx: 0 = none, 1 = has value followed by actual value */
|
||||
uint64_t insert_idx_flag;
|
||||
if ((insert_idx_flag = rdbLoadLen(rdb, NULL)) == RDB_LENERR) return NULL;
|
||||
if (insert_idx_flag > 1) {
|
||||
rdbReportCorruptRDB("Invalid array insert_idx_flag %llu",
|
||||
(unsigned long long)insert_idx_flag);
|
||||
return NULL;
|
||||
}
|
||||
uint64_t insert_idx;
|
||||
if (insert_idx_flag == 0) {
|
||||
insert_idx = AR_INSERT_IDX_NONE;
|
||||
} else {
|
||||
if ((insert_idx = rdbLoadLen(rdb, NULL)) == RDB_LENERR) return NULL;
|
||||
}
|
||||
|
||||
o = createArrayObject();
|
||||
redisArray *ar = o->ptr;
|
||||
ar->insert_idx = insert_idx;
|
||||
|
||||
/* Load elements */
|
||||
for (uint64_t i = 0; i < count; i++) {
|
||||
uint64_t idx;
|
||||
int idx_isencoded;
|
||||
if (rdbLoadLenByRef(rdb, &idx_isencoded, &idx) == -1) {
|
||||
decrRefCount(o);
|
||||
return NULL;
|
||||
}
|
||||
if (idx_isencoded || idx == UINT64_MAX) {
|
||||
decrRefCount(o);
|
||||
rdbReportCorruptRDB("Invalid array index %llu",
|
||||
(unsigned long long)idx);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
uint64_t type_tag;
|
||||
if ((type_tag = rdbLoadLen(rdb, NULL)) == RDB_LENERR) {
|
||||
decrRefCount(o);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
void *v;
|
||||
if (type_tag == AR_RDB_TAG_INT) {
|
||||
int64_t ival = rdbLoadSignedInteger(rdb, RDB_VERSION);
|
||||
if (ival == INT64_MAX && rioGetReadError(rdb)) {
|
||||
decrRefCount(o);
|
||||
return NULL;
|
||||
}
|
||||
v = arValueFromRdbInt(ival);
|
||||
} else if (type_tag == AR_RDB_TAG_FLOAT) {
|
||||
double d;
|
||||
if (rdbLoadBinaryDoubleValue(rdb, &d) == -1) {
|
||||
decrRefCount(o);
|
||||
return NULL;
|
||||
}
|
||||
v = arValueFromRdbFloat(d);
|
||||
} else if (type_tag == AR_RDB_TAG_SMALLSTR) {
|
||||
sds str;
|
||||
if ((str = rdbGenericLoadStringObject(rdb, RDB_LOAD_SDS, NULL)) == NULL) {
|
||||
decrRefCount(o);
|
||||
return NULL;
|
||||
}
|
||||
size_t len = sdslen(str);
|
||||
if (len > AR_SMALLSTR_MAXLEN) {
|
||||
sdsfree(str);
|
||||
decrRefCount(o);
|
||||
rdbReportCorruptRDB("Invalid small string length %zu in array", len);
|
||||
return NULL;
|
||||
}
|
||||
v = arValueFromRdbSmallStr(str, sdslen(str));
|
||||
sdsfree(str);
|
||||
} else if (type_tag == AR_RDB_TAG_SDS) {
|
||||
/* arString */
|
||||
sds str;
|
||||
if ((str = rdbGenericLoadStringObject(rdb, RDB_LOAD_SDS, NULL)) == NULL) {
|
||||
decrRefCount(o);
|
||||
return NULL;
|
||||
}
|
||||
v = arEncode(str, sdslen(str));
|
||||
sdsfree(str);
|
||||
} else {
|
||||
decrRefCount(o);
|
||||
rdbReportCorruptRDB("Unknown array element type_tag %llu",
|
||||
(unsigned long long)type_tag);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
arSet(ar, idx, v);
|
||||
}
|
||||
} else {
|
||||
rdbReportReadError("Unknown RDB encoding type %d",rdbtype);
|
||||
return NULL;
|
||||
|
|
|
|||
|
|
@ -81,10 +81,11 @@
|
|||
#define RDB_TYPE_STREAM_LISTPACKS_4 26 /* Stream with IDMP support */
|
||||
#define RDB_TYPE_STREAM_LISTPACKS_5 27 /* Stream with XNACK support (NACKed entries) */
|
||||
#define RDB_TYPE_GCRA 28 /* GCRA object */
|
||||
#define RDB_TYPE_ARRAY 29 /* Array data type */
|
||||
/* NOTE: WHEN ADDING NEW RDB TYPE, UPDATE rdbIsObjectType(), and rdb_type_string[] */
|
||||
|
||||
/* Test if a type is an object type. */
|
||||
#define rdbIsObjectType(t) (((t) >= 0 && (t) <= 7) || ((t) >= 9 && (t) <= 28))
|
||||
#define rdbIsObjectType(t) (((t) >= 0 && (t) <= 7) || ((t) >= 9 && (t) <= 29))
|
||||
|
||||
/* Special RDB opcodes (saved/loaded with rdbSaveType/rdbLoadType). */
|
||||
#define RDB_OPCODE_KEY_META 243 /* Key metadata (module metadata classes). */
|
||||
|
|
@ -133,6 +134,8 @@ int rdbSaveType(rio *rdb, unsigned char type);
|
|||
int rdbLoadType(rio *rdb);
|
||||
time_t rdbLoadTime(rio *rdb);
|
||||
int rdbSaveLen(rio *rdb, uint64_t len);
|
||||
ssize_t rdbSaveSignedInteger(rio *rdb, int64_t val);
|
||||
int64_t rdbLoadSignedInteger(rio *rdb, int rdbver);
|
||||
ssize_t rdbSaveMillisecondTime(rio *rdb, long long t);
|
||||
long long rdbLoadMillisecondTime(rio *rdb, int rdbver);
|
||||
uint64_t rdbLoadLen(rio *rdb, int *isencoded);
|
||||
|
|
|
|||
|
|
@ -89,6 +89,7 @@ char *rdb_type_string[] = {
|
|||
"stream-v4",
|
||||
"stream-v5",
|
||||
"gcra",
|
||||
"array",
|
||||
};
|
||||
|
||||
/* Show a few stats collected into 'rdbstate' */
|
||||
|
|
|
|||
|
|
@ -90,6 +90,7 @@ typedef long long ustime_t;
|
|||
#define REDISMODULE_KEYTYPE_MODULE 6
|
||||
#define REDISMODULE_KEYTYPE_STREAM 7
|
||||
#define REDISMODULE_KEYTYPE_GCRA 8
|
||||
#define REDISMODULE_KEYTYPE_ARRAY 9
|
||||
|
||||
/* Reply types. */
|
||||
#define REDISMODULE_REPLY_UNKNOWN -1
|
||||
|
|
@ -254,18 +255,19 @@ This flag should not be used directly by the module.
|
|||
#define REDISMODULE_NOTIFY_SUBKEYEVENT (1<<20) /* T */
|
||||
#define REDISMODULE_NOTIFY_SUBKEYSPACEITEM (1<<21) /* I */
|
||||
#define REDISMODULE_NOTIFY_SUBKEYSPACEEVENT (1<<22) /* V */
|
||||
#define REDISMODULE_NOTIFY_ARRAY (1<<23) /* a, array key space notification */
|
||||
|
||||
/* Next notification flag, must be updated when adding new flags above!
|
||||
This flag should not be used directly by the module.
|
||||
* Use RedisModule_GetKeyspaceNotificationFlagsAll instead. */
|
||||
#define _REDISMODULE_NOTIFY_NEXT (1<<23)
|
||||
#define _REDISMODULE_NOTIFY_NEXT (1<<24)
|
||||
|
||||
/* Delivery flags for RM_SubscribeToKeyspaceEventsWithSubkeys.
|
||||
* These are passed in the 'flags' parameter, not in 'types'. */
|
||||
#define REDISMODULE_NOTIFY_FLAG_NONE 0 /* Invoke callback for all matching events */
|
||||
#define REDISMODULE_NOTIFY_FLAG_SUBKEYS_REQUIRED (1<<0) /* Only invoke callback when subkeys are present */
|
||||
|
||||
#define REDISMODULE_NOTIFY_ALL (REDISMODULE_NOTIFY_GENERIC | REDISMODULE_NOTIFY_STRING | REDISMODULE_NOTIFY_LIST | REDISMODULE_NOTIFY_SET | REDISMODULE_NOTIFY_HASH | REDISMODULE_NOTIFY_ZSET | REDISMODULE_NOTIFY_EXPIRED | REDISMODULE_NOTIFY_EVICTED | REDISMODULE_NOTIFY_STREAM | REDISMODULE_NOTIFY_MODULE) /* A */
|
||||
#define REDISMODULE_NOTIFY_ALL (REDISMODULE_NOTIFY_GENERIC | REDISMODULE_NOTIFY_STRING | REDISMODULE_NOTIFY_LIST | REDISMODULE_NOTIFY_SET | REDISMODULE_NOTIFY_HASH | REDISMODULE_NOTIFY_ZSET | REDISMODULE_NOTIFY_EXPIRED | REDISMODULE_NOTIFY_EVICTED | REDISMODULE_NOTIFY_STREAM | REDISMODULE_NOTIFY_MODULE | REDISMODULE_NOTIFY_ARRAY) /* A */
|
||||
|
||||
/* A special pointer that we can use between the core and the module to signal
|
||||
* field deletion, and that is impossible to be a valid pointer. */
|
||||
|
|
|
|||
37
src/server.h
37
src/server.h
|
|
@ -22,6 +22,7 @@
|
|||
#include "atomicvar.h"
|
||||
#include "commands.h"
|
||||
#include "object.h"
|
||||
#include "sparsearray.h"
|
||||
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
|
|
@ -288,6 +289,7 @@ extern int configOOMScoreAdjValuesDefaults[CONFIG_OOM_COUNT];
|
|||
#define ACL_CATEGORY_TRANSACTION (1ULL<<19)
|
||||
#define ACL_CATEGORY_SCRIPTING (1ULL<<20)
|
||||
#define ACL_CATEGORY_RATE_LIMIT (1ULL<<21)
|
||||
#define ACL_CATEGORY_ARRAY (1ULL<<22)
|
||||
|
||||
/* Key-spec flags *
|
||||
* -------------- */
|
||||
|
|
@ -801,7 +803,8 @@ typedef enum {
|
|||
#define NOTIFY_SUBKEYEVENT (1<<20) /* T, subkey-level keyevent notification */
|
||||
#define NOTIFY_SUBKEYSPACEITEM (1<<21) /* I, subkey-level notification per item: channel=key\nsubkey */
|
||||
#define NOTIFY_SUBKEYSPACEEVENT (1<<22) /* V, subkey-level notification: channel=event|key */
|
||||
#define NOTIFY_ALL (NOTIFY_GENERIC | NOTIFY_STRING | NOTIFY_LIST | NOTIFY_SET | NOTIFY_HASH | NOTIFY_ZSET | NOTIFY_EXPIRED | NOTIFY_EVICTED | NOTIFY_STREAM | NOTIFY_MODULE) /* A flag */
|
||||
#define NOTIFY_ARRAY (1<<23) /* a, array notification */
|
||||
#define NOTIFY_ALL (NOTIFY_GENERIC | NOTIFY_STRING | NOTIFY_LIST | NOTIFY_SET | NOTIFY_HASH | NOTIFY_ZSET | NOTIFY_EXPIRED | NOTIFY_EVICTED | NOTIFY_STREAM | NOTIFY_MODULE | NOTIFY_ARRAY) /* A flag */
|
||||
|
||||
/* Using the following macro you can run code inside serverCron() with the
|
||||
* specified period, specified in milliseconds.
|
||||
|
|
@ -866,7 +869,8 @@ typedef enum {
|
|||
#define OBJ_MODULE 5 /* Module object. */
|
||||
#define OBJ_STREAM 6 /* Stream object. */
|
||||
#define OBJ_GCRA 7 /* GCRA object. */
|
||||
#define OBJ_TYPE_MAX 8 /* Maximum number of object types */
|
||||
#define OBJ_ARRAY 8 /* Array object. */
|
||||
#define OBJ_TYPE_MAX 9 /* Maximum number of object types */
|
||||
|
||||
/* NOTE: adding a new object requires changes in the following places:
|
||||
* - rdb.c - save/load (also bump RDB_VERSION if needed)
|
||||
|
|
@ -2442,6 +2446,10 @@ struct redisServer {
|
|||
/* Stream IDMP parameters */
|
||||
long long stream_idmp_duration; /* Default IDMP duration in seconds. */
|
||||
long long stream_idmp_maxsize; /* Default IDMP max entries. */
|
||||
/* Array parameters */
|
||||
uint32_t array_slice_size; /* Slice size for new arrays */
|
||||
uint32_t array_sparse_kmax; /* Max elements before sparse->dense */
|
||||
uint32_t array_sparse_kmin; /* Min elements before dense->sparse */
|
||||
/* List parameters */
|
||||
int list_max_listpack_size;
|
||||
int list_compress_depth;
|
||||
|
|
@ -2801,6 +2809,7 @@ typedef enum {
|
|||
COMMAND_GROUP_GEO,
|
||||
COMMAND_GROUP_STREAM,
|
||||
COMMAND_GROUP_BITMAP,
|
||||
COMMAND_GROUP_ARRAY,
|
||||
COMMAND_GROUP_MODULE,
|
||||
COMMAND_GROUP_RATE_LIMIT,
|
||||
} redisCommandGroup;
|
||||
|
|
@ -3213,6 +3222,7 @@ void addReplyBigNum(client *c, const char *num, size_t len);
|
|||
void addReplyHumanLongDouble(client *c, long double d);
|
||||
void addReplyLongLong(client *c, long long ll);
|
||||
void addReplyLongLongFromStr(client *c, robj* str);
|
||||
void addReplyUnsignedLongLong(client *c, uint64_t v);
|
||||
void addReplyArrayLen(client *c, long length);
|
||||
void addReplyMapLen(client *c, long length);
|
||||
void addReplySetLen(client *c, long length);
|
||||
|
|
@ -3844,6 +3854,9 @@ struct listpackEx *listpackExCreate(void);
|
|||
void listpackExAddNew(robj *o, char *field, size_t flen,
|
||||
char *value, size_t vlen, uint64_t expireAt);
|
||||
|
||||
/* Array data type. */
|
||||
robj *arrayTypeDup(robj *o);
|
||||
|
||||
/* Pub / Sub */
|
||||
int pubsubUnsubscribeAllChannels(client *c, int notify);
|
||||
int pubsubUnsubscribeShardAllChannels(client *c, int notify);
|
||||
|
|
@ -4511,6 +4524,26 @@ void digestCommand(client *c);
|
|||
void gcraCommand(client *c);
|
||||
void gcraSetValueCommand(client *c);
|
||||
|
||||
/* Array commands (t_array.c) */
|
||||
void arsetCommand(client *c);
|
||||
void argetCommand(client *c);
|
||||
void ardelCommand(client *c);
|
||||
void ardelrangeCommand(client *c);
|
||||
void arlenCommand(client *c);
|
||||
void arcountCommand(client *c);
|
||||
void argetrangeCommand(client *c);
|
||||
void arscanCommand(client *c);
|
||||
void argrepCommand(client *c);
|
||||
void aropCommand(client *c);
|
||||
void arinsertCommand(client *c);
|
||||
void arringCommand(client *c);
|
||||
void arnextCommand(client *c);
|
||||
void arseekCommand(client *c);
|
||||
void arlastitemsCommand(client *c);
|
||||
void arinfoCommand(client *c);
|
||||
void armsetCommand(client *c);
|
||||
void armgetCommand(client *c);
|
||||
|
||||
#if defined(__GNUC__)
|
||||
void *calloc(size_t count, size_t size) __attribute__ ((deprecated));
|
||||
void free(void *ptr) __attribute__ ((deprecated));
|
||||
|
|
|
|||
2080
src/sparsearray.c
Normal file
2080
src/sparsearray.c
Normal file
File diff suppressed because it is too large
Load diff
312
src/sparsearray.h
Normal file
312
src/sparsearray.h
Normal file
|
|
@ -0,0 +1,312 @@
|
|||
/*
|
||||
* Copyright (c) 2026-Present, Redis Ltd.
|
||||
* All rights reserved.
|
||||
*
|
||||
* Licensed under your choice of (a) the Redis Source Available License 2.0
|
||||
* (RSALv2); or (b) the Server Side Public License v1 (SSPLv1); or (c) the
|
||||
* GNU Affero General Public License v3 (AGPLv3).
|
||||
*
|
||||
* Sparse Array - A memory-efficient sparse array with 64-bit index space.
|
||||
*
|
||||
* This data structure was designed and implemented by Salvatore Sanfilippo.
|
||||
*/
|
||||
|
||||
#ifndef __SPARSEARRAY_H
|
||||
#define __SPARSEARRAY_H
|
||||
|
||||
#include <stdint.h>
|
||||
#include <stddef.h>
|
||||
#include <string.h>
|
||||
|
||||
/* ============================================================================
|
||||
* SPARSE ARRAY OVERVIEW
|
||||
* ============================================================================
|
||||
*
|
||||
* Sparse arrays are random-access sequences indexed by non-negative 64-bit
|
||||
* integers. They support O(1) get/set operations and efficient iteration.
|
||||
*
|
||||
* MEMORY LAYOUT
|
||||
* -------------
|
||||
* The array uses a two-level structure: a directory pointing to "slices",
|
||||
* which contain just a range of elements. For very large/sparse arrays, a
|
||||
* three-level "superdir" structure is used.
|
||||
*
|
||||
* SLICE TYPES
|
||||
* -----------
|
||||
* Each slice holds up to slice_size elements and can be:
|
||||
*
|
||||
* - Sparse: Sorted array of (offset, value) pairs. Memory-efficient when
|
||||
* elements are scattered within the slice.
|
||||
*
|
||||
* - Dense: Contiguous array with a sliding window. Used when the slice
|
||||
* has many elements.
|
||||
*
|
||||
* VALUE ENCODING (Tagged Pointers)
|
||||
* --------------------------------
|
||||
* Values are stored in tagged pointer-sized words, using the low 2 bits as a
|
||||
* tag. The exact immediate encoding depends on pointer width:
|
||||
*
|
||||
* 64-bit builds:
|
||||
* Tag 00: arString pointer (heap-allocated, 8+ byte strings)
|
||||
* Tag 01: Immediate signed integer in the 62-bit payload
|
||||
* Tag 10: Immediate double (low 2 bits of the IEEE-754 payload cleared)
|
||||
* Tag 11: Inline small string (0-7 bytes)
|
||||
*
|
||||
* 32-bit builds:
|
||||
* Tag 00: arString pointer
|
||||
* Tag 01: Immediate signed integer in the 30-bit payload
|
||||
* Tag 10: Immediate float (low 2 bits of the IEEE-754 payload cleared)
|
||||
* Tag 11: Inline small string (0-3 bytes)
|
||||
*
|
||||
* RDB persistence is architecture-neutral: values are saved as logical ints,
|
||||
* doubles and strings, never as raw tagged words.
|
||||
* ========================================================================== */
|
||||
|
||||
/* ----------------------------------------------------------------------------
|
||||
* Configuration defaults
|
||||
* -------------------------------------------------------------------------- */
|
||||
|
||||
#define AR_SLICE_SIZE_DEFAULT 4096
|
||||
#define AR_SLICE_SIZE_MIN 256
|
||||
#define AR_SLICE_SIZE_MAX 65536
|
||||
#define AR_SPARSE_KMAX_DEFAULT 10
|
||||
#define AR_SPARSE_KMIN_DEFAULT 5
|
||||
|
||||
/* Superdir: fixed-size blocks of slice pointers. Each block holds 2048
|
||||
* pointers to actual array slices, which uses about 8 KB on 32-bit builds
|
||||
* and 16 KB on 64-bit builds. This keeps very large indices from forcing
|
||||
* catastrophic flat-directory growth. */
|
||||
#define AR_SUPER_BLOCK_SLOTS 2048
|
||||
|
||||
/* Internal constants */
|
||||
#define AR_SLICE_MIN_ALLOC 8 /* Initial dense window allocation */
|
||||
#define AR_INSERT_IDX_NONE UINT64_MAX /* No insert performed yet */
|
||||
|
||||
/* Slice encoding types */
|
||||
#define AR_SLICE_DENSE 0
|
||||
#define AR_SLICE_SPARSE 1
|
||||
|
||||
/* Tagged value encoding (low 2 bits). NULL (0) means empty slot. */
|
||||
#define AR_TAG_PTR ((uintptr_t)0) /* arString pointer (low 2 bits = 00) */
|
||||
#define AR_TAG_INT ((uintptr_t)1) /* Immediate signed integer (01) */
|
||||
#define AR_TAG_FLOAT ((uintptr_t)2) /* Immediate float (10) */
|
||||
#define AR_TAG_STR ((uintptr_t)3) /* Inline small string (11) */
|
||||
#define AR_TAG_MASK ((uintptr_t)3)
|
||||
|
||||
#if UINTPTR_MAX == UINT64_MAX
|
||||
#define AR_SMALLSTR_MAXLEN 7
|
||||
#define AR_SMALLSTR_LEN_MASK 0x7u
|
||||
#elif UINTPTR_MAX == UINT32_MAX
|
||||
#define AR_SMALLSTR_MAXLEN 3
|
||||
#define AR_SMALLSTR_LEN_MASK 0x3u
|
||||
#else
|
||||
#error "Unsupported pointer size"
|
||||
#endif
|
||||
|
||||
/* RDB type tags for array elements */
|
||||
#define AR_RDB_TAG_SDS 0
|
||||
#define AR_RDB_TAG_INT 1
|
||||
#define AR_RDB_TAG_FLOAT 2
|
||||
#define AR_RDB_TAG_SMALLSTR 3
|
||||
|
||||
/* Buffer size for inline types (int/float/smallstr) */
|
||||
#define AR_INLINE_BUFSIZE 64
|
||||
|
||||
/* ----------------------------------------------------------------------------
|
||||
* Data structures
|
||||
* -------------------------------------------------------------------------- */
|
||||
|
||||
/* Array slice: holds a range of elements. Single allocation with payload. */
|
||||
typedef struct arSlice {
|
||||
uint8_t encoding; /* 0=dense, 1=sparse */
|
||||
uint8_t _pad1[3];
|
||||
uint32_t count; /* Non-empty items in this slice */
|
||||
union {
|
||||
struct {
|
||||
uint32_t offset; /* First logical offset in window */
|
||||
uint32_t winsize; /* Window size (power of two) */
|
||||
uint32_t max_idx; /* Highest offset with a value */
|
||||
void **items; /* Points into payload */
|
||||
} dense;
|
||||
struct {
|
||||
uint32_t cap; /* Capacity */
|
||||
uint16_t *offsets; /* Points into payload */
|
||||
void **values; /* Points into payload (aligned) */
|
||||
} sparse;
|
||||
} layout;
|
||||
} arSlice;
|
||||
|
||||
/* Super-directory entry: groups slices into fixed-size pointer blocks. */
|
||||
typedef struct arSDirEntry {
|
||||
uint64_t block_id; /* slice_id / AR_SUPER_BLOCK_SLOTS */
|
||||
uint32_t count; /* Non-NULL slots in this block */
|
||||
uint32_t _pad;
|
||||
arSlice **slots; /* AR_SUPER_BLOCK_SLOTS pointers to slices */
|
||||
} arSDirEntry;
|
||||
|
||||
/* Array header */
|
||||
typedef struct redisArray {
|
||||
uint64_t count; /* Total non-empty items */
|
||||
uint64_t insert_idx; /* Last insert index, or UINT64_MAX if none */
|
||||
uint64_t dir_alloc; /* Flat directory length (flat mode) */
|
||||
uint64_t dir_highest_used; /* Highest non-NULL slice index */
|
||||
uint64_t num_slices; /* Number of allocated slices */
|
||||
size_t alloc_size; /* Tracked total allocation (for slot stats) */
|
||||
uint32_t slice_size; /* Slice size (power of two) */
|
||||
uint32_t sdir_len; /* Superdir entries count */
|
||||
uint32_t sdir_cap; /* Superdir capacity */
|
||||
uint32_t _pad;
|
||||
arSlice **dir; /* Flat directory or NULL */
|
||||
arSDirEntry *superdir; /* Super-directory or NULL */
|
||||
} redisArray;
|
||||
|
||||
/* ----------------------------------------------------------------------------
|
||||
* Inline helpers: index arithmetic
|
||||
* -------------------------------------------------------------------------- */
|
||||
|
||||
/* Compute bits needed to address elements within a slice. */
|
||||
static inline int arSliceBits(uint32_t slice_size) {
|
||||
if (slice_size == 4096) return 12; /* Fast path for default */
|
||||
int bits = 0;
|
||||
uint32_t x = slice_size;
|
||||
while (x > 1) { x >>= 1; bits++; }
|
||||
return bits;
|
||||
}
|
||||
|
||||
static inline uint64_t arSliceId(uint64_t idx, uint32_t slice_size) {
|
||||
return idx >> arSliceBits(slice_size);
|
||||
}
|
||||
|
||||
static inline uint32_t arSliceOff(uint64_t idx, uint32_t slice_size) {
|
||||
return (uint32_t)(idx & (slice_size - 1));
|
||||
}
|
||||
|
||||
static inline uint64_t arMakeIdx(uint64_t slice_id, uint32_t off, uint32_t slice_size) {
|
||||
return (slice_id << arSliceBits(slice_size)) | off;
|
||||
}
|
||||
|
||||
/* ----------------------------------------------------------------------------
|
||||
* Inline helpers: tagged value encoding
|
||||
* -------------------------------------------------------------------------- */
|
||||
|
||||
static inline int arIsEmpty(void *v) { return v == NULL; }
|
||||
|
||||
static inline int arIsPtr(void *v) {
|
||||
return v != NULL && ((uintptr_t)v & AR_TAG_MASK) == AR_TAG_PTR;
|
||||
}
|
||||
|
||||
static inline int arIsInt(void *v) {
|
||||
return ((uintptr_t)v & AR_TAG_MASK) == AR_TAG_INT;
|
||||
}
|
||||
|
||||
static inline int64_t arToInt(void *v) {
|
||||
return (int64_t)(intptr_t)v >> 2; /* Arithmetic shift preserves sign */
|
||||
}
|
||||
|
||||
static inline void *arFromInt(int64_t ival) {
|
||||
return (void *)(((uintptr_t)ival << 2) | AR_TAG_INT);
|
||||
}
|
||||
|
||||
static inline int arIntFits(int64_t ival) {
|
||||
#if UINTPTR_MAX == UINT64_MAX
|
||||
return ival >= -(1LL << 61) && ival <= (1LL << 61) - 1;
|
||||
#else
|
||||
return ival >= -(1LL << 29) && ival <= (1LL << 29) - 1;
|
||||
#endif
|
||||
}
|
||||
|
||||
static inline int arIsFloat(void *v) {
|
||||
return ((uintptr_t)v & AR_TAG_MASK) == AR_TAG_FLOAT;
|
||||
}
|
||||
|
||||
static inline double arToDouble(void *v) {
|
||||
#if UINTPTR_MAX == UINT64_MAX
|
||||
uint64_t bits = (uintptr_t)v & ~AR_TAG_MASK;
|
||||
double d;
|
||||
memcpy(&d, &bits, sizeof(d));
|
||||
return d;
|
||||
#else
|
||||
uint32_t bits = (uint32_t)((uintptr_t)v & ~(uintptr_t)AR_TAG_MASK);
|
||||
float f;
|
||||
memcpy(&f, &bits, sizeof(f));
|
||||
return (double)f;
|
||||
#endif
|
||||
}
|
||||
|
||||
static inline void *arFromFloatBits(uint64_t bits_trunc) {
|
||||
#if UINTPTR_MAX == UINT64_MAX
|
||||
return (void *)((bits_trunc & ~AR_TAG_MASK) | AR_TAG_FLOAT);
|
||||
#else
|
||||
uint32_t bits32 = (uint32_t)bits_trunc;
|
||||
return (void *)(uintptr_t)((bits32 & ~(uint32_t)AR_TAG_MASK) | AR_TAG_FLOAT);
|
||||
#endif
|
||||
}
|
||||
|
||||
static inline int arIsSmallStr(void *v) {
|
||||
return ((uintptr_t)v & AR_TAG_MASK) == AR_TAG_STR;
|
||||
}
|
||||
|
||||
static inline int arSmallStrLen(void *v) {
|
||||
return (int)(((uintptr_t)v >> 2) & AR_SMALLSTR_LEN_MASK);
|
||||
}
|
||||
|
||||
static inline int arToSmallStr(void *v, char *buf) {
|
||||
int len = arSmallStrLen(v);
|
||||
uintptr_t val = (uintptr_t)v;
|
||||
for (int i = 0; i < len; i++) {
|
||||
buf[i] = (char)((val >> (8 * (i + 1))) & 0xFF);
|
||||
}
|
||||
buf[len] = '\0';
|
||||
return len;
|
||||
}
|
||||
|
||||
static inline void *arFromSmallStr(const char *s, int len) {
|
||||
uintptr_t v = AR_TAG_STR | ((uintptr_t)len << 2);
|
||||
for (int i = 0; i < len; i++) {
|
||||
v |= ((uintptr_t)(uint8_t)s[i]) << (8 * (i + 1));
|
||||
}
|
||||
return (void *)v;
|
||||
}
|
||||
|
||||
/* ----------------------------------------------------------------------------
|
||||
* Public API
|
||||
* -------------------------------------------------------------------------- */
|
||||
|
||||
/* Lifecycle */
|
||||
redisArray *arNew(void);
|
||||
void arFree(redisArray *ar);
|
||||
redisArray *arDup(redisArray *ar);
|
||||
void arDismiss(redisArray *ar, size_t size_hint);
|
||||
|
||||
/* Element access */
|
||||
void *arGet(redisArray *ar, uint64_t idx);
|
||||
void arSet(redisArray *ar, uint64_t idx, void *v);
|
||||
int arDel(redisArray *ar, uint64_t idx);
|
||||
|
||||
/* Value encoding/decoding */
|
||||
void *arEncode(const char *s, size_t len);
|
||||
const char *arDecode(void *v, char *buf, size_t bufsize, size_t *outlen);
|
||||
int arFormatFloat(double d, char *buf, size_t bufsize);
|
||||
size_t arStringLen(const void *ptr);
|
||||
const char *arStringData(const void *ptr);
|
||||
void *arValueFromRdbInt(int64_t ival);
|
||||
void *arValueFromRdbFloat(double d);
|
||||
void *arValueFromRdbSmallStr(const char *s, size_t len);
|
||||
|
||||
/* Queries */
|
||||
uint64_t arCount(redisArray *ar);
|
||||
uint64_t arLen(redisArray *ar);
|
||||
|
||||
/* Bulk operations */
|
||||
uint64_t arDeleteRange(redisArray *ar, uint64_t lo, uint64_t hi);
|
||||
void arTruncate(redisArray *ar, uint64_t limit);
|
||||
void arMayPromoteToDenseForRangeSet(redisArray *ar, uint64_t lo, uint64_t hi);
|
||||
|
||||
/* Utilities */
|
||||
uint32_t arSparseFindPos(arSlice *s, uint16_t rel_idx, int *found);
|
||||
uint32_t arSuperDirFind(redisArray *ar, uint64_t block_id, int *found);
|
||||
redisArray *arDefrag(redisArray *ar, void *(*defragfn)(void *));
|
||||
unsigned long arDefragIncremental(redisArray **arref, unsigned long cursor,
|
||||
void *(*defragfn)(void *));
|
||||
|
||||
#endif /* __SPARSEARRAY_H */
|
||||
2021
src/t_array.c
Normal file
2021
src/t_array.c
Normal file
File diff suppressed because it is too large
Load diff
|
|
@ -91,6 +91,12 @@ static inline int log2ceil(size_t x) {
|
|||
#endif
|
||||
}
|
||||
|
||||
/* Return the smallest power of 2 >= count (e.g. 5 -> 8, 8 -> 8). */
|
||||
static inline int nearestNextPowerOf2(unsigned int count) {
|
||||
if (count <= 1) return 1;
|
||||
return 1 << (32 - __builtin_clz(count-1));
|
||||
}
|
||||
|
||||
/* Check for __builtin_add_overflow() */
|
||||
#ifndef __has_builtin
|
||||
#define __has_builtin(x) 0
|
||||
|
|
|
|||
BIN
tests/assets/array-32bit.rdb
Normal file
BIN
tests/assets/array-32bit.rdb
Normal file
Binary file not shown.
|
|
@ -15,7 +15,7 @@ if { ! [ catch {
|
|||
|
||||
proc generate_collections {suffix elements} {
|
||||
set rd [redis_deferring_client]
|
||||
set numcmd 7
|
||||
set numcmd 8 ;# base commands including array
|
||||
set has_vsets [server_has_command vadd]
|
||||
if {$has_vsets} {incr numcmd}
|
||||
|
||||
|
|
@ -29,6 +29,15 @@ proc generate_collections {suffix elements} {
|
|||
$rd zadd zset$suffix $j $val
|
||||
$rd sadd set$suffix $val
|
||||
$rd xadd stream$suffix * item 1 value $val
|
||||
# Array with sparse indices and mixed value types (int, float, string)
|
||||
set idx [expr {$j * 100 + int(rand() * 50)}] ;# sparse indices
|
||||
if {$j % 3 == 0} {
|
||||
$rd arset array$suffix $idx $j ;# integer value
|
||||
} elseif {$j % 3 == 1} {
|
||||
$rd arset array$suffix $idx [format "%.5f" [expr {rand() * 1000}]] ;# float value
|
||||
} else {
|
||||
$rd arset array$suffix $idx "str_$val" ;# string value
|
||||
}
|
||||
if {$has_vsets} {
|
||||
$rd vadd vset$suffix VALUES 3 1 1 1 $j
|
||||
}
|
||||
|
|
|
|||
|
|
@ -46,6 +46,15 @@ start_server {tags {"dismiss external:skip needs:debug"}} {
|
|||
# stream
|
||||
r xadd bigstream * entry1 $bigstr entry2 $bigstr
|
||||
|
||||
# array: dense slice populated with large string values, plus a
|
||||
# sparsely-populated array whose indices span multiple slices.
|
||||
for {set i 0} {$i < 32} {incr i} {
|
||||
r arset dense_array $i $bigstr
|
||||
}
|
||||
for {set i 0} {$i < 16} {incr i} {
|
||||
r arset sparse_array [expr {$i * 5000}] $bigstr
|
||||
}
|
||||
|
||||
set digest [debug_digest]
|
||||
# Test both RDB (yes) and AOF (no) rewrite paths.
|
||||
foreach preamble {yes no} {
|
||||
|
|
|
|||
|
|
@ -802,7 +802,8 @@ proc generate_fuzzy_traffic_on_key {key type duration} {
|
|||
set stream_commands {XACK XADD XCLAIM XDEL XGROUP XINFO XLEN XPENDING XRANGE XREAD XREADGROUP XREVRANGE XTRIM XDELEX XACKDEL XNACK}
|
||||
set vset_commands {VADD VREM}
|
||||
set gcra_commands {GCRA}
|
||||
set commands [dict create string $string_commands hash $hash_commands zset $zset_commands list $list_commands set $set_commands stream $stream_commands vectorset $vset_commands gcra $gcra_commands]
|
||||
set array_commands {ARSET ARGET ARDEL ARCOUNT ARMSET ARMGET ARGETRANGE ARDELRANGE ARINFO}
|
||||
set commands [dict create string $string_commands hash $hash_commands zset $zset_commands list $list_commands set $set_commands stream $stream_commands vectorset $vset_commands gcra $gcra_commands array $array_commands]
|
||||
|
||||
set cmds [dict get $commands $type]
|
||||
set start_time [clock seconds]
|
||||
|
|
@ -863,6 +864,49 @@ proc generate_fuzzy_traffic_on_key {key type duration} {
|
|||
lappend cmd [randomValue]
|
||||
incr i 2
|
||||
}
|
||||
# Array commands need integer indices
|
||||
if {$cmd == "ARSET"} {
|
||||
lappend cmd $key
|
||||
lappend cmd [randomInt 100000] ;# index
|
||||
lappend cmd [randomValue] ;# value
|
||||
incr i 3
|
||||
}
|
||||
if {$cmd == "ARGET" || $cmd == "ARDEL"} {
|
||||
lappend cmd $key
|
||||
lappend cmd [randomInt 100000] ;# index
|
||||
incr i 2
|
||||
}
|
||||
if {$cmd == "ARCOUNT" || $cmd == "ARINFO"} {
|
||||
lappend cmd $key
|
||||
incr i 1
|
||||
}
|
||||
if {$cmd == "ARMSET"} {
|
||||
lappend cmd $key
|
||||
# Add 2-4 index/value pairs
|
||||
set npairs [expr {int(rand() * 3) + 2}]
|
||||
for {set p 0} {$p < $npairs} {incr p} {
|
||||
lappend cmd [randomInt 100000]
|
||||
lappend cmd [randomValue]
|
||||
}
|
||||
incr i [expr {1 + $npairs * 2}]
|
||||
}
|
||||
if {$cmd == "ARMGET"} {
|
||||
lappend cmd $key
|
||||
# Add 2-4 indices
|
||||
set nidx [expr {int(rand() * 3) + 2}]
|
||||
for {set p 0} {$p < $nidx} {incr p} {
|
||||
lappend cmd [randomInt 100000]
|
||||
}
|
||||
incr i [expr {1 + $nidx}]
|
||||
}
|
||||
if {$cmd == "ARGETRANGE" || $cmd == "ARDELRANGE"} {
|
||||
lappend cmd $key
|
||||
set idx1 [randomInt 100000]
|
||||
set idx2 [expr {$idx1 + [randomInt 1000]}]
|
||||
lappend cmd $idx1
|
||||
lappend cmd $idx2
|
||||
incr i 3
|
||||
}
|
||||
|
||||
for {} {$i < $arity} {incr i} {
|
||||
if {$i == $firstkey || $i == $lastkey} {
|
||||
|
|
|
|||
|
|
@ -204,6 +204,70 @@ start_server {tags {"aofrw external:skip debug_defrag:skip"} overrides {aof-use-
|
|||
r FUNCTION LIST
|
||||
} {{library_name test engine LUA functions {{name test description {} flags {}}}}}
|
||||
|
||||
# Array AOF rewrite tests
|
||||
test "AOF rewrite of array with mixed value types" {
|
||||
r flushall
|
||||
# Create array with various value types
|
||||
r arset myarray 0 12345 ;# int
|
||||
r arset myarray 1 "hello" ;# small string
|
||||
r arset myarray 2 3.14159 ;# float
|
||||
r arset myarray 100 [string repeat x 50] ;# large string
|
||||
r arset myarray 10000 "sparse" ;# sparse index
|
||||
set d1 [debug_digest]
|
||||
r bgrewriteaof
|
||||
waitForBgrewriteaof r
|
||||
r debug loadaof
|
||||
set d2 [debug_digest]
|
||||
if {$d1 ne $d2} {
|
||||
error "assertion:$d1 is not equal to $d2"
|
||||
}
|
||||
}
|
||||
|
||||
test "AOF rewrite of array with insert_idx (circular buffer)" {
|
||||
r flushall
|
||||
# Create circular buffer using ARRING
|
||||
for {set i 0} {$i < 25} {incr i} {
|
||||
r arring myarray 10 "v$i"
|
||||
}
|
||||
# insert_idx should be 4 ((25-1) % 10 = 4)
|
||||
set next_before [r arnext myarray]
|
||||
set d1 [debug_digest]
|
||||
|
||||
r bgrewriteaof
|
||||
waitForBgrewriteaof r
|
||||
r debug loadaof
|
||||
|
||||
set d2 [debug_digest]
|
||||
if {$d1 ne $d2} {
|
||||
error "assertion:$d1 is not equal to $d2"
|
||||
}
|
||||
# Verify insert_idx preserved
|
||||
assert_equal $next_before [r arnext myarray]
|
||||
|
||||
# Continue inserting - should continue from correct position
|
||||
set new_idx [r arring myarray 10 "after_aof"]
|
||||
assert_equal $next_before $new_idx
|
||||
}
|
||||
|
||||
test "AOF rewrite of array spanning multiple slices" {
|
||||
r flushall
|
||||
# Create array across multiple slices (slice_size = 4096)
|
||||
for {set slice 0} {$slice < 5} {incr slice} {
|
||||
set base [expr {$slice * 4096}]
|
||||
for {set i 0} {$i < 20} {incr i} {
|
||||
r arset myarray [expr {$base + $i * 100}] "s${slice}_v$i"
|
||||
}
|
||||
}
|
||||
set d1 [debug_digest]
|
||||
r bgrewriteaof
|
||||
waitForBgrewriteaof r
|
||||
r debug loadaof
|
||||
set d2 [debug_digest]
|
||||
if {$d1 ne $d2} {
|
||||
error "assertion:$d1 is not equal to $d2"
|
||||
}
|
||||
}
|
||||
|
||||
test {BGREWRITEAOF is delayed if BGSAVE is in progress} {
|
||||
r flushall
|
||||
r set k v
|
||||
|
|
|
|||
|
|
@ -1113,6 +1113,97 @@ run_solo {defrag} {
|
|||
} ;# standalone
|
||||
}
|
||||
}
|
||||
|
||||
if {[string match {*jemalloc*} [s mem_allocator]] &&
|
||||
[r debug mallctl arenas.page] <= 8192 &&
|
||||
$type eq "standalone"} { ;# skip in cluster mode and non-jemalloc
|
||||
test "Active defrag arrays: $type" {
|
||||
r flushdb
|
||||
r config set hz 100
|
||||
r config set activedefrag no
|
||||
wait_for_defrag_stop 500 100
|
||||
r config resetstat
|
||||
r config set active-defrag-max-scan-fields 100
|
||||
r config set active-defrag-threshold-lower 1
|
||||
r config set active-defrag-cycle-min 65
|
||||
r config set active-defrag-cycle-max 75
|
||||
r config set active-defrag-ignore-bytes 512kb
|
||||
r config set maxmemory 0
|
||||
|
||||
# Create two large arrays with interleaved allocations. Indices are
|
||||
# one full slice apart so the surviving array is stored as many
|
||||
# separate slices and uses superdir mode.
|
||||
set rd [redis_deferring_client]
|
||||
set payload [string repeat A 500]
|
||||
set elements 3000
|
||||
set base 8388608
|
||||
set count 0
|
||||
for {set j 0} {$j < $elements} {incr j} {
|
||||
set idx [expr {$base + $j * 4096}]
|
||||
$rd arset bigarray1 $idx "a1:$j:$payload"
|
||||
$rd arset bigarray2 $idx "a2:$j:$payload"
|
||||
|
||||
incr count
|
||||
discard_replies_every $rd $count 1000 2000
|
||||
}
|
||||
set remaining [expr {($count % 1000) * 2}]
|
||||
for {set j 0} {$j < $remaining} {incr j} {
|
||||
$rd read
|
||||
}
|
||||
|
||||
assert_equal $elements [r arcount bigarray1]
|
||||
assert_equal $elements [r arcount bigarray2]
|
||||
assert_morethan [dict get [r arinfo bigarray1] directory-size] 0
|
||||
|
||||
# Free one full array to create fragmentation around the surviving
|
||||
# array's slices and string allocations.
|
||||
r del bigarray2
|
||||
|
||||
after 120 ;# serverCron only updates the info once in 100ms
|
||||
r config set latency-monitor-threshold 5
|
||||
r latency reset
|
||||
|
||||
set digest [debug_digest]
|
||||
catch {r config set activedefrag yes} e
|
||||
if {[r config get activedefrag] eq "activedefrag yes"} {
|
||||
wait_for_condition 50 100 {
|
||||
[s total_active_defrag_time] ne 0
|
||||
} else {
|
||||
after 120 ;# serverCron only updates the info once in 100ms
|
||||
puts [r info memory]
|
||||
puts [r info stats]
|
||||
puts [r memory malloc-stats]
|
||||
fail "defrag not started."
|
||||
}
|
||||
|
||||
# This test only needs to verify that active defrag reached the
|
||||
# array and processed it without corrupting the value. We do
|
||||
# not require the allocator to fully converge to a no-fragmentation
|
||||
# state on every platform.
|
||||
wait_for_condition 500 100 {
|
||||
[s active_defrag_key_hits] + [s active_defrag_key_misses] > 0
|
||||
} else {
|
||||
after 120 ;# serverCron only updates the info once in 100ms
|
||||
puts [r info memory]
|
||||
puts [r info stats]
|
||||
puts [r memory malloc-stats]
|
||||
fail "array defrag did not touch the key."
|
||||
}
|
||||
|
||||
r config set activedefrag no
|
||||
wait_for_defrag_stop 500 100
|
||||
}
|
||||
|
||||
# Verify the array stayed intact after active defrag touched it.
|
||||
assert_equal $elements [r arcount bigarray1]
|
||||
assert_equal "a1:0:$payload" [r arget bigarray1 $base]
|
||||
assert_equal "a1:1234:$payload" [r arget bigarray1 [expr {$base + 1234 * 4096}]]
|
||||
assert_equal "a1:2999:$payload" [r arget bigarray1 [expr {$base + 2999 * 4096}]]
|
||||
assert_equal $digest [debug_digest]
|
||||
assert_equal OK [r save] ;# Iterates all pointers again after defrag.
|
||||
expr 1
|
||||
} {1}
|
||||
}
|
||||
}
|
||||
|
||||
test "Active defrag can't be triggered during replicaof database flush. See issue #14267" {
|
||||
|
|
|
|||
3114
tests/unit/type/array.tcl
Normal file
3114
tests/unit/type/array.tcl
Normal file
File diff suppressed because it is too large
Load diff
431
tools/array-bench.py
Executable file
431
tools/array-bench.py
Executable file
|
|
@ -0,0 +1,431 @@
|
|||
#!/usr/bin/env python3
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import signal
|
||||
import subprocess
|
||||
import sys
|
||||
import tempfile
|
||||
import time
|
||||
from dataclasses import dataclass, asdict
|
||||
from pathlib import Path
|
||||
from typing import Optional
|
||||
|
||||
|
||||
QPS_RE = re.compile(r"([0-9]+(?:\.[0-9]+)?)\s+requests per second")
|
||||
|
||||
|
||||
@dataclass
|
||||
class Workload:
|
||||
name: str
|
||||
description: str
|
||||
command: list[str]
|
||||
requests: int
|
||||
clients: int
|
||||
pipeline: int
|
||||
rand_range: int = 0
|
||||
warmup_requests: int = 2000
|
||||
setup: Optional[str] = None
|
||||
|
||||
|
||||
@dataclass
|
||||
class Result:
|
||||
name: str
|
||||
description: str
|
||||
qps: float
|
||||
requests: int
|
||||
clients: int
|
||||
pipeline: int
|
||||
rand_range: int
|
||||
command: list[str]
|
||||
raw_output: str
|
||||
|
||||
|
||||
class BenchError(RuntimeError):
|
||||
pass
|
||||
|
||||
|
||||
class RedisArrayBench:
|
||||
def __init__(self, args: argparse.Namespace):
|
||||
self.args = args
|
||||
self.base_dir = Path(__file__).resolve().parent
|
||||
repo_root = self.base_dir.parent
|
||||
src_dir = Path(args.src_dir) if args.src_dir else repo_root / "src"
|
||||
self.redis_server = str(src_dir / "redis-server")
|
||||
self.redis_cli = str(src_dir / "redis-cli")
|
||||
self.redis_benchmark = str(src_dir / "redis-benchmark")
|
||||
self.server_proc: Optional[subprocess.Popen[str]] = None
|
||||
self.server_dir: Optional[tempfile.TemporaryDirectory[str]] = None
|
||||
self.host = args.host
|
||||
self.port = args.port
|
||||
self.db = args.db
|
||||
self.results: list[Result] = []
|
||||
|
||||
for binary in (self.redis_server, self.redis_cli, self.redis_benchmark):
|
||||
if not os.path.exists(binary):
|
||||
raise BenchError(f"missing binary: {binary}")
|
||||
|
||||
def run(self) -> int:
|
||||
try:
|
||||
if self.args.start_server:
|
||||
self.start_server()
|
||||
self.prepare_data()
|
||||
self.print_dataset_summary()
|
||||
for workload in self.selected_workloads():
|
||||
result = self.run_workload(workload)
|
||||
self.results.append(result)
|
||||
print(f"{result.name:28s} {result.qps:12.2f} req/s")
|
||||
self.print_summary()
|
||||
if self.args.json_out:
|
||||
with open(self.args.json_out, "w", encoding="utf-8") as fp:
|
||||
json.dump({
|
||||
"host": self.host,
|
||||
"port": self.port,
|
||||
"db": self.db,
|
||||
"results": [asdict(r) for r in self.results],
|
||||
}, fp, indent=2)
|
||||
print(f"json written to {self.args.json_out}")
|
||||
return 0
|
||||
finally:
|
||||
if self.args.start_server and not self.args.keep_server:
|
||||
self.stop_server()
|
||||
|
||||
def start_server(self) -> None:
|
||||
self.server_dir = tempfile.TemporaryDirectory(prefix="array-bench-")
|
||||
cmd = [
|
||||
self.redis_server,
|
||||
"--port", str(self.port),
|
||||
"--save", "",
|
||||
"--appendonly", "no",
|
||||
"--dir", self.server_dir.name,
|
||||
"--loglevel", "warning",
|
||||
"--daemonize", "no",
|
||||
]
|
||||
self.server_proc = subprocess.Popen(
|
||||
cmd,
|
||||
stdout=subprocess.PIPE,
|
||||
stderr=subprocess.STDOUT,
|
||||
text=True,
|
||||
)
|
||||
self.wait_for_ping(timeout=10.0)
|
||||
|
||||
def stop_server(self) -> None:
|
||||
if self.server_proc is not None and self.server_proc.poll() is None:
|
||||
self.server_proc.send_signal(signal.SIGTERM)
|
||||
try:
|
||||
self.server_proc.wait(timeout=5)
|
||||
except subprocess.TimeoutExpired:
|
||||
self.server_proc.kill()
|
||||
self.server_proc.wait(timeout=5)
|
||||
if self.server_dir is not None:
|
||||
self.server_dir.cleanup()
|
||||
self.server_proc = None
|
||||
self.server_dir = None
|
||||
|
||||
def wait_for_ping(self, timeout: float) -> None:
|
||||
deadline = time.time() + timeout
|
||||
last_error = None
|
||||
while time.time() < deadline:
|
||||
if self.server_proc is not None and self.server_proc.poll() is not None:
|
||||
raise BenchError(
|
||||
"server exited before becoming ready:\n"
|
||||
f"{self.read_server_output().strip()}"
|
||||
)
|
||||
try:
|
||||
cmd = [
|
||||
self.redis_cli,
|
||||
"-h", self.host,
|
||||
"-p", str(self.port),
|
||||
"-n", str(self.db),
|
||||
"--raw",
|
||||
"PING",
|
||||
]
|
||||
probe = subprocess.run(
|
||||
cmd,
|
||||
stdout=subprocess.PIPE,
|
||||
stderr=subprocess.PIPE,
|
||||
text=True,
|
||||
)
|
||||
if probe.returncode != 0:
|
||||
raise BenchError(probe.stderr.strip() or probe.stdout.strip())
|
||||
out = probe.stdout.strip()
|
||||
if out == "PONG":
|
||||
return
|
||||
except Exception as exc: # pragma: no cover - startup race handling
|
||||
last_error = exc
|
||||
time.sleep(0.05)
|
||||
raise BenchError(
|
||||
f"server did not start on {self.host}:{self.port}: {last_error}\n"
|
||||
f"{self.read_server_output().strip()}"
|
||||
)
|
||||
|
||||
def read_server_output(self) -> str:
|
||||
if self.server_proc is None or self.server_proc.stdout is None:
|
||||
return ""
|
||||
try:
|
||||
return self.server_proc.stdout.read()
|
||||
except Exception: # pragma: no cover - best effort diagnostics
|
||||
return ""
|
||||
|
||||
def cli(self, command: list[str], raw: bool = False) -> str:
|
||||
cmd = [self.redis_cli, "-h", self.host, "-p", str(self.port), "-n", str(self.db)]
|
||||
if raw:
|
||||
cmd.append("--raw")
|
||||
cmd.extend(command)
|
||||
return subprocess.check_output(cmd, text=True)
|
||||
|
||||
def pipe(self, payload: bytes) -> None:
|
||||
cmd = [self.redis_cli, "-h", self.host, "-p", str(self.port), "-n", str(self.db), "--pipe"]
|
||||
proc = subprocess.run(cmd, input=payload, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
|
||||
if proc.returncode != 0:
|
||||
raise BenchError(f"redis-cli --pipe failed:\n{proc.stdout.decode('utf-8', 'replace')}")
|
||||
out = proc.stdout.decode("utf-8", "replace")
|
||||
if "errors: 0, replies:" not in out:
|
||||
raise BenchError(f"unexpected --pipe output:\n{out}")
|
||||
|
||||
@staticmethod
|
||||
def resp(parts: list[str]) -> bytes:
|
||||
out = [f"*{len(parts)}\r\n".encode()]
|
||||
for part in parts:
|
||||
data = part.encode("utf-8")
|
||||
out.append(f"${len(data)}\r\n".encode())
|
||||
out.append(data)
|
||||
out.append(b"\r\n")
|
||||
return b"".join(out)
|
||||
|
||||
def prepare_data(self) -> None:
|
||||
print("preparing datasets...", file=sys.stderr)
|
||||
self.cli(["FLUSHDB"])
|
||||
payload = bytearray()
|
||||
payload += self.resp(["DEL", "bench:array:dense:num", "bench:array:dense:text", "bench:array:sparse:text", "bench:array:append", "bench:array:ring"])
|
||||
payload += self.build_dense_numeric()
|
||||
payload += self.build_dense_text()
|
||||
payload += self.build_sparse_text()
|
||||
self.pipe(bytes(payload))
|
||||
|
||||
def build_dense_numeric(self) -> bytes:
|
||||
key = "bench:array:dense:num"
|
||||
total = self.args.dense_len
|
||||
batch = 256
|
||||
payload = bytearray()
|
||||
for start in range(0, total, batch):
|
||||
values = [str(start + i) for i in range(min(batch, total - start))]
|
||||
payload += self.resp(["ARSET", key, str(start), *values])
|
||||
return bytes(payload)
|
||||
|
||||
def build_dense_text(self) -> bytes:
|
||||
key = "bench:array:dense:text"
|
||||
total = self.args.dense_len
|
||||
batch = 128
|
||||
payload = bytearray()
|
||||
for start in range(0, total, batch):
|
||||
values = []
|
||||
for i in range(start, min(start + batch, total)):
|
||||
mod = i % 4
|
||||
if mod == 0:
|
||||
values.append(f"row:{i} alpha encoding complexity")
|
||||
elif mod == 1:
|
||||
values.append(f"row:{i} beta sparse vector")
|
||||
elif mod == 2:
|
||||
values.append(f"row:{i} gamma dense matcher")
|
||||
else:
|
||||
values.append(f"row:{i} delta encoding helper")
|
||||
payload += self.resp(["ARSET", key, str(start), *values])
|
||||
return bytes(payload)
|
||||
|
||||
def build_sparse_text(self) -> bytes:
|
||||
key = "bench:array:sparse:text"
|
||||
clusters = [
|
||||
(0, 97, 384),
|
||||
(8_388_608, 113, 640),
|
||||
(16_777_216, 127, 896),
|
||||
(25_165_824, 151, 896),
|
||||
]
|
||||
batch_pairs = 64
|
||||
pairs: list[str] = []
|
||||
payload = bytearray()
|
||||
nth = 0
|
||||
for base, stride, count in clusters:
|
||||
for i in range(count):
|
||||
idx = base + i * stride
|
||||
mod = nth % 4
|
||||
if mod == 0:
|
||||
value = f"slot:{idx} alpha encoding complexity"
|
||||
elif mod == 1:
|
||||
value = f"slot:{idx} beta sparse needle"
|
||||
elif mod == 2:
|
||||
value = f"slot:{idx} gamma dense helper"
|
||||
else:
|
||||
value = f"slot:{idx} delta complexity marker"
|
||||
pairs.extend([str(idx), value])
|
||||
nth += 1
|
||||
if len(pairs) >= batch_pairs * 2:
|
||||
payload += self.resp(["ARMSET", key, *pairs])
|
||||
pairs.clear()
|
||||
if pairs:
|
||||
payload += self.resp(["ARMSET", key, *pairs])
|
||||
return bytes(payload)
|
||||
|
||||
def print_dataset_summary(self) -> None:
|
||||
summary = {
|
||||
"bench:array:dense:num": {
|
||||
"count": self.cli(["ARCOUNT", "bench:array:dense:num"], raw=True).strip(),
|
||||
"len": self.cli(["ARLEN", "bench:array:dense:num"], raw=True).strip(),
|
||||
},
|
||||
"bench:array:dense:text": {
|
||||
"count": self.cli(["ARCOUNT", "bench:array:dense:text"], raw=True).strip(),
|
||||
"len": self.cli(["ARLEN", "bench:array:dense:text"], raw=True).strip(),
|
||||
},
|
||||
"bench:array:sparse:text": {
|
||||
"count": self.cli(["ARCOUNT", "bench:array:sparse:text"], raw=True).strip(),
|
||||
"len": self.cli(["ARLEN", "bench:array:sparse:text"], raw=True).strip(),
|
||||
},
|
||||
}
|
||||
print("dataset:")
|
||||
for key, info in summary.items():
|
||||
print(f" {key}: count={info['count']} len={info['len']}")
|
||||
|
||||
def selected_workloads(self) -> list[Workload]:
|
||||
workloads = self.workloads()
|
||||
if not self.args.only:
|
||||
return workloads
|
||||
wanted = {name.strip() for name in self.args.only.split(",") if name.strip()}
|
||||
unknown = wanted - {w.name for w in workloads}
|
||||
if unknown:
|
||||
raise BenchError(f"unknown workload(s): {', '.join(sorted(unknown))}")
|
||||
return [w for w in workloads if w.name in wanted]
|
||||
|
||||
def workloads(self) -> list[Workload]:
|
||||
dense_range_end = min(8192 + 31, self.args.dense_len - 1)
|
||||
return [
|
||||
Workload("arget_dense_rand", "ARGET dense random hit", ["ARGET", "bench:array:dense:num", "__rand_int__"], 200_000, 50, 16, rand_range=self.args.dense_len),
|
||||
Workload("armget_dense_4_rand", "ARMGET dense 4 random hits", ["ARMGET", "bench:array:dense:num", "__rand_int__", "__rand_int__", "__rand_int__", "__rand_int__"], 100_000, 50, 16, rand_range=self.args.dense_len),
|
||||
Workload("argetrange_dense_32", "ARGETRANGE dense 32 hot", ["ARGETRANGE", "bench:array:dense:num", "8192", str(dense_range_end)], 50_000, 32, 8),
|
||||
Workload("arscan_dense_limit_100", "ARSCAN dense LIMIT 100", ["ARSCAN", "bench:array:dense:text", "0", str(self.args.dense_len - 1), "LIMIT", "100"], 50_000, 24, 4),
|
||||
Workload("argrep_match_dense", "ARGREP MATCH dense", ["ARGREP", "bench:array:dense:text", "0", str(self.args.dense_len - 1), "MATCH", "encoding", "LIMIT", "20", "WITHVALUES"], 20_000, 20, 2),
|
||||
Workload("argrep_re_dense_nocase", "ARGREP RE dense nocase", ["ARGREP", "bench:array:dense:text", "0", str(self.args.dense_len - 1), "RE", "encoding|complexity|helper", "NOCASE", "LIMIT", "20", "WITHVALUES"], 20_000, 20, 2),
|
||||
Workload("arop_sum_dense_4096", "AROP SUM dense 4096", ["AROP", "bench:array:dense:num", "0", "4095", "SUM"], 50_000, 24, 4),
|
||||
Workload("arget_sparse_rand", "ARGET sparse random mostly miss", ["ARGET", "bench:array:sparse:text", "__rand_int__"], 200_000, 50, 16, rand_range=self.args.sparse_space),
|
||||
Workload("arscan_sparse_limit_100", "ARSCAN sparse LIMIT 100", ["ARSCAN", "bench:array:sparse:text", "0", str(self.args.sparse_space - 1), "LIMIT", "100"], 25_000, 20, 2),
|
||||
Workload("argrep_match_sparse", "ARGREP MATCH sparse", ["ARGREP", "bench:array:sparse:text", "0", str(self.args.sparse_space - 1), "MATCH", "encoding", "LIMIT", "20", "WITHVALUES"], 10_000, 16, 1),
|
||||
Workload("arop_used_sparse", "AROP USED sparse", ["AROP", "bench:array:sparse:text", "0", str(self.args.sparse_space - 1), "USED"], 25_000, 20, 2),
|
||||
Workload("arset_dense_rand", "ARSET dense random update", ["ARSET", "bench:array:dense:num", "__rand_int__", "42"], 150_000, 50, 16, rand_range=self.args.dense_len),
|
||||
Workload("armset_dense_4_rand", "ARMSET dense 4 random updates", ["ARMSET", "bench:array:dense:num", "__rand_int__", "11", "__rand_int__", "22", "__rand_int__", "33", "__rand_int__", "44"], 100_000, 50, 16, rand_range=self.args.dense_len),
|
||||
Workload("arinsert_append_hot", "ARINSERT append hot path", ["ARINSERT", "bench:array:append", "x"], 50_000, 24, 8, setup="reset_append"),
|
||||
Workload("arring_hot_1024", "ARRING size 1024 hot path", ["ARRING", "bench:array:ring", "1024", "x"], 100_000, 50, 16, setup="reset_ring"),
|
||||
]
|
||||
|
||||
def run_workload(self, workload: Workload) -> Result:
|
||||
if workload.setup:
|
||||
getattr(self, workload.setup)()
|
||||
if self.args.warmup and workload.warmup_requests > 0:
|
||||
self.invoke_benchmark(workload, workload.warmup_requests, quiet=True)
|
||||
raw = self.invoke_benchmark(workload, self.scale_requests(workload.requests), quiet=True)
|
||||
qps = self.parse_qps(raw)
|
||||
return Result(
|
||||
name=workload.name,
|
||||
description=workload.description,
|
||||
qps=qps,
|
||||
requests=self.scale_requests(workload.requests),
|
||||
clients=workload.clients,
|
||||
pipeline=workload.pipeline,
|
||||
rand_range=workload.rand_range,
|
||||
command=workload.command,
|
||||
raw_output=raw.strip(),
|
||||
)
|
||||
|
||||
def invoke_benchmark(self, workload: Workload, requests: int, quiet: bool) -> str:
|
||||
cmd = [
|
||||
self.redis_benchmark,
|
||||
"-h", self.host,
|
||||
"-p", str(self.port),
|
||||
"--dbnum", str(self.db),
|
||||
"-n", str(requests),
|
||||
"-c", str(workload.clients),
|
||||
"-P", str(workload.pipeline),
|
||||
"--seed", str(self.args.seed),
|
||||
]
|
||||
if quiet:
|
||||
cmd.append("-q")
|
||||
if workload.rand_range:
|
||||
cmd.extend(["-r", str(workload.rand_range)])
|
||||
cmd.extend(workload.command)
|
||||
return subprocess.check_output(cmd, text=True, stderr=subprocess.STDOUT)
|
||||
|
||||
def parse_qps(self, raw: str) -> float:
|
||||
m = QPS_RE.search(raw)
|
||||
if not m:
|
||||
raise BenchError(f"could not parse qps from redis-benchmark output:\n{raw}")
|
||||
return float(m.group(1))
|
||||
|
||||
def scale_requests(self, requests: int) -> int:
|
||||
scaled = int(requests * self.args.request_scale)
|
||||
return max(1000, scaled)
|
||||
|
||||
def reset_append(self) -> None:
|
||||
self.cli(["DEL", "bench:array:append"])
|
||||
|
||||
def reset_ring(self) -> None:
|
||||
self.cli(["DEL", "bench:array:ring"])
|
||||
|
||||
def print_summary(self) -> None:
|
||||
print("\nsummary:")
|
||||
print("| workload | qps | req | c | P | notes |")
|
||||
print("|---|---:|---:|---:|---:|---|")
|
||||
for r in self.results:
|
||||
notes = r.description
|
||||
if r.rand_range:
|
||||
notes += f", rand=0..{r.rand_range - 1}"
|
||||
print(f"| {r.name} | {r.qps:.2f} | {r.requests} | {r.clients} | {r.pipeline} | {notes} |")
|
||||
|
||||
|
||||
def parse_args() -> argparse.Namespace:
|
||||
parser = argparse.ArgumentParser(
|
||||
description=(
|
||||
"Standalone Array benchmark harness. It uses DB 9 by default, "
|
||||
"flushes that DB, loads deterministic Array datasets, and runs "
|
||||
"custom redis-benchmark workloads."
|
||||
)
|
||||
)
|
||||
parser.add_argument("--src-dir", help="Path to the src directory containing redis-server, redis-cli, and redis-benchmark")
|
||||
parser.add_argument("--host", default="127.0.0.1")
|
||||
parser.add_argument("--port", type=int, default=6395)
|
||||
parser.add_argument("--db", type=int, default=9)
|
||||
parser.add_argument("--start-server", action="store_true", default=True,
|
||||
help="Start an ephemeral redis-server on --port (default: enabled)")
|
||||
parser.add_argument("--no-start-server", dest="start_server", action="store_false",
|
||||
help="Use an already running server instead of starting one")
|
||||
parser.add_argument("--keep-server", action="store_true",
|
||||
help="Do not stop the ephemeral server after the run")
|
||||
parser.add_argument("--only", help="Comma-separated workload names to run")
|
||||
parser.add_argument("--seed", type=int, default=12345)
|
||||
parser.add_argument("--request-scale", type=float, default=1.0,
|
||||
help="Scale factor applied to all workload request counts")
|
||||
parser.add_argument("--warmup", action="store_true", default=True,
|
||||
help="Run a short warmup before each benchmark (default: enabled)")
|
||||
parser.add_argument("--no-warmup", dest="warmup", action="store_false")
|
||||
parser.add_argument("--json-out", help="Optional path for machine-readable results")
|
||||
parser.add_argument("--dense-len", type=int, default=16_384,
|
||||
help="Number of contiguous dense elements to preload")
|
||||
parser.add_argument("--sparse-space", type=int, default=30_000_000,
|
||||
help="Logical range used by sparse benchmarks")
|
||||
return parser.parse_args()
|
||||
|
||||
|
||||
def main() -> int:
|
||||
args = parse_args()
|
||||
try:
|
||||
bench = RedisArrayBench(args)
|
||||
return bench.run()
|
||||
except BenchError as exc:
|
||||
print(f"error: {exc}", file=sys.stderr)
|
||||
return 1
|
||||
except subprocess.CalledProcessError as exc:
|
||||
output = exc.output if isinstance(exc.output, str) else exc.output.decode("utf-8", "replace")
|
||||
print(output, file=sys.stderr)
|
||||
return exc.returncode or 1
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
|
|
@ -34,6 +34,7 @@ GROUPS = {
|
|||
"geo": "COMMAND_GROUP_GEO",
|
||||
"stream": "COMMAND_GROUP_STREAM",
|
||||
"bitmap": "COMMAND_GROUP_BITMAP",
|
||||
"array": "COMMAND_GROUP_ARRAY",
|
||||
"rate_limit": "COMMAND_GROUP_RATE_LIMIT",
|
||||
}
|
||||
|
||||
|
|
@ -603,6 +604,7 @@ const char *COMMAND_GROUP_STR[] = {
|
|||
"geo",
|
||||
"stream",
|
||||
"bitmap",
|
||||
"array",
|
||||
"module",
|
||||
"rate_limit"
|
||||
};
|
||||
|
|
|
|||
Loading…
Reference in a new issue