opnsense-src/lib/libc/regex
Bill Sommerfeld 4f4860c9b0 regex: mixed sets are misidentified as singletons
Fix "singleton" function used by regcomp() to turn character set matches
into exact character matches if a character set has exactly one
element.

The underlying cset representation is complex; most critically it
records"small" characters (codepoint less than either 128
or 256 depending on locale) in a bit vector, and "wide" characters in
a secondary array.

Unfortunately the "singleton" function uses to identify singleton sets
treated a cset as a singleton if either the "small" or the "wide" sets
had exactly one element (it would then ignore the other set).

The easiest way to demonstrate this bug:

	$ export LANG=C.UTF-8
	$ echo 'a' | grep '[abà]'

It should match (and print "a") but instead it doesn't match because the
single accented character in the set is misinterpreted as a singleton.

PR:		281710
Reviewed by:	kevans, yuripv
Obtained from:	illumos

(cherry picked from commit 8f7ed58a15556bf567ff876e1999e4fe4d684e1d)
2024-09-25 15:42:25 -05:00
..
grot libc: Purge unneeded cdefs.h 2023-11-26 21:20:09 -07:00
cname.h Remove $FreeBSD$: one-line .h pattern 2023-08-16 11:54:23 -06:00
COPYRIGHT
engine.c libc: Purge unneeded cdefs.h 2023-11-26 21:20:09 -07:00
Makefile.inc Remove $FreeBSD$: one-line sh pattern 2023-08-16 11:55:03 -06:00
re_format.7 Remove $FreeBSD$: one-line nroff pattern 2023-08-16 11:55:15 -06:00
regcomp.c regex: mixed sets are misidentified as singletons 2024-09-25 15:42:25 -05:00
regerror.c libc: Purge unneeded cdefs.h 2023-11-26 21:20:09 -07:00
regex.3 Remove $FreeBSD$: one-line nroff pattern 2023-08-16 11:55:15 -06:00
regex2.h Remove $FreeBSD$: one-line .h pattern 2023-08-16 11:54:23 -06:00
regexec.c libc: Purge unneeded cdefs.h 2023-11-26 21:20:09 -07:00
regfree.c libc: Purge unneeded cdefs.h 2023-11-26 21:20:09 -07:00
Symbol.map libc: Remove empty comments in Symbol.map 2023-12-13 22:08:13 +00:00
utils.h Remove $FreeBSD$: one-line .h pattern 2023-08-16 11:54:23 -06:00
WHATSNEW