dfilter: Disallow embedded NUL bytes in regular strings

When byte escape sequences, that is hex \xhh or octal \0ddd,
are interpreted at the lexical level it is not possible to
use strings with embedded NUL bytes. The NUL byte is interpreted
as a C string terminator. As a consequence, for example, the
strings "AB" and "AB\x00CDE" compare as the same. This leads to
unexpected false matches and a poor user experience.

Disallow embedded NULs for regular strings (strings literals that
do not begin with 'r' or 'R') for this reason.

It is possible to use a raw string instead (eg: r"AB\x00C")
to match embedded NUL bytes, although that only works with regular
expressions. Normal escape rules would also work with regular
expressions (eg: "AB\\x00C"). This is the same string as the previous
one, written in an alternate form.  What won't work is "AB\x00C", this
string is synctatically invalid.

So the expression: data matches r"AB\x00C"
will match the bytes {'A', 'B', '\0', '\C'}.

However the expression: data contains r"AB\x00C"
won't match the fvalue above. Because the "contains" operator
doesn't compile a regular expression it literally tries to
contains-match the bytes {'A', 'B', '\\', 'x', '0', '0', 'C'}.

Therefore raw strings are very convenient but it is still necessary
to be aware that the matches operator has an extra level of indirection
than other string operators (same as in Python).

Fixes #16156.
This commit is contained in:
João Valverde 2021-05-30 08:40:30 +01:00
parent 85c257431f
commit 0fe551e5e7
1 changed files with 12 additions and 0 deletions

View File

@ -284,6 +284,12 @@ static void mark_lval_deprecated(const char *s);
else {
unsigned long result;
result = strtoul(yytext + 1, NULL, 8);
if (result == 0) {
g_string_free(yyextra->quoted_string, TRUE);
yyextra->quoted_string = NULL;
dfilter_fail(yyextra->dfw, "%s (NUL byte) cannot be used with a regular string.", yytext);
return SCAN_FAILED;
}
if (result > 0xff) {
g_string_free(yyextra->quoted_string, TRUE);
yyextra->quoted_string = NULL;
@ -302,6 +308,12 @@ static void mark_lval_deprecated(const char *s);
else {
unsigned long result;
result = strtoul(yytext + 2, NULL, 16);
if (result == 0) {
g_string_free(yyextra->quoted_string, TRUE);
yyextra->quoted_string = NULL;
dfilter_fail(yyextra->dfw, "%s (NUL byte) cannot be used with a regular string.", yytext);
return SCAN_FAILED;
}
g_string_append_c(yyextra->quoted_string, (gchar) result);
}
}