Commit Graph

620 Commits

Author SHA1 Message Date
João Valverde a3b76138f0 dfilter: Fix memory leak
Filter: tcp.srcport == udp.port

Instructions:
00000 READ_TREE		tcp.srcport -> reg#0
00001 IF_FALSE_GOTO	5
00002 READ_TREE		udp.port -> reg#1
00003 IF_FALSE_GOTO	5
00004 ANY_EQ		reg#0 == reg#1
00005 RETURN

=================================================================
==180444==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 34 byte(s) in 1 object(s) allocated from:
    #0 0x55f21e4a9ff9  (/home/jpv/projects/wireshark/wireshark/build-asan/run/dftest+0xcdff9)
    #1 0x7f95ea661338  (/usr/lib/libc.so.6+0x82338)

SUMMARY: AddressSanitizer: 34 byte(s) leaked in 1 allocation(s).

Fixes a68b408a9f.
2022-03-25 18:38:11 +00:00
João Valverde 2fc8c0e36b dfilter: Handle a bitwise expr on the RHS 2022-03-23 11:04:41 +00:00
João Valverde 0335ebdc3a dfilter: ftype_is_true -> ftype_is_zero 2022-03-23 11:04:41 +00:00
João Valverde 16729be2c1 dfilter: Add bitwise masking of bits
Add support for masking of bits. Before the bitwise operator
could only test bits, it did not support clearing bits.

This allows testing if any combination of bits are set/unset
more naturally with a single test. Previously this was only
possible by combining several bitwise predicates.

Bitwise is implemented as a test node, even though it is not.
Maybe the test node should be renamed to something else.

Fixes #17246.
2022-03-22 12:58:04 +00:00
João Valverde 631cf34f0c dfilter: Use a function pointer array to free registers 2022-03-21 18:43:36 +00:00
João Valverde 6a0129a0e3 dfilter: Fix EditorConfig settings 2022-03-21 17:49:12 +00:00
João Valverde 54d8627c9a dfilter: Add more comments to optimization pass 2022-03-21 17:36:41 +00:00
João Valverde d60f2580ba dfilter: Pass around constants in instructions
The DFVM instructions arguments are generic boxed types but instead
of using FVALUE and PCRE types the code passes aroung REGISTER types
instead. Change that to pass constants in the instruction.
2022-03-21 17:09:56 +00:00
João Valverde 94d909103e dfilter: Remove DFVM constant initialization 2022-03-21 17:09:43 +00:00
João Valverde ae17e733ac dfilter: Use more DFVM values in gencode 2022-03-21 17:09:29 +00:00
João Valverde 769f1f10de dfilter: Add DFVM value constructor 2022-03-21 17:09:19 +00:00
João Valverde 1b574e7466 dfilter: Cleanup dfvm_apply() 2022-03-21 12:38:09 +00:00
João Valverde 22f3d87a8f dfilter: Use singly linked list for registers
Replace calls to list append with list prepend where applicable.
2022-03-21 11:47:19 +00:00
João Valverde ea949ef719 dfilter: Cleanup dfilter_dump() 2022-03-21 11:26:52 +00:00
João Valverde 50f04cb9da dfilter: Remove dead code 2022-03-19 20:10:43 +00:00
João Valverde 588d22a82b dfilter: Allow variable number of jumps during codegen
Use a list to allow a variable number of jumps, instead of a fixed
count. The flexibility in the number of jumps a given syntax tree
node might need to handle is useful to add new kinds of
operations.
2022-03-16 20:12:22 +00:00
João Valverde 32446523f6 dfilter: Fix stnode_tostr()
Syntax tree nodes can mutate and change type so the caching being used
is keepign a stale representation and printing wrong results. Recreate
the string every time the function is called.

We still store the string pointer in the node to be able to pass a const
char * to the caller without leaking memory, as a convenience.
2022-03-16 19:23:33 +00:00
João Valverde 7e07f373f5 dfilter: Remove unused function
Clean-up for a68b408a9f.
2022-03-09 11:51:47 +00:00
João Valverde 8983dda8f2 dfilter: Deprecate "~=" (any_ne)
The representation "~= has been superseded by "!==" with the same
meaning, making it superfluous and somewhat confusing. Deprecate
"~=" and recommend "!==" instead.
2022-03-09 11:28:39 +00:00
João Valverde df0fc8b517 dfilter: Try to be more flexible with leading colons
For an expression starting with a colon (a literal) try to parse
the value with and without colon. This avoids excluding some
valid representations like the IPv6 address "::1".
2022-03-05 11:10:54 +00:00
João Valverde bd48f947b0 dfilter: Require a field-like value on the LHS
Comparisons require a field-like value on one of the sides,
or both. Change this to require on the LHS or both. There is
realy no reason that I can see to allow the relation to commute,
and it allows removing a lot of unnecessary code and extra tests.
2022-03-05 11:10:54 +00:00
João Valverde a68b408a9f dfilter: Add RHS bias for literal values
For unparsed values on the RHS of a comparison try
to parse them first as a literal and only then as
a protocol. This is more complicated in code but
should be a use case a lot more common and useful in
practice.

It removes some annoying special cases and applies this
rule consistently to any expression. Consistency is
important otherwise the special cases and exceptions
make the language confusing and difficult to learn.

For values on the LHS the rule remains to first try a
protocol value, then a literal.

Related with issue #17731.
2022-03-05 11:10:54 +00:00
João Valverde c4f9d8abda dfilter: Rename "unparsed" to "literal"
A literal value is a value that cannot be interpreted as a
registered protocol. An unparsed value can be a literal or
an identifier (protocol/field) according to context and the
current disambiguation rules.

Strictly literal here is to be understood to  mean "numeric
literal, including numeric arrays, but not strings or character
constants".
2022-03-05 11:10:54 +00:00
João Valverde 6d520addd1 dfilter: Add special syntax for literals and names
The syntax for protocols and some literals like numbers
and bytes/addresses can be  ambiguous. Some protocols can
be parsed as a literal, for example the protocol "fc"
(Fibre Channel) can be parsed as 0xFC.

If a numeric protocol is registered that will also take
precedence over any literal, according to the current
rules, thereby breaking numerical comparisons to that
number. The same for an hypothetical protocol named "true",
etc.

To allow the user to disambiguate this meaning introduce
new syntax.

Any value prefixed with ':' or enclosed in <,> will be treated
as a literal value only. The value :fc or <fc> will always
mean 0xFC, under any context. Never a protocol whose filter
name is "fc".

Likewise any value prefixed with a dot will always be parsed
as an identifier (protocol or protocol field) in the language.
Never any literal value parsed from the token "fc".

This allows the user to be explicit about the meaning,
and between the two explicit methods plus the ambiguous one
it doesn't completely break any one meaning.

The difference can be seen in the following two programs:

    Filter: frame == fc

    Constants:

    Instructions:
    00000 READ_TREE		frame -> reg#0
    00001 IF-FALSE-GOTO	5
    00002 READ_TREE		fc -> reg#1
    00003 IF-FALSE-GOTO	5
    00004 ANY_EQ		reg#0 == reg#1
    00005 RETURN

    --------

    Filter: frame == :fc

    Constants:
    00000 PUT_FVALUE	fc <FT_PROTOCOL> -> reg#1

    Instructions:
    00000 READ_TREE		frame -> reg#0
    00001 IF-FALSE-GOTO	3
    00002 ANY_EQ		reg#0 == reg#1
    00003 RETURN

The filter "frame == fc" is the same as "filter == .fc",
according to the current heuristic, except the first form
will try to parse it as a literal if the name does not
correspond to any registered protocol.

By treating a leading dot as a name in the language we
necessarily disallow writing floats with a leading dot. We
will also disallow writing with an ending dot when using
unparsed values. This is a backward incompatibility but has
the happy side effect of making the expression {1...2}
unambiguous.

This could either mean "1 .. .2" or "1. .. 2". If we require
a leading and ending digit then the meaning is clear:
    1.0..0.2 -> 1.0 .. 0.2

Fixes #17731.
2022-03-05 11:10:54 +00:00
João Valverde 1278e36152 dfilter: Add more debug code 2022-02-27 23:35:57 +00:00
João Valverde 70301ba54c dfilter: Fix dfvm dump display
Fix operators to reflect their true meaning.
2022-02-27 19:12:02 +00:00
João Valverde 1aef88df4b dfilter: Fix node debug representation 2022-02-23 22:27:59 +00:00
João Valverde b5d74c69a7 dfilter: Fix error message with non printable ASCII
Before:
    Filter: http.user_agent == açaí
    dftest: "�" was unexpected in this context.

After:
    Filter: http.user_agent == açaí
    dftest: Non-printable ASCII characters may only appear inside double-quotes.

Related with #17770.
2022-02-19 17:49:29 +00:00
João Valverde d8b7d1f821 dfilter: Add aliases "any_eq" and "all_ne" 2021-12-22 14:32:32 +00:00
João Valverde 8b23dd3a3c dfilter: Add an "all equal" operator
To complete the set of equality operators add an "all equal"
operator that matches a frame if all fields match the condition.

The symbol chosen for "all_eq" is "===".
2021-12-22 14:32:32 +00:00
João Valverde c5a19582e4 epan: Convert to use stdio.h from GLib
Replace:
    g_snprintf() -> snprintf()
    g_vsnprintf() -> vsnprintf()
    g_strdup_printf() -> ws_strdup_printf()
    g_strdup_vprintf() -> ws_strdup_vprintf()

This is more portable, user-friendly and faster on platforms
where GLib does not like the native I/O.

Adjust the format string to use macros from intypes.h.
2021-12-19 19:29:53 +00:00
João Valverde 5bba669579 Remove some lingering uses of g_assert()
Also replace some incorrect uses of g_assert_true().

  g_assert_true -> g_assert -> ws_assert
2021-12-16 10:19:45 +00:00
João Valverde f5f8d9ebb6 dfilter: Fix token associativity
TEST_EQ and TEST_NE are unused. Replace by the correct values
and add missing token to string representations.
2021-12-13 01:24:18 +00:00
João Valverde 7cffcfa835 dfilter: Remove a default switch case 2021-12-12 10:16:27 +00:00
João Valverde 60e305d1e1 dfilter: Convert grammar.lemon to 4-space indentation
Add global EditorConfig settings for lemon files.

Add exceptions for the two grammar files that use tab indentation.
2021-12-02 15:48:40 +00:00
João Valverde 3657788cbb dfilter: Add default grammar type 2021-12-01 19:43:30 +00:00
João Valverde 647decd509 dfilter: Avoid double strdup to save token value
Store the lval token value instead.
2021-12-01 19:42:51 +00:00
João Valverde 557cee31fc dfilter: Save lexical token value to syntax tree
Use that for error messages, including any using test operators.

This allows to always use the same name as the user. It avoids
cases where the user write "a && b" and the message is "a and b"
is syntactically invalid.

It should also allow us to be more consistent with the use of
double quotes.
2021-12-01 13:34:01 +00:00
João Valverde 3e0806ca09 dfilter: Remove dfilter_fail_parse()
Instead of requiring a special error function in the parser
just set the syntax_error flag if an error occurs, in any stage
of compilation. Outside of the parser loop it will not be used
but that is fine.
2021-11-30 19:52:05 +00:00
João Valverde a6f978b4d3 dfilter: Remove two stnode replacement functions
One is unused and the other is only used with a corner
case. They are probably not necessary otherwise.
2021-11-30 19:48:47 +00:00
Moshe Kaplan a523135202 epan: Add header files to Doxygen
Add @file markers for epan
headers so that Doxygen will
generate documentation for them.
2021-11-30 08:46:49 +00:00
Moshe Kaplan 40016daeb3 Add files with WS_DLL_PUBLIC to Doxygen part2
Add @file markers for remaining non-dissector
files that contain functions exported with
WS_DLL_PUBLIC so that Doxygen will
generate documentation for them.
2021-11-30 06:47:35 +00:00
João Valverde fbfb4959ae dfilter: Better representation for charconst 2021-11-27 18:38:22 +00:00
João Valverde 352390aa97 dfilter: Need to handle a charconst on the LHS 2021-11-27 17:19:11 +00:00
João Valverde 702c7f0cc8 Remove stale comment. 2021-11-24 10:45:42 +00:00
João Valverde 35ad2e85c8 dfilter: Free a scanner string 2021-11-24 10:06:19 +00:00
João Valverde eb8c3169e7 dfilter: Clean up charconst error message 2021-11-24 09:38:58 +00:00
João Valverde 943c282009 dfilter: Parse character constants in lexer
Invalid character constants should be handled in the lexical scanner.

Todo: See if some code could be shared to parse double quoted strings.

It also fixes some unintuitive type coercions to string. Character
constants should be treated as characters, or maybe integers, or
maybe even throw an invalid comparison error, but coverting to a
literal string or byte array is surprising and not particularly
useful:
  '\xFF' -> "'\xFF'" (equals)
  '\xFF' -> "FF"     (contains)

Before:

    Filter: http.request.method contains "\x63"

    Constants:
    00000 PUT_FVALUE	"c" <FT_STRING> -> reg#1
    (...)

    Filter: http.request.method contains '\x63'

    Constants:
    00000 PUT_FVALUE	"63" <FT_STRING> -> reg#1
    (...)

    Filter: http.request.method == "\x63"

    Constants:
    00000 PUT_FVALUE	"c" <FT_STRING> -> reg#1
    (...)

    Filter: http.request.method == '\x63'

    Constants:
    00000 PUT_FVALUE	"'\\x63'" <FT_STRING> -> reg#1
    (...)

After:

    Filter: http.request.method contains '\x63'

    Constants:
    00000 PUT_FVALUE	"c" <FT_STRING> -> reg#1
    (...)

    Filter: http.request.method == '\x63'

    Constants:
    00000 PUT_FVALUE	"c" <FT_STRING> -> reg#1
    (...)
2021-11-24 08:40:20 +00:00
João Valverde 7028646f9e dfilter: Fix invalid character constant error message
This reverts commit d635ff4933.

A charconst cannot be a value string, for that reason it is not
redundant with unparsed.

Maybe character constants should be parsed in the lexical scanner
instead.

Before:
  Filter: ip.proto == '\g'
  dftest: "'\g'" cannot be found among the possible values for ip.proto.

After:
  Filter: ip.proto == '\g'
  dftest: "'\g'" isn't a valid character constant.
2021-11-23 17:35:40 +00:00
João Valverde 72c5efea1b dfilter: Reject invalid character escape sequences
For double quoted strings. This is consistent with single quote
character constants and the C standard. It also avoids common
mistakes where the superfluous backslash is silently suppressed.
2021-11-23 16:48:02 +00:00