Commit Graph

83 Commits

Author SHA1 Message Date
João Valverde e8800ff3c4 dfilter: Add a thin encapsulation layer for REs 2021-10-18 12:09:36 +00:00
João Valverde 2e048df011 dfilter: Improve error message for "matches"
Should be more obvious that this error is caused
by a string syntax error and not something else.
2021-10-18 12:09:36 +00:00
João Valverde a975d478ba dfilter: Require double-quoted strings with "matches"
Matches is a special case that looks on the RHS and tries
to convert every unparsed value to a string, regardless
of the LHS type. This is not how types work in the display
filter. Require double-quotes to avoid ambiguity, because
matches doesn't follow normal Wireshark display filter
type rules. It doesn't need nor benefit from the flexibility
provided by unparsed strings in the syntax.

For matches the RHS is always a literal strings except
if the RHS is also a field name, then it complains of an
incompatible type. This is confusing. No type can be compatible
because no type rules are ever considered. Every unparsed value is
a text string except if it happens to coincide with a field
name it also requires double-quoting or it throws a syntax error,
just to be difficult. We could remove this odd quirk but requiring
double-quotes for regular expressions is a better, more elegant
fix.

Before:
  Filter: tcp matches "udp"

  Constants:
  00000 PUT_PCRE	udp -> reg#1

  Instructions:
  00000 READ_TREE		tcp -> reg#0
  00001 IF-FALSE-GOTO	3
  00002 ANY_MATCHES	reg#0 matches reg#1
  00003 RETURN

  Filter: tcp matches udp

  Constants:
  00000 PUT_PCRE	udp -> reg#1

  Instructions:
  00000 READ_TREE		tcp -> reg#0
  00001 IF-FALSE-GOTO	3
  00002 ANY_MATCHES	reg#0 matches reg#1
  00003 RETURN

  Filter: tcp matches udp.srcport
  dftest: tcp and udp.srcport are not of compatible types.

  Filter: tcp matches udp.srcportt

  Constants:
  00000 PUT_PCRE	udp.srcportt -> reg#1

  Instructions:
  00000 READ_TREE		tcp -> reg#0
  00001 IF-FALSE-GOTO	3
  00002 ANY_MATCHES	reg#0 matches reg#1
  00003 RETURN

After:
  Filter: tcp matches "udp"

  Constants:
  00000 PUT_PCRE	udp -> reg#1

  Instructions:
  00000 READ_TREE		tcp -> reg#0
  00001 IF-FALSE-GOTO	3
  00002 ANY_MATCHES	reg#0 matches reg#1
  00003 RETURN

  Filter: tcp matches udp
  dftest: "udp" was unexpected in this context.

  Filter: tcp matches udp.srcport
  dftest: "udp.srcport" was unexpected in this context.

  Filter: tcp matches udp.srcportt
  dftest: "udp.srcportt" was unexpected in this context.

The error message could still be improved.
2021-10-17 22:53:36 +00:00
João Valverde e46deda5cf Fix build with WS_DISABLE_DEBUG 2021-10-15 12:23:43 +01:00
João Valverde 1ace61074e dfilter: Display token value for debugging 2021-10-14 23:24:57 +01:00
João Valverde 0d3bfedfb0 dfilter: Fixup deprecated tokens initialization
Always use the internal API to access "deprecated" and initialize
the data structure on demand. This fixes a null pointer dereference
introduced previously.

Use reference counting to share the array cleanly and avoid memory
leaks.

Keep the pointer in dfwork_t.
2021-10-14 16:49:23 +01:00
João Valverde e91b5beafd dfilter: Resolve field names in the parser
The lexical rules for fields and unparsed strings are ambiguous,
e.g. "fc" can be the protocol fibre channel or the byte 0xfc.
In general a name is determined to be a protocol field or not by
checking the registry.

Resolving the name in the parser gives more flexibility, for example
to use different semantic rules according to the relation between
LHS and RHS, and allows function names and protocol names to co-exist
without ambiguity.

Before:
  Filter: tcp == 1

  Constants:
  00000 PUT_FVALUE	01 <FT_PROTOCOL> -> reg#1

  Instructions:
  00000 READ_TREE		tcp -> reg#0
  00001 IF-FALSE-GOTO	3
  00002 ANY_EQ		reg#0 == reg#1
  00003 RETURN

  Filter: tcp() == 1
  dftest: Syntax error near "(".

After:
  Filter: tcp == 1

  Constants:
  00000 PUT_FVALUE	01 <FT_PROTOCOL> -> reg#1

  Instructions:
  (same)

  Filter: tcp() == 1
  dftest: Function 'tcp' does not exist

It's also a goal to make it easier to modify the lexer rules.

Ping #12810.
2021-10-14 16:45:19 +01:00
João Valverde 2c701ddf6f dfilter: Improve grammar to parse ranges
Do the integer conversion for ranges in the parser. This is more
conventional, I think, and allows removing the unnecessary integer
syntax tree node type.

Try to minimize the number and complexity of lexical rules for
ranges. But it seems we need to keep different states for integer
and punctuation because of the need to disambiguate the ranges
[-n-n] and [-n--n].
2021-10-08 19:18:56 +01:00
João Valverde 92285e6258 dfilter: Improve grammar to parse functions
A function is grammatically an identifier that is followed by '(' and ')'
according to some rules. We should avoid assuming a token is a function
just because it matches a registered function name.

Before:
  Filter: foobar(http.user_agent) contains "UPDATE"
  dftest: Syntax error near "(".

After:
  Filter: foobar(http.user_agent) contains "UPDATE"
  dftest: The function 'foobar' does not exist.

This has the problem that a function cannot have the same name
as a protocol but that limitation already existed before.
2021-10-08 04:01:24 +00:00
João Valverde db18865e55 dfilter: Save token value to syntax tree
When parsing we save the token value to the syntax tree. This is
useful for better error reporting. Use it to report an invalid
entity for the slice operation. Before only the memory location
was reported, which is not a good error message.

Before:
  % dftest '"01:02:03:04"[0:3] == foo'
  Filter: ""01:02:03:04"[0:3] == foo"
  dftest: Range is not supported for entity <0x7f6c84017740> of type STRING

After:
  % dftest '"01:02:03:04"[0:3] == foo'
  Filter: ""01:02:03:04"[0:3] == foo"
  dftest: Range is not supported for entity 01:02:03:04 of type STRING

When creating a new node from an old one we need to copy the token
value. Simple tokens such as RBRACKET, COMMA and COLON are
not part of the AST and don't have an associated semantic value.
2021-10-01 16:04:37 +00:00
João Valverde de6f5b9d82 dfilter: Fixup syntax tree node display 2021-09-30 19:11:17 +01:00
João Valverde 0e7ba54d98 dfilter: Clean up handling of "deprecated" tokens
Pass the deprecated data struture to the scanner and insert the deprecated
tokens there. This avoids having to keep a dedicated syntax node field
for this.

Pass the deprecated argument in dfwork_t instead of in a separate
argument. This is less cumbersome than adding an extra argument
to every level of the semantic checker.
2021-09-30 17:26:19 +01:00
João Valverde 3ea2a61f2a dfilter: Display syntax tree for debugging
Use wslog to output debug information. Being able to control
it at runtime is a big advantage.

We extend the syntax tree nodes with a method to return a
canonical string representation.

Add a routine to walk the tree and return an textual representation
for debugging purposes.
2021-09-30 16:29:11 +01:00
João Valverde 0e50979b3f Replace g_assert() with ws_assert() 2021-06-19 01:23:31 +00:00
João Valverde 39df3ae3c0 Replace g_log() calls with ws_log() 2021-06-16 12:50:27 +00:00
João Valverde 85c257431f dfilter: Add support for raw strings
Add support for a literal string specification copied from Python
raw strings[1].

Raw string literals are enclosed with r"..." or R"...". Double quotes
can be include in the string but they must be escaped with backslash.
In escape sequences backslashes are preserved in the final result.

So for example the string "a\\\"b" is the same as r"a\"b".

r"\\\a" is the same as "\\\\\\a".

Raw strings should be used for convenience wherever a regular expression
is used in a display filter expression.

[1]https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals
2021-06-05 02:46:40 +01:00
João Valverde 9ba97d12d6 Add ws_debug() and use it
Replace most instances of ws_debug_printf() except in
epan/dissectors and dissector plugins.

Some replacements use printf(), some use ws_debug(), and
some were removed because they were dead or judged to be
temporary.
2021-05-24 01:13:19 +00:00
Guy Harris 20800366dd HTTPS (almost) everywhere.
Change all wireshark.org URLs to use https.

Fix some broken links while we're at it.

Change-Id: I161bf8eeca43b8027605acea666032da86f5ea1c
Reviewed-on: https://code.wireshark.org/review/34089
Reviewed-by: Guy Harris <guy@alum.mit.edu>
2019-07-26 18:44:40 +00:00
Guy Harris 9d8b0a9cd0 Always set *dfp to NULL on an error return from dfilter_compile().
All other error-return code paths set *dfp to NULL; make this one do so
as well.

Change-Id: I4015c1d53bdbac99cdeda158d7d01c8da7bf2562
Reviewed-on: https://code.wireshark.org/review/31102
Reviewed-by: Guy Harris <guy@alum.mit.edu>
2018-12-19 06:04:16 +00:00
Peter Wu 6144951380 dfilter: fix memleaks with functions and slice operator
Running tools/dfilter-test.py with LSan enabled resulted in 38 test
failures due to memory leaks from "fvalue_new". Problematic dfilters:
- Return values from functions, e.g. `len(data.data) > 8` (instruction
  CALL_FUNCTION invoking functions from epan/dfilter/dfunctions.c)
- Slice operator: `data.data[1:2] == aa:bb` (function mk_range)

These values end up in "registers", but as some values (from READ_TREE)
reference the proto tree, a new tracking flag ("owns_memory") is added.

Add missing tests for some functions and try to improve documentation.

Change-Id: I28e8cf872675d0a81ea7aa5fac7398257de3f47b
Reviewed-on: https://code.wireshark.org/review/27132
Petri-Dish: Peter Wu <peter@lekensteyn.nl>
Reviewed-by: Peter Wu <peter@lekensteyn.nl>
Tested-by: Petri Dish Buildbot
Reviewed-by: Anders Broman <a.broman58@gmail.com>
2018-04-25 06:57:00 +00:00
Peter Wu 6a45dcd7a2 dfilter: require spaces as set element separator
Previously a filter such as `http.request.method in {"GET"HEAD""}` would
be parsed as three strings (GET, HEAD and an empty string). As it seems
more likely that people make typos rather than intending to construct
such a filter, forbid this by always requiring a whitespace separator.

Change-Id: I77e531fd6be072f62dd06aac27f856106c8920c6
Reported-by: Stig Bjørlykke
Reviewed-on: https://code.wireshark.org/review/26989
Petri-Dish: Peter Wu <peter@lekensteyn.nl>
Tested-by: Petri Dish Buildbot
Reviewed-by: Anders Broman <a.broman58@gmail.com>
2018-04-18 03:47:58 +00:00
Dario Lombardo 55c68ee69c epan: use SPDX indentifiers.
Skipping dissectors dir for now.

Change-Id: I717b66bfbc7cc81b83f8c2cbc011fcad643796aa
Reviewed-on: https://code.wireshark.org/review/25694
Petri-Dish: Dario Lombardo <lomato@gmail.com>
Tested-by: Petri Dish Buildbot
Reviewed-by: Anders Broman <a.broman58@gmail.com>
2018-02-08 19:29:45 +00:00
Guy Harris 4d2d423106 Rename routines to clarify what they do.
XXX_prime_with_YYY makes it a bit clearer than does XXX_prime_YYY that
we're not priming YYY, we're priming XXX *using* YYY.

Change-Id: I1686b8b5469bc0f0bd6db8551fb6301776a1b133
Reviewed-on: https://code.wireshark.org/review/21031
Reviewed-by: Guy Harris <guy@alum.mit.edu>
2017-04-12 04:56:49 +00:00
Dario Lombardo 7c69ae929d dfilter-macro: add cleanup routine.
Change-Id: I3de59c0366e9bec058de144eb136abaca24b5911
Reviewed-on: https://code.wireshark.org/review/19918
Petri-Dish: Dario Lombardo <lomato@gmail.com>
Tested-by: Petri Dish Buildbot <buildbot-no-reply@wireshark.org>
Reviewed-by: Michael Mann <mmann78@netscape.net>
2017-02-03 02:38:20 +00:00
Peter Wu 032a6ac3be Fix memleaks in capture file dialog
Tried to poke various fields (including the capture filter field), this
revealed some memleaks.

Change-Id: I1eca431a09839906a4b3c902ad85e55bffc71ca8
Reviewed-on: https://code.wireshark.org/review/17648
Petri-Dish: Peter Wu <peter@lekensteyn.nl>
Tested-by: Petri Dish Buildbot <buildbot-no-reply@wireshark.org>
Reviewed-by: Michael Mann <mmann78@netscape.net>
2016-09-12 01:33:38 +00:00
Michael Mann 1da1f945e2 Fix checkAPI.pl warnings about printf
Many of the complaints from checkAPI.pl for use of printf are when its embedded
in an #ifdef and checkAPI isn't smart enough to figure that out.
The other (non-ifdef) use is dumping internal structures (which is a type of
debug functionality)
Add a "ws_debug_printf" macro for printf to pacify the warnings.

Change-Id: I63610e1adbbaf2feffb4ec9d4f817247d833f7fd
Reviewed-on: https://code.wireshark.org/review/16623
Reviewed-by: Michael Mann <mmann78@netscape.net>
Petri-Dish: Michael Mann <mmann78@netscape.net>
Tested-by: Petri Dish Buildbot <buildbot-no-reply@wireshark.org>
Reviewed-by: Anders Broman <a.broman58@gmail.com>
2016-07-25 04:26:50 +00:00
Guy Harris 59816ef00c Make the Flex scanners and YACC parser in libraries reentrant.
master-branch libpcap now generates a reentrant Flex scanner and
Bison/Berkeley YACC parser for capture filter expressions, so it
requires versions of Flex and Bison/Berkeley YACC that support that.

We might as well do the same.  For libwiretap, it means we could
actually have multiple K12 text or Ascend/Lucent text files open at the
same time.  For libwireshark, it might not be as useful, as we only read
configuration files at startup (which should only happen once, in one
thread) or on demand (in which case, if we ever support multiple threads
running libwireshark, we'd need a mutex to ensure that only one file
reads it), but it's still the right thing to do.

We also require a version of Flex that can write out a header file, so
we change the runlex script to generate the header file ourselves. This
means we require a version of Flex new enough to support --header-file.

Clean up some other stuff encountered in the process.

Change-Id: Id23078c6acea549a52fc687779bb55d715b55c16
Reviewed-on: https://code.wireshark.org/review/14719
Reviewed-by: Guy Harris <guy@alum.mit.edu>
2016-04-03 22:21:29 +00:00
Stig Bjørlykke cc679ca5ce Qt: Add check for field extractors
The proto tree is needed in several cases when using Lua field extractors,
because they fetch values from the tree.  Without a valid field extractor
a Lua plugin may misbehave and display wrong column info.

This fixes column issues when:
- Calling resetColumns() in Qt.  This involves adding a display filter,
  change time display format, change name resolution and other changes
  in UI which requires column updates.
- Print summary lines.
- Export as CSV and PSML.

Change-Id: Ieed6f8578cdf2759f1f836cd8413a4529b7bbd80
Reviewed-on: https://code.wireshark.org/review/13708
Petri-Dish: Stig Bjørlykke <stig@bjorlykke.org>
Tested-by: Petri Dish Buildbot <buildbot-no-reply@wireshark.org>
Reviewed-by: Anders Broman <a.broman58@gmail.com>
2016-02-05 05:35:02 +00:00
Guy Harris 4348d4dd34 Type cleanups.
dfilter_macro_apply_recurse() returns either NULL or a pointer to
freshly-allocated memory, so it doesn't return a const pointer.
dfilter_macro_apply() calls dfilter_macro_apply_recurse(), so it doesn't
return a const pointer, either.

In dfilter_compile(), have separate variables for the filter handed in
and the macro-expanded filter, the former being const gchar * and the
latter being gchar *.

Change-Id: I191549bf0ff6c09c1278a98432a907c93d5e0e74
Reviewed-on: https://code.wireshark.org/review/12446
Reviewed-by: Guy Harris <guy@alum.mit.edu>
2015-12-05 20:41:32 +00:00
Bill Meier 3e3fc9fc5e epan/dfilter/*.c: As needed: Add editor modelines & Fix indentation
Change-Id: I410839329a98bd806c60961dfb9693d5eeeeb702
Reviewed-on: https://code.wireshark.org/review/7104
Reviewed-by: Bill Meier <wmeier@newsguy.com>
2015-02-13 19:04:44 +00:00
Guy Harris cfcbb28671 Clean up ftype-conversion and dfilter error message string handling.
Have dfilter_compile() take an additional gchar ** argument, pointing to
a gchar * item that, on error, gets set to point to a g_malloc()ed error
string.  That removes one bit of global state from the display filter
parser, and doesn't impose a fixed limit on the error message strings.

Have fvalue_from_string() and fvalue_from_unparsed() take a gchar **
argument, pointer to a gchar * item, rather than an error-reporting
function, and set the gchar * item to point to a g_malloc()ed error
string on an error.

Allow either gchar ** argument to be null; if the argument is null, no
error message is allocated or provided.

Change-Id: Ibd36b8aaa9bf4234aa6efa1e7fb95f7037493b4c
Reviewed-on: https://code.wireshark.org/review/6608
Reviewed-by: Guy Harris <guy@alum.mit.edu>
2015-01-18 10:22:59 +00:00
Michael Mann 86726f404a Trim down the use of ep_ memory in the display filter code.
Couldn't quite eliminate it completely, but it's much improved.  Need to figure out where/when to free dfilter_error_msg.

Change-Id: I10216e9546d38e83f69991ded8ec0b3fc8472035
Reviewed-on: https://code.wireshark.org/review/6591
Reviewed-by: Michael Mann <mmann78@netscape.net>
2015-01-18 00:28:53 +00:00
Alexis La Goutte 296591399f Remove all $Id$ from top of file
(Using sed : sed -i '/^ \* \$Id\$/,+1 d')

Fix manually some typo (in export_object_dicom.c and crc16-plain.c)

Change-Id: I4c1ae68d1c4afeace8cb195b53c715cf9e1227a8
Reviewed-on: https://code.wireshark.org/review/497
Reviewed-by: Anders Broman <a.broman58@gmail.com>
2014-03-04 14:27:33 +00:00
Jakub Zawadzki c6669a3c63 dfilter: report warning if OR and AND logic operands are mixed without parentheses.
svn path=/trunk/; revision=51247
2013-08-10 17:49:28 +00:00
Jeff Morriss 54bb2e7a5c Move report_err.{h,c} from epan into wsutil: there's nothing epan-specific there and moving it avoids having to recompile the file for use in editcap and capinfos (which don't link against libwireshark).
svn path=/trunk/; revision=50598
2013-07-15 02:48:26 +00:00
Jeff Morriss 3729335973 We always HAVE_CONFIG_H so don't bother checking whether we have it or not.
svn path=/trunk/; revision=45016
2012-09-20 01:48:30 +00:00
Jakub Zawadzki bf81b42e1e Update Free Software Foundation address.
(COPYING will be updated in next commit)

svn path=/trunk/; revision=43536
2012-06-28 22:56:06 +00:00
Anders Broman 345b48d1ea Try to squelch warnings
svn path=/trunk/; revision=43019
2012-06-03 09:26:15 +00:00
Anders Broman 7f96d94b7c From Gilbert Ramirez: When filtering on a single-byte byte-array-slice, using a normal hex string would be nice
svn path=/trunk/; revision=41232
2012-02-29 05:58:45 +00:00
Guy Harris 70a8b27948 Get rid of the depth argument to dfilter_macro_apply(); have an internal
routine that does all the work and that takes a depth argumen, and an
external routine that calls that internal routine with a depth argument
of 0.  The depth is only of use internally, to avoid infinite recursion.

When recursing with that routine, pass depth+1 as the depth value,
rather than passing depth and incrementing it afterwards; the latter
doesn't prevent infinite recursion.  (Thanks and a tip of the hat to
Clang Cat for catching this.)

Squelch some other (harmless) warnings from Clang Cat.

Clean up indentation.

svn path=/trunk/; revision=39838
2011-11-15 04:26:38 +00:00
Guy Harris b32b4d08bd It wasn't complaining about that null pointer reference.
svn path=/trunk/; revision=35981
2011-02-17 18:43:38 +00:00
Guy Harris 8eb8623b15 OK, let's try a couple more explicit checks against NULL, to see whether
that de-confuses Microsoft's code analyzer.

svn path=/trunk/; revision=35975
2011-02-17 08:15:05 +00:00
Stig Bjørlykke 54fa8338b9 Fixed a data type and removed a shadowed variable.
svn path=/trunk/; revision=30610
2009-10-18 21:30:39 +00:00
Kovarththanan Rajaratnam 17f010119a From Jakub Zawadzki via. Bug 3330:
* Fix memleak (df->deprecated in dfilter_free())
* Free protocol hash tables on cleanup.
* Free protocols list on cleanup.
* Free memory allocated by fgetline() in parse_services_file()

From me:

* proto.c: set gmc_hfinfo to NULL after free
* proto.c: switch order of g_free() and g_list_remove() in proto_cleanup()

svn path=/trunk/; revision=29656
2009-09-01 18:16:55 +00:00
Kovarththanan Rajaratnam 2afdee256c Handle a text NULL pointer more gracefully
svn path=/trunk/; revision=29491
2009-08-21 11:31:21 +00:00
Anders Broman 29bb0505d5 NULL is zero on all platforms we run on.
svn path=/trunk/; revision=29003
2009-07-07 22:22:05 +00:00
Anders Broman 0d64273468 Initialize memory to zero.
(is NULL = zero on all platforms?)

svn path=/trunk/; revision=28955
2009-07-06 15:37:29 +00:00
Bill Meier 277980a6c6 From Kovarththanan Rajaratnam: dfilter: Fix for bug #3490:
"Purify reports an uninitialized memory read in dfw_append_const() when
accessing the 'next_const_id' member. This seems to be caused by dfwork_new()
which doesn't properly initialize the member."

svn path=/trunk/; revision=28486
2009-05-26 15:13:19 +00:00
Stig Bjørlykke 62f60df6b4 From Jakub Zawadzki (bug 3331):
g_free() is NULL safe, so we don't need check against it.

svn path=/trunk/; revision=27718
2009-03-13 22:06:48 +00:00
Guy Harris 884a635762 Assign pointers to strings to a const pointer.
svn path=/trunk/; revision=25582
2008-06-24 18:22:26 +00:00