Commit Graph

28 Commits

Author SHA1 Message Date
João Valverde b10db887ce dfilter: Remove unparsed syntax type and RHS literal bias
This removes unparsed name resolution during the semantic
check because it feels like a hack to work around limitations
in the language syntax, that should be solved at the lexical
level instead.

We were interpreting unparsed differently on the LHS and RHS.
Now an unparsed value is always a field if it matches a
registered field name (this matches the implementation in 3.6
and before).

This requires tightening a bit the allowed filter names for
protocols to avoid some common and potentially weird conflicting
cases.

Incidentally this extends set grammar to accept all entities.
That is experimental and may be reverted in the future.
2022-07-02 11:18:20 +01:00
João Valverde 47348ae598 dfilter: Add support for literal strings with null bytes
Before:
    Filter: frame matches "abc\x00def"
    dftest: \x00 (NUL byte) cannot be used with a regular string.
    	frame matches "abc\x00def"
    	                  ^~~~
    Filter: _ws.ftypes.string == "a string with a \0 byte"
    dftest: \0 (NUL byte) cannot be used with a regular string.
    	_ws.ftypes.string == "a string with a \0 byte"
    	                                      ^~

After:
    Filter: frame matches "abc\x00def"

    Syntax tree:
     0 TEST_MATCHES:
       1 FIELD(frame)
       1 PCRE(abc\0def)

    Instructions:
    00000 READ_TREE		frame -> reg#0
    00001 IF_FALSE_GOTO	3
    00002 ANY_MATCHES	reg#0 matches abc\0def
    00003 RETURN

    Filter: _ws.ftypes.string == "a string with a \0 byte"

    Syntax tree:
     0 TEST_ANY_EQ:
       1 FIELD(_ws.ftypes.string)
       1 FVALUE("a string with a \0 byte" <FT_STRING>)

    Instructions:
    00000 READ_TREE		_ws.ftypes.string -> reg#0
    00001 IF_FALSE_GOTO	3
    00002 ANY_EQ		reg#0 == "a string with a \0 byte" <FT_STRING>
    00003 RETURN

Fixes issue #16156.
2022-06-21 15:10:08 +00:00
João Valverde a68b408a9f dfilter: Add RHS bias for literal values
For unparsed values on the RHS of a comparison try
to parse them first as a literal and only then as
a protocol. This is more complicated in code but
should be a use case a lot more common and useful in
practice.

It removes some annoying special cases and applies this
rule consistently to any expression. Consistency is
important otherwise the special cases and exceptions
make the language confusing and difficult to learn.

For values on the LHS the rule remains to first try a
protocol value, then a literal.

Related with issue #17731.
2022-03-05 11:10:54 +00:00
João Valverde c4f9d8abda dfilter: Rename "unparsed" to "literal"
A literal value is a value that cannot be interpreted as a
registered protocol. An unparsed value can be a literal or
an identifier (protocol/field) according to context and the
current disambiguation rules.

Strictly literal here is to be understood to  mean "numeric
literal, including numeric arrays, but not strings or character
constants".
2022-03-05 11:10:54 +00:00
João Valverde 943c282009 dfilter: Parse character constants in lexer
Invalid character constants should be handled in the lexical scanner.

Todo: See if some code could be shared to parse double quoted strings.

It also fixes some unintuitive type coercions to string. Character
constants should be treated as characters, or maybe integers, or
maybe even throw an invalid comparison error, but coverting to a
literal string or byte array is surprising and not particularly
useful:
  '\xFF' -> "'\xFF'" (equals)
  '\xFF' -> "FF"     (contains)

Before:

    Filter: http.request.method contains "\x63"

    Constants:
    00000 PUT_FVALUE	"c" <FT_STRING> -> reg#1
    (...)

    Filter: http.request.method contains '\x63'

    Constants:
    00000 PUT_FVALUE	"63" <FT_STRING> -> reg#1
    (...)

    Filter: http.request.method == "\x63"

    Constants:
    00000 PUT_FVALUE	"c" <FT_STRING> -> reg#1
    (...)

    Filter: http.request.method == '\x63'

    Constants:
    00000 PUT_FVALUE	"'\\x63'" <FT_STRING> -> reg#1
    (...)

After:

    Filter: http.request.method contains '\x63'

    Constants:
    00000 PUT_FVALUE	"c" <FT_STRING> -> reg#1
    (...)

    Filter: http.request.method == '\x63'

    Constants:
    00000 PUT_FVALUE	"c" <FT_STRING> -> reg#1
    (...)
2021-11-24 08:40:20 +00:00
João Valverde 7028646f9e dfilter: Fix invalid character constant error message
This reverts commit d635ff4933.

A charconst cannot be a value string, for that reason it is not
redundant with unparsed.

Maybe character constants should be parsed in the lexical scanner
instead.

Before:
  Filter: ip.proto == '\g'
  dftest: "'\g'" cannot be found among the possible values for ip.proto.

After:
  Filter: ip.proto == '\g'
  dftest: "'\g'" isn't a valid character constant.
2021-11-23 17:35:40 +00:00
João Valverde b62d4b8eca dfilter: Change string node display representation again
Adding double quotes to the display output format was probably a mistake.
2021-11-10 03:19:24 +00:00
João Valverde e7ecc9b9e5 dfilter: Clean up error format and exception code
Misc code cleanups. Add some extra stnode functions for increased type
safety. Fix a constness issue with df_lval_value().
2021-11-10 03:18:50 +00:00
João Valverde 2d45cb0881 dfilter: Improve some error messages 2021-11-06 11:45:21 +00:00
João Valverde d635ff4933 dfilter: Remove redundant STTYPE_CHARCONST syntax node
A charconst uses the same semantic rules as unparsed so just
use the latter to avoid redundancies.

We keep the use of TOKEN_CHARCONST as an optimization to avoid
an unnecessary name resolution (lookup for a registered field with
the same name as the charconst).
2021-10-31 20:33:31 +00:00
João Valverde c6b68b3ee2 dfilter: Need to check validity of LHS of "matches" expression
Fixes #17690, a crash on a failed assertion.
2021-10-28 16:26:36 +00:00
João Valverde db04d188e1 Remove some unnecessary casts.
Casts are best avoided unless they are truly required. Fix some
constness mismatches this revealed.
2021-10-27 10:24:20 +01:00
João Valverde 07371d4557 dfilter: Split tostr() into debug and pretty print 2021-10-11 21:55:45 +00:00
João Valverde 3ea2a61f2a dfilter: Display syntax tree for debugging
Use wslog to output debug information. Being able to control
it at runtime is a big advantage.

We extend the syntax tree nodes with a method to return a
canonical string representation.

Add a routine to walk the tree and return an textual representation
for debugging purposes.
2021-09-30 16:29:11 +01:00
Guy Harris 20800366dd HTTPS (almost) everywhere.
Change all wireshark.org URLs to use https.

Fix some broken links while we're at it.

Change-Id: I161bf8eeca43b8027605acea666032da86f5ea1c
Reviewed-on: https://code.wireshark.org/review/34089
Reviewed-by: Guy Harris <guy@alum.mit.edu>
2019-07-26 18:44:40 +00:00
Dario Lombardo 55c68ee69c epan: use SPDX indentifiers.
Skipping dissectors dir for now.

Change-Id: I717b66bfbc7cc81b83f8c2cbc011fcad643796aa
Reviewed-on: https://code.wireshark.org/review/25694
Petri-Dish: Dario Lombardo <lomato@gmail.com>
Tested-by: Petri Dish Buildbot
Reviewed-by: Anders Broman <a.broman58@gmail.com>
2018-02-08 19:29:45 +00:00
Guy Harris d7fe514fc0 Improve support for single-character fields and filter expressions.
Add an FT_CHAR type, which is like FT_UINT8 except that the value is
displayed as a C-style character constant.

Allow use of C-style character constants in filter expressions; they can
be used in comparisons with all integral types, and in "contains"
operators.

Use that type for some fields that appear (based on the way they're
displayed, or on the use of C-style character constants in their
value_string tables) to be 1-byte characters rather than 8-bit numbers.

Change-Id: I39a9f0dda0bd7f4fa02a9ca8373216206f4d7135
Reviewed-on: https://code.wireshark.org/review/17787
Reviewed-by: Guy Harris <guy@alum.mit.edu>
2016-09-19 02:51:13 +00:00
Bill Meier 3e3fc9fc5e epan/dfilter/*.c: As needed: Add editor modelines & Fix indentation
Change-Id: I410839329a98bd806c60961dfb9693d5eeeeb702
Reviewed-on: https://code.wireshark.org/review/7104
Reviewed-by: Bill Meier <wmeier@newsguy.com>
2015-02-13 19:04:44 +00:00
Alexis La Goutte 296591399f Remove all $Id$ from top of file
(Using sed : sed -i '/^ \* \$Id\$/,+1 d')

Fix manually some typo (in export_object_dicom.c and crc16-plain.c)

Change-Id: I4c1ae68d1c4afeace8cb195b53c715cf9e1227a8
Reviewed-on: https://code.wireshark.org/review/497
Reviewed-by: Anders Broman <a.broman58@gmail.com>
2014-03-04 14:27:33 +00:00
Jakub Zawadzki bf81b42e1e Update Free Software Foundation address.
(COPYING will be updated in next commit)

svn path=/trunk/; revision=43536
2012-06-28 22:56:06 +00:00
Jakub Zawadzki addf9236dc Support multiple relation test without logic and (python-like)
Like: 
  a == b == c 
  or 
  a < b <= c <= d < e 

Real life example:
  6660 <= tcp.port <= 6669

Just syntactic sugar, this is *NOT* optimized.

svn path=/trunk/; revision=43353
2012-06-19 12:12:41 +00:00
Stig Bjørlykke 6d4a2e7ebf Changed email address for Gerald from zing.org to wireshark.org
in a lot of files, which I suppose is correct.

svn path=/trunk/; revision=24034
2008-01-08 22:54:51 +00:00
Ronnie Sahlberg 89f022b12b name change
svn path=/trunk/; revision=18197
2006-05-21 05:12:17 +00:00
Guy Harris 8a8b883450 Set the svn:eol-style property on all text files to "native", so that
they have LF at the end of the line on UN*X and CR/LF on Windows;
hopefully this means that if a CR/LF version is checked in on Windows,
the CRs will be stripped so that they show up only when checked out on
Windows, not on UN*X.

svn path=/trunk/; revision=11400
2004-07-18 00:24:25 +00:00
Gilbert Ramirez 086774b71f Add to the fundamental types passed between the scanner and the parser.
Besides "STRING", there is now "UNPARSED_STRING", where the distinction
is that "STRING" was a double-quoted string and "UNPARSED_STRING" is just
a sequence of characters that the scanner didn't know how to scan/parse,
so it's up to the Ftype to parse it.

This gives us more flexibility and prepares the dfilter parsing engine
for the upcoming addition of the "contains" operator.

In the process of doing this, I also re-did the double-quoted string
support in the scanner, so that instead of the naively-simple support we
used to have, double-quoted strings now can have embedded dobule-quotes,
embedded octal sequences, and embedded hexadecimal sequences:
    "\""    embedded double-quote
    "\110"  embedded octal
    "\x48"  embedded hex

Enhance the dfilter unit test script to be able to run a single collection
of tests instead of having to run all of them all the time.

svn path=/trunk/; revision=8083
2003-07-25 03:44:05 +00:00
Jörg Mayer 48be4e530d Removed trailing whitespaces from .h and .c files using the
winapi_cleanup tool written by Patrik Stridvall for the wine
project.

svn path=/trunk/; revision=6116
2002-08-28 20:41:00 +00:00
Gilbert Ramirez 96e0398fc6 Grumble, grumble. I forgot to add the license comment at the top
of these files.

svn path=/trunk/; revision=2968
2001-02-01 20:31:21 +00:00
Gilbert Ramirez 8f1fff2e6a Create a more modular type system for the FT_* types. Put them
into epan/ftypes.

Re-write display filter routines using Lemon parser instead of yacc.
Besides using a different tool, the new grammar is much simpler, while
the display filter engine itself is more powerful and more easily extended.

Add dftest executable, to test display filter "bytecode" generation.
Add option to "configure" to build dftest or randpkt, both of which are not
built by default.

Implement Ed Warnicke's ideas about dranges in the new display filter and
ftype code.

Remove type FT_TEXT_ONLY in favor of FT_NONE, and have protocols registered
as FT_PROTOCOL. Thus, FT_NONE is used only for simple labels in the proto tree,
while FT_PROTOCOL is used for protocols. This was necessary for being
able to make byte slices (ranges) out of protocols, like "frame[0:3]"

Win32 Makefile.nmake's will be added tonight.

svn path=/trunk/; revision=2967
2001-02-01 20:21:25 +00:00