This removes unparsed name resolution during the semantic
check because it feels like a hack to work around limitations
in the language syntax, that should be solved at the lexical
level instead.
We were interpreting unparsed differently on the LHS and RHS.
Now an unparsed value is always a field if it matches a
registered field name (this matches the implementation in 3.6
and before).
This requires tightening a bit the allowed filter names for
protocols to avoid some common and potentially weird conflicting
cases.
Incidentally this extends set grammar to accept all entities.
That is experimental and may be reverted in the future.
For unparsed values on the RHS of a comparison try
to parse them first as a literal and only then as
a protocol. This is more complicated in code but
should be a use case a lot more common and useful in
practice.
It removes some annoying special cases and applies this
rule consistently to any expression. Consistency is
important otherwise the special cases and exceptions
make the language confusing and difficult to learn.
For values on the LHS the rule remains to first try a
protocol value, then a literal.
Related with issue #17731.
A literal value is a value that cannot be interpreted as a
registered protocol. An unparsed value can be a literal or
an identifier (protocol/field) according to context and the
current disambiguation rules.
Strictly literal here is to be understood to mean "numeric
literal, including numeric arrays, but not strings or character
constants".
Invalid character constants should be handled in the lexical scanner.
Todo: See if some code could be shared to parse double quoted strings.
It also fixes some unintuitive type coercions to string. Character
constants should be treated as characters, or maybe integers, or
maybe even throw an invalid comparison error, but coverting to a
literal string or byte array is surprising and not particularly
useful:
'\xFF' -> "'\xFF'" (equals)
'\xFF' -> "FF" (contains)
Before:
Filter: http.request.method contains "\x63"
Constants:
00000 PUT_FVALUE "c" <FT_STRING> -> reg#1
(...)
Filter: http.request.method contains '\x63'
Constants:
00000 PUT_FVALUE "63" <FT_STRING> -> reg#1
(...)
Filter: http.request.method == "\x63"
Constants:
00000 PUT_FVALUE "c" <FT_STRING> -> reg#1
(...)
Filter: http.request.method == '\x63'
Constants:
00000 PUT_FVALUE "'\\x63'" <FT_STRING> -> reg#1
(...)
After:
Filter: http.request.method contains '\x63'
Constants:
00000 PUT_FVALUE "c" <FT_STRING> -> reg#1
(...)
Filter: http.request.method == '\x63'
Constants:
00000 PUT_FVALUE "c" <FT_STRING> -> reg#1
(...)
This reverts commit d635ff4933.
A charconst cannot be a value string, for that reason it is not
redundant with unparsed.
Maybe character constants should be parsed in the lexical scanner
instead.
Before:
Filter: ip.proto == '\g'
dftest: "'\g'" cannot be found among the possible values for ip.proto.
After:
Filter: ip.proto == '\g'
dftest: "'\g'" isn't a valid character constant.
A charconst uses the same semantic rules as unparsed so just
use the latter to avoid redundancies.
We keep the use of TOKEN_CHARCONST as an optimization to avoid
an unnecessary name resolution (lookup for a registered field with
the same name as the charconst).
Use wslog to output debug information. Being able to control
it at runtime is a big advantage.
We extend the syntax tree nodes with a method to return a
canonical string representation.
Add a routine to walk the tree and return an textual representation
for debugging purposes.
Change all wireshark.org URLs to use https.
Fix some broken links while we're at it.
Change-Id: I161bf8eeca43b8027605acea666032da86f5ea1c
Reviewed-on: https://code.wireshark.org/review/34089
Reviewed-by: Guy Harris <guy@alum.mit.edu>
Add an FT_CHAR type, which is like FT_UINT8 except that the value is
displayed as a C-style character constant.
Allow use of C-style character constants in filter expressions; they can
be used in comparisons with all integral types, and in "contains"
operators.
Use that type for some fields that appear (based on the way they're
displayed, or on the use of C-style character constants in their
value_string tables) to be 1-byte characters rather than 8-bit numbers.
Change-Id: I39a9f0dda0bd7f4fa02a9ca8373216206f4d7135
Reviewed-on: https://code.wireshark.org/review/17787
Reviewed-by: Guy Harris <guy@alum.mit.edu>
(Using sed : sed -i '/^ \* \$Id\$/,+1 d')
Fix manually some typo (in export_object_dicom.c and crc16-plain.c)
Change-Id: I4c1ae68d1c4afeace8cb195b53c715cf9e1227a8
Reviewed-on: https://code.wireshark.org/review/497
Reviewed-by: Anders Broman <a.broman58@gmail.com>
Like:
a == b == c
or
a < b <= c <= d < e
Real life example:
6660 <= tcp.port <= 6669
Just syntactic sugar, this is *NOT* optimized.
svn path=/trunk/; revision=43353
they have LF at the end of the line on UN*X and CR/LF on Windows;
hopefully this means that if a CR/LF version is checked in on Windows,
the CRs will be stripped so that they show up only when checked out on
Windows, not on UN*X.
svn path=/trunk/; revision=11400
Besides "STRING", there is now "UNPARSED_STRING", where the distinction
is that "STRING" was a double-quoted string and "UNPARSED_STRING" is just
a sequence of characters that the scanner didn't know how to scan/parse,
so it's up to the Ftype to parse it.
This gives us more flexibility and prepares the dfilter parsing engine
for the upcoming addition of the "contains" operator.
In the process of doing this, I also re-did the double-quoted string
support in the scanner, so that instead of the naively-simple support we
used to have, double-quoted strings now can have embedded dobule-quotes,
embedded octal sequences, and embedded hexadecimal sequences:
"\"" embedded double-quote
"\110" embedded octal
"\x48" embedded hex
Enhance the dfilter unit test script to be able to run a single collection
of tests instead of having to run all of them all the time.
svn path=/trunk/; revision=8083
into epan/ftypes.
Re-write display filter routines using Lemon parser instead of yacc.
Besides using a different tool, the new grammar is much simpler, while
the display filter engine itself is more powerful and more easily extended.
Add dftest executable, to test display filter "bytecode" generation.
Add option to "configure" to build dftest or randpkt, both of which are not
built by default.
Implement Ed Warnicke's ideas about dranges in the new display filter and
ftype code.
Remove type FT_TEXT_ONLY in favor of FT_NONE, and have protocols registered
as FT_PROTOCOL. Thus, FT_NONE is used only for simple labels in the proto tree,
while FT_PROTOCOL is used for protocols. This was necessary for being
able to make byte slices (ranges) out of protocols, like "frame[0:3]"
Win32 Makefile.nmake's will be added tonight.
svn path=/trunk/; revision=2967