Revert to the original design of having a single pattern to catch
everything as unparsed and also try to be less hackish and fragile
parsing "..".
Strings like "80..90" are tricky because it can be parsed as
INTEGER DOTDOT INTEGER or FLOAT FLOAT.
Deprecate the usage of significant whitespace to separate set elements
(or anywhere else for that matter). This will make the implementation
simpler and cleaner and the language more expressive and user-friendly.
Wireshark defines the relation of equality A == B as
A any_eq B <=> An == Bn for at least one An, Bn.
More accurately I think this is (formally) an equivalence
relation, not true equality.
Whichever definition for "==" we choose we must keep the
definition of "!=" as !(A == B), otherwise it will
lead to logical contradictions like (A == B) AND (A != B)
being true.
Fix the '!=' relation to match the definition of equality:
A != B <=> !(A == B) <=> A all_ne B <=> An != Bn, for
every n.
This has been the recomended way to write "not equal" for a
long time in the documentation, even to the point where != was
deprecated, but it just wasn't implemented consistently in the
language, which has understandably been a persistent source
of confusion. Even a field that is normally well-behaved
with "!=" like "ip.src" or "ip.dst" will produce unexpected
results with encapsulations like IP-over-IP.
The opcode ALL_NE could have been implemented in the compiler
instead using NOT and ANY_EQ but I chose to implement it in
bytecode. It just seemed more elegant and efficient
but the difference was not very significant.
Keep around "~=" for any_ne relation, in case someone depends
on that, and because we don't have an operator for true equality:
A strict_equal B <=> A all_eq B <=> !(A any_ne B).
If there is only one value then any_ne and all_ne are the same
comparison operation.
Implementing this change did not require fixing any tests so it
is unlikely the relation "~=" (any_ne) will be very useful.
Note that the behaviour of the '<' (less than) comparison relation
is a separate, more subtle issue. In the general case the definition
of '<' that is used is only a partial order.
Matches is a special case that looks on the RHS and tries
to convert every unparsed value to a string, regardless
of the LHS type. This is not how types work in the display
filter. Require double-quotes to avoid ambiguity, because
matches doesn't follow normal Wireshark display filter
type rules. It doesn't need nor benefit from the flexibility
provided by unparsed strings in the syntax.
For matches the RHS is always a literal strings except
if the RHS is also a field name, then it complains of an
incompatible type. This is confusing. No type can be compatible
because no type rules are ever considered. Every unparsed value is
a text string except if it happens to coincide with a field
name it also requires double-quoting or it throws a syntax error,
just to be difficult. We could remove this odd quirk but requiring
double-quotes for regular expressions is a better, more elegant
fix.
Before:
Filter: tcp matches "udp"
Constants:
00000 PUT_PCRE udp -> reg#1
Instructions:
00000 READ_TREE tcp -> reg#0
00001 IF-FALSE-GOTO 3
00002 ANY_MATCHES reg#0 matches reg#1
00003 RETURN
Filter: tcp matches udp
Constants:
00000 PUT_PCRE udp -> reg#1
Instructions:
00000 READ_TREE tcp -> reg#0
00001 IF-FALSE-GOTO 3
00002 ANY_MATCHES reg#0 matches reg#1
00003 RETURN
Filter: tcp matches udp.srcport
dftest: tcp and udp.srcport are not of compatible types.
Filter: tcp matches udp.srcportt
Constants:
00000 PUT_PCRE udp.srcportt -> reg#1
Instructions:
00000 READ_TREE tcp -> reg#0
00001 IF-FALSE-GOTO 3
00002 ANY_MATCHES reg#0 matches reg#1
00003 RETURN
After:
Filter: tcp matches "udp"
Constants:
00000 PUT_PCRE udp -> reg#1
Instructions:
00000 READ_TREE tcp -> reg#0
00001 IF-FALSE-GOTO 3
00002 ANY_MATCHES reg#0 matches reg#1
00003 RETURN
Filter: tcp matches udp
dftest: "udp" was unexpected in this context.
Filter: tcp matches udp.srcport
dftest: "udp.srcport" was unexpected in this context.
Filter: tcp matches udp.srcportt
dftest: "udp.srcportt" was unexpected in this context.
The error message could still be improved.
It won't work with embedded null bytes so don't try. This is
not an additional restriction, it just removes a hidden failure
mode. To support matching embedded NUL bytes we would have
to use an internal string representation other than
null-terminated C strings (which doesn't seem very onerous with
GString).
Before:
Filter: http.user_agent == 41:42:00:43
Constants:
00000 PUT_FVALUE "AB" <FT_STRING> -> reg#1
Instructions:
00000 READ_TREE http.user_agent -> reg#0
00001 IF-FALSE-GOTO 3
00002 ANY_EQ reg#0 == reg#1
00003 RETURN
After:
Filter: http.user_agent == 41:42:00:43
Constants:
00000 PUT_FVALUE "41:42:00:43" <FT_STRING> -> reg#1
Instructions:
00000 READ_TREE http.user_agent -> reg#0
00001 IF-FALSE-GOTO 3
00002 ANY_EQ reg#0 == reg#1
00003 RETURN
FT_PROTOCOL and FT_BYTES are the same semantic type, but one is
backed by a GByteArray and the other by a TVBuff. Use the same
semantic rules to parse both. In particular unparsed strings
are not converted to literal strings for protocols.
Before:
Filter: frame contains 0x0000
Constants:
00000 PUT_FVALUE 30:78:30:30:30:30 <FT_PROTOCOL> -> reg#1
Instructions:
00000 READ_TREE frame -> reg#0
00001 IF-FALSE-GOTO 3
00002 ANY_CONTAINS reg#0 contains reg#1
00003 RETURN
Filter: frame[5:] contains 0x0000
dftest: "0x0000" is not a valid byte string.
After:
Filter: frame contains 0x0000
dftest: "0x0000" is not a valid byte string.
Filter: frame[5:] contains 0x0000
dftest: "0x0000" is not a valid byte string.
Related to #17634.
If we have a STRING value in an expression and a numeric comparison
we must also check if it matches a value string before throwing
a type error.
Add appropriate tests to the test suite.
Fixes 4d2f469212.
Allow an entity in the grammar as range body. Perform a stronger
sanity check during semantic analysis everywhere a range is used.
This is both safer (unless we want to allow FIELD bodies only, but
functions are allowed too) and also provides better error messages.
Previously a range of range only compiled on the RHS. Now it can
appear on both sides of a relation.
This fixes a crash with STRING entities similar to #10690 for
UNPARSED.
This also adds back support for slicing functions that was removed
in f3f833ccec (by accident presumably).
Ping #10690
PEP 394[1] says,
"In cases where the script is expected to be executed outside virtual
environments, developers will need to be aware of the following
discrepancies across platforms and installation methods:
* Older Linux distributions will provide a python command that refers
to Python 2, and will likely not provide a python2 command.
* Some newer Linux distributions will provide a python command that
refers to Python 3.
* Some Linux distributions will not provide a python command at all by
default, but will provide a python3 command by default."
Debian has forced the issue by choosing the third option[2]:
"NOTE: Debian testing (bullseye) has removed the "python" package and
the '/usr/bin/python' symlink due to the deprecation of Python 2."
Switch our shebang from "#!/usr/bin/env python" to "#!/usr/bin/env
python3" in some places. Remove some 2/3 version checks if we know we're
running under Python 3. Remove the "coding: utf-8" in a bunch of places
since that's the default in Python 3.
[1]https://www.python.org/dev/peps/pep-0394/#for-python-script-publishers
[2]https://wiki.debian.org/Python
If a display filter contains a set for the set membership operator and
an error occurs, then gen_relation_in() (called via dfw_gencode() will
not take ownership of the set and a memory leak occurs.
Fix this by implementing a free callback for STTYPE_SET nodes which
frees unclaimed data. Add tests to verify the effectiveness, ASAN no
longer complains after this fix.
Bug: 15442
Change-Id: If37cf047660464b2d0304748034d0bc22111e5d6
Reviewed-on: https://code.wireshark.org/review/31758
Petri-Dish: Peter Wu <peter@lekensteyn.nl>
Tested-by: Petri Dish Buildbot
Reviewed-by: Peter Wu <peter@lekensteyn.nl>
Instead of just reporting a mismatching error code, include the program
output. This should help tracking down unexpected errors. While at it,
check the expected error message too.
Change-Id: Ib8fe51cc06b795bb54bfe1e6eaa828c6ba1128ef
Reviewed-on: https://code.wireshark.org/review/31714
Petri-Dish: Peter Wu <peter@lekensteyn.nl>
Tested-by: Petri Dish Buildbot
Reviewed-by: Peter Wu <peter@lekensteyn.nl>
Do not rely on strptime("%b") to parse the month, it does not correctly
recognize English month abbreviations on non-English systems. While at
it, do not try to parse milliseconds if seconds are missing.
Change-Id: Ia049bf362195eef1eba2f04ff7217049fa6a7d9d
Reviewed-on: https://code.wireshark.org/review/31707
Petri-Dish: Peter Wu <peter@lekensteyn.nl>
Tested-by: Petri Dish Buildbot
Reviewed-by: João Valverde <j@v6e.pt>
Reviewed-by: Peter Wu <peter@lekensteyn.nl>
This function can convert non-string fields into strings. This allows the
user to apply string functions (like contains and matches) to non-string fields.
Examples:
string(frame.number) matches "[13579]$" => for odd frames
string(eth.dst) matches "aa\.bb\.cc\.dd\.ee\..." => to match a group of stations
string(snmp.name) matches "^1.2.3.4" => for all OIDs under a specific node
Change-Id: I18173f50ba5314ecdcd1e4b66c7e8ba5b44257ee
Reviewed-on: https://code.wireshark.org/review/31427
Petri-Dish: Peter Wu <peter@lekensteyn.nl>
Tested-by: Petri Dish Buildbot
Reviewed-by: Peter Wu <peter@lekensteyn.nl>
The buildbot detects random errors on Windows. Log some more details in
order to understand the problem better.
Change-Id: I903457894985273a63b8907b6784a2897cd93d93
Reviewed-on: https://code.wireshark.org/review/31340
Petri-Dish: Peter Wu <peter@lekensteyn.nl>
Tested-by: Petri Dish Buildbot
Reviewed-by: Peter Wu <peter@lekensteyn.nl>
Stop using subprocesstest, drop the (now redundant) DFTestCase base
class and use pytest-style fixtures to inject the dependency on tshark.
This approach makes it easier to switch to pytest in the future.
Most substitutions were automated, so no typos should be present.
Change-Id: I3516029162f87423816937410ff63507ff82e96f
Reviewed-on: https://code.wireshark.org/review/30649
Petri-Dish: Peter Wu <peter@lekensteyn.nl>
Tested-by: Petri Dish Buildbot
Reviewed-by: Peter Wu <peter@lekensteyn.nl>
'double' tests have been disabled in aa03833 due to format change
in ntp fields.
Change-Id: Id3ab0a736c164bb7fdfed7b5da8856b512308978
Reviewed-on: https://code.wireshark.org/review/30366
Petri-Dish: Dario Lombardo <lomato@gmail.com>
Tested-by: Petri Dish Buildbot
Reviewed-by: Anders Broman <a.broman58@gmail.com>
dfilter/group_double tests have been removed and need to be replaced by leveraging
another protocol.
Bug: 15049
Change-Id: I354a27a5217336ee5c9b1d021a2d3226e3532eec
Reviewed-on: https://code.wireshark.org/review/29035
Reviewed-by: Peter Wu <peter@lekensteyn.nl>
Petri-Dish: Peter Wu <peter@lekensteyn.nl>
Tested-by: Petri Dish Buildbot
Reviewed-by: Michael Mann <mmann78@netscape.net>
Change the test suite list in CMakeLists.txt to a static list. Add a
CTest coverage unit test.
Change-Id: I8459f320a2d0707618d6d56abdfce80274fddd2d
Reviewed-on: https://code.wireshark.org/review/27377
Petri-Dish: Gerald Combs <gerald@wireshark.org>
Tested-by: Petri Dish Buildbot
Reviewed-by: Gerald Combs <gerald@wireshark.org>