Commit Graph

31 Commits

Author SHA1 Message Date
João Valverde f78ebe1564 dfilter: Remove deprecated support for whitespace separator in sets 2021-10-31 09:13:18 +00:00
João Valverde e876d499d1 dfilter: Refactor some scanner patterns
Revert to the original design of having a single pattern to catch
everything as unparsed and also try to be less hackish and fragile
parsing "..".

Strings like "80..90" are tricky because it can be parsed as
INTEGER DOTDOT INTEGER or FLOAT FLOAT.
2021-10-29 17:33:28 +01:00
João Valverde c6b68b3ee2 dfilter: Need to check validity of LHS of "matches" expression
Fixes #17690, a crash on a failed assertion.
2021-10-28 16:26:36 +00:00
João Valverde 2183738ef2 dfilter: Add support for comma as set separator
Deprecate the usage of significant whitespace to separate set elements
(or anywhere else for that matter). This will make the implementation
simpler and cleaner and the language more expressive and user-friendly.
2021-10-28 04:11:05 +00:00
João Valverde 0839f05bf7 tests/dfilter: Move deprecated to syntax group 2021-10-27 07:42:23 +00:00
João Valverde 0abe10e040 dfilter: Fix "!=" relation to be free of contradictions
Wireshark defines the relation of equality A == B as
A any_eq B <=> An == Bn for at least one An, Bn.
More accurately I think this is (formally) an equivalence
relation, not true equality.

Whichever definition for "==" we choose we must keep the
definition of "!=" as !(A == B), otherwise it will
lead to logical contradictions like (A == B) AND (A != B)
being true.

Fix the '!=' relation to match the definition of equality:
  A != B <=> !(A == B) <=> A all_ne B <=> An != Bn, for
every n.

This has been the recomended way to write "not equal" for a
long time in the documentation, even to the point where != was
deprecated, but it just wasn't implemented consistently in the
language, which has understandably been a persistent source
of confusion. Even a field that is normally well-behaved
with "!=" like "ip.src" or "ip.dst" will produce unexpected
results with encapsulations like IP-over-IP.

The opcode ALL_NE could have been implemented in the compiler
instead using NOT and ANY_EQ but I chose to implement it in
bytecode. It just seemed more elegant and efficient
but the difference was not very significant.

Keep around "~=" for any_ne relation, in case someone depends
on that, and because we don't have an operator for true equality:
  A strict_equal B <=> A all_eq B <=> !(A any_ne B).
If there is only one value then any_ne and all_ne are the same
comparison operation.

Implementing this change did not require fixing any tests so it
is unlikely the relation "~=" (any_ne) will be very useful.

Note that the behaviour of the '<' (less than) comparison relation
is a separate, more subtle issue. In the general case the definition
of '<' that is used is only a partial order.
2021-10-24 06:55:54 +00:00
João Valverde e8800ff3c4 dfilter: Add a thin encapsulation layer for REs 2021-10-18 12:09:36 +00:00
João Valverde 2e048df011 dfilter: Improve error message for "matches"
Should be more obvious that this error is caused
by a string syntax error and not something else.
2021-10-18 12:09:36 +00:00
João Valverde a975d478ba dfilter: Require double-quoted strings with "matches"
Matches is a special case that looks on the RHS and tries
to convert every unparsed value to a string, regardless
of the LHS type. This is not how types work in the display
filter. Require double-quotes to avoid ambiguity, because
matches doesn't follow normal Wireshark display filter
type rules. It doesn't need nor benefit from the flexibility
provided by unparsed strings in the syntax.

For matches the RHS is always a literal strings except
if the RHS is also a field name, then it complains of an
incompatible type. This is confusing. No type can be compatible
because no type rules are ever considered. Every unparsed value is
a text string except if it happens to coincide with a field
name it also requires double-quoting or it throws a syntax error,
just to be difficult. We could remove this odd quirk but requiring
double-quotes for regular expressions is a better, more elegant
fix.

Before:
  Filter: tcp matches "udp"

  Constants:
  00000 PUT_PCRE	udp -> reg#1

  Instructions:
  00000 READ_TREE		tcp -> reg#0
  00001 IF-FALSE-GOTO	3
  00002 ANY_MATCHES	reg#0 matches reg#1
  00003 RETURN

  Filter: tcp matches udp

  Constants:
  00000 PUT_PCRE	udp -> reg#1

  Instructions:
  00000 READ_TREE		tcp -> reg#0
  00001 IF-FALSE-GOTO	3
  00002 ANY_MATCHES	reg#0 matches reg#1
  00003 RETURN

  Filter: tcp matches udp.srcport
  dftest: tcp and udp.srcport are not of compatible types.

  Filter: tcp matches udp.srcportt

  Constants:
  00000 PUT_PCRE	udp.srcportt -> reg#1

  Instructions:
  00000 READ_TREE		tcp -> reg#0
  00001 IF-FALSE-GOTO	3
  00002 ANY_MATCHES	reg#0 matches reg#1
  00003 RETURN

After:
  Filter: tcp matches "udp"

  Constants:
  00000 PUT_PCRE	udp -> reg#1

  Instructions:
  00000 READ_TREE		tcp -> reg#0
  00001 IF-FALSE-GOTO	3
  00002 ANY_MATCHES	reg#0 matches reg#1
  00003 RETURN

  Filter: tcp matches udp
  dftest: "udp" was unexpected in this context.

  Filter: tcp matches udp.srcport
  dftest: "udp.srcport" was unexpected in this context.

  Filter: tcp matches udp.srcportt
  dftest: "udp.srcportt" was unexpected in this context.

The error message could still be improved.
2021-10-17 22:53:36 +00:00
João Valverde 07023a7774 tests: Accept a partial string in checkDFilterFail() 2021-10-15 15:10:57 +01:00
João Valverde 00673e22ef tests: Fixup test names 2021-10-15 15:10:54 +01:00
João Valverde c484ad0e5c dfilter: Don't try to parse byte arrays as strings
It won't work with embedded null bytes so don't try. This is
not an additional restriction, it just removes a hidden failure
mode. To support matching embedded NUL bytes we would have
to use an internal string representation other than
null-terminated C strings (which doesn't seem very onerous with
GString).

Before:
  Filter: http.user_agent == 41:42:00:43

  Constants:
  00000 PUT_FVALUE	"AB" <FT_STRING> -> reg#1

  Instructions:
  00000 READ_TREE		http.user_agent -> reg#0
  00001 IF-FALSE-GOTO	3
  00002 ANY_EQ		reg#0 == reg#1
  00003 RETURN

After:
  Filter: http.user_agent == 41:42:00:43

  Constants:
  00000 PUT_FVALUE	"41:42:00:43" <FT_STRING> -> reg#1

  Instructions:
  00000 READ_TREE		http.user_agent -> reg#0
  00001 IF-FALSE-GOTO	3
  00002 ANY_EQ		reg#0 == reg#1
  00003 RETURN
2021-10-15 13:06:51 +01:00
João Valverde 144dc1e2ee dfilter: Use the same semantic rules for protocols and bytes
FT_PROTOCOL and FT_BYTES are the same semantic type, but one is
backed by a GByteArray and the other by a TVBuff. Use the same
semantic rules to parse both. In particular unparsed strings
are not converted to literal strings for protocols.

Before:
  Filter: frame contains 0x0000

  Constants:
  00000 PUT_FVALUE	30:78:30:30:30:30 <FT_PROTOCOL> -> reg#1

  Instructions:
  00000 READ_TREE		frame -> reg#0
  00001 IF-FALSE-GOTO	3
  00002 ANY_CONTAINS	reg#0 contains reg#1
  00003 RETURN

  Filter: frame[5:] contains 0x0000
  dftest: "0x0000" is not a valid byte string.

After:
  Filter: frame contains 0x0000
  dftest: "0x0000" is not a valid byte string.

  Filter: frame[5:] contains 0x0000
  dftest: "0x0000" is not a valid byte string.

Related to #17634.
2021-10-15 13:06:51 +01:00
João Valverde 9d87c4712e dfilter: Fix parsing of value strings
If we have a STRING value in an expression and a numeric comparison
we must also check if it matches a value string before throwing
a type error.

Add appropriate tests to the test suite.

Fixes 4d2f469212.
2021-10-08 18:53:15 +01:00
João Valverde 4a2b18a9c0 dfilter: Skip equality test and add explanation
Also fix a byte typo in the 'eth' filter expression.
2021-10-07 13:21:32 +00:00
João Valverde 39036a0a30 dfilter: Add some more syntax tests 2021-10-05 19:19:36 +01:00
João Valverde d45ba348fd dfilter: Strengthen sanity check for range
Allow an entity in the grammar as range body. Perform a stronger
sanity check during semantic analysis everywhere a range is used.
This is both safer (unless we want to allow FIELD bodies only, but
functions are allowed too) and also provides better error messages.

Previously a range of range only compiled on the RHS. Now it can
appear on both sides of a relation.

This fixes a crash with STRING entities similar to #10690 for
UNPARSED.

This also adds back support for slicing functions that was removed
in f3f833ccec (by accident presumably).

Ping #10690
2021-10-05 16:39:41 +01:00
João Valverde d6836d103d dfilter: Add test for "deprecated" tokens
Tokens that are (so-called) deprecated produce a warning/hint to
the user in the UI.
2021-09-30 17:26:19 +01:00
Gerald Combs 30c392f166 Tools+test: Call python3 explicitly.
PEP 394[1] says,

"In cases where the script is expected to be executed outside virtual
 environments, developers will need to be aware of the following
 discrepancies across platforms and installation methods:

  * Older Linux distributions will provide a python command that refers
    to Python 2, and will likely not provide a python2 command.

  * Some newer Linux distributions will provide a python command that
    refers to Python 3.

  * Some Linux distributions will not provide a python command at all by
    default, but will provide a python3 command by default."

Debian has forced the issue by choosing the third option[2]:

"NOTE: Debian testing (bullseye) has removed the "python" package and
 the '/usr/bin/python' symlink due to the deprecation of Python 2."

Switch our shebang from "#!/usr/bin/env python" to "#!/usr/bin/env
python3" in some places. Remove some 2/3 version checks if we know we're
running under Python 3. Remove the "coding: utf-8" in a bunch of places
since that's the default in Python 3.

[1]https://www.python.org/dev/peps/pep-0394/#for-python-script-publishers
[2]https://wiki.debian.org/Python
2020-11-05 06:46:35 +00:00
Peter Wu eec3ce3bb2 dfilter: fix memory leaks on dfilter compile errors involving a set
If a display filter contains a set for the set membership operator and
an error occurs, then gen_relation_in() (called via dfw_gencode() will
not take ownership of the set and a memory leak occurs.

Fix this by implementing a free callback for STTYPE_SET nodes which
frees unclaimed data. Add tests to verify the effectiveness, ASAN no
longer complains after this fix.

Bug: 15442
Change-Id: If37cf047660464b2d0304748034d0bc22111e5d6
Reviewed-on: https://code.wireshark.org/review/31758
Petri-Dish: Peter Wu <peter@lekensteyn.nl>
Tested-by: Petri Dish Buildbot
Reviewed-by: Peter Wu <peter@lekensteyn.nl>
2019-01-28 11:09:35 +00:00
Peter Wu 198c5a2cac test/dfilter: be explicit with the expected error message
Instead of just reporting a mismatching error code, include the program
output. This should help tracking down unexpected errors. While at it,
check the expected error message too.

Change-Id: Ib8fe51cc06b795bb54bfe1e6eaa828c6ba1128ef
Reviewed-on: https://code.wireshark.org/review/31714
Petri-Dish: Peter Wu <peter@lekensteyn.nl>
Tested-by: Petri Dish Buildbot
Reviewed-by: Peter Wu <peter@lekensteyn.nl>
2019-01-24 18:24:00 +00:00
Peter Wu a946eb3141 ftype-time: parse the month independent of the locale
Do not rely on strptime("%b") to parse the month, it does not correctly
recognize English month abbreviations on non-English systems. While at
it, do not try to parse milliseconds if seconds are missing.

Change-Id: Ia049bf362195eef1eba2f04ff7217049fa6a7d9d
Reviewed-on: https://code.wireshark.org/review/31707
Petri-Dish: Peter Wu <peter@lekensteyn.nl>
Tested-by: Petri Dish Buildbot
Reviewed-by: João Valverde <j@v6e.pt>
Reviewed-by: Peter Wu <peter@lekensteyn.nl>
2019-01-24 09:20:10 +00:00
Dario Lombardo c3d198c401 dfilter: add string() function.
This function can convert non-string fields into strings. This allows the
user to apply string functions (like contains and matches) to non-string fields.

Examples:

string(frame.number) matches "[13579]$" => for odd frames
string(eth.dst) matches "aa\.bb\.cc\.dd\.ee\..." => to match a group of stations
string(snmp.name) matches "^1.2.3.4" => for all OIDs under a specific node

Change-Id: I18173f50ba5314ecdcd1e4b66c7e8ba5b44257ee
Reviewed-on: https://code.wireshark.org/review/31427
Petri-Dish: Peter Wu <peter@lekensteyn.nl>
Tested-by: Petri Dish Buildbot
Reviewed-by: Peter Wu <peter@lekensteyn.nl>
2019-01-14 16:00:29 +00:00
Peter Wu 8dbca7320d test: print command output for dfiltertest failures
The buildbot detects random errors on Windows. Log some more details in
order to understand the problem better.

Change-Id: I903457894985273a63b8907b6784a2897cd93d93
Reviewed-on: https://code.wireshark.org/review/31340
Petri-Dish: Peter Wu <peter@lekensteyn.nl>
Tested-by: Petri Dish Buildbot
Reviewed-by: Peter Wu <peter@lekensteyn.nl>
2019-01-03 20:59:56 +00:00
Peter Wu d631c17eee test: convert suite_dfilter to use fixtures
Stop using subprocesstest, drop the (now redundant) DFTestCase base
class and use pytest-style fixtures to inject the dependency on tshark.
This approach makes it easier to switch to pytest in the future.
Most substitutions were automated, so no typos should be present.

Change-Id: I3516029162f87423816937410ff63507ff82e96f
Reviewed-on: https://code.wireshark.org/review/30649
Petri-Dish: Peter Wu <peter@lekensteyn.nl>
Tested-by: Petri Dish Buildbot
Reviewed-by: Peter Wu <peter@lekensteyn.nl>
2018-11-15 22:57:40 +00:00
Dario Lombardo d4b60271d9 test: make 'double' tests rely on icmp instead of ntp.
'double' tests have been disabled in aa03833 due to format change
in ntp fields.

Change-Id: Id3ab0a736c164bb7fdfed7b5da8856b512308978
Reviewed-on: https://code.wireshark.org/review/30366
Petri-Dish: Dario Lombardo <lomato@gmail.com>
Tested-by: Petri Dish Buildbot
Reviewed-by: Anders Broman <a.broman58@gmail.com>
2018-10-25 04:09:44 +00:00
Dario Lombardo aa038336ce ntp: change root delay and dispersion to integer for fixed precision.
dfilter/group_double tests have been removed and need to be replaced by leveraging
another protocol.

Bug: 15049
Change-Id: I354a27a5217336ee5c9b1d021a2d3226e3532eec
Reviewed-on: https://code.wireshark.org/review/29035
Reviewed-by: Peter Wu <peter@lekensteyn.nl>
Petri-Dish: Peter Wu <peter@lekensteyn.nl>
Tested-by: Petri Dish Buildbot
Reviewed-by: Michael Mann <mmann78@netscape.net>
2018-10-21 23:29:09 +00:00
Dario Lombardo 77b4b938e3 ntp: make ntp.precision an uint8.
Change-Id: I7ee0c7fbe5bab90bd1109b2f39feaec033b95621
Reviewed-on: https://code.wireshark.org/review/29178
Petri-Dish: Dario Lombardo <lomato@gmail.com>
Tested-by: Petri Dish Buildbot
Reviewed-by: Peter Wu <peter@lekensteyn.nl>
2018-09-04 09:05:24 +00:00
Gerald Combs 55304159fc Test: Add UTF-8 filter tests.
Change-Id: Ic1e961802e716b5c446428efa068a6205faab954
Reviewed-on: https://code.wireshark.org/review/27912
Petri-Dish: Gerald Combs <gerald@wireshark.org>
Tested-by: Petri Dish Buildbot
Reviewed-by: Gerald Combs <gerald@wireshark.org>
2018-05-30 21:16:38 +00:00
Gerald Combs f72481a144 Test: Make sure we run our display filter tests.
Change the test suite list in CMakeLists.txt to a static list. Add a
CTest coverage unit test.

Change-Id: I8459f320a2d0707618d6d56abdfce80274fddd2d
Reviewed-on: https://code.wireshark.org/review/27377
Petri-Dish: Gerald Combs <gerald@wireshark.org>
Tested-by: Petri Dish Buildbot
Reviewed-by: Gerald Combs <gerald@wireshark.org>
2018-05-06 23:56:41 +00:00
Gerald Combs 7591ed848e Test: Add dftest to our tests.
Move the dfilter tests and captures from tools to test.

Change-Id: I2e6a6cc1d383c985ba07c76c93ae1c57d3c8f84c
Reviewed-on: https://code.wireshark.org/review/27339
Petri-Dish: Gerald Combs <gerald@wireshark.org>
Tested-by: Petri Dish Buildbot
Reviewed-by: Gerald Combs <gerald@wireshark.org>
2018-05-04 22:44:32 +00:00