Commit Graph

10 Commits

Author SHA1 Message Date
João Valverde f78ebe1564 dfilter: Remove deprecated support for whitespace separator in sets 2021-10-31 09:13:18 +00:00
João Valverde c6b68b3ee2 dfilter: Need to check validity of LHS of "matches" expression
Fixes #17690, a crash on a failed assertion.
2021-10-28 16:26:36 +00:00
João Valverde 2183738ef2 dfilter: Add support for comma as set separator
Deprecate the usage of significant whitespace to separate set elements
(or anywhere else for that matter). This will make the implementation
simpler and cleaner and the language more expressive and user-friendly.
2021-10-28 04:11:05 +00:00
João Valverde 0839f05bf7 tests/dfilter: Move deprecated to syntax group 2021-10-27 07:42:23 +00:00
João Valverde 0abe10e040 dfilter: Fix "!=" relation to be free of contradictions
Wireshark defines the relation of equality A == B as
A any_eq B <=> An == Bn for at least one An, Bn.
More accurately I think this is (formally) an equivalence
relation, not true equality.

Whichever definition for "==" we choose we must keep the
definition of "!=" as !(A == B), otherwise it will
lead to logical contradictions like (A == B) AND (A != B)
being true.

Fix the '!=' relation to match the definition of equality:
  A != B <=> !(A == B) <=> A all_ne B <=> An != Bn, for
every n.

This has been the recomended way to write "not equal" for a
long time in the documentation, even to the point where != was
deprecated, but it just wasn't implemented consistently in the
language, which has understandably been a persistent source
of confusion. Even a field that is normally well-behaved
with "!=" like "ip.src" or "ip.dst" will produce unexpected
results with encapsulations like IP-over-IP.

The opcode ALL_NE could have been implemented in the compiler
instead using NOT and ANY_EQ but I chose to implement it in
bytecode. It just seemed more elegant and efficient
but the difference was not very significant.

Keep around "~=" for any_ne relation, in case someone depends
on that, and because we don't have an operator for true equality:
  A strict_equal B <=> A all_eq B <=> !(A any_ne B).
If there is only one value then any_ne and all_ne are the same
comparison operation.

Implementing this change did not require fixing any tests so it
is unlikely the relation "~=" (any_ne) will be very useful.

Note that the behaviour of the '<' (less than) comparison relation
is a separate, more subtle issue. In the general case the definition
of '<' that is used is only a partial order.
2021-10-24 06:55:54 +00:00
João Valverde e8800ff3c4 dfilter: Add a thin encapsulation layer for REs 2021-10-18 12:09:36 +00:00
João Valverde 2e048df011 dfilter: Improve error message for "matches"
Should be more obvious that this error is caused
by a string syntax error and not something else.
2021-10-18 12:09:36 +00:00
João Valverde a975d478ba dfilter: Require double-quoted strings with "matches"
Matches is a special case that looks on the RHS and tries
to convert every unparsed value to a string, regardless
of the LHS type. This is not how types work in the display
filter. Require double-quotes to avoid ambiguity, because
matches doesn't follow normal Wireshark display filter
type rules. It doesn't need nor benefit from the flexibility
provided by unparsed strings in the syntax.

For matches the RHS is always a literal strings except
if the RHS is also a field name, then it complains of an
incompatible type. This is confusing. No type can be compatible
because no type rules are ever considered. Every unparsed value is
a text string except if it happens to coincide with a field
name it also requires double-quoting or it throws a syntax error,
just to be difficult. We could remove this odd quirk but requiring
double-quotes for regular expressions is a better, more elegant
fix.

Before:
  Filter: tcp matches "udp"

  Constants:
  00000 PUT_PCRE	udp -> reg#1

  Instructions:
  00000 READ_TREE		tcp -> reg#0
  00001 IF-FALSE-GOTO	3
  00002 ANY_MATCHES	reg#0 matches reg#1
  00003 RETURN

  Filter: tcp matches udp

  Constants:
  00000 PUT_PCRE	udp -> reg#1

  Instructions:
  00000 READ_TREE		tcp -> reg#0
  00001 IF-FALSE-GOTO	3
  00002 ANY_MATCHES	reg#0 matches reg#1
  00003 RETURN

  Filter: tcp matches udp.srcport
  dftest: tcp and udp.srcport are not of compatible types.

  Filter: tcp matches udp.srcportt

  Constants:
  00000 PUT_PCRE	udp.srcportt -> reg#1

  Instructions:
  00000 READ_TREE		tcp -> reg#0
  00001 IF-FALSE-GOTO	3
  00002 ANY_MATCHES	reg#0 matches reg#1
  00003 RETURN

After:
  Filter: tcp matches "udp"

  Constants:
  00000 PUT_PCRE	udp -> reg#1

  Instructions:
  00000 READ_TREE		tcp -> reg#0
  00001 IF-FALSE-GOTO	3
  00002 ANY_MATCHES	reg#0 matches reg#1
  00003 RETURN

  Filter: tcp matches udp
  dftest: "udp" was unexpected in this context.

  Filter: tcp matches udp.srcport
  dftest: "udp.srcport" was unexpected in this context.

  Filter: tcp matches udp.srcportt
  dftest: "udp.srcportt" was unexpected in this context.

The error message could still be improved.
2021-10-17 22:53:36 +00:00
João Valverde 9d87c4712e dfilter: Fix parsing of value strings
If we have a STRING value in an expression and a numeric comparison
we must also check if it matches a value string before throwing
a type error.

Add appropriate tests to the test suite.

Fixes 4d2f469212.
2021-10-08 18:53:15 +01:00
João Valverde 39036a0a30 dfilter: Add some more syntax tests 2021-10-05 19:19:36 +01:00