wireshark

Commit Graph

Author	SHA1	Message	Date
João Valverde	74a89a9862	dfilter: Minor set grammar cleanup	2021-10-27 11:13:52 +01:00
João Valverde	a7c625808c	dfilter: Add a helper function to create test stnodes	2021-10-27 09:27:45 +01:00
João Valverde	f5fea52982	dfilter: Remove token value from syntax tree Currently unused. This might still be useful to differentiate different spelling of the same token in user messages, like "==" and "eq", but currently we are not storing test tokens anyway, so just remove it, it makes everything simpler. If it's ever necessary it can be added back.	2021-10-27 09:27:45 +01:00
João Valverde	0e4851b025	dfilter: Use a string lval type in scanner Minor change to decouple the AST data structures from the lexical scanner. We pass a structure to allow for some future enhancements.	2021-10-27 09:27:45 +01:00
João Valverde	b1222edcd2	dfilter: Parse ranges in the drange node constructor Using a hand written tokenizer is simpler than using flex start conditions. Do the validation in the drange node constructor. Add validation for malformed ranges with different endpoint signs.	2021-10-27 06:02:07 +00:00
João Valverde	0abe10e040	dfilter: Fix "!=" relation to be free of contradictions Wireshark defines the relation of equality A == B as A any_eq B <=> An == Bn for at least one An, Bn. More accurately I think this is (formally) an equivalence relation, not true equality. Whichever definition for "==" we choose we must keep the definition of "!=" as !(A == B), otherwise it will lead to logical contradictions like (A == B) AND (A != B) being true. Fix the '!=' relation to match the definition of equality: A != B <=> !(A == B) <=> A all_ne B <=> An != Bn, for every n. This has been the recomended way to write "not equal" for a long time in the documentation, even to the point where != was deprecated, but it just wasn't implemented consistently in the language, which has understandably been a persistent source of confusion. Even a field that is normally well-behaved with "!=" like "ip.src" or "ip.dst" will produce unexpected results with encapsulations like IP-over-IP. The opcode ALL_NE could have been implemented in the compiler instead using NOT and ANY_EQ but I chose to implement it in bytecode. It just seemed more elegant and efficient but the difference was not very significant. Keep around "~=" for any_ne relation, in case someone depends on that, and because we don't have an operator for true equality: A strict_equal B <=> A all_eq B <=> !(A any_ne B). If there is only one value then any_ne and all_ne are the same comparison operation. Implementing this change did not require fixing any tests so it is unlikely the relation "~=" (any_ne) will be very useful. Note that the behaviour of the '<' (less than) comparison relation is a separate, more subtle issue. In the general case the definition of '<' that is used is only a partial order.	2021-10-24 06:55:54 +00:00
João Valverde	2e048df011	dfilter: Improve error message for "matches" Should be more obvious that this error is caused by a string syntax error and not something else.	2021-10-18 12:09:36 +00:00
João Valverde	a975d478ba	dfilter: Require double-quoted strings with "matches" Matches is a special case that looks on the RHS and tries to convert every unparsed value to a string, regardless of the LHS type. This is not how types work in the display filter. Require double-quotes to avoid ambiguity, because matches doesn't follow normal Wireshark display filter type rules. It doesn't need nor benefit from the flexibility provided by unparsed strings in the syntax. For matches the RHS is always a literal strings except if the RHS is also a field name, then it complains of an incompatible type. This is confusing. No type can be compatible because no type rules are ever considered. Every unparsed value is a text string except if it happens to coincide with a field name it also requires double-quoting or it throws a syntax error, just to be difficult. We could remove this odd quirk but requiring double-quotes for regular expressions is a better, more elegant fix. Before: Filter: tcp matches "udp" Constants: 00000 PUT_PCRE udp -> reg#1 Instructions: 00000 READ_TREE tcp -> reg#0 00001 IF-FALSE-GOTO 3 00002 ANY_MATCHES reg#0 matches reg#1 00003 RETURN Filter: tcp matches udp Constants: 00000 PUT_PCRE udp -> reg#1 Instructions: 00000 READ_TREE tcp -> reg#0 00001 IF-FALSE-GOTO 3 00002 ANY_MATCHES reg#0 matches reg#1 00003 RETURN Filter: tcp matches udp.srcport dftest: tcp and udp.srcport are not of compatible types. Filter: tcp matches udp.srcportt Constants: 00000 PUT_PCRE udp.srcportt -> reg#1 Instructions: 00000 READ_TREE tcp -> reg#0 00001 IF-FALSE-GOTO 3 00002 ANY_MATCHES reg#0 matches reg#1 00003 RETURN After: Filter: tcp matches "udp" Constants: 00000 PUT_PCRE udp -> reg#1 Instructions: 00000 READ_TREE tcp -> reg#0 00001 IF-FALSE-GOTO 3 00002 ANY_MATCHES reg#0 matches reg#1 00003 RETURN Filter: tcp matches udp dftest: "udp" was unexpected in this context. Filter: tcp matches udp.srcport dftest: "udp.srcport" was unexpected in this context. Filter: tcp matches udp.srcportt dftest: "udp.srcportt" was unexpected in this context. The error message could still be improved.	2021-10-17 22:53:36 +00:00
João Valverde	4e5e806604	dfilter: Do not chain matches expressions It is always an error to chain regexes using the logic for "le" and "eq". var matches "regex1" matches "regex2" => var matches "regex1" and "regex1" matches "regex2" Before: Filter: tcp matches "abc$" matches "^cde" dftest: Neither "abc$" nor "^cde" are field or protocol names. Filter: "abc$" matches tcp matches "^cde" dftest: Neither "abc$" nor "tcp" are field or protocol names. After: Filter: tcp matches "abc$" matches "^cde" dftest: "matches" was unexpected in this context. Filter: "abc$" matches tcp matches "^cde" dftest: "matches" was unexpected in this context.	2021-10-17 22:53:36 +00:00
João Valverde	e91b5beafd	dfilter: Resolve field names in the parser The lexical rules for fields and unparsed strings are ambiguous, e.g. "fc" can be the protocol fibre channel or the byte 0xfc. In general a name is determined to be a protocol field or not by checking the registry. Resolving the name in the parser gives more flexibility, for example to use different semantic rules according to the relation between LHS and RHS, and allows function names and protocol names to co-exist without ambiguity. Before: Filter: tcp == 1 Constants: 00000 PUT_FVALUE 01 <FT_PROTOCOL> -> reg#1 Instructions: 00000 READ_TREE tcp -> reg#0 00001 IF-FALSE-GOTO 3 00002 ANY_EQ reg#0 == reg#1 00003 RETURN Filter: tcp() == 1 dftest: Syntax error near "(". After: Filter: tcp == 1 Constants: 00000 PUT_FVALUE 01 <FT_PROTOCOL> -> reg#1 Instructions: (same) Filter: tcp() == 1 dftest: Function 'tcp' does not exist It's also a goal to make it easier to modify the lexer rules. Ping #12810.	2021-10-14 16:45:19 +01:00
João Valverde	2c701ddf6f	dfilter: Improve grammar to parse ranges Do the integer conversion for ranges in the parser. This is more conventional, I think, and allows removing the unnecessary integer syntax tree node type. Try to minimize the number and complexity of lexical rules for ranges. But it seems we need to keep different states for integer and punctuation because of the need to disambiguate the ranges [-n-n] and [-n--n].	2021-10-08 19:18:56 +01:00
João Valverde	92285e6258	dfilter: Improve grammar to parse functions A function is grammatically an identifier that is followed by '(' and ')' according to some rules. We should avoid assuming a token is a function just because it matches a registered function name. Before: Filter: foobar(http.user_agent) contains "UPDATE" dftest: Syntax error near "(". After: Filter: foobar(http.user_agent) contains "UPDATE" dftest: The function 'foobar' does not exist. This has the problem that a function cannot have the same name as a protocol but that limitation already existed before.	2021-10-08 04:01:24 +00:00
João Valverde	7bf02254c1	dfilter: Rename function production rule Make it more obvious that entities are also functions.	2021-10-05 19:19:36 +01:00
João Valverde	a940318f37	dfilter: Minor grammar fixups Clean up syntax error code. TEST and SET are never returned by the tokenizer. Remove unnecessary range_body() grammar element. Fix a comment. Move the stnode_token_value() function to its proper place.	2021-10-05 17:56:21 +01:00
João Valverde	d45ba348fd	dfilter: Strengthen sanity check for range Allow an entity in the grammar as range body. Perform a stronger sanity check during semantic analysis everywhere a range is used. This is both safer (unless we want to allow FIELD bodies only, but functions are allowed too) and also provides better error messages. Previously a range of range only compiled on the RHS. Now it can appear on both sides of a relation. This fixes a crash with STRING entities similar to #10690 for UNPARSED. This also adds back support for slicing functions that was removed in `f3f833ccec` (by accident presumably). Ping #10690	2021-10-05 16:39:41 +01:00
João Valverde	c7dc907d0e	dfilter: Rename some identifiers in grammar Prefer grammar names for readibility over C names. Prefer rel_binop to rel_op2. Clean formatting.	2021-10-01 16:58:42 +00:00
João Valverde	2c55bffb41	dfilter: Improve syntax error message Pass simple token value and use it for the error message. This string is freed in the parser destructor.	2021-10-01 16:04:37 +00:00
João Valverde	db18865e55	dfilter: Save token value to syntax tree When parsing we save the token value to the syntax tree. This is useful for better error reporting. Use it to report an invalid entity for the slice operation. Before only the memory location was reported, which is not a good error message. Before: % dftest '"01:02:03:04"[0:3] == foo' Filter: ""01:02:03:04"[0:3] == foo" dftest: Range is not supported for entity <0x7f6c84017740> of type STRING After: % dftest '"01:02:03:04"[0:3] == foo' Filter: ""01:02:03:04"[0:3] == foo" dftest: Range is not supported for entity 01:02:03:04 of type STRING When creating a new node from an old one we need to copy the token value. Simple tokens such as RBRACKET, COMMA and COLON are not part of the AST and don't have an associated semantic value.	2021-10-01 16:04:37 +00:00
João Valverde	b4af7c52a5	dfilter: Add a flags member to the syntax tree node Use it to record "inside parenthesis".	2021-09-30 17:03:55 +00:00
João Valverde	0e50979b3f	Replace g_assert() with ws_assert()	2021-06-19 01:23:31 +00:00
Guy Harris	b61fd6d76a	dfilter, ftypes: get rid of FT_PCRE. It's not a valid field type, it's only a hack to support regular expression matching in packet-matching expressions. Instead, in the packet-matching code, have a separate syntax tree type for Perl-compatible regular expressions, and a separate instruction to load one into a register, and have the "matching" operator for field types take a GRegex * as the second argument.	2021-03-21 03:27:44 -07:00
Guy Harris	7d4b47a073	Further improve that error message. Put the function name in quotes. Change-Id: I09be392a9bac3b56c13b82a554d17ea29695657c Reviewed-on: https://code.wireshark.org/review/31790 Reviewed-by: Guy Harris <guy@alum.mit.edu>	2019-01-28 22:23:45 +00:00
Guy Harris	e8f54b8aed	Fix an error message. I guess "s" in "The function s" was supposed to be "%s", giving the function name. Make it so, and properly fetch the function name. Change-Id: I67287f24626fa0a2816fb2cf574e5d9ff58713bf Reviewed-on: https://code.wireshark.org/review/31787 Reviewed-by: Guy Harris <guy@alum.mit.edu>	2019-01-28 22:08:41 +00:00
Peter Wu	6a45dcd7a2	dfilter: require spaces as set element separator Previously a filter such as `http.request.method in {"GET"HEAD""}` would be parsed as three strings (GET, HEAD and an empty string). As it seems more likely that people make typos rather than intending to construct such a filter, forbid this by always requiring a whitespace separator. Change-Id: I77e531fd6be072f62dd06aac27f856106c8920c6 Reported-by: Stig Bjørlykke Reviewed-on: https://code.wireshark.org/review/26989 Petri-Dish: Peter Wu <peter@lekensteyn.nl> Tested-by: Petri Dish Buildbot Reviewed-by: Anders Broman <a.broman58@gmail.com>	2018-04-18 03:47:58 +00:00
Peter Wu	1ff82572ca	dfilter: add range support to set membership operator ("f in {x .. y}") Allow "tcp.srcport in {1662 1663 1664}" to be abbreviated to "tcp.srcport in {1662 .. 1664}". The range operator is supported for any field value which supports the "<=" and "=>" operators and thus works for integers, IP addresses, etc. The naive mapping "tcp.srcport >= 1662 and tcp.srcport <= 1664" is not used because it does not have the intended effect with fields that have multiple occurrences (e.g. tcp.port). Each condition could be satisfied by an other value. Therefore a new DVFM instruction (ANY_IN_RANGE) is added to test the range condition against each individual field value. Bug: 14180 Change-Id: I53c2d0f9bc9d4f0ffaabde9a83442122965c95f7 Reviewed-on: https://code.wireshark.org/review/26945 Petri-Dish: Peter Wu <peter@lekensteyn.nl> Tested-by: Petri Dish Buildbot Reviewed-by: Anders Broman <a.broman58@gmail.com>	2018-04-18 03:47:02 +00:00
Guy Harris	d7fe514fc0	Improve support for single-character fields and filter expressions. Add an FT_CHAR type, which is like FT_UINT8 except that the value is displayed as a C-style character constant. Allow use of C-style character constants in filter expressions; they can be used in comparisons with all integral types, and in "contains" operators. Use that type for some fields that appear (based on the way they're displayed, or on the use of C-style character constants in their value_string tables) to be 1-byte characters rather than 8-bit numbers. Change-Id: I39a9f0dda0bd7f4fa02a9ca8373216206f4d7135 Reviewed-on: https://code.wireshark.org/review/17787 Reviewed-by: Guy Harris <guy@alum.mit.edu>	2016-09-19 02:51:13 +00:00
Stig Bjørlykke	b86e2a3609	Dfilter: Mark an error in %syntax_error Because of a change in lemon the %parse_failure is not always called. Bug: 11637 Change-Id: Iea218aeee10e20f29461169829a10345bbdac903 Reviewed-on: https://code.wireshark.org/review/11302 Petri-Dish: Stig Bjørlykke <stig@bjorlykke.org> Tested-by: Petri Dish Buildbot <buildbot-no-reply@wireshark.org> Tested-by: Jeff Morriss <jeff.morriss.ws@gmail.com> Reviewed-by: Stig Bjørlykke <stig@bjorlykke.org>	2015-10-27 17:22:38 +00:00
Jeffrey Smith	80322d88da	dfilter: Add membership operator Added a new relational test: 'x in {a b c}'. The only LHS entity supported at this time is a field. The generated DFVM operations are equivalent to an OR'ed series of =='s, but with the redundant existence tests removed. Change-Id: Iddc89b81cf7ad6319aef1a2a94f93314cb721a8a Reviewed-on: https://code.wireshark.org/review/10246 Reviewed-by: Hadriel Kaplan <hadrielk@yahoo.com> Petri-Dish: Hadriel Kaplan <hadrielk@yahoo.com> Tested-by: Petri Dish Buildbot <buildbot-no-reply@wireshark.org> Reviewed-by: Michael Mann <mmann78@netscape.net>	2015-09-11 06:31:33 +00:00
Alexis La Goutte	2e1fa634c6	Lemon grammar: fix indent (use tabs) Change-Id: I6fa38d5d85b25ac6c55fcfa67d6c8dba8482cc8c Reviewed-on: https://code.wireshark.org/review/10266 Petri-Dish: Alexis La Goutte <alexis.lagoutte@gmail.com> Tested-by: Petri Dish Buildbot <buildbot-no-reply@wireshark.org> Reviewed-by: Anders Broman <a.broman58@gmail.com>	2015-08-27 04:35:23 +00:00
Guy Harris	cfcbb28671	Clean up ftype-conversion and dfilter error message string handling. Have dfilter_compile() take an additional gchar ** argument, pointing to a gchar * item that, on error, gets set to point to a g_malloc()ed error string. That removes one bit of global state from the display filter parser, and doesn't impose a fixed limit on the error message strings. Have fvalue_from_string() and fvalue_from_unparsed() take a gchar ** argument, pointer to a gchar * item, rather than an error-reporting function, and set the gchar * item to point to a g_malloc()ed error string on an error. Allow either gchar ** argument to be null; if the argument is null, no error message is allocated or provided. Change-Id: Ibd36b8aaa9bf4234aa6efa1e7fb95f7037493b4c Reviewed-on: https://code.wireshark.org/review/6608 Reviewed-by: Guy Harris <guy@alum.mit.edu>	2015-01-18 10:22:59 +00:00
Martin Kaiser	f3f833ccec	display filter: the body of a range should only be a string, a field name or another range - not an unparsed element Bug: 10690 Change-Id: I126143636c940cc73ed6467660f0a573209e2ae9 Reviewed-on: https://code.wireshark.org/review/5243 Reviewed-by: Martin Kaiser <wireshark@kaiser.cx> Tested-by: Martin Kaiser <wireshark@kaiser.cx>	2014-11-17 07:05:35 +00:00
Peter Wu	f2b4daf400	Add printf-format annotations, fix garbage The WRETH dissector showed up some garbage in the column display. Upon further inspection, it turns out that the format string had a trailing percent sign which caused (unsigned)-1 to be returned by g_printf_string_upper_bound (in emem_strdup_vprintf). Then ep_alloc is called with (unsigned)-1 + 1 = 0 memory, no wonder that garbage shows up. ASAN could not even catch this error because EP is in charge of this. So, start adding G_GNUC_PRINTF annotations in each header that uses the "fmt" or "format" paramters (grepped + awk). This revealed some other errors. The NCP2222 dissector was missing a format string (not a security vuln though). Many dissectors used val_to_str with a constant (but empty) string, these have been replaced by val_to_str_const. ASN.1 dissectors were regenerated for this. Minor: the mate plugin used "%X" instead of "%p" for a pointer type. The ncp2222 dissector and wimax plugin gained modelines. Change-Id: I7f3f6a3136116f9b251719830a39a7b21646f622 Reviewed-on: https://code.wireshark.org/review/2881 Reviewed-by: Evan Huus <eapache@gmail.com>	2014-07-06 23:00:40 +00:00
Alexis La Goutte	7d77d753c6	Continue to remove $Id$ from top of file (Using sed :sed -i '/^\/\* \$Id\$ \\//,+0 d') ( / $Id */ ) Change-Id: I46e928d7f2a307c35876ed5d34cb6b7cccfcd6e9 Reviewed-on: https://code.wireshark.org/review/886 Reviewed-by: Anders Broman <a.broman58@gmail.com>	2014-03-31 18:49:26 +00:00
Gerald Combs	465e4664de	Use "(void) <variable/>" to avoid unused variable warnings similar to Qt's Q_UNUSED macro. svn path=/trunk/; revision=54110	2013-12-14 23:44:25 +00:00
Jakub Zawadzki	ae59b09443	Add missing includes in order to remove exceptions.h from proto.h (next commit). svn path=/trunk/; revision=53230	2013-11-10 15:59:37 +00:00
Jakub Zawadzki	c6669a3c63	dfilter: report warning if OR and AND logic operands are mixed without parentheses. svn path=/trunk/; revision=51247	2013-08-10 17:49:28 +00:00
Jakub Zawadzki	73aa1e7807	Support drange for functions last think from bug #8979 + fix semcheck.c:875: warning: signed and unsigned type in conditional expression svn path=/trunk/; revision=50951	2013-07-27 19:14:34 +00:00
Jeff Morriss	3729335973	We always HAVE_CONFIG_H so don't bother checking whether we have it or not. svn path=/trunk/; revision=45016	2012-09-20 01:48:30 +00:00
Jakub Zawadzki	addf9236dc	Support multiple relation test without logic and (python-like) Like: a == b == c or a < b <= c <= d < e Real life example: 6660 <= tcp.port <= 6669 Just syntactic sugar, this is NOT optimized. svn path=/trunk/; revision=43353	2012-06-19 12:12:41 +00:00
Anders Broman	d1c1455882	Fix warnings svn path=/trunk/; revision=43046	2012-06-03 20:59:41 +00:00
Luis Ontanon	9709011a9b	Implement a proposal from Elefterios Gabriel for SCCP: Add a table of DPCs and SSNs that allow to override the protocol that would be choosen so that the same SSN can use two different protocols in two different DPCs. I did not believe it someone could have done it, then I saw the captures... svn path=/trunk/; revision=21321	2007-04-03 19:08:00 +00:00
Luis Ontanon	22004e8190	productions of non-terminal "sentence" do not generate any value. Avoid a destructor being called for them. see http://www.sqlite.org/cvstrac/tktview?tn=2172 svn path=/trunk/; revision=20460	2007-01-17 16:41:22 +00:00
Gilbert Ramirez	e3899ed4a4	Add infrastructure for display filter functions. Add upper() and lower() display filter functions for string fields. svn path=/trunk/; revision=18071	2006-05-02 14:26:17 +00:00
Jörg Mayer	96adc5f4a1	- Include the .h files in their .c files. - Remove epan/dissectors/packet-sna.h, it isn't used anywhere. svn path=/trunk/; revision=15475	2005-08-20 16:19:22 +00:00
Guy Harris	8a8b883450	Set the svn:eol-style property on all text files to "native", so that they have LF at the end of the line on UNX and CR/LF on Windows; hopefully this means that if a CR/LF version is checked in on Windows, the CRs will be stripped so that they show up only when checked out on Windows, not on UNX. svn path=/trunk/; revision=11400	2004-07-18 00:24:25 +00:00
Guy Harris	e1e690ff3a	From Graeme Hewson: Use gint32 instead of guint32 for node data. Fix up some other signed-vs-unsigned issues in the display filter parser and lexical analyzer. svn path=/trunk/; revision=11085	2004-06-03 07:36:25 +00:00
Olivier Biot	1791f84919	First attempt at "bitwise AND" display filter operator. Document how a display operator can be added. svn path=/trunk/; revision=10250	2004-02-27 12:00:32 +00:00
Guy Harris	b9b4a23834	Make an existence test of an arbitrary entity syntactically valid, but check, in the semantics-checking phase, that we're testing a field, so that we can give a better message than, for example, "Unexpected end of filter string." for an existence test with a misspelled field name. svn path=/trunk/; revision=10043	2004-02-11 21:20:52 +00:00
Gilbert Ramirez	55a6251e7c	From Olivier Biot New "matches" operater in display filter language. Uses PCRE. If a "matches" operator is found in a dfilter while libpcre has not been used to build the binary, then an exception is thrown after using dfilter_fail() to set an apporporiate error message. svn path=/trunk/; revision=9182	2003-12-06 16:35:20 +00:00
Gilbert Ramirez	52338a3baf	Add a "contains" operator for byte-strings, strings, and tvbuffs (protocols). The search uses a naive approach; more work is required to add a Boyer-Moore Search algorithm. svn path=/trunk/; revision=8280	2003-08-27 15:23:11 +00:00

1 2

55 Commits