wireshark

osmith

Author	SHA1	Message	Date
João Valverde	c98df5eef5	dfilter: Print syntax tree using dftest + format enhancements Add argument to dfilter_compile_real() to save syntax tree text representation. Use it with dftest to print syntax tree. Misc debug output format improvements.	2022-04-05 12:04:37 +01:00
João Valverde	5cd0e4cc97	dfilter: Fix use after free with references By the time we are using the reference fvalue the tree may have gone away and with it the fvalue. We need to duplicate the reference fvalues and take ownership of the memory.	2022-03-30 14:05:22 +01:00
João Valverde	260942e170	dfilter: Refactor macro tree references This replaces the current macro reference system with a completely different implementation. Instead of a macro a reference is a syntax element. A reference is a constant that can be filled in the dfilter code after compilation from an existing protocol tree. It is best understood as a field value that can be read from a fixed tree that is not the frame being filtered. Usually this fixed tree is the currently selected frame when the filter is applied. This allows comparing fields in the filtered frame with fields in the selected frame. Because the field reference syntax uses the same sigil notation as a macro we have to use a heuristic to distinguish them: if the name has a dot it is a field reference, otherwise it is a macro name. The reference is synctatically validated at compile time. There are two main advantages to this implementation (and a couple of minor ones): The protocol tree for each selected frame is only walked if we have a display filter and if the display filter uses references. Also only the actual reference values are copied, intead of loading the entire tree into a hash table (in textual form even). The other advantage is that the reference is tested like a protocol field against all the values in the selected frame (if there is more than one). Currently the reference fields are not "primed" during dissection, so the entire tree is walked to find a particular reference (this is similar to the previous implementation). If the display filter contains a valid reference and the reference is not loaded at the time the filter is run the result is the same as a non existing field for a regular READ_TREE instruction. Fixes #17599.	2022-03-29 12:36:31 +00:00
João Valverde	9ee9b40b64	dfilter: Store expanded text	2022-03-28 17:22:01 +01:00
João Valverde	16729be2c1	dfilter: Add bitwise masking of bits Add support for masking of bits. Before the bitwise operator could only test bits, it did not support clearing bits. This allows testing if any combination of bits are set/unset more naturally with a single test. Previously this was only possible by combining several bitwise predicates. Bitwise is implemented as a test node, even though it is not. Maybe the test node should be renamed to something else. Fixes #17246.	2022-03-22 12:58:04 +00:00
João Valverde	631cf34f0c	dfilter: Use a function pointer array to free registers	2022-03-21 18:43:36 +00:00
João Valverde	94d909103e	dfilter: Remove DFVM constant initialization	2022-03-21 17:09:43 +00:00
João Valverde	22f3d87a8f	dfilter: Use singly linked list for registers Replace calls to list append with list prepend where applicable.	2022-03-21 11:47:19 +00:00
João Valverde	6d520addd1	dfilter: Add special syntax for literals and names The syntax for protocols and some literals like numbers and bytes/addresses can be ambiguous. Some protocols can be parsed as a literal, for example the protocol "fc" (Fibre Channel) can be parsed as 0xFC. If a numeric protocol is registered that will also take precedence over any literal, according to the current rules, thereby breaking numerical comparisons to that number. The same for an hypothetical protocol named "true", etc. To allow the user to disambiguate this meaning introduce new syntax. Any value prefixed with ':' or enclosed in <,> will be treated as a literal value only. The value :fc or <fc> will always mean 0xFC, under any context. Never a protocol whose filter name is "fc". Likewise any value prefixed with a dot will always be parsed as an identifier (protocol or protocol field) in the language. Never any literal value parsed from the token "fc". This allows the user to be explicit about the meaning, and between the two explicit methods plus the ambiguous one it doesn't completely break any one meaning. The difference can be seen in the following two programs: Filter: frame == fc Constants: Instructions: 00000 READ_TREE frame -> reg#0 00001 IF-FALSE-GOTO 5 00002 READ_TREE fc -> reg#1 00003 IF-FALSE-GOTO 5 00004 ANY_EQ reg#0 == reg#1 00005 RETURN -------- Filter: frame == :fc Constants: 00000 PUT_FVALUE fc <FT_PROTOCOL> -> reg#1 Instructions: 00000 READ_TREE frame -> reg#0 00001 IF-FALSE-GOTO 3 00002 ANY_EQ reg#0 == reg#1 00003 RETURN The filter "frame == fc" is the same as "filter == .fc", according to the current heuristic, except the first form will try to parse it as a literal if the name does not correspond to any registered protocol. By treating a leading dot as a name in the language we necessarily disallow writing floats with a leading dot. We will also disallow writing with an ending dot when using unparsed values. This is a backward incompatibility but has the happy side effect of making the expression {1...2} unambiguous. This could either mean "1 .. .2" or "1. .. 2". If we require a leading and ending digit then the meaning is clear: 1.0..0.2 -> 1.0 .. 0.2 Fixes #17731.	2022-03-05 11:10:54 +00:00
João Valverde	647decd509	dfilter: Avoid double strdup to save token value Store the lval token value instead.	2021-12-01 19:42:51 +00:00
João Valverde	3e0806ca09	dfilter: Remove dfilter_fail_parse() Instead of requiring a special error function in the parser just set the syntax_error flag if an error occurs, in any stage of compilation. Outside of the parser loop it will not be used but that is fine.	2021-11-30 19:52:05 +00:00
Moshe Kaplan	a523135202	epan: Add header files to Doxygen Add @file markers for epan headers so that Doxygen will generate documentation for them.	2021-11-30 08:46:49 +00:00
João Valverde	943c282009	dfilter: Parse character constants in lexer Invalid character constants should be handled in the lexical scanner. Todo: See if some code could be shared to parse double quoted strings. It also fixes some unintuitive type coercions to string. Character constants should be treated as characters, or maybe integers, or maybe even throw an invalid comparison error, but coverting to a literal string or byte array is surprising and not particularly useful: '\xFF' -> "'\xFF'" (equals) '\xFF' -> "FF" (contains) Before: Filter: http.request.method contains "\x63" Constants: 00000 PUT_FVALUE "c" <FT_STRING> -> reg#1 (...) Filter: http.request.method contains '\x63' Constants: 00000 PUT_FVALUE "63" <FT_STRING> -> reg#1 (...) Filter: http.request.method == "\x63" Constants: 00000 PUT_FVALUE "c" <FT_STRING> -> reg#1 (...) Filter: http.request.method == '\x63' Constants: 00000 PUT_FVALUE "'\\x63'" <FT_STRING> -> reg#1 (...) After: Filter: http.request.method contains '\x63' Constants: 00000 PUT_FVALUE "c" <FT_STRING> -> reg#1 (...) Filter: http.request.method == '\x63' Constants: 00000 PUT_FVALUE "c" <FT_STRING> -> reg#1 (...)	2021-11-24 08:40:20 +00:00
João Valverde	e7ecc9b9e5	dfilter: Clean up error format and exception code Misc code cleanups. Add some extra stnode functions for increased type safety. Fix a constness issue with df_lval_value().	2021-11-10 03:18:50 +00:00
João Valverde	146a840ad1	dfilter: Move a constructor to the grammar file	2021-11-06 11:45:21 +00:00
João Valverde	fb490eb172	dfilter: Move regex creation to semcheck	2021-11-06 11:45:21 +00:00
João Valverde	f78ebe1564	dfilter: Remove deprecated support for whitespace separator in sets	2021-10-31 09:13:18 +00:00
João Valverde	0e4851b025	dfilter: Use a string lval type in scanner Minor change to decouple the AST data structures from the lexical scanner. We pass a structure to allow for some future enhancements.	2021-10-27 09:27:45 +01:00
João Valverde	b1222edcd2	dfilter: Parse ranges in the drange node constructor Using a hand written tokenizer is simpler than using flex start conditions. Do the validation in the drange node constructor. Add validation for malformed ranges with different endpoint signs.	2021-10-27 06:02:07 +00:00
João Valverde	2e048df011	dfilter: Improve error message for "matches" Should be more obvious that this error is caused by a string syntax error and not something else.	2021-10-18 12:09:36 +00:00
João Valverde	a975d478ba	dfilter: Require double-quoted strings with "matches" Matches is a special case that looks on the RHS and tries to convert every unparsed value to a string, regardless of the LHS type. This is not how types work in the display filter. Require double-quotes to avoid ambiguity, because matches doesn't follow normal Wireshark display filter type rules. It doesn't need nor benefit from the flexibility provided by unparsed strings in the syntax. For matches the RHS is always a literal strings except if the RHS is also a field name, then it complains of an incompatible type. This is confusing. No type can be compatible because no type rules are ever considered. Every unparsed value is a text string except if it happens to coincide with a field name it also requires double-quoting or it throws a syntax error, just to be difficult. We could remove this odd quirk but requiring double-quotes for regular expressions is a better, more elegant fix. Before: Filter: tcp matches "udp" Constants: 00000 PUT_PCRE udp -> reg#1 Instructions: 00000 READ_TREE tcp -> reg#0 00001 IF-FALSE-GOTO 3 00002 ANY_MATCHES reg#0 matches reg#1 00003 RETURN Filter: tcp matches udp Constants: 00000 PUT_PCRE udp -> reg#1 Instructions: 00000 READ_TREE tcp -> reg#0 00001 IF-FALSE-GOTO 3 00002 ANY_MATCHES reg#0 matches reg#1 00003 RETURN Filter: tcp matches udp.srcport dftest: tcp and udp.srcport are not of compatible types. Filter: tcp matches udp.srcportt Constants: 00000 PUT_PCRE udp.srcportt -> reg#1 Instructions: 00000 READ_TREE tcp -> reg#0 00001 IF-FALSE-GOTO 3 00002 ANY_MATCHES reg#0 matches reg#1 00003 RETURN After: Filter: tcp matches "udp" Constants: 00000 PUT_PCRE udp -> reg#1 Instructions: 00000 READ_TREE tcp -> reg#0 00001 IF-FALSE-GOTO 3 00002 ANY_MATCHES reg#0 matches reg#1 00003 RETURN Filter: tcp matches udp dftest: "udp" was unexpected in this context. Filter: tcp matches udp.srcport dftest: "udp.srcport" was unexpected in this context. Filter: tcp matches udp.srcportt dftest: "udp.srcportt" was unexpected in this context. The error message could still be improved.	2021-10-17 22:53:36 +00:00
João Valverde	0d3bfedfb0	dfilter: Fixup deprecated tokens initialization Always use the internal API to access "deprecated" and initialize the data structure on demand. This fixes a null pointer dereference introduced previously. Use reference counting to share the array cleanly and avoid memory leaks. Keep the pointer in dfwork_t.	2021-10-14 16:49:23 +01:00
João Valverde	e91b5beafd	dfilter: Resolve field names in the parser The lexical rules for fields and unparsed strings are ambiguous, e.g. "fc" can be the protocol fibre channel or the byte 0xfc. In general a name is determined to be a protocol field or not by checking the registry. Resolving the name in the parser gives more flexibility, for example to use different semantic rules according to the relation between LHS and RHS, and allows function names and protocol names to co-exist without ambiguity. Before: Filter: tcp == 1 Constants: 00000 PUT_FVALUE 01 <FT_PROTOCOL> -> reg#1 Instructions: 00000 READ_TREE tcp -> reg#0 00001 IF-FALSE-GOTO 3 00002 ANY_EQ reg#0 == reg#1 00003 RETURN Filter: tcp() == 1 dftest: Syntax error near "(". After: Filter: tcp == 1 Constants: 00000 PUT_FVALUE 01 <FT_PROTOCOL> -> reg#1 Instructions: (same) Filter: tcp() == 1 dftest: Function 'tcp' does not exist It's also a goal to make it easier to modify the lexer rules. Ping #12810.	2021-10-14 16:45:19 +01:00
João Valverde	2c701ddf6f	dfilter: Improve grammar to parse ranges Do the integer conversion for ranges in the parser. This is more conventional, I think, and allows removing the unnecessary integer syntax tree node type. Try to minimize the number and complexity of lexical rules for ranges. But it seems we need to keep different states for integer and punctuation because of the need to disambiguate the ranges [-n-n] and [-n--n].	2021-10-08 19:18:56 +01:00
João Valverde	92285e6258	dfilter: Improve grammar to parse functions A function is grammatically an identifier that is followed by '(' and ')' according to some rules. We should avoid assuming a token is a function just because it matches a registered function name. Before: Filter: foobar(http.user_agent) contains "UPDATE" dftest: Syntax error near "(". After: Filter: foobar(http.user_agent) contains "UPDATE" dftest: The function 'foobar' does not exist. This has the problem that a function cannot have the same name as a protocol but that limitation already existed before.	2021-10-08 04:01:24 +00:00
João Valverde	0e7ba54d98	dfilter: Clean up handling of "deprecated" tokens Pass the deprecated data struture to the scanner and insert the deprecated tokens there. This avoids having to keep a dedicated syntax node field for this. Pass the deprecated argument in dfwork_t instead of in a separate argument. This is less cumbersome than adding an extra argument to every level of the semantic checker.	2021-09-30 17:26:19 +01:00
João Valverde	3ea2a61f2a	dfilter: Display syntax tree for debugging Use wslog to output debug information. Being able to control it at runtime is a big advantage. We extend the syntax tree nodes with a method to return a canonical string representation. Add a routine to walk the tree and return an textual representation for debugging purposes.	2021-09-30 16:29:11 +01:00
João Valverde	85c257431f	dfilter: Add support for raw strings Add support for a literal string specification copied from Python raw strings[1]. Raw string literals are enclosed with r"..." or R"...". Double quotes can be include in the string but they must be escaped with backslash. In escape sequences backslashes are preserved in the final result. So for example the string "a\\\"b" is the same as r"a\"b". r"\\\a" is the same as "\\\\\\a". Raw strings should be used for convenience wherever a regular expression is used in a display filter expression. [1]https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals	2021-06-05 02:46:40 +01:00
Peter Wu	6144951380	dfilter: fix memleaks with functions and slice operator Running tools/dfilter-test.py with LSan enabled resulted in 38 test failures due to memory leaks from "fvalue_new". Problematic dfilters: - Return values from functions, e.g. `len(data.data) > 8` (instruction CALL_FUNCTION invoking functions from epan/dfilter/dfunctions.c) - Slice operator: `data.data[1:2] == aa:bb` (function mk_range) These values end up in "registers", but as some values (from READ_TREE) reference the proto tree, a new tracking flag ("owns_memory") is added. Add missing tests for some functions and try to improve documentation. Change-Id: I28e8cf872675d0a81ea7aa5fac7398257de3f47b Reviewed-on: https://code.wireshark.org/review/27132 Petri-Dish: Peter Wu <peter@lekensteyn.nl> Reviewed-by: Peter Wu <peter@lekensteyn.nl> Tested-by: Petri Dish Buildbot Reviewed-by: Anders Broman <a.broman58@gmail.com>	2018-04-25 06:57:00 +00:00
Peter Wu	6a45dcd7a2	dfilter: require spaces as set element separator Previously a filter such as `http.request.method in {"GET"HEAD""}` would be parsed as three strings (GET, HEAD and an empty string). As it seems more likely that people make typos rather than intending to construct such a filter, forbid this by always requiring a whitespace separator. Change-Id: I77e531fd6be072f62dd06aac27f856106c8920c6 Reported-by: Stig Bjørlykke Reviewed-on: https://code.wireshark.org/review/26989 Petri-Dish: Peter Wu <peter@lekensteyn.nl> Tested-by: Petri Dish Buildbot Reviewed-by: Anders Broman <a.broman58@gmail.com>	2018-04-18 03:47:58 +00:00
Dario Lombardo	55c68ee69c	epan: use SPDX indentifiers. Skipping dissectors dir for now. Change-Id: I717b66bfbc7cc81b83f8c2cbc011fcad643796aa Reviewed-on: https://code.wireshark.org/review/25694 Petri-Dish: Dario Lombardo <lomato@gmail.com> Tested-by: Petri Dish Buildbot Reviewed-by: Anders Broman <a.broman58@gmail.com>	2018-02-08 19:29:45 +00:00
Guy Harris	59816ef00c	Make the Flex scanners and YACC parser in libraries reentrant. master-branch libpcap now generates a reentrant Flex scanner and Bison/Berkeley YACC parser for capture filter expressions, so it requires versions of Flex and Bison/Berkeley YACC that support that. We might as well do the same. For libwiretap, it means we could actually have multiple K12 text or Ascend/Lucent text files open at the same time. For libwireshark, it might not be as useful, as we only read configuration files at startup (which should only happen once, in one thread) or on demand (in which case, if we ever support multiple threads running libwireshark, we'd need a mutex to ensure that only one file reads it), but it's still the right thing to do. We also require a version of Flex that can write out a header file, so we change the runlex script to generate the header file ourselves. This means we require a version of Flex new enough to support --header-file. Clean up some other stuff encountered in the process. Change-Id: Id23078c6acea549a52fc687779bb55d715b55c16 Reviewed-on: https://code.wireshark.org/review/14719 Reviewed-by: Guy Harris <guy@alum.mit.edu>	2016-04-03 22:21:29 +00:00
Guy Harris	cfcbb28671	Clean up ftype-conversion and dfilter error message string handling. Have dfilter_compile() take an additional gchar ** argument, pointing to a gchar * item that, on error, gets set to point to a g_malloc()ed error string. That removes one bit of global state from the display filter parser, and doesn't impose a fixed limit on the error message strings. Have fvalue_from_string() and fvalue_from_unparsed() take a gchar ** argument, pointer to a gchar * item, rather than an error-reporting function, and set the gchar * item to point to a g_malloc()ed error string on an error. Allow either gchar ** argument to be null; if the argument is null, no error message is allocated or provided. Change-Id: Ibd36b8aaa9bf4234aa6efa1e7fb95f7037493b4c Reviewed-on: https://code.wireshark.org/review/6608 Reviewed-by: Guy Harris <guy@alum.mit.edu>	2015-01-18 10:22:59 +00:00
Peter Wu	f2b4daf400	Add printf-format annotations, fix garbage The WRETH dissector showed up some garbage in the column display. Upon further inspection, it turns out that the format string had a trailing percent sign which caused (unsigned)-1 to be returned by g_printf_string_upper_bound (in emem_strdup_vprintf). Then ep_alloc is called with (unsigned)-1 + 1 = 0 memory, no wonder that garbage shows up. ASAN could not even catch this error because EP is in charge of this. So, start adding G_GNUC_PRINTF annotations in each header that uses the "fmt" or "format" paramters (grepped + awk). This revealed some other errors. The NCP2222 dissector was missing a format string (not a security vuln though). Many dissectors used val_to_str with a constant (but empty) string, these have been replaced by val_to_str_const. ASN.1 dissectors were regenerated for this. Minor: the mate plugin used "%X" instead of "%p" for a pointer type. The ncp2222 dissector and wimax plugin gained modelines. Change-Id: I7f3f6a3136116f9b251719830a39a7b21646f622 Reviewed-on: https://code.wireshark.org/review/2881 Reviewed-by: Evan Huus <eapache@gmail.com>	2014-07-06 23:00:40 +00:00
Alexis La Goutte	296591399f	Remove all $Id$ from top of file (Using sed : sed -i '/^ \* \$Id\$/,+1 d') Fix manually some typo (in export_object_dicom.c and crc16-plain.c) Change-Id: I4c1ae68d1c4afeace8cb195b53c715cf9e1227a8 Reviewed-on: https://code.wireshark.org/review/497 Reviewed-by: Anders Broman <a.broman58@gmail.com>	2014-03-04 14:27:33 +00:00
Jakub Zawadzki	82f1fecf14	struct _dfilter_t: rename to epan_dfilter. typedef (dfilter_t) not renamed. svn path=/trunk/; revision=53765	2013-12-03 20:59:25 +00:00
Gerald Combs	c0702583d3	Make the minimum supported GLib version 2.16. svn path=/trunk/; revision=49444	2013-05-20 17:27:05 +00:00
Jakub Zawadzki	bf81b42e1e	Update Free Software Foundation address. (COPYING will be updated in next commit) svn path=/trunk/; revision=43536	2012-06-28 22:56:06 +00:00
Guy Harris	53a7a35e91	Neither num_registers nor max_registers in a dfilter_t are ever negative; make them unsigned. svn path=/trunk/; revision=30612	2009-10-18 23:25:33 +00:00
Stephen Fisher	726a1caaf1	- Remove GLIB1 code - Change ugly GLIB version checking statements to GLIB_CHECK_VERSION - Remove ws_strsplit files because we no longer need to borrow GLIB2's g_strsplit code for the no longer supported GLIB1 builds svn path=/trunk/; revision=24829	2008-04-07 05:22:54 +00:00
Anders Broman	1a2b14d60c	In glib 2.16 g_malloc Changed from: - gpointer g_malloc (gulong n_bytes) G_GNUC_MALLOC; to: + gpointer g_malloc (gsize n_bytes) G_GNUC_MALLOC; svn path=/trunk/; revision=24710	2008-03-21 16:10:47 +00:00
Gerald Combs	9703c2bb75	If "!=" or "ne" are used in a display filter, warn the user that the results may be unexpected. svn path=/trunk/; revision=24232	2008-01-31 19:50:38 +00:00
Bill Meier	b436aeaf5f	From Didier Gautheron: Bug #2042 : Move constants initialisation at compile time. svn path=/trunk/; revision=23659	2007-11-28 22:44:37 +00:00
Ronnie Sahlberg	89f022b12b	name change svn path=/trunk/; revision=18197	2006-05-21 05:12:17 +00:00
Jörg Mayer	c5ab5374c2	Some more 'char' -> 'const char' changes svn path=/trunk/; revision=15013	2005-07-23 06:53:59 +00:00
Guy Harris	8a8b883450	Set the svn:eol-style property on all text files to "native", so that they have LF at the end of the line on UNX and CR/LF on Windows; hopefully this means that if a CR/LF version is checked in on Windows, the CRs will be stripped so that they show up only when checked out on Windows, not on UNX. svn path=/trunk/; revision=11400	2004-07-18 00:24:25 +00:00
Guy Harris	95391789a3	From Graeme Hewson: Add a #define to enable parser tracing. Clean up parser state when finished parsing, even if we stopped parsing due to a syntax error, so that there's nothing left around to screw up the next parse. svn path=/trunk/; revision=11152	2004-06-15 10:38:14 +00:00
Guy Harris	60096bfad9	Use -1 rather than 0 as the SCAN_FAILED return value from the lexical analyzer on errors, and check for SCAN_FAILED from the lexical analyzer and abort the parse if we see it; 0 means "end of input", and we want to distinguish errors from end-of-input, so that we can report errors as such. If we see end-of-input while parsing a double-quoted string, report the error (missing closing quote). Fix the URL for the "Start conditions" section of the Flex manual. svn path=/trunk/; revision=10044	2004-02-11 22:52:54 +00:00
Jörg Mayer	48be4e530d	Removed trailing whitespaces from .h and .c files using the winapi_cleanup tool written by Patrik Stridvall for the wine project. svn path=/trunk/; revision=6116	2002-08-28 20:41:00 +00:00
Guy Harris	30f02bc99c	Move the code to build the balanced tree of fields into "proto_init()", move the code from "dfilter_lookup_token()" into "proto_registrar_get_byname()", and get rid of "dfilter_lookup_token()" and have its callers call "proto_registrar_get_byname()" instead. svn path=/trunk/; revision=5287	2002-04-29 07:55:32 +00:00

1 2

55 Commits