wireshark

Commit Graph

Author	SHA1	Message	Date
João Valverde	263bda375c	dfilter: Check if type supports unary minus Fix crash for types that do not support unary minus. Fixes #18750.	2022-12-21 14:43:39 +00:00
João Valverde	32f88ad22c	wmem: Remove strbuf max size parameter This parameter was introduced as a safeguard for bugs that generate an unbounded string but its utility for that purpose is doubtful and the way it is being used creates problems with invalid truncation of UTF-8 strings. Rename wmem_strbuf_sized_new() with a better name.	2022-12-03 01:54:52 +00:00
João Valverde	967a3c3df9	Qt: Check field autocomplete for syntactical validity Currently the autocompletion engine always suggests a protocol field completion, even in places where it isn't syntactically valid. Fix that by compiling the preamble to the token under the cursor and checking the returned error. If it is DF_ERROR_UNEXPECTED_END that indicates a field or literal value was expected. Otherwise a field replacement is not valid in this position. Fixes #12811.	2022-12-01 22:50:09 +00:00
João Valverde	b116ccd6d5	dfilter: Replace compile booleans arguments with a bit flag	2022-11-30 17:36:17 +00:00
João Valverde	84e75be5c6	dfilter: Add optimization flag When we are just testing code to see if it compiles performing optimizations is wasteful. Add an option to disable them.	2022-11-30 17:36:17 +00:00
João Valverde	93814ef740	dfilter: Always set error pointer in case of failure	2022-11-30 15:00:34 +00:00
João Valverde	a0d77e9329	dfilter: Return an error object instead of string Return an struct containing error information. This simplifies the interface to more easily provide richer diagnostics in the future. Add an error code besides a human-readable error string to allow checking programmatically for errors in a robust manner. Currently there is only a generic error code, it is expected to increase in the future. Move error location information to the struct. Change callers and implementation to use the new interface.	2022-11-28 15:46:44 +00:00
João Valverde	79c3a77752	Add macros to control lemon diagnostics Rename flex macros using parenthesis (mostly a style issue): DIAG_OFF_FLEX -> DIAG_OFF_FLEX() DIAG_ON_FLEX -> DIAG_ON_FLEX() Use the same kind of construct with lemon generated code using DIAG_OFF_LEMON() and DIAG_ON_LEMON(). Use %include and %code directives to enforce the desired order with generated code in the middle in between pragmas. Fix a clang-specific pragma to use DIAG_OFF_CLANG(). DIAG_OFF(unreachable-code) -> DIAG_OFF_CLANG(unreachable-code). Apparently GCC is ignoring the -Wunreachable flag, that's why it did not trigger an unknown pragma warning. From [1}: The -Wunreachable-code has been removed, because it was unstable: it relied on the optimizer, and so different versions of gcc would warn about different code. The compiler still accepts and ignores the command line option so that existing Makefiles are not broken. In some future release the option will be removed entirely. - Ian [1] https://gcc.gnu.org/legacy-ml/gcc-help/2011-05/msg00360.html	2022-11-20 10:11:27 +00:00
João Valverde	2443df7318	Disable another -Wunreachable lemon warning	2022-11-17 11:21:41 +00:00
Peter Wu	df478a365d	dfilter: treat carriage returns as whitespace Fixes #18595	2022-11-07 01:00:50 +00:00
João Valverde	4c2d0f16d4	dfilter: Improve representation of raw field references Instead of using the abstract type "<RAW>", which might be confusing, show FT_BYTES, but display the representation with the "@" operator, so it's not even more confusing in error messages why a field might flip-flop types. Refactor the field tostr() function and some other clean ups. Before: ``` Filter: _ws.ftypes.string ==${@frame.len} dftest: _ws.ftypes.string and frame.len <RAW> are not of compatible types. _ws.ftypes.string ==${@frame.len} ^~~~~~~~~ ``` After: ``` Filter: _ws.ftypes.string ==${@frame.len} dftest: _ws.ftypes.string <FT_STRING> and @frame.len <FT_BYTES> are not of compatible types. _ws.ftypes.string ==${@frame.len} ^~~~~~~~~ ```	2022-10-31 21:02:39 +00:00
João Valverde	b83658d8a4	dfilter: Add suport for raw addressing with references Extends raw adressing syntax to wok with references. The syntax is @field1 == ${@field2} This requires replicating the logic to load field references, but using raw values instead. We use separate hash tables for that, namely "references" vs "raw_references".	2022-10-31 21:02:39 +00:00
João Valverde	0853ddd1cb	dfilter: Add support for raw (bytes) addressing mode This adds new syntax to read a field from the tree as bytes, instead of the actual type. This is a useful extension for example to match matformed strings that contain unicode replacement characters. In this case it is not possible to match the raw value of the malformed string field. This extension fills this need and is generic enough that it should be useful in many other situations. The syntax used is to prefix the field name with "@". The following artificial example tests if the HTTP user agent contains a particular invalid UTF-8 sequence: @http.user_agent == "Mozill\xAA" Where simply using "http.user_agent" won't work because the invalid byte sequence will have been replaced with U+FFFD. Considering the following programs: $ dftest '_ws.ftypes.string == "ABC"' Filter: _ws.ftypes.string == "ABC" Syntax tree: 0 TEST_ANY_EQ: 1 FIELD(_ws.ftypes.string <FT_STRING>) 1 FVALUE("ABC" <FT_STRING>) Instructions: 00000 READ_TREE _ws.ftypes.string <FT_STRING> -> reg#0 00001 IF_FALSE_GOTO 3 00002 ANY_EQ reg#0 == "ABC" <FT_STRING> 00003 RETURN $ dftest '@_ws.ftypes.string == "ABC"' Filter: @_ws.ftypes.string == "ABC" Syntax tree: 0 TEST_ANY_EQ: 1 FIELD(_ws.ftypes.string <RAW>) 1 FVALUE(41:42:43 <FT_BYTES>) Instructions: 00000 READ_TREE @_ws.ftypes.string <FT_BYTES> -> reg#0 00001 IF_FALSE_GOTO 3 00002 ANY_EQ reg#0 == 41:42:43 <FT_BYTES> 00003 RETURN In the second case the field has a "raw" type, that equates directly to FT_BYTES, and the field value is read from the protocol raw data.	2022-10-31 21:02:39 +00:00
João Valverde	31a0147daa	dfilter: Pass a value by reference The lifetime of the reference is longer than the runtime so avoid an unecessary fvalue dup.	2022-10-31 21:02:39 +00:00
João Valverde	0583b76204	dfilter: Remove unused data structure	2022-10-31 21:02:39 +00:00
João Valverde	0662a3f6ac	dfilter: Amend a numeric pattern in the scanner We amend the :<numeric> pattern to not eat the leading colon. Because the colon can be part of the value (with IPv6 addresses for example) we want to avoid doing that. IPv6 addresses are covered by their own rules but this removes the requirement in the future to handle any special cases and avoids surprises. For this reason the colon-prefix syntax is already explicitly defined to work only for byte arrays and there is currently no universal syntax for all literal values or even all numbers. Other numbers can keep using the lexical type "unparsed". ``` run/dftest "_ws.ftypes.uint8 == :fd" Filter: _ws.ftypes.uint8 == :fd dftest: ":fd" is not a valid number. _ws.ftypes.uint8 == :fd ^~~ run/dftest "_ws.ftypes.uint8 == fd" Filter: _ws.ftypes.uint8 == fd dftest: "fd" is not a valid number. _ws.ftypes.uint8 == fd ^~ run/dftest "_ws.ftypes.uint8 == 0xfd" Filter: _ws.ftypes.uint8 == 0xfd Syntax tree: 0 TEST_ANY_EQ: 1 FIELD(_ws.ftypes.uint8 <FT_UINT8>) 1 FVALUE(253 <FT_UINT8>) Instructions: 00000 READ_TREE _ws.ftypes.uint8 <FT_UINT8> -> reg#0 00001 IF_FALSE_GOTO 3 00002 ANY_EQ reg#0 == 253 <FT_UINT8> 00003 RETURN run/dftest "_ws.ftypes.bytes == fd" Filter: _ws.ftypes.bytes == fd Syntax tree: 0 TEST_ANY_EQ: 1 FIELD(_ws.ftypes.bytes <FT_BYTES>) 1 FVALUE(fd <FT_BYTES>) Instructions: 00000 READ_TREE _ws.ftypes.bytes <FT_BYTES> -> reg#0 00001 IF_FALSE_GOTO 3 00002 ANY_EQ reg#0 == fd <FT_BYTES> 00003 RETURN run/dftest "_ws.ftypes.bytes == :fd" Filter: _ws.ftypes.bytes == :fd Syntax tree: 0 TEST_ANY_EQ: 1 FIELD(_ws.ftypes.bytes <FT_BYTES>) 1 FVALUE(fd <FT_BYTES>) Instructions: 00000 READ_TREE _ws.ftypes.bytes <FT_BYTES> -> reg#0 00001 IF_FALSE_GOTO 3 00002 ANY_EQ reg#0 == fd <FT_BYTES> 00003 RETURN ```	2022-10-08 09:51:49 +00:00
João Valverde	14f5121c4a	dfilter: Remove problematic <...> literal syntax The <...> syntax for literals, intended to be as generic as possible, unintentionally introduced an ambiguity with the relational expression "a < b or a > c". Literals are values like numbers, bytes, IPv6 addresses or, one could imagine, UNC paths for example, if an FT_UNC type were to be added in the future. We could use a new unique symbol like @...@ but the <...> syntax is very recent and may not be necessary with ":xxx" so just remove it. A byte array can be explicitly declared by prefixing with a colon. It is not as generic but the main ambiguity that this new syntax attempted to solve is bytes vs protocol names. We don't want to introduce a new reserved symbol for now, until other requirements if any are more clear. Fixes #18418.	2022-10-08 09:51:49 +00:00
João Valverde	0816e317cb	dfilter: Fix crash with FT_NONE and arithmetic expressions Do the first ftype-can check in an arithmetic expressions before evaluating the second term to be sure we do not allow FT_NONE as a valid LHS ftype. $ dftest '_ws.ftypes.none + 1 == 2' Filter: _ws.ftypes.none + 1 == 2 dftest: FT_NONE cannot +. _ws.ftypes.none + 1 == 2 ^~~~~~~~~~~~~~~	2022-07-28 16:50:09 +00:00
João Valverde	84f54d54e5	dfilter: Fix a crash using abs() Passing a literal value to abs() on the LHS segfaults, because it is incorrectly assumed to be a valid field. We need to check if we actually have a field. While at it improve the diagnostic of literals.	2022-07-19 19:11:47 +01:00
Alexis La Goutte	b448b6a591	semcheck: fix -Wmissing-prototypes semcheck.c:1110:1: warning: no previous prototype for function 'check_arithmetic_entity'	2022-07-15 13:45:52 +00:00
Alexis La Goutte	bd28c19ad6	dvfm: Fix -Wmissing-prototypes dfvm.c:206:1: warning: no previous prototype for function 'dfvm_value_tostr' dfvm.c:550:1: warning: no previous prototype for function 'filter_finfo_fvalues' dfvm.c:645:1: warning: no previous prototype for function 'filter_refs_fvalues'	2022-07-15 13:45:52 +00:00
João Valverde	4c975b770e	dfilter: Improve compatibility of integer types Before: $ dftest '_ws.ftypes.int64 == _ws.ftypes.int8' Filter: _ws.ftypes.int64 == _ws.ftypes.int8 dftest: _ws.ftypes.int64 and _ws.ftypes.int8 are not of compatible types. _ws.ftypes.int64 == _ws.ftypes.int8 ^~~~~~~~~~~~~~~ After: $ dftest '_ws.ftypes.int64 == _ws.ftypes.int8' Filter: _ws.ftypes.int64 == _ws.ftypes.int8 Syntax tree: 0 TEST_ANY_EQ: 1 FIELD(_ws.ftypes.int64 <FT_INT64>) 1 FIELD(_ws.ftypes.int8 <FT_INT8>) Instructions: 00000 READ_TREE _ws.ftypes.int64 <FT_INT64> -> reg#0 00001 IF_FALSE_GOTO 5 00002 READ_TREE _ws.ftypes.int8 <FT_INT8> -> reg#1 00003 IF_FALSE_GOTO 5 00004 ANY_EQ reg#0 === reg#1 00005 RETURN	2022-07-14 20:12:30 +00:00
João Valverde	f68f172454	dfilter: Remove a debug message Still too noisy even with noisy level.	2022-07-13 16:06:28 +00:00
João Valverde	6c8a8d7960	dfilter: Fix dfvm code string All/any equal have their own symbols for operators so cannot be handled in the same switch case. Other comparisons don't have different symbols for any/all.	2022-07-13 00:37:12 +01:00
João Valverde	5e3a7e9ab8	dfilter: Small optimization for "not all zero" code Remove extra NOT instruction. Also remove unused ANY_ZERO opcode.	2022-07-05 09:58:43 +01:00
João Valverde	a877f2d5f3	dfilter: Allow existence check for slices Allow checking if a slice exists. The result is true if the slice has length greater than zero. The len() function is implemented as a DFVM instruction instead. The semantics are the same.	2022-07-04 22:45:14 +00:00
João Valverde	0fc81c21b2	dfilter: Cleanup scanner value setters	2022-07-04 22:15:40 +00:00
João Valverde	8d93f0920a	dfilter: Fix some debug strings	2022-07-02 21:21:12 +01:00
João Valverde	eb8acd088e	dfilter: Rename dfvm opcodes with a namespace prefix	2022-07-02 11:46:45 +01:00
João Valverde	fc5c81328e	dfilter: Rename test syntax tree node Test node also includes arithmetic operations so rename it to a generic "operator" node.	2022-07-02 11:39:17 +01:00
João Valverde	b10db887ce	dfilter: Remove unparsed syntax type and RHS literal bias This removes unparsed name resolution during the semantic check because it feels like a hack to work around limitations in the language syntax, that should be solved at the lexical level instead. We were interpreting unparsed differently on the LHS and RHS. Now an unparsed value is always a field if it matches a registered field name (this matches the implementation in 3.6 and before). This requires tightening a bit the allowed filter names for protocols to avoid some common and potentially weird conflicting cases. Incidentally this extends set grammar to accept all entities. That is experimental and may be reverted in the future.	2022-07-02 11:18:20 +01:00
Roland Knall	8bdff72625	dfilter: Fix undefined dereference and add null check A value of ref could be accessed undefined and add additional checks to ensure, that refs_array actually contains data or return null immediately	2022-06-27 14:57:01 +00:00
João Valverde	efbe699756	dfilter: Remove STTYPE_RANGE_NODE STTYPE_RANGE_NODE is just a lexical token, it is not used withi the syntax tree so remove it.	2022-06-25 16:06:48 +01:00
João Valverde	aaff0d21ae	dfilter: Add layer support for references This adds support for using the layers filter with field references. Before: $ dftest 'ip.src != ${ip.src#2}' dftest: invalid character in macro name After: $ dftest 'ip.src != ${ip.src#2}' Filter: ip.src != ${ip.src#2} Syntax tree: 0 TEST_ALL_NE: 1 FIELD(ip.src <FT_IPv4>) 1 REFERENCE(ip.src#[2:1] <FT_IPv4>) Instructions: 00000 READ_TREE ip.src <FT_IPv4> -> reg#0 00001 IF_FALSE_GOTO 5 00002 READ_REFERENCE_R ${ip.src <FT_IPv4>} #[2:1] -> reg#1 00003 IF_FALSE_GOTO 5 00004 ALL_NE reg#0 != reg#1 00005 RETURN This requires adding another level of complexity to references. When loading references we need to copy the 'proto_layer_num' and add the logic to filter on that. The "layer" sttype is removed and replace by a new field sttype with support for a range. This is a nice cleanup for the semantic check and general simplification. The grammar is better too with this design. Range sttype is renamed to slice for clarity.	2022-06-25 14:57:40 +01:00
João Valverde	8793650707	dftest: Print ftype of protocol fields	2022-06-24 21:10:45 +00:00
João Valverde	354e0d7edf	dfilter: Add support for unicode escape sequences Add support for entering unicode codepoints as \uNNNN or \uNNNNNNNN for strings and charconsts (following the C standard).	2022-06-21 16:54:16 +01:00
João Valverde	47348ae598	dfilter: Add support for literal strings with null bytes Before: Filter: frame matches "abc\x00def" dftest: \x00 (NUL byte) cannot be used with a regular string. frame matches "abc\x00def" ^~~~ Filter: _ws.ftypes.string == "a string with a \0 byte" dftest: \0 (NUL byte) cannot be used with a regular string. _ws.ftypes.string == "a string with a \0 byte" ^~ After: Filter: frame matches "abc\x00def" Syntax tree: 0 TEST_MATCHES: 1 FIELD(frame) 1 PCRE(abc\0def) Instructions: 00000 READ_TREE frame -> reg#0 00001 IF_FALSE_GOTO 3 00002 ANY_MATCHES reg#0 matches abc\0def 00003 RETURN Filter: _ws.ftypes.string == "a string with a \0 byte" Syntax tree: 0 TEST_ANY_EQ: 1 FIELD(_ws.ftypes.string) 1 FVALUE("a string with a \0 byte" <FT_STRING>) Instructions: 00000 READ_TREE _ws.ftypes.string -> reg#0 00001 IF_FALSE_GOTO 3 00002 ANY_EQ reg#0 == "a string with a \0 byte" <FT_STRING> 00003 RETURN Fixes issue #16156.	2022-06-21 15:10:08 +00:00
João Valverde	0615ba6317	ftypes: Make accessor functions type safe	2022-06-20 17:29:57 +00:00
João Valverde	de103394fe	dfilter: Make regex matches case insensitive by default	2022-06-08 12:17:22 +01:00
Alexis La Goutte	8ee1eabeee	dfvm(dfilter): fix clang analyzer warning (Dead.Store)	2022-05-22 08:40:44 +00:00
João Valverde	bebf7afa37	dfilter: Remove unused DFVM 4th instruction argument	2022-05-13 14:13:18 +01:00
João Valverde	3bb918428e	dfilter: Remove stale comment	2022-05-13 12:50:33 +00:00
João Valverde	ac901e5de8	dfilter: Fix maybe-unitialized warning [1702/2528] Building C object epan/dfilter/CMakeFiles/dfilter.dir/dfvm.c.o In function ‘drange_contains_layer’, inlined from ‘filter_finfo_fvalues’ at /home/jpv/code/wireshark/wireshark/epan/dfilter/dfvm.c:587:21: /home/jpv/code/wireshark/wireshark/epan/dfilter/dfvm.c:555:41: warning: ‘upper’ may be used uninitialized [-Wmaybe-uninitialized] 555 \| if (num >= lower && num <= upper) { /* inclusive */ \| ~~~~^~~~~~~~ /home/jpv/code/wireshark/wireshark/epan/dfilter/dfvm.c: In function ‘filter_finfo_fvalues’: /home/jpv/code/wireshark/wireshark/epan/dfilter/dfvm.c:537:20: note: ‘upper’ was declared here 537 \| int lower, upper; \| ^~~~~	2022-05-13 13:22:29 +01:00
João Valverde	b602911b31	dfilter: Add support for universal quantifiers Adds the keywords "any" and "all" to implement the quantification to any existing relational operator. Filter: all tcp.port in {100, 2000..3000} Syntax tree: 0 ALL TEST_IN: 1 FIELD(tcp.port) 1 SET(#2): 2 FVALUE(100 <FT_UINT16>) 2 FVALUE(2000 <FT_UINT16>) .. FVALUE(3000 <FT_UINT16>) Instructions: 00000 READ_TREE tcp.port -> reg#0 00001 IF_FALSE_GOTO 5 00002 ALL_EQ reg#0 === 100 <FT_UINT16> 00003 IF_TRUE_GOTO 5 00004 ALL_IN_RANGE reg#0 in { 2000 <FT_UINT16> .. 3000 <FT_UINT16> } 00005 RETURN	2022-05-12 14:26:54 +01:00
João Valverde	164f3ce9a2	dfilter: Improve syntax tree display format for sets	2022-05-12 14:06:33 +01:00
Joakim Karlsson	b75b8ca72e	dfilter: fix may be used uninitialized in this function [-Wmaybe-uninitialized]	2022-04-27 13:36:43 +02:00
João Valverde	4f3f507eee	dfilter: Add syntax to match specific layers in the protocol stack Add support to display filters for matching a specific layer within a frame. Layers are counted sequentially up the protocol stack. Each protocol (dissector) that appears in the stack is one layer. LINK-LAYER#1 <-> IP#1 <-> TCP#1 <-> IP#2 <-> TCP#2 <-> etc. The syntax allows for negative indexes and ranges with the usual semantics for slices (but note that counting starts at one): tcp.port#[2-4] == 1024 Matches layers 2 to 4 inclusive. Fixes #3791.	2022-04-26 16:50:59 +00:00
João Valverde	c0170dad42	dfilter: Rename "range" to "slice" The word range is used for different things with different meanings and that is confusing. Avoid using "range" in code to mean "slice". A range is one or more intervals with a lower and upper bound. A slice is a range applied to a bytes field. Replace range with slice wherever appropriate. This usage of "slice" instead of range is generally correct and consistent in the documentation.	2022-04-26 16:50:59 +00:00
Gerald Combs	a73fd872ad	dfilter: Add a null check. Try to fix *** CID 1504179: Null pointer dereferences (FORWARD_NULL) /builds/wireshark/wireshark/epan/dfilter/dfvm.c: 327 in dfvm_dump_str() 321 stack_print = dump_str_stack_push(stack_print, arg1_str); 322 break; 323 324 case STACK_POP: 325 wmem_strbuf_append_printf(buf, "%05d STACK_POP\t%s\n", id, arg1_str); 326 for (i = 0; i < arg1->value.numeric; i ++) { >>> CID 1504179: Null pointer dereferences (FORWARD_NULL) >>> Passing null pointer "stack_print" to "dump_str_stack_pop", which dereferences it. 327 stack_print = dump_str_stack_pop(stack_print); 328 } 329 break; 330 331 case MK_RANGE: 332 wmem_strbuf_append_printf(buf, "%05d MK_RANGE\t\t%s[%s] -> %s\n",	2022-04-21 17:10:44 +00:00
João Valverde	fab32ea0cb	dfilter: Allow arithmetic expressions as function arguments This allows writing moderately complex expressions, for example a float epsilon test (#16483): Filter: {abs(_ws.ftypes.double - 1) / max(abs(_ws.ftypes.double), abs(1))} < 0.01 Syntax tree: 0 TEST_LT: 1 OP_DIVIDE: 2 FUNCTION(abs#1): 3 OP_SUBTRACT: 4 FIELD(_ws.ftypes.double) 4 FVALUE(1 <FT_DOUBLE>) 2 FUNCTION(max#2): 3 FUNCTION(abs#1): 4 FIELD(_ws.ftypes.double) 3 FUNCTION(abs#1): 4 FVALUE(1 <FT_DOUBLE>) 1 FVALUE(0.01 <FT_DOUBLE>) Instructions: 00000 READ_TREE _ws.ftypes.double -> reg#1 00001 IF_FALSE_GOTO 3 00002 SUBRACT reg#1 - 1 <FT_DOUBLE> -> reg#2 00003 STACK_PUSH reg#2 00004 CALL_FUNCTION abs(reg#2) -> reg#0 00005 STACK_POP 1 00006 IF_FALSE_GOTO 24 00007 READ_TREE _ws.ftypes.double -> reg#1 00008 IF_FALSE_GOTO 9 00009 STACK_PUSH reg#1 00010 CALL_FUNCTION abs(reg#1) -> reg#4 00011 STACK_POP 1 00012 IF_FALSE_GOTO 13 00013 STACK_PUSH reg#4 00014 STACK_PUSH 1 <FT_DOUBLE> 00015 CALL_FUNCTION abs(1 <FT_DOUBLE>) -> reg#5 00016 STACK_POP 1 00017 IF_FALSE_GOTO 18 00018 STACK_PUSH reg#5 00019 CALL_FUNCTION max(reg#5, reg#4) -> reg#3 00020 STACK_POP 2 00021 IF_FALSE_GOTO 24 00022 DIVIDE reg#0 / reg#3 -> reg#6 00023 ANY_LT reg#6 < 0.01 <FT_DOUBLE> 00024 RETURN We now use a stack to pass arguments to the function. The stack is implemented as a list of lists (list of registers). Arguments may still be non-existent to functions (this is a feature). Functions must check for nil arguments (NULL lists) and handle that case. It's somewhat complicated to allow literal values and test compatibility for different types, both because of lack of type information with unparsed/literal and also because it is an underdeveloped area in the code. In my limited testing it was good enough and useful, further enhancements are left for future work.	2022-04-18 17:10:31 +01:00
João Valverde	eb2a9889c3	dfilter: Add abs() function Add an absolute value function for ftypes.	2022-04-18 17:09:00 +01:00
Alexis La Goutte	83959f77e3	dfvm: Fix Dead Store found by Clang Analyzer	2022-04-16 18:15:45 +00:00
João Valverde	af878388fe	dfilter: Fix scanning of strings The code was ignoring a SCAN_FAILED return value.	2022-04-15 22:51:15 +01:00
João Valverde	827d143e6e	dfilter: Allow function arguments to be non-existent. Instead of not calling the function if an argument is non-existent (read tree fails), call the function and let the function handle the condition.	2022-04-14 13:07:41 +00:00
João Valverde	cb2f085f14	dfilter: Add max() and min() functions Changes the function calling convention to pass the first register number plus the number of registers after that sequentially. This allows function with any number of arguments. Functions can still only return one value. Adds max() and min() function to select the maximum/minimum value from any number of arguments, all of the same type. The functions accept literals too. The return type is the same as the first argument (cannot be a literal).	2022-04-14 13:07:41 +00:00
João Valverde	8746eea297	dfilter: Try to resolve field reference instead of using a heuristic Instead of using a heuristic to decide whether the form ${...} is a macro or not, try to resolve the name to a registered protocol field and use that instead. This increases somewhat the surface for clobbering existing macro names with new field registrations but we'll cross that bridge when we get to it. Rejecting protocol field types reduces this probability again but it may not be intuitive to the user trying to mistakenly use a reference to a protocol why it is parsed as a macro. The reasons for rejecting FT_PROTOCOL types as not interesting field references are not very strong but it seems reasonable. $ dftest 'frame.number != ${frame.number}' Filter: frame.number != ${frame.number} Instructions: 00000 READ_TREE frame.number -> reg#0 00001 IF_FALSE_GOTO 5 00002 READ_REFERENCE ${frame.number} -> reg#1 00003 IF_FALSE_GOTO 5 00004 ALL_NE reg#0 != reg#1 00005 RETURN $ dftest 'frame != ${frame}' dftest: macro 'frame' does not exist	2022-04-12 14:03:18 +00:00
João Valverde	09696f1762	Try to fix a narrowing warning "C:\Development\wsbuild64\Wireshark.sln" (default target) (1) -> "C:\Development\wsbuild64\epan\dfilter\dfilter.vcxproj.metaproj" (default target) (18) -> "C:\Development\wsbuild64\epan\dfilter\dfilter.vcxproj" (default target) (108) -> (ClCompile target) -> C:/Development/wireshark/epan/dfilter/scanner.l(463,54): warning C4267: '+=': conversion from 'size_t' to 'int ', possible loss of data [C:\Development\wsbuild64\epan\dfilter\dfilter.vcxproj] C:/Development/wireshark/epan/dfilter/scanner.l(463,54): warning C4267: state->location.col_start += sta te->location.col_len; [C:\Development\wsbuild64\epan\dfilter\dfilter.vcxproj] C:/Development/wireshark/epan/dfilter/scanner.l(463,54): warning C4267: ^ (compiling source file C:\Development\wsbuild64\epan\dfilter\scanner.c) [C:\Development\ws build64\epan\dfilter\dfilter.vcxproj]	2022-04-11 22:23:13 +01:00
João Valverde	2f02cd6e19	dfilter: Handle missing error location more gracefully If we don't have an offset, don't print anything with underline. Also it can underline filters using macros correctly now. $ tshark -Y 'ip and ${private_ipv4:ip.sr}' -r /dev/null tshark: Left side of "==" expression must be a field or function, not "ip.sr". ip and ip.sr == 192.168.0.0/16 or ip.sr == 172.16.0.0/12 or ip.sr == 10.0.0.0/8 ^~~~~	2022-04-11 21:03:06 +00:00
João Valverde	24443fa33a	tshark: Add underline to dfilter errors $ tshark -Y 'frame.number == 123foobar and ip' -r /dev/null tshark: "123foobar" is not a valid number. frame.number == 123foobar and ip ^~~~~~~~~	2022-04-11 19:25:37 +00:00
João Valverde	4d9470e7dd	dfilter: Add location tracking to scanner and use it to report errors Add location tracking as a column offset and length from offset to the scanner. Our input is a single line only so we don't need to track line offset. Record that information in the syntax tree. Return the error location in dfilter_compile(). Use it in dftest to mark the location of the error in the filter string. Later it would be nice to use the location in the GUI as well. $ dftest "ip.proto == aaaaaa and tcp.port == 123" Filter: ip.proto == aaaaaa and tcp.port == 123 dftest: "aaaaaa" cannot be found among the possible values for ip.proto. ip.proto == aaaaaa and tcp.port == 123 ^~~~~~	2022-04-10 10:09:51 +01:00
João Valverde	da19379eb5	dfilter: Create the syntax node in the scanner and pass that Revert to passing a syntax node from the lexical scanner to the grammar parser. Using a union is not having a discernible advantage and requires duplicating a lot of properties of syntax nodes.	2022-04-10 09:54:03 +01:00
João Valverde	fb9a176587	dfilter: Allow grouping arithmetical expressions with { } This removes the limitation of having only two terms in an arithmetic expression and allows setting the precedence using curly braces (like any basic calculator). Our grammar currently does not allow grouping arithmetic expressions using parenthesis, because boolean expressions and arithmetic expressions are different and parenthesis are used with the former.	2022-04-08 23:12:04 +01:00
João Valverde	cc5726b63f	dfilter: Remove leading colon special meaning Instead of saying a leading colon will make any token a literal value, say it is part of the syntax of bytes arrays. This is useful to write bytes without a separator, and other potentially ambiguous formats. The restriction in meaning to bytes and simple numeric values should make the rules for handling a leading colon (specifically ommiting it or not) saner without much loss of functionality.	2022-04-07 00:16:07 +01:00
João Valverde	0313cd02bc	dfilter: Fix RHS bias for literal values Fixes `a3b76138f0`.	2022-04-06 23:46:22 +01:00
João Valverde	7429832db4	Fix a log message	2022-04-06 23:42:04 +01:00
João Valverde	5584aba326	dfilter: Fix slice using range [:j] Fixes: $ dftest 'frame[:10] contains 0xff' dftest: ":10" is not a valid range.	2022-04-06 18:35:10 +01:00
João Valverde	a6f37323e6	dfilter: Clean up lexical scanning	2022-04-06 18:11:27 +01:00
João Valverde	8108e67de7	dfilter: Fix memory leak with leading colon When retrying fvalue_from_literal() we were leaking the error message string. Refactor the code to avoid the retry. This assumes the only valid use of a leading ':' with a literal is for an IPv6 address. Bytes with leading ':' are supported but the colon is skipped, so the parser doesn't see it. Fixes `df0fc8b517`.	2022-04-06 18:09:12 +01:00
João Valverde	12c8cc32f0	dfilter: Fix parsing of some IPv6 compressed addresses Fix parsing of some IPv6 addresses and add tests. Also pass tokens as unparsed unless the user was specfic about the semantic type. For example the IPv4 address 1.1.1.1 is also a valid field, but 1.1.1.1/128 is not (because of the slash). However choose not to enforce the distinction in the lexical scanner and pass everything as unparsed unless the meaning is explicit in the syntax with leading dot, colon, or between angle branckets.	2022-04-06 10:10:04 +01:00
João Valverde	7ed5d5036e	dfilter: restore support for identifiers using hyphen Restores support for filters such as "mac-lte", that was broken in `330d408328`. This means we are not able to support arithmetic expressions with binary minus without spaces. $ dftest 'tcp.port == 1-2' dftest: "1-2" is not a valid number.	2022-04-05 15:38:20 +01:00
João Valverde	8fb28f5161	dfilter: Minor grammar cleanup Remove duplication for arithmetic expressions.	2022-04-05 12:04:37 +01:00
João Valverde	20afbd46ec	dfilter: Remove existence test syntax tree nodes After some experimentation I don't think these two existence tests belong in the grammar, it's an implementation detail and removing it might avoid some artificial constraints.	2022-04-05 12:04:37 +01:00
João Valverde	fb08c4b4a8	dfilter: Replace bitwise sttype with arithmetic Most of the bitwise codepaths are just duplicating code for the arithmetic type. Parse bitwise expressions as arithmetic instead.	2022-04-05 12:04:37 +01:00
João Valverde	c98df5eef5	dfilter: Print syntax tree using dftest + format enhancements Add argument to dfilter_compile_real() to save syntax tree text representation. Use it with dftest to print syntax tree. Misc debug output format improvements.	2022-04-05 12:04:37 +01:00
João Valverde	d91734ab6a	dfilter: Fix range registers in DFVM dump	2022-04-05 12:04:37 +01:00
João Valverde	330d408328	dfilter: Allow arithmetic expressions without spaces To allow an arithmetic expressions without spaces, such as "1+2", we cannot match the expression in other lexical rules using "+". Because of longest match this becomes the token LITERAL or UNPARSED with semantic value "1+2". The same goes for all the other arithmetic operators. So we need to remove [+-/%] from "word chars" and add very specific patterns (that won't mistakenly match an arithmetic expression) for those literal or unparsed tokens we want to support using these characters. The plus was not a problem but right slash is used for CIDR, minus for mac address separator, etc. There are still some corner case. 11-22-33-44-55-66 is a mac address and not the arithmetic expression with six terms "eleven minus twenty two minus etc." (if we ever support more than two terms in the grammar, which we don't currently). We lift some patterns from the flex manual to match on IPv4 and IPv6 (ugly) and add MAC address. Other hypothetical literal lexical values using [+-/%] are already supported enclosed in angle brackets but the cases of MAC/IPv4/IPv6 are are very common and moreover we need to do the utmost to not break backward compatibily here. Before: $ dftest "_ws.ftypes.int32 == 1+2" dftest: "1+2" is not a valid number. After: $ dftest "_ws.ftypes.int32 == 1+2" Filter: _ws.ftypes.int32 == 1+2 Instructions: 00000 READ_TREE _ws.ftypes.int32 -> reg#0 00001 IF_FALSE_GOTO 4 00002 ADD 1 <FT_INT32> + 2 <FT_INT32> -> reg#1 00003 ANY_EQ reg#0 == reg#1 00004 RETURN	2022-04-04 20:28:55 +00:00
João Valverde	34ad6bb478	dfilter: Make logical AND higher precedence than logical OR In most, if not all, programming languages logical AND has higher precedence than logical OR. Apply the principle of least surprise and do the same for Wireshark display filters. Before: ip and tcp or udp => ip and (tcp or udp) Filter: ip and tcp or udp Instructions: 00000 CHECK_EXISTS ip 00001 IF_FALSE_GOTO 5 00002 CHECK_EXISTS tcp 00003 IF_TRUE_GOTO 5 00004 CHECK_EXISTS udp 00005 RETURN After: ip and tcp or udp => (ip and tcp) or udp Filter: ip and tcp or udp Instructions: 00000 CHECK_EXISTS ip 00001 IF_FALSE_GOTO 4 00002 CHECK_EXISTS tcp 00003 IF_TRUE_GOTO 5 00004 CHECK_EXISTS udp 00005 RETURN	2022-04-04 19:51:38 +00:00
João Valverde	f0ca30b60b	dfilter: More arithmetic fixes Fix a failed assertion with constant arithmetic expressions. Because we do not parse constants on the lexical level it is more complicated to handle constant expressions with unparsed values. We need to handle missing type information gracefully for any kind of arithmetic expression, not just unary minus.	2022-04-02 18:10:33 +00:00
João Valverde	67e5e5c3ab	dfilter: Fix arithmetic expressions on the LHS Filter: _ws.ftypes.framenum % 3 == 0 Instructions: 00000 READ_TREE _ws.ftypes.framenum -> reg#0 00001 IF_FALSE_GOTO 4 00002 MODULO reg#0 % 3 <FT_FRAMENUM> -> reg#1 00003 ANY_EQ reg#1 == 0 <FT_FRAMENUM> 00004 RETURN	2022-04-01 14:33:38 +01:00
João Valverde	8bc214b5bb	dfilter: Add remaining arithmetic integer ops	2022-03-31 16:49:42 +01:00
João Valverde	2a9cb588aa	dfilter: Add binary arithmetic (add/subtract) Add support for display filter binary addition and subtraction. The grammar is intentionally kept simple for now. The use case is to add a constant to a protocol field, or (maybe) add two fields in an expression. We use signed arithmetic with unsigned numbers, checking for overflow and casting where necessary to do the conversion. We could legitimately opt to use traditional modular arithmetic instead (like C) and if it turns out that that is more useful for some reason we may want to in the future. Fixes #15504.	2022-03-31 11:27:34 +01:00
João Valverde	5cd0e4cc97	dfilter: Fix use after free with references By the time we are using the reference fvalue the tree may have gone away and with it the fvalue. We need to duplicate the reference fvalues and take ownership of the memory.	2022-03-30 14:05:22 +01:00
João Valverde	260942e170	dfilter: Refactor macro tree references This replaces the current macro reference system with a completely different implementation. Instead of a macro a reference is a syntax element. A reference is a constant that can be filled in the dfilter code after compilation from an existing protocol tree. It is best understood as a field value that can be read from a fixed tree that is not the frame being filtered. Usually this fixed tree is the currently selected frame when the filter is applied. This allows comparing fields in the filtered frame with fields in the selected frame. Because the field reference syntax uses the same sigil notation as a macro we have to use a heuristic to distinguish them: if the name has a dot it is a field reference, otherwise it is a macro name. The reference is synctatically validated at compile time. There are two main advantages to this implementation (and a couple of minor ones): The protocol tree for each selected frame is only walked if we have a display filter and if the display filter uses references. Also only the actual reference values are copied, intead of loading the entire tree into a hash table (in textual form even). The other advantage is that the reference is tested like a protocol field against all the values in the selected frame (if there is more than one). Currently the reference fields are not "primed" during dissection, so the entire tree is walked to find a particular reference (this is similar to the previous implementation). If the display filter contains a valid reference and the reference is not loaded at the time the filter is run the result is the same as a non existing field for a regular READ_TREE instruction. Fixes #17599.	2022-03-29 12:36:31 +00:00
João Valverde	431cb43b81	dfilter: Remove parenthesis deprecation warning This usage devalues a mechanism for warning users that deserves more attention than this minor suggestion. The warning is inconvenient for intermediate and advanced users.	2022-03-29 12:19:26 +00:00
João Valverde	d2907d91c0	dfilter: Add more logging for bytecode	2022-03-28 17:59:07 +01:00
João Valverde	9ee9b40b64	dfilter: Store expanded text	2022-03-28 17:22:01 +01:00
João Valverde	a1299d63d9	dfilter: Lower level of two debug messages	2022-03-28 17:20:00 +01:00
João Valverde	ac0a69636b	dfilter: Add support for unary arithmetic This change implements a unary minus operator. Filter: tcp.window_size_scalefactor == -tcp.dstport Instructions: 00000 READ_TREE tcp.window_size_scalefactor -> reg#0 00001 IF_FALSE_GOTO 6 00002 READ_TREE tcp.dstport -> reg#1 00003 IF_FALSE_GOTO 6 00004 MK_MINUS -reg#1 -> reg#2 00005 ANY_EQ reg#0 == reg#2 00006 RETURN It is supported for integer types, floats and relative time values. The unsigned integer types are promoted to a 32 bit signed integer. Unary plus is implemented as a no-op. The plus sign is simply ignored. Constant arithmetic expressions are computed during compilation. Overflow with constants is a compile time error. Overflow with variables is a run time error and silently ignored. Only a debug message will be printed to the console. Related to #15504.	2022-03-28 11:20:41 +00:00
João Valverde	a3b76138f0	dfilter: Fix memory leak Filter: tcp.srcport == udp.port Instructions: 00000 READ_TREE tcp.srcport -> reg#0 00001 IF_FALSE_GOTO 5 00002 READ_TREE udp.port -> reg#1 00003 IF_FALSE_GOTO 5 00004 ANY_EQ reg#0 == reg#1 00005 RETURN ================================================================= ==180444==ERROR: LeakSanitizer: detected memory leaks Direct leak of 34 byte(s) in 1 object(s) allocated from: #0 0x55f21e4a9ff9 (/home/jpv/projects/wireshark/wireshark/build-asan/run/dftest+0xcdff9) #1 0x7f95ea661338 (/usr/lib/libc.so.6+0x82338) SUMMARY: AddressSanitizer: 34 byte(s) leaked in 1 allocation(s). Fixes `a68b408a9f`.	2022-03-25 18:38:11 +00:00
João Valverde	2fc8c0e36b	dfilter: Handle a bitwise expr on the RHS	2022-03-23 11:04:41 +00:00
João Valverde	0335ebdc3a	dfilter: ftype_is_true -> ftype_is_zero	2022-03-23 11:04:41 +00:00
João Valverde	16729be2c1	dfilter: Add bitwise masking of bits Add support for masking of bits. Before the bitwise operator could only test bits, it did not support clearing bits. This allows testing if any combination of bits are set/unset more naturally with a single test. Previously this was only possible by combining several bitwise predicates. Bitwise is implemented as a test node, even though it is not. Maybe the test node should be renamed to something else. Fixes #17246.	2022-03-22 12:58:04 +00:00
João Valverde	631cf34f0c	dfilter: Use a function pointer array to free registers	2022-03-21 18:43:36 +00:00
João Valverde	6a0129a0e3	dfilter: Fix EditorConfig settings	2022-03-21 17:49:12 +00:00
João Valverde	54d8627c9a	dfilter: Add more comments to optimization pass	2022-03-21 17:36:41 +00:00
João Valverde	d60f2580ba	dfilter: Pass around constants in instructions The DFVM instructions arguments are generic boxed types but instead of using FVALUE and PCRE types the code passes aroung REGISTER types instead. Change that to pass constants in the instruction.	2022-03-21 17:09:56 +00:00
João Valverde	94d909103e	dfilter: Remove DFVM constant initialization	2022-03-21 17:09:43 +00:00
João Valverde	ae17e733ac	dfilter: Use more DFVM values in gencode	2022-03-21 17:09:29 +00:00
João Valverde	769f1f10de	dfilter: Add DFVM value constructor	2022-03-21 17:09:19 +00:00
João Valverde	1b574e7466	dfilter: Cleanup dfvm_apply()	2022-03-21 12:38:09 +00:00

1 2 3 4 5 ...

758 Commits