wireshark

Commit Graph

Author	SHA1	Message	Date
João Valverde	e85f8d4cf1	dfilter: Do not jump when generating function arguments Instead of "jumping" with length zero to the next sequential instruction skip generating the no-op jump instruction entirely.	2022-12-27 21:09:04 +00:00
João Valverde	540b71d738	dfilter: Fix crash with a constant arithmetic expression	2022-12-26 23:55:27 +00:00
João Valverde	84e75be5c6	dfilter: Add optimization flag When we are just testing code to see if it compiles performing optimizations is wasteful. Add an option to disable them.	2022-11-30 17:36:17 +00:00
João Valverde	b83658d8a4	dfilter: Add suport for raw addressing with references Extends raw adressing syntax to wok with references. The syntax is @field1 == ${@field2} This requires replicating the logic to load field references, but using raw values instead. We use separate hash tables for that, namely "references" vs "raw_references".	2022-10-31 21:02:39 +00:00
João Valverde	0853ddd1cb	dfilter: Add support for raw (bytes) addressing mode This adds new syntax to read a field from the tree as bytes, instead of the actual type. This is a useful extension for example to match matformed strings that contain unicode replacement characters. In this case it is not possible to match the raw value of the malformed string field. This extension fills this need and is generic enough that it should be useful in many other situations. The syntax used is to prefix the field name with "@". The following artificial example tests if the HTTP user agent contains a particular invalid UTF-8 sequence: @http.user_agent == "Mozill\xAA" Where simply using "http.user_agent" won't work because the invalid byte sequence will have been replaced with U+FFFD. Considering the following programs: $ dftest '_ws.ftypes.string == "ABC"' Filter: _ws.ftypes.string == "ABC" Syntax tree: 0 TEST_ANY_EQ: 1 FIELD(_ws.ftypes.string <FT_STRING>) 1 FVALUE("ABC" <FT_STRING>) Instructions: 00000 READ_TREE _ws.ftypes.string <FT_STRING> -> reg#0 00001 IF_FALSE_GOTO 3 00002 ANY_EQ reg#0 == "ABC" <FT_STRING> 00003 RETURN $ dftest '@_ws.ftypes.string == "ABC"' Filter: @_ws.ftypes.string == "ABC" Syntax tree: 0 TEST_ANY_EQ: 1 FIELD(_ws.ftypes.string <RAW>) 1 FVALUE(41:42:43 <FT_BYTES>) Instructions: 00000 READ_TREE @_ws.ftypes.string <FT_BYTES> -> reg#0 00001 IF_FALSE_GOTO 3 00002 ANY_EQ reg#0 == 41:42:43 <FT_BYTES> 00003 RETURN In the second case the field has a "raw" type, that equates directly to FT_BYTES, and the field value is read from the protocol raw data.	2022-10-31 21:02:39 +00:00
João Valverde	5e3a7e9ab8	dfilter: Small optimization for "not all zero" code Remove extra NOT instruction. Also remove unused ANY_ZERO opcode.	2022-07-05 09:58:43 +01:00
João Valverde	a877f2d5f3	dfilter: Allow existence check for slices Allow checking if a slice exists. The result is true if the slice has length greater than zero. The len() function is implemented as a DFVM instruction instead. The semantics are the same.	2022-07-04 22:45:14 +00:00
João Valverde	eb8acd088e	dfilter: Rename dfvm opcodes with a namespace prefix	2022-07-02 11:46:45 +01:00
João Valverde	fc5c81328e	dfilter: Rename test syntax tree node Test node also includes arithmetic operations so rename it to a generic "operator" node.	2022-07-02 11:39:17 +01:00
João Valverde	aaff0d21ae	dfilter: Add layer support for references This adds support for using the layers filter with field references. Before: $ dftest 'ip.src != ${ip.src#2}' dftest: invalid character in macro name After: $ dftest 'ip.src != ${ip.src#2}' Filter: ip.src != ${ip.src#2} Syntax tree: 0 TEST_ALL_NE: 1 FIELD(ip.src <FT_IPv4>) 1 REFERENCE(ip.src#[2:1] <FT_IPv4>) Instructions: 00000 READ_TREE ip.src <FT_IPv4> -> reg#0 00001 IF_FALSE_GOTO 5 00002 READ_REFERENCE_R ${ip.src <FT_IPv4>} #[2:1] -> reg#1 00003 IF_FALSE_GOTO 5 00004 ALL_NE reg#0 != reg#1 00005 RETURN This requires adding another level of complexity to references. When loading references we need to copy the 'proto_layer_num' and add the logic to filter on that. The "layer" sttype is removed and replace by a new field sttype with support for a range. This is a nice cleanup for the semantic check and general simplification. The grammar is better too with this design. Range sttype is renamed to slice for clarity.	2022-06-25 14:57:40 +01:00
João Valverde	bebf7afa37	dfilter: Remove unused DFVM 4th instruction argument	2022-05-13 14:13:18 +01:00
João Valverde	b602911b31	dfilter: Add support for universal quantifiers Adds the keywords "any" and "all" to implement the quantification to any existing relational operator. Filter: all tcp.port in {100, 2000..3000} Syntax tree: 0 ALL TEST_IN: 1 FIELD(tcp.port) 1 SET(#2): 2 FVALUE(100 <FT_UINT16>) 2 FVALUE(2000 <FT_UINT16>) .. FVALUE(3000 <FT_UINT16>) Instructions: 00000 READ_TREE tcp.port -> reg#0 00001 IF_FALSE_GOTO 5 00002 ALL_EQ reg#0 === 100 <FT_UINT16> 00003 IF_TRUE_GOTO 5 00004 ALL_IN_RANGE reg#0 in { 2000 <FT_UINT16> .. 3000 <FT_UINT16> } 00005 RETURN	2022-05-12 14:26:54 +01:00
João Valverde	4f3f507eee	dfilter: Add syntax to match specific layers in the protocol stack Add support to display filters for matching a specific layer within a frame. Layers are counted sequentially up the protocol stack. Each protocol (dissector) that appears in the stack is one layer. LINK-LAYER#1 <-> IP#1 <-> TCP#1 <-> IP#2 <-> TCP#2 <-> etc. The syntax allows for negative indexes and ranges with the usual semantics for slices (but note that counting starts at one): tcp.port#[2-4] == 1024 Matches layers 2 to 4 inclusive. Fixes #3791.	2022-04-26 16:50:59 +00:00
João Valverde	c0170dad42	dfilter: Rename "range" to "slice" The word range is used for different things with different meanings and that is confusing. Avoid using "range" in code to mean "slice". A range is one or more intervals with a lower and upper bound. A slice is a range applied to a bytes field. Replace range with slice wherever appropriate. This usage of "slice" instead of range is generally correct and consistent in the documentation.	2022-04-26 16:50:59 +00:00
João Valverde	fab32ea0cb	dfilter: Allow arithmetic expressions as function arguments This allows writing moderately complex expressions, for example a float epsilon test (#16483): Filter: {abs(_ws.ftypes.double - 1) / max(abs(_ws.ftypes.double), abs(1))} < 0.01 Syntax tree: 0 TEST_LT: 1 OP_DIVIDE: 2 FUNCTION(abs#1): 3 OP_SUBTRACT: 4 FIELD(_ws.ftypes.double) 4 FVALUE(1 <FT_DOUBLE>) 2 FUNCTION(max#2): 3 FUNCTION(abs#1): 4 FIELD(_ws.ftypes.double) 3 FUNCTION(abs#1): 4 FVALUE(1 <FT_DOUBLE>) 1 FVALUE(0.01 <FT_DOUBLE>) Instructions: 00000 READ_TREE _ws.ftypes.double -> reg#1 00001 IF_FALSE_GOTO 3 00002 SUBRACT reg#1 - 1 <FT_DOUBLE> -> reg#2 00003 STACK_PUSH reg#2 00004 CALL_FUNCTION abs(reg#2) -> reg#0 00005 STACK_POP 1 00006 IF_FALSE_GOTO 24 00007 READ_TREE _ws.ftypes.double -> reg#1 00008 IF_FALSE_GOTO 9 00009 STACK_PUSH reg#1 00010 CALL_FUNCTION abs(reg#1) -> reg#4 00011 STACK_POP 1 00012 IF_FALSE_GOTO 13 00013 STACK_PUSH reg#4 00014 STACK_PUSH 1 <FT_DOUBLE> 00015 CALL_FUNCTION abs(1 <FT_DOUBLE>) -> reg#5 00016 STACK_POP 1 00017 IF_FALSE_GOTO 18 00018 STACK_PUSH reg#5 00019 CALL_FUNCTION max(reg#5, reg#4) -> reg#3 00020 STACK_POP 2 00021 IF_FALSE_GOTO 24 00022 DIVIDE reg#0 / reg#3 -> reg#6 00023 ANY_LT reg#6 < 0.01 <FT_DOUBLE> 00024 RETURN We now use a stack to pass arguments to the function. The stack is implemented as a list of lists (list of registers). Arguments may still be non-existent to functions (this is a feature). Functions must check for nil arguments (NULL lists) and handle that case. It's somewhat complicated to allow literal values and test compatibility for different types, both because of lack of type information with unparsed/literal and also because it is an underdeveloped area in the code. In my limited testing it was good enough and useful, further enhancements are left for future work.	2022-04-18 17:10:31 +01:00
João Valverde	827d143e6e	dfilter: Allow function arguments to be non-existent. Instead of not calling the function if an argument is non-existent (read tree fails), call the function and let the function handle the condition.	2022-04-14 13:07:41 +00:00
João Valverde	cb2f085f14	dfilter: Add max() and min() functions Changes the function calling convention to pass the first register number plus the number of registers after that sequentially. This allows function with any number of arguments. Functions can still only return one value. Adds max() and min() function to select the maximum/minimum value from any number of arguments, all of the same type. The functions accept literals too. The return type is the same as the first argument (cannot be a literal).	2022-04-14 13:07:41 +00:00
João Valverde	20afbd46ec	dfilter: Remove existence test syntax tree nodes After some experimentation I don't think these two existence tests belong in the grammar, it's an implementation detail and removing it might avoid some artificial constraints.	2022-04-05 12:04:37 +01:00
João Valverde	fb08c4b4a8	dfilter: Replace bitwise sttype with arithmetic Most of the bitwise codepaths are just duplicating code for the arithmetic type. Parse bitwise expressions as arithmetic instead.	2022-04-05 12:04:37 +01:00
João Valverde	8bc214b5bb	dfilter: Add remaining arithmetic integer ops	2022-03-31 16:49:42 +01:00
João Valverde	2a9cb588aa	dfilter: Add binary arithmetic (add/subtract) Add support for display filter binary addition and subtraction. The grammar is intentionally kept simple for now. The use case is to add a constant to a protocol field, or (maybe) add two fields in an expression. We use signed arithmetic with unsigned numbers, checking for overflow and casting where necessary to do the conversion. We could legitimately opt to use traditional modular arithmetic instead (like C) and if it turns out that that is more useful for some reason we may want to in the future. Fixes #15504.	2022-03-31 11:27:34 +01:00
João Valverde	260942e170	dfilter: Refactor macro tree references This replaces the current macro reference system with a completely different implementation. Instead of a macro a reference is a syntax element. A reference is a constant that can be filled in the dfilter code after compilation from an existing protocol tree. It is best understood as a field value that can be read from a fixed tree that is not the frame being filtered. Usually this fixed tree is the currently selected frame when the filter is applied. This allows comparing fields in the filtered frame with fields in the selected frame. Because the field reference syntax uses the same sigil notation as a macro we have to use a heuristic to distinguish them: if the name has a dot it is a field reference, otherwise it is a macro name. The reference is synctatically validated at compile time. There are two main advantages to this implementation (and a couple of minor ones): The protocol tree for each selected frame is only walked if we have a display filter and if the display filter uses references. Also only the actual reference values are copied, intead of loading the entire tree into a hash table (in textual form even). The other advantage is that the reference is tested like a protocol field against all the values in the selected frame (if there is more than one). Currently the reference fields are not "primed" during dissection, so the entire tree is walked to find a particular reference (this is similar to the previous implementation). If the display filter contains a valid reference and the reference is not loaded at the time the filter is run the result is the same as a non existing field for a regular READ_TREE instruction. Fixes #17599.	2022-03-29 12:36:31 +00:00
João Valverde	ac0a69636b	dfilter: Add support for unary arithmetic This change implements a unary minus operator. Filter: tcp.window_size_scalefactor == -tcp.dstport Instructions: 00000 READ_TREE tcp.window_size_scalefactor -> reg#0 00001 IF_FALSE_GOTO 6 00002 READ_TREE tcp.dstport -> reg#1 00003 IF_FALSE_GOTO 6 00004 MK_MINUS -reg#1 -> reg#2 00005 ANY_EQ reg#0 == reg#2 00006 RETURN It is supported for integer types, floats and relative time values. The unsigned integer types are promoted to a 32 bit signed integer. Unary plus is implemented as a no-op. The plus sign is simply ignored. Constant arithmetic expressions are computed during compilation. Overflow with constants is a compile time error. Overflow with variables is a run time error and silently ignored. Only a debug message will be printed to the console. Related to #15504.	2022-03-28 11:20:41 +00:00
João Valverde	0335ebdc3a	dfilter: ftype_is_true -> ftype_is_zero	2022-03-23 11:04:41 +00:00
João Valverde	16729be2c1	dfilter: Add bitwise masking of bits Add support for masking of bits. Before the bitwise operator could only test bits, it did not support clearing bits. This allows testing if any combination of bits are set/unset more naturally with a single test. Previously this was only possible by combining several bitwise predicates. Bitwise is implemented as a test node, even though it is not. Maybe the test node should be renamed to something else. Fixes #17246.	2022-03-22 12:58:04 +00:00
João Valverde	54d8627c9a	dfilter: Add more comments to optimization pass	2022-03-21 17:36:41 +00:00
João Valverde	d60f2580ba	dfilter: Pass around constants in instructions The DFVM instructions arguments are generic boxed types but instead of using FVALUE and PCRE types the code passes aroung REGISTER types instead. Change that to pass constants in the instruction.	2022-03-21 17:09:56 +00:00
João Valverde	94d909103e	dfilter: Remove DFVM constant initialization	2022-03-21 17:09:43 +00:00
João Valverde	ae17e733ac	dfilter: Use more DFVM values in gencode	2022-03-21 17:09:29 +00:00
João Valverde	769f1f10de	dfilter: Add DFVM value constructor	2022-03-21 17:09:19 +00:00
João Valverde	50f04cb9da	dfilter: Remove dead code	2022-03-19 20:10:43 +00:00
João Valverde	588d22a82b	dfilter: Allow variable number of jumps during codegen Use a list to allow a variable number of jumps, instead of a fixed count. The flexibility in the number of jumps a given syntax tree node might need to handle is useful to add new kinds of operations.	2022-03-16 20:12:22 +00:00
João Valverde	8b23dd3a3c	dfilter: Add an "all equal" operator To complete the set of equality operators add an "all equal" operator that matches a frame if all fields match the condition. The symbol chosen for "all_eq" is "===".	2021-12-22 14:32:32 +00:00
João Valverde	274531820a	Move regex code to wsutil	2021-11-14 21:00:59 +00:00
João Valverde	0abe10e040	dfilter: Fix "!=" relation to be free of contradictions Wireshark defines the relation of equality A == B as A any_eq B <=> An == Bn for at least one An, Bn. More accurately I think this is (formally) an equivalence relation, not true equality. Whichever definition for "==" we choose we must keep the definition of "!=" as !(A == B), otherwise it will lead to logical contradictions like (A == B) AND (A != B) being true. Fix the '!=' relation to match the definition of equality: A != B <=> !(A == B) <=> A all_ne B <=> An != Bn, for every n. This has been the recomended way to write "not equal" for a long time in the documentation, even to the point where != was deprecated, but it just wasn't implemented consistently in the language, which has understandably been a persistent source of confusion. Even a field that is normally well-behaved with "!=" like "ip.src" or "ip.dst" will produce unexpected results with encapsulations like IP-over-IP. The opcode ALL_NE could have been implemented in the compiler instead using NOT and ANY_EQ but I chose to implement it in bytecode. It just seemed more elegant and efficient but the difference was not very significant. Keep around "~=" for any_ne relation, in case someone depends on that, and because we don't have an operator for true equality: A strict_equal B <=> A all_eq B <=> !(A any_ne B). If there is only one value then any_ne and all_ne are the same comparison operation. Implementing this change did not require fixing any tests so it is unlikely the relation "~=" (any_ne) will be very useful. Note that the behaviour of the '<' (less than) comparison relation is a separate, more subtle issue. In the general case the definition of '<' that is used is only a partial order.	2021-10-24 06:55:54 +00:00
João Valverde	e8800ff3c4	dfilter: Add a thin encapsulation layer for REs	2021-10-18 12:09:36 +00:00
João Valverde	0e50979b3f	Replace g_assert() with ws_assert()	2021-06-19 01:23:31 +00:00
Guy Harris	b61fd6d76a	dfilter, ftypes: get rid of FT_PCRE. It's not a valid field type, it's only a hack to support regular expression matching in packet-matching expressions. Instead, in the packet-matching code, have a separate syntax tree type for Perl-compatible regular expressions, and a separate instruction to load one into a register, and have the "matching" operator for field types take a GRegex * as the second argument.	2021-03-21 03:27:44 -07:00
Guy Harris	d78d50d5a1	Move some variables inside the block where they're used. They're not used outside a block, so move them inside the block. Also, they're set before they're used, so don't initialize them when they're declared. This should squelch some unreadVariable warnings from cppcheck.	2021-01-20 12:45:46 -08:00
Guy Harris	20800366dd	HTTPS (almost) everywhere. Change all wireshark.org URLs to use https. Fix some broken links while we're at it. Change-Id: I161bf8eeca43b8027605acea666032da86f5ea1c Reviewed-on: https://code.wireshark.org/review/34089 Reviewed-by: Guy Harris <guy@alum.mit.edu>	2019-07-26 18:44:40 +00:00
Peter Wu	eec3ce3bb2	dfilter: fix memory leaks on dfilter compile errors involving a set If a display filter contains a set for the set membership operator and an error occurs, then gen_relation_in() (called via dfw_gencode() will not take ownership of the set and a memory leak occurs. Fix this by implementing a free callback for STTYPE_SET nodes which frees unclaimed data. Add tests to verify the effectiveness, ASAN no longer complains after this fix. Bug: 15442 Change-Id: If37cf047660464b2d0304748034d0bc22111e5d6 Reviewed-on: https://code.wireshark.org/review/31758 Petri-Dish: Peter Wu <peter@lekensteyn.nl> Tested-by: Petri Dish Buildbot Reviewed-by: Peter Wu <peter@lekensteyn.nl>	2019-01-28 11:09:35 +00:00
Peter Wu	e8e60df4ce	dfilter: fix memory leaks if a dfilter fails to compile A display filter can contain values such as strings, numbers, etc. These are internally stored in a fvalue_t structure. While compiling a display filter, it will store a fvalue_t in a node of type STTYPE_FVALUE. These nodes are created while parsing the dfilter in dfilter_compile(). If the semantic check and conversion (dfw_semcheck()) succeeds, it will transfer the values of the parsed tree to dfw_gencode(). After that, dfwork_free will dispose the tree while a compiled dfilter code remains. When the dfilter code is destroyed, it will free the values too. However, when dfw_semcheck() fails (for example, due to an illegal filter such as "len(badname)==1"), it will skip "dfw_gencode()" and consequently the fvalue data is not transferred nor freed. Fix this by always freeing the data (unless the data was stolen by dfw_gencode()). Fixes a memory leak reported for case_dfunction_string::test_fail_2 which was detected by ASAN. Bug: 15442 Change-Id: I9b1cb613659890c8ddcfa57f11f9d3f61a51a3f9 Reviewed-on: https://code.wireshark.org/review/31757 Petri-Dish: Peter Wu <peter@lekensteyn.nl> Tested-by: Petri Dish Buildbot Reviewed-by: Peter Wu <peter@lekensteyn.nl>	2019-01-28 11:09:17 +00:00
Peter Wu	1ff82572ca	dfilter: add range support to set membership operator ("f in {x .. y}") Allow "tcp.srcport in {1662 1663 1664}" to be abbreviated to "tcp.srcport in {1662 .. 1664}". The range operator is supported for any field value which supports the "<=" and "=>" operators and thus works for integers, IP addresses, etc. The naive mapping "tcp.srcport >= 1662 and tcp.srcport <= 1664" is not used because it does not have the intended effect with fields that have multiple occurrences (e.g. tcp.port). Each condition could be satisfied by an other value. Therefore a new DVFM instruction (ANY_IN_RANGE) is added to test the range condition against each individual field value. Bug: 14180 Change-Id: I53c2d0f9bc9d4f0ffaabde9a83442122965c95f7 Reviewed-on: https://code.wireshark.org/review/26945 Petri-Dish: Peter Wu <peter@lekensteyn.nl> Tested-by: Petri Dish Buildbot Reviewed-by: Anders Broman <a.broman58@gmail.com>	2018-04-18 03:47:02 +00:00
Dario Lombardo	566d20f444	dfilter: use g_malloc0 to prevent uninitialized memory to be used. Found by clang. Change-Id: I89497bd0f32c79f82218c6d254a214364c930eb3 Reviewed-on: https://code.wireshark.org/review/25884 Petri-Dish: Dario Lombardo <lomato@gmail.com> Tested-by: Petri Dish Buildbot Reviewed-by: Anders Broman <a.broman58@gmail.com>	2018-02-21 17:14:30 +00:00
Dario Lombardo	55c68ee69c	epan: use SPDX indentifiers. Skipping dissectors dir for now. Change-Id: I717b66bfbc7cc81b83f8c2cbc011fcad643796aa Reviewed-on: https://code.wireshark.org/review/25694 Petri-Dish: Dario Lombardo <lomato@gmail.com> Tested-by: Petri Dish Buildbot Reviewed-by: Anders Broman <a.broman58@gmail.com>	2018-02-08 19:29:45 +00:00
Michael Mann	1da1f945e2	Fix checkAPI.pl warnings about printf Many of the complaints from checkAPI.pl for use of printf are when its embedded in an #ifdef and checkAPI isn't smart enough to figure that out. The other (non-ifdef) use is dumping internal structures (which is a type of debug functionality) Add a "ws_debug_printf" macro for printf to pacify the warnings. Change-Id: I63610e1adbbaf2feffb4ec9d4f817247d833f7fd Reviewed-on: https://code.wireshark.org/review/16623 Reviewed-by: Michael Mann <mmann78@netscape.net> Petri-Dish: Michael Mann <mmann78@netscape.net> Tested-by: Petri Dish Buildbot <buildbot-no-reply@wireshark.org> Reviewed-by: Anders Broman <a.broman58@gmail.com>	2016-07-25 04:26:50 +00:00
Jeffrey Smith	80322d88da	dfilter: Add membership operator Added a new relational test: 'x in {a b c}'. The only LHS entity supported at this time is a field. The generated DFVM operations are equivalent to an OR'ed series of =='s, but with the redundant existence tests removed. Change-Id: Iddc89b81cf7ad6319aef1a2a94f93314cb721a8a Reviewed-on: https://code.wireshark.org/review/10246 Reviewed-by: Hadriel Kaplan <hadrielk@yahoo.com> Petri-Dish: Hadriel Kaplan <hadrielk@yahoo.com> Tested-by: Petri Dish Buildbot <buildbot-no-reply@wireshark.org> Reviewed-by: Michael Mann <mmann78@netscape.net>	2015-09-11 06:31:33 +00:00
Bill Meier	e88a11f5c9	(Trivial) Fix printf-related 'Mismatch on sign' warnings Found by MSVC2013 Code Analysis Change-Id: I58063946dd558e98308c87b36eeac0ddbe1a6e79 Reviewed-on: https://code.wireshark.org/review/7045 Reviewed-by: Bill Meier <wmeier@newsguy.com>	2015-02-09 18:57:14 +00:00
Alexis La Goutte	296591399f	Remove all $Id$ from top of file (Using sed : sed -i '/^ \* \$Id\$/,+1 d') Fix manually some typo (in export_object_dicom.c and crc16-plain.c) Change-Id: I4c1ae68d1c4afeace8cb195b53c715cf9e1227a8 Reviewed-on: https://code.wireshark.org/review/497 Reviewed-by: Anders Broman <a.broman58@gmail.com>	2014-03-04 14:27:33 +00:00
Jakub Zawadzki	9cfac1227d	Replace hfinfo pointer to same_name_prev, with same_name_prev_id. svn path=/trunk/; revision=51175	2013-08-06 20:53:47 +00:00

1 2

80 Commits