Commit Graph

1154 Commits

Author SHA1 Message Date
John Thacker c8f8bc82a7 epan: FT_FRAMENUM strings are special
When creating the text of a custom column, don't call hf_try_val_to_str,
etc. on FT_FRAMENUM fields that have hfinfo->strings.
It refers to the ft_framenum_type.

Prevents crashing on custom columns of FT_FRAMENUM fields of
a type different than FT_FRAMENUM_NONE.
2023-01-16 20:34:19 -05:00
Alexis La Goutte 5766002231 proto(.c): Fix Argument with 'nonnull' attribute passed null 2023-01-13 08:06:02 +00:00
João Valverde 4c9b0d846c CMake: Reverse debug macros
Originally WS_DISABLE_DEBUG was chosen to be
similar to G_DISABLE_ASSERT and NDEBUG.

However generator expressions are essential for modern CMake
but the syntax is weird and having to use negations makes it
ten-fold worse.

Remove the negation. Instead of changing the CMake variable
reverse the macro definition for WS_DISABLE_DEBUG.

The $<CONFIG:cgs> generator expression with multiple config arguments
requires CMake >= 3.19 so we can't use that yet for a further
syntactical simplification.
2023-01-12 00:59:15 +00:00
João Valverde b893616048 proto: Fix validity test for proto names
We want at least one letter.  Because protocol names can contain
dots and hyphens testing for !isdigit is not enough to make it
dissimilar to decimal numeric expressions.
2023-01-03 11:56:21 +00:00
Martin Mathieson 5bbe533244 WIP: Check types for _add_bits_ functions, and ensure no mask 2022-12-27 12:10:03 +00:00
John Thacker f0f72927b4 epan: Allow FT_IPv4, FT_IPv6 custom columns to be resolved or not.
Similar to commit dbb9fe2a37, proto_item_fill_display_label
now uses address_to_display for FT_IPv4, FT_IPv6, and FT_FCWWN,
the other three address types that double as field types and which
have optional name resolution.

Add these to the list of types that, if present in a custom column,
has the GUI enable the checkbox to switch between "resolved" (names)
and not (values).

This allows adding custom columns with these field types with both
resolved and non resolved text. Note that the appropriate Name
Resolution preference settings must be enabled for the type as well.
2022-12-26 16:12:19 +00:00
John Thacker dbb9fe2a37 epan: Allow FT_ETHER custom columns to be resolved or not
Have proto_item_fill_display_label (which is used for custom
columns resolved type and packet diagrams) use address_to_display
for FT_ETHER. This is resolved when name resolution for MAC
Addresses is enabled.

Add FT_ETHER to the list of types that, if present in a custom
column, has the GUI enable the checkbox to switch between "resolved"
and "unresolved" text.

This allows FT_ETHER custom columns to be displayed as either
resolved addresses or unresolved. (Note that to be displayed
as resolved, the column resolved option must be checked and
the name resolution preference enabled.)

Fix #18665
2022-12-17 20:07:45 +00:00
John Thacker 7baa0ca0c4 proto: Custom column concatenation and truncation
Fix some issues regarding custom columns near the maximum size:

Fix where when near the column limit, a comma was not being added
to separate a value but the first character of the next field was,
resulting in an invalid field.

Create the "result" and the "expr" (resolved and unresolved) separately
to address issue where for multifield custom columns of different
types, the "result" might be truncated without "expr" necessarily
being so. This created problems when concatenating the end of the
result to the expr for certain types later.

Avoid passing a NULL to snprintf for integer columns of BASE_NONE
of unexpected value.

Indicate when the custom column has been truncated, since after
commit e449b560c0 this string value is no longer
used to create the filter and is for display only. Also use
the label truncation function so that truncatation is on UTF-8
boundaries.

Fix #17618
2022-12-16 21:08:47 +00:00
João Valverde 32f88ad22c wmem: Remove strbuf max size parameter
This parameter was introduced as a safeguard for bugs
that generate an unbounded string but its utility for
that purpose is doubtful and the way it is being used
creates problems with invalid truncation of UTF-8
strings.

Rename wmem_strbuf_sized_new() with a better name.
2022-12-03 01:54:52 +00:00
Martin Mathieson 709d65883f Fix some cppcheck issues 2022-11-18 10:07:57 +00:00
João Valverde 09718fb9b3 CMake: Move clang warnings
Move clang warnings to normal set. Let the CMake compatibility
check control the warning.

Fix or work-around -Wunreachable warnings in the code.
2022-11-17 01:35:16 +00:00
Dario Lombardo c2b59567d3 tshark: update man to explain why some fields are skipped in elastic-mapping. 2022-11-08 06:24:50 +00:00
John Thacker 3e0ee841b1 epan: Simplify construct_match_selected_string
Since fvalue_to_string_repr does take the field base
as a parameter and that affects the representation,
an existing comment is no longer true, and we can
get rid of a large amount of duplicative special
handling for integer-based types.
2022-11-02 18:16:59 -04:00
John Thacker e449b560c0 epan: Properly generate filter expressions for custom columns
Properly generate filter expressions for custom columns by
using proto_construct_match_selected_string on each value and
then joining them together later instead of trying to split
the column expression value.

This ensures that escaping is done properly for display filter
strings, that commas internal to field values are not confused
with commas between occurrences, that for multifield columns
we can distinguish which field each value matches, etc.

It's not entirely clear whether AND or OR logic is appropriate
for multiple occurrences; currently OR is used.

Bump glib requirement to 2.54 for g_ptr_array_find_with_equal_func
(this doesn't drop support for any major distribution that already
meets our other library requirements, like Qt.)

Fix #18001.
2022-11-02 19:46:11 +00:00
João Valverde 76a6e2a2bf ftypes: Do not sanitize strings for UTF-8 errors
The ftype itself is encoding agnostic. In the case of literal
display filter strings it is possible and legal to contain
invalid UTF-8.

Maybe it shouldn't be but that requires a user-friendly diagnostic
message, not silently sanitizing the string as is done currently
(only a debug message is printed in that case).

Do the debug checks in proto_tree_set_string() instead. That
still detects dissector code that might need fixing, which was
the purpose for this check.

Improve documentation and add admonition for proto_tree_add_string().

Ping #18521.
2022-10-26 16:23:55 +01:00
João Valverde 92e1357bb4 Rename ws_label_strcat() to ws_label_strcpy()
The semantics of ws_label_strcat() are closer to g_strlcpy() so
rename the function to reflect that.
2022-10-26 13:12:35 +01:00
João Valverde 40ec1adfb0 S7Comm: Fix invalid UTF-8 value string chars
Fixes #18533.
2022-10-26 01:42:43 +01:00
João Valverde 603354203b epan/proto: Replace format text()
The proto.h APIs expect valid UTF-8 so replace uses of format_text()
with a label copy function that just does formatting and does not
check for encoding errors. Avoid multiple levels of temporary
string allocations.

Make sure the copy does not truncate a multibyte character and
produce invalid strings. Add debug checks for UTF-8 encoding errors
instead.

We escape C0 and C1 control codes (because control codes)
and ASCII whitespace (and bell).

Overall the goal is to be more efficient and optimized and help
detect misuse of APIs by passing invalid UTF-8.

Add a unit test for ws_label_strcat.
2022-10-20 20:05:15 +01:00
Brian Sipos 5bb756e203 epan: centralize SDNV processing along other similar varint types
This avoids having general-purpose decoding happening in
non-DLL-exported functions defined in a dissector for #18478,
and removes unused functions and avoids duplicate decoding.
This also removes unnecessary early exit conditions for #18145.
Unit test cases for varint decoding are added to verify this.
2022-10-19 15:27:42 +00:00
João Valverde f2cc1f2382 epan: Add BASE_STR_WSP and use it
This field display type formats the representation string of
FT_STRING by replacing all space character with ' '.

Instead of "A line end\n" it will output "A line end ".

This allows cleaner code using proto_tree_add_item() and
avoids the problematic pattern

  proto_tree_add_string(..., tvb_format_text_wsp(...));

because we only want to affect the way the string value is displayed,
not the actual field value stored.
2022-09-28 19:32:46 +01:00
João Valverde 6d06d4e46b Add some UTF-8 debug checks with a compile time flag
Some older dissectors that predate Unicode and parse text protocols
are prone to generate invalid UTF-8 strings. This is a bug and can have
safety implications.

For example passing invalid UTF-8 to proto_tree_add_string() is a
common bug. There are safeguards in format_text() but this should
not be relied on as a general solution to the problem.

For one, as the name implies, it is only used with representation of a
field value, which is not the same as the value itself of an FT_STRING field.
Issue #18317 shows another reason why.

For now this compile flag only enables extra checks for string ftypes,
which covers a subset of proto.h APIs including
proto_tree_append_string(). Later is should be extended to other
interfaces.

This is also not expected to be disabled for release builds because
there are still many dissectors that do not correctly handle strings.
More work is needed to 1) identify them and 2) fix them.

Ping #18317
2022-09-27 17:04:44 +00:00
John Thacker cc61fe9d40 epan: Prevent crash when asserting on unvalidated UTF-8 strings
If UTF-8 validation fails, set the fvalue to a sanitized value so that calls
later to retrieve it don't null deference and crash. We could,
especially for a release, disable the assertion and just sanitize
bad strings.

Related to #18363
2022-09-23 07:34:36 -04:00
John Thacker 30b309d24c proto: Validate add_string values as UTF-8
When a dissector directly adds a string value through
proto_tree_add_string[_format_value], validate that it is
UTF-8 so that only valid UTF-8 strings are used internally,
and written to output (whether text, JSON, or XML.)
(We were treating it as a UTF-8 string anyway, but not
validating it.)

If the string passed in is not UTF-8, that's a dissector bug
Dissectors that use API functions like tvb_get_string_enc
will always produce valid UTF-8, but some do their own
processing.

Fix #18317
2022-09-21 07:53:01 -04:00
John Thacker a48298a93a proto: Ensure that representation strings are printable, valid UTF-8
The proto_item_XXX_text() routines and proto_tree_add_XXX_format[_value]
functions allow dissectors to alter the representation string for
a protocol tree item with data that may come from arbitrary packet data.
These values are displayed by tshark or wireshark, so they should made
into printable, valid UTF-8.

This means that dissectors no longer need to call format_text before using
those functions (though, if they want to produce some other kind of
printable string, such as with format_text_wsp, they still can.)

Also, mark when appending and prepending text truncates a string that
was not previously truncated (except for a small number of cases where
it is difficult to determine if it was truncated before.)

Part of #18317
2022-09-11 15:32:03 +00:00
John Thacker e25f0508aa proto: Fix truncation of UTF-8 strings.
It is correct to pass in the memory address immediately past
the end of our buffer, as g_utf8_prev_char() does not deference
it until after decrementing it once, and we want to find the final
UTF-8 character start. Starting one byte earlier truncates the string
more than necessary.

This effectively reverts 4b6224a673
which noted that Coverity flagged this as a memory access error,
although it is not. This is possibly because it was written as
&label_str[ITEM_LABEL_LENGTH]. All versions of the ISO C standard
starting with C99 have indicated (6.5.3.2) than in such a case
"neither the & operator nor the unary * that is implied by the [] is
evaluated and the result is as if the & operator were removed and the
[] operator were changed to a + operator" and (6.5.6) that referring
to the memory address one past the last element of an array object
"shall not produce an overflow" and is not undefined (so long as it
not deferenced.)

However, Coverity may not have been aware of this, so rewrite
the expression using the + operator in the hopes of avoiding
false positive Coverity errors.
2022-09-08 00:56:07 +00:00
Anders Broman a47830e56f Increase number of preallocated fields. 2022-08-16 09:43:42 +02:00
Jaap Keuter 8097d3e4a3 Streamline hfinfo retrieval in proto_tree_add_* functions
Instead of a function call, instantiate the PROTO_REGISTRAR_GET_NTH
macro directly, which contains the subsequent DISSECTOR_ASSERT macro
to test the result anyway.
2022-08-08 17:05:50 +00:00
João Valverde 80f16015e2 epan: Refactor floating point display types
Remove the redundant BASE_FLOAT field display type. The name
BASE_FLOAT is meaningless and the value aliased to BASE_NONE.

Require BASE_NONE instead of BASE_FLOAT (corresponding to
the printf() %g format).

Add new float display types using BASE_DEC, BASE_HEX and BASE_EXP
corresponfing to %f, %a and %e respectively.

Add support for BASE_CUSTOM with floats.
2022-08-02 13:16:46 +00:00
Guy Harris f15b7b0ccc proto: fix proto_tree_add_bitmask_list_ret_uint64 to always return a value.
A "proto_tree_add..._ret_..." routine *must* return the value through
the pointer, even if no protocol tree is being built, as there's no
guarantee that a protocol tree will be built under all circumstances
(for example, if the dissection is only being done to generate the
column values, no column is a custom column, there are no coloring
rules, etc., so that none of the named field values are of interest, and
the protocol tree isn't going to be displayed, no protocol tree will be
built).

Fixes #18203.
2022-07-15 00:24:58 -07:00
Joakim Karlsson bf8577b88c pfcp: change to utilize proto_tree_add_bitmask_list 2022-07-14 12:46:09 +00:00
Chuck Craft e12954a637 epan: ws_debug log for heuristic that claims frame (len != 0)
It's possible for a dissector to claim a frame without adding to
the tree or being added to frame.protocols (see !6669)
Log a debug message showing the pinfo layers and the dissector that
claimed the tvb (frame/packet).
2022-07-12 14:15:33 +00:00
John Thacker 02b00a8ee5 epan: Copy multifield custom column undecoded values correctly
When writing a custom column, some field types can't have a resolved
value, and just copy the label from the expression to the value.
Only copy information from the most recent field when doing so,
so that with multifield custom columns the entire unresolved value
doesn't get overwritten with the resolved value (if some fields
have resolved values and some don't.) This also reduces copying
from O(N^2) to O(N).

Fixes the display "unresolved" value for multifield custom columns
that are a mix of field types.
2022-07-08 09:54:54 -04:00
John Thacker dd5e2f3b3f epan: Fix return value of prooto_strlcpy when not enough room
proto_strlcpy in normal situations returns the number of bytes
copied (because the return value of g_strlcpy is strlen of the
source buffer). It can copy no more than dest_size - 1, because
dest_size is the size of the buffer, including the null terminator.
(https://docs.gtk.org/glib/func.strlcpy.html)

Returning dest_size can cause offsets to get off by one and reach
the end of the buffer, and can cause subsequent calls to have
buffer overflows. (See #16905 for an example in the comments.)
2022-07-05 22:12:41 +00:00
David Perry 88a7bf9db2 Properly free range strings, ext strings, custom base 2022-07-05 20:43:31 +00:00
João Valverde b10db887ce dfilter: Remove unparsed syntax type and RHS literal bias
This removes unparsed name resolution during the semantic
check because it feels like a hack to work around limitations
in the language syntax, that should be solved at the lexical
level instead.

We were interpreting unparsed differently on the LHS and RHS.
Now an unparsed value is always a field if it matches a
registered field name (this matches the implementation in 3.6
and before).

This requires tightening a bit the allowed filter names for
protocols to avoid some common and potentially weird conflicting
cases.

Incidentally this extends set grammar to accept all entities.
That is experimental and may be reverted in the future.
2022-07-02 11:18:20 +01:00
João Valverde 0615ba6317 ftypes: Make accessor functions type safe 2022-06-20 17:29:57 +00:00
João Valverde 51de43cfd2 dfilter: Fix protocol slices with negative indexes
Field infos have a length property that was not stored with the
field value so when using a negative index the end was computed
from the captured length of the frame tvbuff, leading to incorrect
results. The documentation in wireshark-filter(5) describes how
this was supposed to work but as far as I can tell it never worked
properly.

We now store the length and use that (when it is different from -1)
to locate the end of the protocol data in the tvbuff. An extra wrinkle
is that sometimes the length is set after the field value is created.
This is the most common case as the majority of protocols have a
variable length and dissection generally proceeds with a TVB subset from
the current layer (with offset zero) through all remaining layers to the
end of the captured length. For that reason we must use an expedient to allow
changing the protocol length of an existing protocol fvalue, whenever
proto_item_set_len() is called.

Fixes #17772.
2022-05-23 23:04:07 +01:00
John Thacker f2fb1662b2 proto: Handle BASE_SPECIAL_VALS in add_bitmask_ title
Respect BASE_SPECIAL_VALS when adding to the title item of an
item added with the proto_tree_add_bitmask* functions.

Note that the documentation for the BMT_NO_INT flag has always
said that "only boolean flags are added to the title" and that
no integer based items are added, but the actual behavior has been
to add integer items with custom format functions and value strings.
2022-05-15 09:59:52 -04:00
John Thacker 1e7a600680 proto: Fix display of BASE_UNIT_STRING for 64 bit fields in bitmask
When integer fields are displayed in the bitmask header item in
proto_tree_add_bitmask_tree and hf->strings is set, only the string
from the value_string is used, not the integer value, to save space.

However, that means that BASE_UNIT_STRING fields have to be treated
differently from all the other fields with hf->strings set. If not,
then only the units are displayed instead of the number with the units.

Fields based on 32 bit integers were already being handled correctly.
Use that same logic for fields based on 64 bit integers.
(See commit 24d991dab4 for something similar.)
2022-05-14 15:14:22 -04:00
John Thacker a98391e316 proto: Fix reversed test for signed ints with unit strings
In proto_item_add_bitmask_tree, on the signed integer path, the
test for if the display uses a unit string is clearly reversed,
calling it only if BASE_UNIT_STRING is unset. Use the correct
test from the unsigned integer path.
2022-05-14 09:26:20 -04:00
John Thacker 8a872d6142 proto: Add support for BASE_SPECIAL_VALS to fields with bitmasks
Add support for BASE_SPECIAL_VALS to fill_label_bitfield[64], for
fields with a nonzero bitmask, using the same logic as
fill_label_number[64].

There's at least one dissector (packet-ipmi-se.c) that was trying
to use this already, but silently had no effect.
2022-05-13 21:02:54 -04:00
João Valverde b602911b31 dfilter: Add support for universal quantifiers
Adds the keywords "any" and "all" to implement the quantification
to any existing relational operator.

Filter: all tcp.port in {100, 2000..3000}

Syntax tree:
 0 ALL TEST_IN:
   1 FIELD(tcp.port)
   1 SET(#2):
     2 FVALUE(100 <FT_UINT16>)
     2 FVALUE(2000 <FT_UINT16>) .. FVALUE(3000 <FT_UINT16>)

Instructions:
00000 READ_TREE		tcp.port -> reg#0
00001 IF_FALSE_GOTO	5
00002 ALL_EQ		reg#0 === 100 <FT_UINT16>
00003 IF_TRUE_GOTO	5
00004 ALL_IN_RANGE	reg#0 in { 2000 <FT_UINT16> .. 3000 <FT_UINT16> }
00005 RETURN
2022-05-12 14:26:54 +01:00
João Valverde d517feee74 epan: Add more bookkeeping for layers
Packet info already contains the notion of layer depth for the
current protocol, among all the protocols in the frame. This
adds an extra layer number for the protocols that are the same
as the current one. Obviously this will only go above one if
the protocol is repeated in the stack, such as with IP tunneling.

Adds extra logic to track numbers for each protocol in the frame
and update them when calling a dissector.

The total layer number and protocol layer number are store in
the field info structure so they can be used after dissection,
namely by display filters.
2022-04-26 16:50:59 +00:00
John Thacker 7a97a1dc22 epan: Add comments about _get_parent, _set_len and faked items
If we're faking items, then proto_[item|tree]_get_parent[_nth] return
the parent of the faked item, which may not be what we want. We have
no way of knowing if the logical item meant was the faked item itself
or one of its children that share the same proto_item when faked.

Thus we don't know if we should return the proto_item itself or its
parent when called on a possibly faked item. Most of the time we will
be adding new items to what we return here, which means not faking items
that could be faked (since we might be returning the root node, which
doesn't have a field_info), hurting performance (see #8069).

It can also have some unusual effects on the protocol hierarchy stats,
particularly if we change things so that non-visible items can change
their length, which has a similar issue. (#17877)
2022-04-20 21:30:34 +00:00
Chuck Craft 4e0cd3dbd2 epan: add ENC_TIME_USECS timestamp encoding
Needed to format timestamp in #18038 - packet-cql.c
Mirrors changes made in !1924 - Add ENC_TIME_NSECS timestamp encoding
Documentation in README.dissector, proto.c, proto.h - could use
refresh in a different merge request.
2022-04-14 15:18:30 +00:00
João Valverde 260942e170 dfilter: Refactor macro tree references
This replaces the current macro reference system with
a completely different implementation. Instead of a macro a reference
is a syntax element. A reference is a constant that can be filled
in the dfilter code after compilation from an existing protocol tree.
It is best understood as a field value that can be read from a fixed
tree that is not the frame being filtered. Usually this fixed tree
is the currently selected frame when the filter is applied. This
allows comparing fields in the filtered frame with fields in the
selected frame.

Because the field reference syntax uses the same sigil notation
as a macro we have to use a heuristic to distinguish them:
if the name has a dot it is a field reference, otherwise
it is a macro name.

The reference is synctatically validated at compile time.

There are two main advantages to this implementation (and a couple of
minor ones):

The protocol tree for each selected frame is only walked if we have a
display filter and if the display filter uses references. Also only the
actual reference values are copied, intead of loading the entire tree
into a hash table (in textual form even).

The other advantage is that the reference is tested like a protocol
field against all the values in the selected frame (if there is more
than one).

Currently the reference fields are not "primed" during dissection, so
the entire tree is walked to find a particular reference (this is
similar to the previous implementation).

If the display filter contains a valid reference and the reference is
not loaded at the time the filter is run the result is the same as a
non existing field for a regular READ_TREE instruction.

Fixes #17599.
2022-03-29 12:36:31 +00:00
João Valverde b9b45a4a8f dfilter: Add ftypes pseudofields
This adds a _ws.ftypes namespace with protocol fields with all
the existing field types.

Currently this is only useful to debug the display filter compiler,
without having to find a real protocol field with the desired type.

Later it may find other uses.
2022-03-28 15:42:32 +01:00
John Thacker d7f3612613 proto: Fix comment on NTP Era 1 Epoch
NTP Era 1 begins on 7 February 2036, 06:28:16 UTC, exactly when
the 64 bit fixed point timestamp rolls over. See RFC 4330/5905 (and
the correct comments later in get_time_value). Fix the comment where
the constant is defined (the value is already correct, however.)
2022-03-25 17:16:54 -04:00
Dario Lombardo 9012722f9b elastic: fix mapping with recent es versions. 2022-03-14 08:34:48 +00:00
Gerald Combs 8575914213 epan: Make sure we always set our return values.
Make sure we always set a return value in our various
proto_tree_add_item_ret_* routines. Fixes #17994.
2022-03-12 01:52:56 +00:00