A "proto_tree_add..._ret_..." routine *must* return the value through
the pointer, even if no protocol tree is being built, as there's no
guarantee that a protocol tree will be built under all circumstances
(for example, if the dissection is only being done to generate the
column values, no column is a custom column, there are no coloring
rules, etc., so that none of the named field values are of interest, and
the protocol tree isn't going to be displayed, no protocol tree will be
built).
Fixes#18203.
It's possible for a dissector to claim a frame without adding to
the tree or being added to frame.protocols (see !6669)
Log a debug message showing the pinfo layers and the dissector that
claimed the tvb (frame/packet).
When writing a custom column, some field types can't have a resolved
value, and just copy the label from the expression to the value.
Only copy information from the most recent field when doing so,
so that with multifield custom columns the entire unresolved value
doesn't get overwritten with the resolved value (if some fields
have resolved values and some don't.) This also reduces copying
from O(N^2) to O(N).
Fixes the display "unresolved" value for multifield custom columns
that are a mix of field types.
proto_strlcpy in normal situations returns the number of bytes
copied (because the return value of g_strlcpy is strlen of the
source buffer). It can copy no more than dest_size - 1, because
dest_size is the size of the buffer, including the null terminator.
(https://docs.gtk.org/glib/func.strlcpy.html)
Returning dest_size can cause offsets to get off by one and reach
the end of the buffer, and can cause subsequent calls to have
buffer overflows. (See #16905 for an example in the comments.)
This removes unparsed name resolution during the semantic
check because it feels like a hack to work around limitations
in the language syntax, that should be solved at the lexical
level instead.
We were interpreting unparsed differently on the LHS and RHS.
Now an unparsed value is always a field if it matches a
registered field name (this matches the implementation in 3.6
and before).
This requires tightening a bit the allowed filter names for
protocols to avoid some common and potentially weird conflicting
cases.
Incidentally this extends set grammar to accept all entities.
That is experimental and may be reverted in the future.
Field infos have a length property that was not stored with the
field value so when using a negative index the end was computed
from the captured length of the frame tvbuff, leading to incorrect
results. The documentation in wireshark-filter(5) describes how
this was supposed to work but as far as I can tell it never worked
properly.
We now store the length and use that (when it is different from -1)
to locate the end of the protocol data in the tvbuff. An extra wrinkle
is that sometimes the length is set after the field value is created.
This is the most common case as the majority of protocols have a
variable length and dissection generally proceeds with a TVB subset from
the current layer (with offset zero) through all remaining layers to the
end of the captured length. For that reason we must use an expedient to allow
changing the protocol length of an existing protocol fvalue, whenever
proto_item_set_len() is called.
Fixes#17772.
Respect BASE_SPECIAL_VALS when adding to the title item of an
item added with the proto_tree_add_bitmask* functions.
Note that the documentation for the BMT_NO_INT flag has always
said that "only boolean flags are added to the title" and that
no integer based items are added, but the actual behavior has been
to add integer items with custom format functions and value strings.
When integer fields are displayed in the bitmask header item in
proto_tree_add_bitmask_tree and hf->strings is set, only the string
from the value_string is used, not the integer value, to save space.
However, that means that BASE_UNIT_STRING fields have to be treated
differently from all the other fields with hf->strings set. If not,
then only the units are displayed instead of the number with the units.
Fields based on 32 bit integers were already being handled correctly.
Use that same logic for fields based on 64 bit integers.
(See commit 24d991dab4 for something similar.)
In proto_item_add_bitmask_tree, on the signed integer path, the
test for if the display uses a unit string is clearly reversed,
calling it only if BASE_UNIT_STRING is unset. Use the correct
test from the unsigned integer path.
Add support for BASE_SPECIAL_VALS to fill_label_bitfield[64], for
fields with a nonzero bitmask, using the same logic as
fill_label_number[64].
There's at least one dissector (packet-ipmi-se.c) that was trying
to use this already, but silently had no effect.
Packet info already contains the notion of layer depth for the
current protocol, among all the protocols in the frame. This
adds an extra layer number for the protocols that are the same
as the current one. Obviously this will only go above one if
the protocol is repeated in the stack, such as with IP tunneling.
Adds extra logic to track numbers for each protocol in the frame
and update them when calling a dissector.
The total layer number and protocol layer number are store in
the field info structure so they can be used after dissection,
namely by display filters.
If we're faking items, then proto_[item|tree]_get_parent[_nth] return
the parent of the faked item, which may not be what we want. We have
no way of knowing if the logical item meant was the faked item itself
or one of its children that share the same proto_item when faked.
Thus we don't know if we should return the proto_item itself or its
parent when called on a possibly faked item. Most of the time we will
be adding new items to what we return here, which means not faking items
that could be faked (since we might be returning the root node, which
doesn't have a field_info), hurting performance (see #8069).
It can also have some unusual effects on the protocol hierarchy stats,
particularly if we change things so that non-visible items can change
their length, which has a similar issue. (#17877)
Needed to format timestamp in #18038 - packet-cql.c
Mirrors changes made in !1924 - Add ENC_TIME_NSECS timestamp encoding
Documentation in README.dissector, proto.c, proto.h - could use
refresh in a different merge request.
This replaces the current macro reference system with
a completely different implementation. Instead of a macro a reference
is a syntax element. A reference is a constant that can be filled
in the dfilter code after compilation from an existing protocol tree.
It is best understood as a field value that can be read from a fixed
tree that is not the frame being filtered. Usually this fixed tree
is the currently selected frame when the filter is applied. This
allows comparing fields in the filtered frame with fields in the
selected frame.
Because the field reference syntax uses the same sigil notation
as a macro we have to use a heuristic to distinguish them:
if the name has a dot it is a field reference, otherwise
it is a macro name.
The reference is synctatically validated at compile time.
There are two main advantages to this implementation (and a couple of
minor ones):
The protocol tree for each selected frame is only walked if we have a
display filter and if the display filter uses references. Also only the
actual reference values are copied, intead of loading the entire tree
into a hash table (in textual form even).
The other advantage is that the reference is tested like a protocol
field against all the values in the selected frame (if there is more
than one).
Currently the reference fields are not "primed" during dissection, so
the entire tree is walked to find a particular reference (this is
similar to the previous implementation).
If the display filter contains a valid reference and the reference is
not loaded at the time the filter is run the result is the same as a
non existing field for a regular READ_TREE instruction.
Fixes#17599.
This adds a _ws.ftypes namespace with protocol fields with all
the existing field types.
Currently this is only useful to debug the display filter compiler,
without having to find a real protocol field with the desired type.
Later it may find other uses.
NTP Era 1 begins on 7 February 2036, 06:28:16 UTC, exactly when
the 64 bit fixed point timestamp rolls over. See RFC 4330/5905 (and
the correct comments later in get_time_value). Fix the comment where
the constant is defined (the value is already correct, however.)
Return NULL when an item with length zero or lower -1 is added to
the tree.
With this the calling dissector doesn't have to check the length and
there is no Dissector bug reported.
Related to #17890
Add comments to indicate what types of display information various field
types are allowed.
Make the error messages for fields that only allow some particular
display information types specific to those types, rather than saying
"no field information allowed". This also gets rid of some
fallthroughs, one of which allows BASE_PROTOCOL_INFO for floating-point
types, which makes no sense.
Add BASE_SHOW_UTF_8_PRINTABLE and related function tvb_utf_8_isprint
for supporting fields of bytes that are "maybe UTF-8" (default or
SHOULD be UTF-8 but could be something else, with no encoding indicator),
such as SSID fields in IEEE 802.11 (See #16208), certain OctetString
fields in Diameter or PFCP, and other places where
BASE_SHOW_ASCII_PRINTABLE is currently used. Fix#5307
Report all registration errors with REPORT_DISSECTOR_BUG().
In the workers for register_all_protocols() and
register_all_protocol_handlers(), use TRY/CATCH/ENDTRY to catch
DissectorError exceptions thrown by REPORT_DISSECTOR_BUG() when
registering dissectors. Return the error message from the main thread
routine and, when joining the worker thread, if there's an error message
returned, throw it in the current thread, so that it gets caught by the
main libwireshark initialization code.
Fixes the crash in #17856.
This makes it easier to understand the code, avoids conflicts
and ugly and unnecessary casts.
The field display enum has evolved over time from integer types
to a type generic parameter.
Repeated words were found with:
egrep "(\b[a-zA-Z]+) +\1\b" . -Ir
and then manually reviewed.
Non-displayed strings (e.g., in comments)
were also corrected, to ease future review.
Replace:
g_snprintf() -> snprintf()
g_vsnprintf() -> vsnprintf()
g_strdup_printf() -> ws_strdup_printf()
g_strdup_vprintf() -> ws_strdup_vprintf()
This is more portable, user-friendly and faster on platforms
where GLib does not like the native I/O.
Adjust the format string to use macros from intypes.h.
These display bases work to replace unprintable characters so the
name is a misnomer. In addition they are the same option and this
display behaviour is not something that is configurable.
This does not affect encodings because all our internal text strings
need to be valid UTF-8 and the source encoding is specified using
ENC_*.
Remove the assertion for valid UTF-8 in proto.c because
tvb_get_*_string() must return a valid UTF-8 string, always, and we
don't need to assert that, it is expensive.
The header ftypes-int.h should not be used outside of epan/ftypes
because it is a private header.
The functions fvalue_free() and fvalue_cleanup() need not and should
not be macros either.
Registering a preference module for a protocol filter name with
upper case letters aborts the program. Relax this restriction to
conform with the rules for protocols. The recommendation is still
to use all lower-case letters.
Fixes 070aeddf76.
Unlike other header fields in filter expressions protocol names
cannot contain upper-case letters. Remove that restriction. This
should make start-up slightly faster as it remove an extra loop
for each protocol filter name.
This was added in 9ead15a6eb but
I don't see a reason to have different rules for protocols and
fields, it seems the README.developer was just being vague and
conflating PROTOABBREV with PROTOFILTERNAME.
The recommendation for lower case is a style recommendation,
and it's a good one, but it should be applied uniformly. As
long as we are not enforcing this for all field filter values
there is no point in enforcing it just for protocol names and
actually it is detrimental, e.g:
hi2operations
HI2Operations.IRIsContent
HI2Operations.UUS1_Content_element
HI2Operations.iRIContent
HI2Operations.iRISequence
HI2Operations.IRIContent
HI2Operations.iRI_Begin_record_element
HI2Operations.iRI_End_record_element
HI2Operations.iRI_Continue_record_element
HI2Operations.iRI_Report_record_element
(...)
It's weird and unexpected to have this difference and there is
no technical reason to require it. What we should probably do
is not include the protocol name in the FIELDFILTERNAME and
have the registration mechanism append it to the PROTOFILTERNAME.
Also disallow leading '-' everywhere in filter names, not just
protocol filter names. It's a universal requirement.
Wireshark defines the relation of equality A == B as
A any_eq B <=> An == Bn for at least one An, Bn.
More accurately I think this is (formally) an equivalence
relation, not true equality.
Whichever definition for "==" we choose we must keep the
definition of "!=" as !(A == B), otherwise it will
lead to logical contradictions like (A == B) AND (A != B)
being true.
Fix the '!=' relation to match the definition of equality:
A != B <=> !(A == B) <=> A all_ne B <=> An != Bn, for
every n.
This has been the recomended way to write "not equal" for a
long time in the documentation, even to the point where != was
deprecated, but it just wasn't implemented consistently in the
language, which has understandably been a persistent source
of confusion. Even a field that is normally well-behaved
with "!=" like "ip.src" or "ip.dst" will produce unexpected
results with encapsulations like IP-over-IP.
The opcode ALL_NE could have been implemented in the compiler
instead using NOT and ANY_EQ but I chose to implement it in
bytecode. It just seemed more elegant and efficient
but the difference was not very significant.
Keep around "~=" for any_ne relation, in case someone depends
on that, and because we don't have an operator for true equality:
A strict_equal B <=> A all_eq B <=> !(A any_ne B).
If there is only one value then any_ne and all_ne are the same
comparison operation.
Implementing this change did not require fixing any tests so it
is unlikely the relation "~=" (any_ne) will be very useful.
Note that the behaviour of the '<' (less than) comparison relation
is a separate, more subtle issue. In the general case the definition
of '<' that is used is only a partial order.