Properly generate filter expressions for custom columns by
using proto_construct_match_selected_string on each value and
then joining them together later instead of trying to split
the column expression value.
This ensures that escaping is done properly for display filter
strings, that commas internal to field values are not confused
with commas between occurrences, that for multifield columns
we can distinguish which field each value matches, etc.
It's not entirely clear whether AND or OR logic is appropriate
for multiple occurrences; currently OR is used.
Bump glib requirement to 2.54 for g_ptr_array_find_with_equal_func
(this doesn't drop support for any major distribution that already
meets our other library requirements, like Qt.)
Fix#18001.
Since cbd3c447 ("ftypes: Add FT_UINT_STRING to IS_FT_STRING() macro")
proto_tree_add_string() accepts FT_UINT_STRING, but the API check still
fails. Update the API check to reflect that change.
The ftype itself is encoding agnostic. In the case of literal
display filter strings it is possible and legal to contain
invalid UTF-8.
Maybe it shouldn't be but that requires a user-friendly diagnostic
message, not silently sanitizing the string as is done currently
(only a debug message is printed in that case).
Do the debug checks in proto_tree_set_string() instead. That
still detects dissector code that might need fixing, which was
the purpose for this check.
Improve documentation and add admonition for proto_tree_add_string().
Ping #18521.
This avoids having general-purpose decoding happening in
non-DLL-exported functions defined in a dissector for #18478,
and removes unused functions and avoids duplicate decoding.
This also removes unnecessary early exit conditions for #18145.
Unit test cases for varint decoding are added to verify this.
This field display type formats the representation string of
FT_STRING by replacing all space character with ' '.
Instead of "A line end\n" it will output "A line end ".
This allows cleaner code using proto_tree_add_item() and
avoids the problematic pattern
proto_tree_add_string(..., tvb_format_text_wsp(...));
because we only want to affect the way the string value is displayed,
not the actual field value stored.
When a dissector directly adds a string value through
proto_tree_add_string[_format_value], validate that it is
UTF-8 so that only valid UTF-8 strings are used internally,
and written to output (whether text, JSON, or XML.)
(We were treating it as a UTF-8 string anyway, but not
validating it.)
If the string passed in is not UTF-8, that's a dissector bug
Dissectors that use API functions like tvb_get_string_enc
will always produce valid UTF-8, but some do their own
processing.
Fix#18317
Remove the redundant BASE_FLOAT field display type. The name
BASE_FLOAT is meaningless and the value aliased to BASE_NONE.
Require BASE_NONE instead of BASE_FLOAT (corresponding to
the printf() %g format).
Add new float display types using BASE_DEC, BASE_HEX and BASE_EXP
corresponfing to %f, %a and %e respectively.
Add support for BASE_CUSTOM with floats.
It's possible for a dissector to claim a frame without adding to
the tree or being added to frame.protocols (see !6669)
Log a debug message showing the pinfo layers and the dissector that
claimed the tvb (frame/packet).
Port the script that creates init.lua to Python3. The generated init.lua
removes one newline and adds another, otherwise the output is identical
to the Perl version.
Ping #18152.
Packet info already contains the notion of layer depth for the
current protocol, among all the protocols in the frame. This
adds an extra layer number for the protocols that are the same
as the current one. Obviously this will only go above one if
the protocol is repeated in the stack, such as with IP tunneling.
Adds extra logic to track numbers for each protocol in the frame
and update them when calling a dissector.
The total layer number and protocol layer number are store in
the field info structure so they can be used after dissection,
namely by display filters.
Needed to format timestamp in #18038 - packet-cql.c
Mirrors changes made in !1924 - Add ENC_TIME_NSECS timestamp encoding
Documentation in README.dissector, proto.c, proto.h - could use
refresh in a different merge request.
This replaces the current macro reference system with
a completely different implementation. Instead of a macro a reference
is a syntax element. A reference is a constant that can be filled
in the dfilter code after compilation from an existing protocol tree.
It is best understood as a field value that can be read from a fixed
tree that is not the frame being filtered. Usually this fixed tree
is the currently selected frame when the filter is applied. This
allows comparing fields in the filtered frame with fields in the
selected frame.
Because the field reference syntax uses the same sigil notation
as a macro we have to use a heuristic to distinguish them:
if the name has a dot it is a field reference, otherwise
it is a macro name.
The reference is synctatically validated at compile time.
There are two main advantages to this implementation (and a couple of
minor ones):
The protocol tree for each selected frame is only walked if we have a
display filter and if the display filter uses references. Also only the
actual reference values are copied, intead of loading the entire tree
into a hash table (in textual form even).
The other advantage is that the reference is tested like a protocol
field against all the values in the selected frame (if there is more
than one).
Currently the reference fields are not "primed" during dissection, so
the entire tree is walked to find a particular reference (this is
similar to the previous implementation).
If the display filter contains a valid reference and the reference is
not loaded at the time the filter is run the result is the same as a
non existing field for a regular READ_TREE instruction.
Fixes#17599.
Add BASE_SHOW_UTF_8_PRINTABLE and related function tvb_utf_8_isprint
for supporting fields of bytes that are "maybe UTF-8" (default or
SHOULD be UTF-8 but could be something else, with no encoding indicator),
such as SSID fields in IEEE 802.11 (See #16208), certain OctetString
fields in Diameter or PFCP, and other places where
BASE_SHOW_ASCII_PRINTABLE is currently used. Fix#5307
This makes it easier to understand the code, avoids conflicts
and ugly and unnecessary casts.
The field display enum has evolved over time from integer types
to a type generic parameter.
Replace:
g_snprintf() -> snprintf()
g_vsnprintf() -> vsnprintf()
g_strdup_printf() -> ws_strdup_printf()
g_strdup_vprintf() -> ws_strdup_vprintf()
This is more portable, user-friendly and faster on platforms
where GLib does not like the native I/O.
Adjust the format string to use macros from intypes.h.
These display bases work to replace unprintable characters so the
name is a misnomer. In addition they are the same option and this
display behaviour is not something that is configurable.
This does not affect encodings because all our internal text strings
need to be valid UTF-8 and the source encoding is specified using
ENC_*.
Remove the assertion for valid UTF-8 in proto.c because
tvb_get_*_string() must return a valid UTF-8 string, always, and we
don't need to assert that, it is expensive.
Unlike other header fields in filter expressions protocol names
cannot contain upper-case letters. Remove that restriction. This
should make start-up slightly faster as it remove an extra loop
for each protocol filter name.
This was added in 9ead15a6eb but
I don't see a reason to have different rules for protocols and
fields, it seems the README.developer was just being vague and
conflating PROTOABBREV with PROTOFILTERNAME.
The recommendation for lower case is a style recommendation,
and it's a good one, but it should be applied uniformly. As
long as we are not enforcing this for all field filter values
there is no point in enforcing it just for protocol names and
actually it is detrimental, e.g:
hi2operations
HI2Operations.IRIsContent
HI2Operations.UUS1_Content_element
HI2Operations.iRIContent
HI2Operations.iRISequence
HI2Operations.IRIContent
HI2Operations.iRI_Begin_record_element
HI2Operations.iRI_End_record_element
HI2Operations.iRI_Continue_record_element
HI2Operations.iRI_Report_record_element
(...)
It's weird and unexpected to have this difference and there is
no technical reason to require it. What we should probably do
is not include the protocol name in the FIELDFILTERNAME and
have the registration mechanism append it to the PROTOFILTERNAME.
Also disallow leading '-' everywhere in filter names, not just
protocol filter names. It's a universal requirement.
Add the heur_dtbl_entry_t entry as deregistered when deleting a
heuristics dissector. The UDP dissector is storing a pointer to
this in proto_data and may access the entry during reload Lua
plugins until all packets are redissected.
Implement little endian support for tvb_get_bits family of functions.
The big/little endian refers to bit numbering within an octet. In big
endian, the most significant bit is considered bit 0, while in little
endian the least significant bit is considered bit 0.
Add encoding parameters to proto tree bits format family functions.
Specify ENC_BIG_ENDIAN in all dissectors using these functions except in
USB HID that requires ENC_LITTLE_ENDIAN to work correctly.
When formatting bits values, always display most significant bit on the
leftmost position regardless of the encoding. This results in no gaps
between octets and makes the displayed value comprehensible.
Close#4478Fix#17014
- Make sure reassembly requests & errors are properly propagated from
any point in the PDU, no matter how many sub-structure levels.
- Handle the sub-dissection methods as well:
- Ensure the sub-dissection methods handle errors from previous calls.
- Reduce the error handling needed in sub-dissector implementations.
- Add missing sub-dissection methods for list, set, and map.
- Add the handling of sub-structure.
- Handle Compact protocol in addition to the existing binary protocol.
- Include and improve MR !3171
- Handle reassembly the same way as for binary protocol.
- Handle sub-dissection with the same functions.
=> Sub-dissectors only depend on .thrift files.
Additional changes:
- Use of constants instead of hard-coded values.
- Removed U64 support (never supported by thrift code generator, only
referenced in the C++ thrift library header but not supported in reality.
- Removed references to UTF-8 and UTF-16 string for the same reason.
- Replaced references to UTF-7 string with just string (same reason).
- Replaced references to byte with i8 as the documentation explicitly
states that byte is a compatibility name.
Documentation reference:
- https://thrift.apache.org/developers
- https://thrift.apache.org/docs/idl.html
- https://github.com/apache/thrift/blob/master/doc/specs/thrift-compact-protocol.md
- https://erikvanoosten.github.io/thrift-missing-specification/
- https://diwakergupta.github.io/thrift-missing-guide/Closes#16244
Additional changes:
- Add authors and improve consistency
- Fix typo and clarify documentation
This header was installed incorrectly to epan/wmem_scopes.h.
Instead of creating additional installation rules for a single
header in a subfolder (kept for backward compatibility) just
rename the standard "epan/wmem/wmem.h" include to
"epan/wmem_scopes.h" and fix the documentation.
Now the header is installed *correctly* to epan/wmem_scopes.h.
Add a new timestamp encoding format ENC_TIME_NSECS, like ENC_TIME_SEC but
for nanosecond values. Needed for my work-in-progress dissector for Apple
push notifications.
Instead *_register_plugin() is turned into a noop (with a warning).
The test suit is failing with ENABLE_PLUGINS=Off (it was already failing
before and this patch didn't affect that).
Closes#17202.
- Fix duplicate "are are".
- Fix NTP epoch year in ENC_TIME_NTP docs (572b80d2 fixed it in the README
but not in proto.h).
- Remove completely redundant "(ie. )" clauses.
ENC_TIME_MIP6 and ENC_TIME_CLASSIC_MAC_OS_SECS were added recently by
factoring them out of specific dissectors, but they weren't documented.
I added documentation, based on comments in the dissector code they came
from.
Replace the somewhat weird field format
"[Checksum: [missing]]"
with
"Checksum: 0x0000 [ignored or illegal value]"
Improve code redability and fix XXX comment.
Pull the value-formatting code in proto_custom_set into
proto_item_fill_display_label. Use that in FieldInformation::toString
instead of fvalue_to_string_repr. Fixes#16911.
Move the maximum number of tree items and maximum tree depth to
preferences instead of hardcoded values. Refer to issue #12584 for
an example VNC capture where real data exceeds the current limit.
Add support internally to using iconv (always present with glib) to convert
strings from various encodings to UTF-8 (using REPLACEMENT CHARACTER as
recommended), and use that to support GB 18030 and EUC-KR. Replace call
directly to iconv in ANSI 637 for EUC-KR to new API. Update comments
and documentation around character encodings. It is possible to replace
the calls to iconv with an internal decoder later. Tested on Linux and
on Windows (including with illegal characters). Closes#16630.
Add ui/urls.h to define some URLs on various of our websites. Use the
GitLab URL for the wiki. Add a macro to generate wiki URLs.
Update wiki URLs in comments etc.
Use the #defined URL for the docs page in
WelcomePage::on_helpLabel_clicked; that removes the last user of
topic_online_url(), so get rid of it and swallow it up into
topic_action_url().
Add an encoding for "unpacked" 3GPP TS 23.038 7-bit strings, in which
each code position is in a byte of its own, rather than with the code
positions packed into 7 bits. Rename the packed encoding to explicitly
indicate that it's packed.
Add an encoding for ETSI TS 102 221 Annex A strings.
Use the new encodings.
FT_STRINGZPAD is for null-*padded* strings, where the field is in an
area of specified length, and, if the string is shorter than that
length, all bytes past the end of the string are NULs.
FT_STRINGZTRUNC is for null-*truncated* strings, where the field is in
an area of specified length and, if the string is shorter than that
length, there's a null character (which might be more than one byte, for
UCS-2, UTF-16, or UTF-32), and anything after that is not guaranteed to
have any particular value.
Use IS_FT_STRING() in some places rather than enumerating all the string
types, so that those places get automatically changed if the set of
string types changes.
Export proto_item_set_bits_offset_len and fix
In file included from ../epan/dfilter/dfilter.h:18:
../epan/proto.h:1113:11: warning: parameter 'bits_offset' is already documented [-Wdocumentation]
* @param bits_offset The new length in bits.
^~~~~~~~~~~
../epan/proto.h:1112:5: note: previous documentation
* @param bits_offset The number of bits from the beginning of the field.
^ ~~~~~~~~~~~
Change-Id: Ib171ce38607b9656baea5eb7a3e6aee3b99ddbac
Reviewed-on: https://code.wireshark.org/review/38115
Reviewed-by: Gerald Combs <gerald@wireshark.org>
Petri-Dish: Gerald Combs <gerald@wireshark.org>
Tested-by: Petri Dish Buildbot
Reviewed-by: Anders Broman <a.broman58@gmail.com>
Add a new top-level view that shows each packet as a series of diagrams
similar to what you'd find in a networking textook or an RFC.
Add proto_item_set_bits_offset_len so that we can display some diagram
fields correctly.
Bugs / to do:
- Make this a separate dialog instead of a main window view?
- Handle bitfields / flags
Change-Id: Iba4897a5bf1dcd73929dde6210d5483cf07f54df
Reviewed-on: https://code.wireshark.org/review/37497
Reviewed-by: Gerald Combs <gerald@wireshark.org>
Petri-Dish: Gerald Combs <gerald@wireshark.org>
Tested-by: Petri Dish Buildbot
Reviewed-by: Anders Broman <a.broman58@gmail.com>