Commit Graph

519 Commits

Author SHA1 Message Date
Bernhard Dick 75fb2e770c DECT-NWK: Add basic support for DECT charsets 2022-12-21 21:30:20 +00:00
João Valverde 868313956f proto: Tweak admonition for proto_tree_add_string()
Try the clarify the distinction and implications of a string
value vs a string label.
2022-12-03 11:28:48 +00:00
John Thacker e449b560c0 epan: Properly generate filter expressions for custom columns
Properly generate filter expressions for custom columns by
using proto_construct_match_selected_string on each value and
then joining them together later instead of trying to split
the column expression value.

This ensures that escaping is done properly for display filter
strings, that commas internal to field values are not confused
with commas between occurrences, that for multifield columns
we can distinguish which field each value matches, etc.

It's not entirely clear whether AND or OR logic is appropriate
for multiple occurrences; currently OR is used.

Bump glib requirement to 2.54 for g_ptr_array_find_with_equal_func
(this doesn't drop support for any major distribution that already
meets our other library requirements, like Qt.)

Fix #18001.
2022-11-02 19:46:11 +00:00
Simon Holesch 8f6a640337 epan: Allow FT_UINT_STRING for proto_tree_add_string()
Since cbd3c447 ("ftypes: Add FT_UINT_STRING to IS_FT_STRING() macro")
proto_tree_add_string() accepts FT_UINT_STRING, but the API check still
fails. Update the API check to reflect that change.
2022-10-31 14:54:39 +00:00
João Valverde 76a6e2a2bf ftypes: Do not sanitize strings for UTF-8 errors
The ftype itself is encoding agnostic. In the case of literal
display filter strings it is possible and legal to contain
invalid UTF-8.

Maybe it shouldn't be but that requires a user-friendly diagnostic
message, not silently sanitizing the string as is done currently
(only a debug message is printed in that case).

Do the debug checks in proto_tree_set_string() instead. That
still detects dissector code that might need fixing, which was
the purpose for this check.

Improve documentation and add admonition for proto_tree_add_string().

Ping #18521.
2022-10-26 16:23:55 +01:00
João Valverde 40ec1adfb0 S7Comm: Fix invalid UTF-8 value string chars
Fixes #18533.
2022-10-26 01:42:43 +01:00
Brian Sipos 5bb756e203 epan: centralize SDNV processing along other similar varint types
This avoids having general-purpose decoding happening in
non-DLL-exported functions defined in a dissector for #18478,
and removes unused functions and avoids duplicate decoding.
This also removes unnecessary early exit conditions for #18145.
Unit test cases for varint decoding are added to verify this.
2022-10-19 15:27:42 +00:00
João Valverde 9faaf0ecff wslua: Use introspection API to generate constants 2022-10-07 10:28:47 +01:00
João Valverde f2cc1f2382 epan: Add BASE_STR_WSP and use it
This field display type formats the representation string of
FT_STRING by replacing all space character with ' '.

Instead of "A line end\n" it will output "A line end ".

This allows cleaner code using proto_tree_add_item() and
avoids the problematic pattern

  proto_tree_add_string(..., tvb_format_text_wsp(...));

because we only want to affect the way the string value is displayed,
not the actual field value stored.
2022-09-28 19:32:46 +01:00
John Thacker 30b309d24c proto: Validate add_string values as UTF-8
When a dissector directly adds a string value through
proto_tree_add_string[_format_value], validate that it is
UTF-8 so that only valid UTF-8 strings are used internally,
and written to output (whether text, JSON, or XML.)
(We were treating it as a UTF-8 string anyway, but not
validating it.)

If the string passed in is not UTF-8, that's a dissector bug
Dissectors that use API functions like tvb_get_string_enc
will always produce valid UTF-8, but some do their own
processing.

Fix #18317
2022-09-21 07:53:01 -04:00
João Valverde 80f16015e2 epan: Refactor floating point display types
Remove the redundant BASE_FLOAT field display type. The name
BASE_FLOAT is meaningless and the value aliased to BASE_NONE.

Require BASE_NONE instead of BASE_FLOAT (corresponding to
the printf() %g format).

Add new float display types using BASE_DEC, BASE_HEX and BASE_EXP
corresponfing to %f, %a and %e respectively.

Add support for BASE_CUSTOM with floats.
2022-08-02 13:16:46 +00:00
Alexis La Goutte 0ed87211da proto.h: Fix -Wdocumentation
proto.h:3004:9: warning: parameter 'return_value' not found in the function declaration [-Wdocumentation]
2022-07-21 21:01:25 +00:00
Joakim Karlsson bf8577b88c pfcp: change to utilize proto_tree_add_bitmask_list 2022-07-14 12:46:09 +00:00
Chuck Craft e12954a637 epan: ws_debug log for heuristic that claims frame (len != 0)
It's possible for a dissector to claim a frame without adding to
the tree or being added to frame.protocols (see !6669)
Log a debug message showing the pinfo layers and the dissector that
claimed the tvb (frame/packet).
2022-07-12 14:15:33 +00:00
Gerald Combs 4153af1dc7 wslua: Port make-init-lua to Python3
Port the script that creates init.lua to Python3. The generated init.lua
removes one newline and adds another, otherwise the output is identical
to the Perl version.
Ping #18152.
2022-06-27 16:28:36 +00:00
João Valverde d517feee74 epan: Add more bookkeeping for layers
Packet info already contains the notion of layer depth for the
current protocol, among all the protocols in the frame. This
adds an extra layer number for the protocols that are the same
as the current one. Obviously this will only go above one if
the protocol is repeated in the stack, such as with IP tunneling.

Adds extra logic to track numbers for each protocol in the frame
and update them when calling a dissector.

The total layer number and protocol layer number are store in
the field info structure so they can be used after dissection,
namely by display filters.
2022-04-26 16:50:59 +00:00
Chuck Craft 4e0cd3dbd2 epan: add ENC_TIME_USECS timestamp encoding
Needed to format timestamp in #18038 - packet-cql.c
Mirrors changes made in !1924 - Add ENC_TIME_NSECS timestamp encoding
Documentation in README.dissector, proto.c, proto.h - could use
refresh in a different merge request.
2022-04-14 15:18:30 +00:00
João Valverde 260942e170 dfilter: Refactor macro tree references
This replaces the current macro reference system with
a completely different implementation. Instead of a macro a reference
is a syntax element. A reference is a constant that can be filled
in the dfilter code after compilation from an existing protocol tree.
It is best understood as a field value that can be read from a fixed
tree that is not the frame being filtered. Usually this fixed tree
is the currently selected frame when the filter is applied. This
allows comparing fields in the filtered frame with fields in the
selected frame.

Because the field reference syntax uses the same sigil notation
as a macro we have to use a heuristic to distinguish them:
if the name has a dot it is a field reference, otherwise
it is a macro name.

The reference is synctatically validated at compile time.

There are two main advantages to this implementation (and a couple of
minor ones):

The protocol tree for each selected frame is only walked if we have a
display filter and if the display filter uses references. Also only the
actual reference values are copied, intead of loading the entire tree
into a hash table (in textual form even).

The other advantage is that the reference is tested like a protocol
field against all the values in the selected frame (if there is more
than one).

Currently the reference fields are not "primed" during dissection, so
the entire tree is walked to find a particular reference (this is
similar to the previous implementation).

If the display filter contains a valid reference and the reference is
not loaded at the time the filter is run the result is the same as a
non existing field for a regular READ_TREE instruction.

Fixes #17599.
2022-03-29 12:36:31 +00:00
John Thacker 25d0c88251 epan: Add BASE_SHOW_UTF_8_PRINTABLE
Add BASE_SHOW_UTF_8_PRINTABLE and related function tvb_utf_8_isprint
for supporting fields of bytes that are "maybe UTF-8" (default or
SHOULD be UTF-8 but could be something else, with no encoding indicator),
such as SSID fields in IEEE 802.11 (See #16208), certain OctetString
fields in Diameter or PFCP, and other places where
BASE_SHOW_ASCII_PRINTABLE is currently used. Fix #5307
2022-02-06 00:32:13 +00:00
João Valverde a566076839 epan: Move time display types to field_display_e
This makes it easier to understand the code, avoids conflicts
and ugly and unnecessary casts.

The field display enum has evolved over time from integer types
to a type generic parameter.
2021-12-27 22:31:31 +00:00
Jaap Keuter 1b5acc8d57 Replace ENC_VARIANT_MASK by ENC_VARINT_MASK 2021-12-22 20:14:31 +00:00
João Valverde c5a19582e4 epan: Convert to use stdio.h from GLib
Replace:
    g_snprintf() -> snprintf()
    g_vsnprintf() -> vsnprintf()
    g_strdup_printf() -> ws_strdup_printf()
    g_strdup_vprintf() -> ws_strdup_vprintf()

This is more portable, user-friendly and faster on platforms
where GLib does not like the native I/O.

Adjust the format string to use macros from intypes.h.
2021-12-19 19:29:53 +00:00
João Valverde 6c5d00a746 epan: Remove obsolete function proto_register_fields_manual()
Related with #17774.
2021-12-11 17:02:16 +00:00
João Valverde d2a9cb940a epan: Remove new proto tree API
Remove experimental new API.

Fix Netlink dissector to compile with normal proto tree API.

Closes #17774.
2021-12-10 14:37:01 +00:00
João Valverde 19dcb725b6 epan: Remove STR_ASCII and STR_UNICODE
These display bases work to replace unprintable characters so the
name is a misnomer. In addition they are the same option and this
display behaviour is not something that is configurable.

This does not affect encodings because all our internal text strings
need to be valid UTF-8 and the source encoding is specified using
ENC_*.

Remove the assertion for valid UTF-8 in proto.c because
tvb_get_*_string() must return a valid UTF-8 string, always, and we
don't need to assert that, it is expensive.
2021-12-03 04:35:56 +00:00
John Thacker aadf4efcbe epan: Add ENC_ISO_8601_DATE_TIME_BASIC
Add the ISO 8601 Basic date time format as another string time
option. This could be used for e.g. ASN.1 GeneralizedTime.
Add tests for it.
2021-12-02 14:19:49 +00:00
João Valverde 01f234571f epan: Optimize heuristic name validity check
Do the name check in one pass only, instead of two passes, one
for all letters and a second one to exclude upper case letters.
2021-11-04 14:03:37 +00:00
João Valverde 070aeddf76 Lift restriction on upper case protocol display filter names
Unlike other header fields in filter expressions protocol names
cannot contain upper-case letters. Remove that restriction. This
should make start-up slightly faster as it remove an extra loop
for each protocol filter name.

This was added in 9ead15a6eb but
I don't see a reason to have different rules for protocols and
fields, it seems the README.developer was just being vague and
conflating PROTOABBREV with PROTOFILTERNAME.

The recommendation for lower case is a style recommendation,
and it's a good one, but it should be applied uniformly. As
long as we are not enforcing this for all field filter values
there is no point in enforcing it just for protocol names and
actually it is detrimental, e.g:

hi2operations
HI2Operations.IRIsContent
HI2Operations.UUS1_Content_element
HI2Operations.iRIContent
HI2Operations.iRISequence
HI2Operations.IRIContent
HI2Operations.iRI_Begin_record_element
HI2Operations.iRI_End_record_element
HI2Operations.iRI_Continue_record_element
HI2Operations.iRI_Report_record_element
(...)

It's weird and unexpected to have this difference and there is
no technical reason to require it. What we should probably do
is not include the protocol name in the FIELDFILTERNAME and
have the registration mechanism append it to the PROTOFILTERNAME.

Also disallow leading '-' everywhere in filter names, not just
protocol filter names. It's a universal requirement.
2021-11-02 08:35:24 +00:00
Stig Bjørlykke e9ac4d3900 proto: Delay deleting heur_dtbl_entry_t in heur_dissector_delete
Add the heur_dtbl_entry_t entry as deregistered when deleting a
heuristics dissector. The UDP dissector is storing a pointer to
this in proto_data and may access the entry during reload Lua
plugins until all packets are redissected.
2021-09-29 07:08:52 +00:00
Tomasz Moń 7b82110092 USB HID: Parse bit fields with correct bit order
Implement little endian support for tvb_get_bits family of functions.
The big/little endian refers to bit numbering within an octet. In big
endian, the most significant bit is considered bit 0, while in little
endian the least significant bit is considered bit 0.

Add encoding parameters to proto tree bits format family functions.
Specify ENC_BIG_ENDIAN in all dissectors using these functions except in
USB HID that requires ENC_LITTLE_ENDIAN to work correctly.

When formatting bits values, always display most significant bit on the
leftmost position regardless of the encoding. This results in no gaps
between octets and makes the displayed value comprehensible.

Close #4478
Fix #17014
2021-09-26 18:16:28 +02:00
Triton Circonflexe d4de52690f Thrift: Complete handling of Binary & Compact protocols
- Make sure reassembly requests & errors are properly propagated from
  any point in the PDU, no matter how many sub-structure levels.
- Handle the sub-dissection methods as well:
  - Ensure the sub-dissection methods handle errors from previous calls.
  - Reduce the error handling needed in sub-dissector implementations.
  - Add missing sub-dissection methods for list, set, and map.
  - Add the handling of sub-structure.
- Handle Compact protocol in addition to the existing binary protocol.
  - Include and improve MR !3171
  - Handle reassembly the same way as for binary protocol.
  - Handle sub-dissection with the same functions.
    => Sub-dissectors only depend on .thrift files.

Additional changes:
- Use of constants instead of hard-coded values.
- Removed U64 support (never supported by thrift code generator, only
  referenced in the C++ thrift library header but not supported in reality.
- Removed references to UTF-8 and UTF-16 string for the same reason.
- Replaced references to UTF-7 string with just string (same reason).
- Replaced references to byte with i8 as the documentation explicitly
  states that byte is a compatibility name.

Documentation reference:
- https://thrift.apache.org/developers
- https://thrift.apache.org/docs/idl.html
- https://github.com/apache/thrift/blob/master/doc/specs/thrift-compact-protocol.md
- https://erikvanoosten.github.io/thrift-missing-specification/
- https://diwakergupta.github.io/thrift-missing-guide/

Closes #16244

Additional changes:
- Add authors and improve consistency
- Fix typo and clarify documentation
2021-08-27 06:04:17 +00:00
João Valverde 133b0c583f Move epan/wmem/wmem_scopes.h to epan/
This header was installed incorrectly to epan/wmem_scopes.h.

Instead of creating additional installation rules for a single
header in a subfolder (kept for backward compatibility) just
rename the standard "epan/wmem/wmem.h" include to
"epan/wmem_scopes.h" and fix the documentation.

Now the header is installed *correctly* to epan/wmem_scopes.h.
2021-07-26 14:56:11 +00:00
AndersBroman 754cce9531 Add ENC_APN_STR to handle APN strings 2021-05-20 09:27:53 +00:00
Nicolás Alvarez ebfbf958f6 Add ENC_TIME_NSECS timestamp encoding
Add a new timestamp encoding format ENC_TIME_NSECS, like ENC_TIME_SEC but
for nanosecond values. Needed for my work-in-progress dissector for Apple
push notifications.
2021-02-10 12:45:54 +00:00
João Valverde 89fee9321e Avoid exposing HAVE_PLUGINS in the public API
Instead *_register_plugin() is turned into a noop (with a warning).

The test suit is failing with ENABLE_PLUGINS=Off (it was already failing
before and this patch didn't affect that).

Closes #17202.
2021-02-06 16:35:51 +00:00
Nicolás Alvarez 981e662a0a Minor changes to ENC_TIME documentation
- Fix duplicate "are are".
- Fix NTP epoch year in ENC_TIME_NTP docs (572b80d2 fixed it in the README
  but not in proto.h).
- Remove completely redundant "(ie. )" clauses.
2021-02-04 10:13:36 +00:00
Nicolás Alvarez 0e86ea6c57 Update documentation for ENC_TIME_* constants
ENC_TIME_MIP6 and ENC_TIME_CLASSIC_MAC_OS_SECS were added recently by
factoring them out of specific dissectors, but they weren't documented.
I added documentation, based on comments in the dissector code they came
from.
2021-02-03 21:03:11 -03:00
João Valverde be0171019c UDP: Clean up handling of zero-valued UDP checksums
Replace the somewhat weird field format
    "[Checksum: [missing]]"
with
    "Checksum: 0x0000 [ignored or illegal value]"

Improve code redability and fix XXX comment.
2021-01-27 16:46:15 +00:00
Jaap Keuter a260f6a4e0 Correct comment on expert values 2021-01-17 20:08:10 +00:00
Anders Broman 9a46fabf52 Introduce ENC_BCD_ODD_NUM_DIG in order to handle odd number of digits 2020-12-10 16:02:10 +01:00
Alexis La Goutte f71458c601 proto(.h): fix -Wdocumentation
proto.h:2373:9: warning: parameter 'fi' not found in the function declaration [-Wdocumentation]
2020-11-23 20:06:49 +00:00
Gerald Combs 3a7966c716 Qt+epan: Print better-looking values in the packet diagram.
Pull the value-formatting code in proto_custom_set into
proto_item_fill_display_label. Use that in FieldInformation::toString
instead of fvalue_to_string_repr. Fixes #16911.
2020-11-13 19:41:51 +00:00
John Thacker 524a28c4b1 QT/CLI: Move max tree items and depth to prefs
Move the maximum number of tree items and maximum tree depth to
preferences instead of hardcoded values. Refer to issue #12584 for
an example VNC capture where real data exceeds the current limit.
2020-10-23 04:18:36 +00:00
John Thacker e20bd408de Use iconv to support GB 18030 and EUC-KR, allow future encodings
Add support internally to using iconv (always present with glib) to convert
strings from various encodings to UTF-8 (using REPLACEMENT CHARACTER as
recommended), and use that to support GB 18030 and EUC-KR. Replace call
directly to iconv in ANSI 637 for EUC-KR to new API. Update comments
and documentation around character encodings. It is possible to replace
the calls to iconv with an internal decoder later. Tested on Linux and
on Windows (including with illegal characters). Closes #16630.
2020-10-21 11:26:23 +00:00
Guy Harris 5dd6fc9459 Add proto_tree_add_item_ret_ipv4().
Change some guint32's to ws_in4_addr while we're at it.
2020-10-11 17:54:58 -07:00
Guy Harris e013c5ec7f Clean up URLs.
Add ui/urls.h to define some URLs on various of our websites.  Use the
GitLab URL for the wiki.  Add a macro to generate wiki URLs.

Update wiki URLs in comments etc.

Use the #defined URL for the docs page in
WelcomePage::on_helpLabel_clicked; that removes the last user of
topic_online_url(), so get rid of it and swallow it up into
topic_action_url().
2020-10-02 20:13:42 -07:00
Guy Harris c597927da8 Add some more string encodings.
Add an encoding for "unpacked" 3GPP TS 23.038 7-bit strings, in which
each code position is in a byte of its own, rather than with the code
positions packed into 7 bits.  Rename the packed encoding to explicitly
indicate that it's packed.

Add an encoding for ETSI TS 102 221 Annex A strings.

Use the new encodings.
2020-09-28 22:30:35 +00:00
Guy Harris 272502790b Add FT_STRINGZTRUNC.
FT_STRINGZPAD is for null-*padded* strings, where the field is in an
area of specified length, and, if the string is shorter than that
length, all bytes past the end of the string are NULs.

FT_STRINGZTRUNC is for null-*truncated* strings, where the field is in
an area of specified length and, if the string is shorter than that
length, there's a null character (which might be more than one byte, for
UCS-2, UTF-16, or UTF-32), and anything after that is not guaranteed to
have any particular value.

Use IS_FT_STRING() in some places rather than enumerating all the string
types, so that those places get automatically changed if the set of
string types changes.
2020-09-12 14:16:12 -07:00
Gerald Combs c2075185de epan: Fixup proto_item_set_bits_offset_len.
Export proto_item_set_bits_offset_len and fix

In file included from ../epan/dfilter/dfilter.h:18:
../epan/proto.h:1113:11: warning: parameter 'bits_offset' is already documented [-Wdocumentation]
 * @param bits_offset The new length in bits.
          ^~~~~~~~~~~
../epan/proto.h:1112:5: note: previous documentation
 * @param bits_offset The number of bits from the beginning of the field.
    ^     ~~~~~~~~~~~

Change-Id: Ib171ce38607b9656baea5eb7a3e6aee3b99ddbac
Reviewed-on: https://code.wireshark.org/review/38115
Reviewed-by: Gerald Combs <gerald@wireshark.org>
Petri-Dish: Gerald Combs <gerald@wireshark.org>
Tested-by: Petri Dish Buildbot
Reviewed-by: Anders Broman <a.broman58@gmail.com>
2020-08-11 03:25:17 +00:00
Gerald Combs 9b07412277 Qt: Add a packet diagram view.
Add a new top-level view that shows each packet as a series of diagrams
similar to what you'd find in a networking textook or an RFC.

Add proto_item_set_bits_offset_len so that we can display some diagram
fields correctly.

Bugs / to do:
  - Make this a separate dialog instead of a main window view?
  - Handle bitfields / flags

Change-Id: Iba4897a5bf1dcd73929dde6210d5483cf07f54df
Reviewed-on: https://code.wireshark.org/review/37497
Reviewed-by: Gerald Combs <gerald@wireshark.org>
Petri-Dish: Gerald Combs <gerald@wireshark.org>
Tested-by: Petri Dish Buildbot
Reviewed-by: Anders Broman <a.broman58@gmail.com>
2020-08-10 18:17:50 +00:00