Commit Graph

536 Commits

Author SHA1 Message Date
David Perry 4995e9a8d5 proto.c: `proto_tree_add_mac48_detail()` function
Create a public function in `epan/proto.c` to dissect a single MAC-48
address. Encapsulates the name and OUI resolution, and the LG and IG
bit parsing.

Created after observing that `packet-ieee80211.c` does not resolve the
OUI or IG/LG bits for WLAN fields (`wlan.ra`, `wlan.da`, `wlan.sa`,
`wlan.bssid`) the way that `packet-eth.c` does.

This change modifies `packet-eth.c` and `packet-ieee80211.c`
to use the new function.

Add IG/LG bits
2023-09-27 07:00:15 +00:00
Guy Harris c7886837f7 Give ENC_RFC_1123 the same value as ENC_IMF_DATE_TIME.
That way, code using either ENC_RFC_822 or ENC_RFC_1123 will get
ENC_IMF_DATE_TIME format, which is what they *should* get if they're
trying to parse Internet Message Format dates.
2023-09-07 20:16:49 -07:00
Guy Harris 5dbe4ee8a8 Clean up handling of string encodings for byte arrays and absolute times.
There is no reason to have separate "RFC 822" and "RFC 1123" parsing of
Internet Message Format date/time values.  RFC 1123 adds support for
4-digit years to RFC 822's 2-digit years; RFC 2822, which supplanted RFC
822, supports both, and RFC 5322, which supplanted RFC 2822, continues
to do so.  I know of no cases where *only* 2-digit years should be
supported, especially given that it has beenover 23 years since January
1, 2000.

Instead, have ENC_IMF_DATE_TIME for Internet Message Format date/time
values; keep ENC_RFC_1123 for backwards compatibility, but treat it
exactly the same as ENC_IMF_DATE_TIME.  Keep ENC_RFC_822 around as an
alias for ENC_IMF_DATE_TIME.

Have separate expert infos for errors parsing string-encoded byte arrays
and string-encoded dates and times, with the messages speaking of byte
arrays and dates/times rather than "numbers".  For now, have the only
error being "couldn't convert"; we can add more for specific cases of
what's wrong.

In tvb_get_string_bytes() and tvb_get_string_time(), don't use errno as
a way of indicating failure, use a return of NULL. Using errno 1) runs
the risk of getting errno overwritten by intermediate calls and 2)
would force assinging particular errno values, the set of which we don't
control, to particular errors if we were to distinguish different errors
in the future.

In tvb_get_string_time(), for Internet Message Format date/time values,
check for 2-digit and 3-digit years based on the length of the year
field in the string.  Handle them the way the RFCs say, not the way
strptime() does (strptime(), which was originally written to allow
programs to ask the user to specify a date and time and let them enter
it in the locale's date and time format, and to use 2-digit years in
case they were in the habit of doing so, which they might have been
given that this was the mid 1980's; the choice of 1969 as the first "not
21st century" year was based on the fact that it was the earliest year
for a date correspnding to a non-negative time_t value, which is
UNIX-specific, unlike the Internet Message Format).

Change tvb_get_string_time() to use guard clauses to handle errors; yes
it involves gotos, but it's easier to read than nested ifs, for example.

Update the Lua unit tests to reflect this.
2023-09-07 19:03:02 -07:00
Stig Bjørlykke 8e5929818f proto: Add a comment to proto_deregister_protocol
Add a note that this function should only be used internally
when reloading Lua plugins.

Close #19307
2023-08-26 07:15:38 +00:00
John Thacker da690738a1 epan: Fix up --disable-all-protocols
Commit af0691342b and
commit 80ae370811 both added
proto_disable_all.

Take the one from the 80ae370811,
which made changes to allow protocols to be enabled by
proto_enable_proto_by_name() even if they are not disabled by
default, which allows --only-protocols to work without changing
the disabled-by-default setting of a protocol.

We don't want to change the disabled by default setting, because
that causes the disabled status to be persistent even when
changing Configuration Profiles with the GUI, instead of being
reset. In general, command line options that temporarily override
the Configuration Profile at startup are cleared when switching
to a new profile via the profile, and we don't want to make an
exception.
2023-08-09 05:38:15 -04:00
Juanma Sanchez af0691342b Add --only-protocols and --disable-all-protocols to tshark and rawshark.
--disable-all-protocols will mark all protocols as disabled by default,
and then disable them. Certain protocols can then be enabled one by one
by using --enable-protocol.

--only-protocols is a helper option to make it easier to enable only
certain protocols It's equivalent to passing --disable-all-protocols and
then several --enable-protocol options. It accepts a comma separated
list of protocols. First all protocols will be disabled, and then all
protocols included in the list will be enabled one by one.

Side-note, it wouldn't make much sense to enable only "tcp" for example
without enabling the protocols in the lower layers (e.g: eth, sll, ip,
ipv6). In this case, something like --only-protocols eth,sll,ip,ipv6,tcp
will generally be needed in order to make sure that TCP is decoded.

Signed-off-by: Juanma Sanchez <juasanch@redhat.com>
2023-08-08 21:54:37 +00:00
Jonathan Landis b1b9ff27e7 add new add-and-return functions to C API 2023-08-08 21:37:28 +00:00
Ismael Mendez Matamoros 39a0efc3ad RTPS: Added CRC32 and MD5 checksum check and deleted unused hfs
Added an option for checking the expected RTPS message
checksum is the same as the received in the wire if
checksum is CRC-32C or MD5. Also delted unused header filters.
Introduced function proto_tree_add_checksum_bytes.
2023-07-16 15:14:52 +00:00
João Valverde 6730cc3a65 Add Unix time support for absolute time field type
Add ABSOLUTE_TIME_UNIX absolute time type, to allow
date and time values to be represented in Unix time,
besides other existing formats.
2023-07-03 21:45:50 +00:00
João Valverde d456cc761a ftypes: Rename IS_FT_* macros
Rename IS_FT_*() to FT_IS_*(). I find it to be more natural and
a better namespace for a public interface.
2023-06-20 00:22:57 +01:00
David Perry 80ae370811 Allow disabling unused dissectors from PHS dialog 2023-06-13 17:12:26 +00:00
David Perry 1bd8e05f54 tshark: show field abbrevs matching a prefix 2023-06-11 20:16:03 +00:00
John Thacker 1744ce4a0f epan: Add ENC_BOM modifier for UTF-16, UCS-2, UCS-4
Add ENC_BOM to the list of bitflag modifiers, and use it with
UTF-16, UCS-2, and UCS-4 (UTF-32). If set, this means that the
first 2 (or 4) octets, if present, are checked to see if they are
a Big-Endian BYTE ORDER MARK ("ZERO WIDTH NON-BREAKING SPACE"). If so,
those octets are skipped and the encoding is set to Little-Endian
or Big-Endian depending on endianness of the BOM.

If the BOM is absent, the passed in Endianness flag is used normally.

Related to #17991
2023-06-08 11:25:54 +00:00
John Thacker 8b3d214f41 epan: Update a comment
FT_UINT_STRING hasn't been FT_NSTRING_UINT8 since
commit 7c0421b146
2023-05-16 22:19:58 -04:00
John Thacker 9ea2b3db5e epan: Implement EBCDIC CP 500, for DRDA
EBCDIC Code Page 500 has exactly the same repertoire as CP 037,
covering all of ISO-8859-1, but has 7 bytes permuted. It is
the default code page for DRDA; use it there.
2023-04-26 12:30:46 +00:00
João Valverde 7595af96a0 ftypes: Hide fvalue implementation
Exposing the fvalue_t implementation is exposing internal
details of the implementation. Fix that by making the fvalue_t
internal to the ftypes implementation and using setters/getters
where necessary.
2023-04-19 15:12:25 +00:00
John Thacker 550001e161 epan: Update an encoding comment
We do, in fact, map illegal ASCII characters and invalid UTF-8
sequences to REPLACEMENT CHARACTERs.

[skip ci]
2023-04-02 21:45:26 -04:00
Bernhard Dick 75fb2e770c DECT-NWK: Add basic support for DECT charsets 2022-12-21 21:30:20 +00:00
João Valverde 868313956f proto: Tweak admonition for proto_tree_add_string()
Try the clarify the distinction and implications of a string
value vs a string label.
2022-12-03 11:28:48 +00:00
John Thacker e449b560c0 epan: Properly generate filter expressions for custom columns
Properly generate filter expressions for custom columns by
using proto_construct_match_selected_string on each value and
then joining them together later instead of trying to split
the column expression value.

This ensures that escaping is done properly for display filter
strings, that commas internal to field values are not confused
with commas between occurrences, that for multifield columns
we can distinguish which field each value matches, etc.

It's not entirely clear whether AND or OR logic is appropriate
for multiple occurrences; currently OR is used.

Bump glib requirement to 2.54 for g_ptr_array_find_with_equal_func
(this doesn't drop support for any major distribution that already
meets our other library requirements, like Qt.)

Fix #18001.
2022-11-02 19:46:11 +00:00
Simon Holesch 8f6a640337 epan: Allow FT_UINT_STRING for proto_tree_add_string()
Since cbd3c447 ("ftypes: Add FT_UINT_STRING to IS_FT_STRING() macro")
proto_tree_add_string() accepts FT_UINT_STRING, but the API check still
fails. Update the API check to reflect that change.
2022-10-31 14:54:39 +00:00
João Valverde 76a6e2a2bf ftypes: Do not sanitize strings for UTF-8 errors
The ftype itself is encoding agnostic. In the case of literal
display filter strings it is possible and legal to contain
invalid UTF-8.

Maybe it shouldn't be but that requires a user-friendly diagnostic
message, not silently sanitizing the string as is done currently
(only a debug message is printed in that case).

Do the debug checks in proto_tree_set_string() instead. That
still detects dissector code that might need fixing, which was
the purpose for this check.

Improve documentation and add admonition for proto_tree_add_string().

Ping #18521.
2022-10-26 16:23:55 +01:00
João Valverde 40ec1adfb0 S7Comm: Fix invalid UTF-8 value string chars
Fixes #18533.
2022-10-26 01:42:43 +01:00
Brian Sipos 5bb756e203 epan: centralize SDNV processing along other similar varint types
This avoids having general-purpose decoding happening in
non-DLL-exported functions defined in a dissector for #18478,
and removes unused functions and avoids duplicate decoding.
This also removes unnecessary early exit conditions for #18145.
Unit test cases for varint decoding are added to verify this.
2022-10-19 15:27:42 +00:00
João Valverde 9faaf0ecff wslua: Use introspection API to generate constants 2022-10-07 10:28:47 +01:00
João Valverde f2cc1f2382 epan: Add BASE_STR_WSP and use it
This field display type formats the representation string of
FT_STRING by replacing all space character with ' '.

Instead of "A line end\n" it will output "A line end ".

This allows cleaner code using proto_tree_add_item() and
avoids the problematic pattern

  proto_tree_add_string(..., tvb_format_text_wsp(...));

because we only want to affect the way the string value is displayed,
not the actual field value stored.
2022-09-28 19:32:46 +01:00
John Thacker 30b309d24c proto: Validate add_string values as UTF-8
When a dissector directly adds a string value through
proto_tree_add_string[_format_value], validate that it is
UTF-8 so that only valid UTF-8 strings are used internally,
and written to output (whether text, JSON, or XML.)
(We were treating it as a UTF-8 string anyway, but not
validating it.)

If the string passed in is not UTF-8, that's a dissector bug
Dissectors that use API functions like tvb_get_string_enc
will always produce valid UTF-8, but some do their own
processing.

Fix #18317
2022-09-21 07:53:01 -04:00
João Valverde 80f16015e2 epan: Refactor floating point display types
Remove the redundant BASE_FLOAT field display type. The name
BASE_FLOAT is meaningless and the value aliased to BASE_NONE.

Require BASE_NONE instead of BASE_FLOAT (corresponding to
the printf() %g format).

Add new float display types using BASE_DEC, BASE_HEX and BASE_EXP
corresponfing to %f, %a and %e respectively.

Add support for BASE_CUSTOM with floats.
2022-08-02 13:16:46 +00:00
Alexis La Goutte 0ed87211da proto.h: Fix -Wdocumentation
proto.h:3004:9: warning: parameter 'return_value' not found in the function declaration [-Wdocumentation]
2022-07-21 21:01:25 +00:00
Joakim Karlsson bf8577b88c pfcp: change to utilize proto_tree_add_bitmask_list 2022-07-14 12:46:09 +00:00
Chuck Craft e12954a637 epan: ws_debug log for heuristic that claims frame (len != 0)
It's possible for a dissector to claim a frame without adding to
the tree or being added to frame.protocols (see !6669)
Log a debug message showing the pinfo layers and the dissector that
claimed the tvb (frame/packet).
2022-07-12 14:15:33 +00:00
Gerald Combs 4153af1dc7 wslua: Port make-init-lua to Python3
Port the script that creates init.lua to Python3. The generated init.lua
removes one newline and adds another, otherwise the output is identical
to the Perl version.
Ping #18152.
2022-06-27 16:28:36 +00:00
João Valverde d517feee74 epan: Add more bookkeeping for layers
Packet info already contains the notion of layer depth for the
current protocol, among all the protocols in the frame. This
adds an extra layer number for the protocols that are the same
as the current one. Obviously this will only go above one if
the protocol is repeated in the stack, such as with IP tunneling.

Adds extra logic to track numbers for each protocol in the frame
and update them when calling a dissector.

The total layer number and protocol layer number are store in
the field info structure so they can be used after dissection,
namely by display filters.
2022-04-26 16:50:59 +00:00
Chuck Craft 4e0cd3dbd2 epan: add ENC_TIME_USECS timestamp encoding
Needed to format timestamp in #18038 - packet-cql.c
Mirrors changes made in !1924 - Add ENC_TIME_NSECS timestamp encoding
Documentation in README.dissector, proto.c, proto.h - could use
refresh in a different merge request.
2022-04-14 15:18:30 +00:00
João Valverde 260942e170 dfilter: Refactor macro tree references
This replaces the current macro reference system with
a completely different implementation. Instead of a macro a reference
is a syntax element. A reference is a constant that can be filled
in the dfilter code after compilation from an existing protocol tree.
It is best understood as a field value that can be read from a fixed
tree that is not the frame being filtered. Usually this fixed tree
is the currently selected frame when the filter is applied. This
allows comparing fields in the filtered frame with fields in the
selected frame.

Because the field reference syntax uses the same sigil notation
as a macro we have to use a heuristic to distinguish them:
if the name has a dot it is a field reference, otherwise
it is a macro name.

The reference is synctatically validated at compile time.

There are two main advantages to this implementation (and a couple of
minor ones):

The protocol tree for each selected frame is only walked if we have a
display filter and if the display filter uses references. Also only the
actual reference values are copied, intead of loading the entire tree
into a hash table (in textual form even).

The other advantage is that the reference is tested like a protocol
field against all the values in the selected frame (if there is more
than one).

Currently the reference fields are not "primed" during dissection, so
the entire tree is walked to find a particular reference (this is
similar to the previous implementation).

If the display filter contains a valid reference and the reference is
not loaded at the time the filter is run the result is the same as a
non existing field for a regular READ_TREE instruction.

Fixes #17599.
2022-03-29 12:36:31 +00:00
John Thacker 25d0c88251 epan: Add BASE_SHOW_UTF_8_PRINTABLE
Add BASE_SHOW_UTF_8_PRINTABLE and related function tvb_utf_8_isprint
for supporting fields of bytes that are "maybe UTF-8" (default or
SHOULD be UTF-8 but could be something else, with no encoding indicator),
such as SSID fields in IEEE 802.11 (See #16208), certain OctetString
fields in Diameter or PFCP, and other places where
BASE_SHOW_ASCII_PRINTABLE is currently used. Fix #5307
2022-02-06 00:32:13 +00:00
João Valverde a566076839 epan: Move time display types to field_display_e
This makes it easier to understand the code, avoids conflicts
and ugly and unnecessary casts.

The field display enum has evolved over time from integer types
to a type generic parameter.
2021-12-27 22:31:31 +00:00
Jaap Keuter 1b5acc8d57 Replace ENC_VARIANT_MASK by ENC_VARINT_MASK 2021-12-22 20:14:31 +00:00
João Valverde c5a19582e4 epan: Convert to use stdio.h from GLib
Replace:
    g_snprintf() -> snprintf()
    g_vsnprintf() -> vsnprintf()
    g_strdup_printf() -> ws_strdup_printf()
    g_strdup_vprintf() -> ws_strdup_vprintf()

This is more portable, user-friendly and faster on platforms
where GLib does not like the native I/O.

Adjust the format string to use macros from intypes.h.
2021-12-19 19:29:53 +00:00
João Valverde 6c5d00a746 epan: Remove obsolete function proto_register_fields_manual()
Related with #17774.
2021-12-11 17:02:16 +00:00
João Valverde d2a9cb940a epan: Remove new proto tree API
Remove experimental new API.

Fix Netlink dissector to compile with normal proto tree API.

Closes #17774.
2021-12-10 14:37:01 +00:00
João Valverde 19dcb725b6 epan: Remove STR_ASCII and STR_UNICODE
These display bases work to replace unprintable characters so the
name is a misnomer. In addition they are the same option and this
display behaviour is not something that is configurable.

This does not affect encodings because all our internal text strings
need to be valid UTF-8 and the source encoding is specified using
ENC_*.

Remove the assertion for valid UTF-8 in proto.c because
tvb_get_*_string() must return a valid UTF-8 string, always, and we
don't need to assert that, it is expensive.
2021-12-03 04:35:56 +00:00
John Thacker aadf4efcbe epan: Add ENC_ISO_8601_DATE_TIME_BASIC
Add the ISO 8601 Basic date time format as another string time
option. This could be used for e.g. ASN.1 GeneralizedTime.
Add tests for it.
2021-12-02 14:19:49 +00:00
João Valverde 01f234571f epan: Optimize heuristic name validity check
Do the name check in one pass only, instead of two passes, one
for all letters and a second one to exclude upper case letters.
2021-11-04 14:03:37 +00:00
João Valverde 070aeddf76 Lift restriction on upper case protocol display filter names
Unlike other header fields in filter expressions protocol names
cannot contain upper-case letters. Remove that restriction. This
should make start-up slightly faster as it remove an extra loop
for each protocol filter name.

This was added in 9ead15a6eb but
I don't see a reason to have different rules for protocols and
fields, it seems the README.developer was just being vague and
conflating PROTOABBREV with PROTOFILTERNAME.

The recommendation for lower case is a style recommendation,
and it's a good one, but it should be applied uniformly. As
long as we are not enforcing this for all field filter values
there is no point in enforcing it just for protocol names and
actually it is detrimental, e.g:

hi2operations
HI2Operations.IRIsContent
HI2Operations.UUS1_Content_element
HI2Operations.iRIContent
HI2Operations.iRISequence
HI2Operations.IRIContent
HI2Operations.iRI_Begin_record_element
HI2Operations.iRI_End_record_element
HI2Operations.iRI_Continue_record_element
HI2Operations.iRI_Report_record_element
(...)

It's weird and unexpected to have this difference and there is
no technical reason to require it. What we should probably do
is not include the protocol name in the FIELDFILTERNAME and
have the registration mechanism append it to the PROTOFILTERNAME.

Also disallow leading '-' everywhere in filter names, not just
protocol filter names. It's a universal requirement.
2021-11-02 08:35:24 +00:00
Stig Bjørlykke e9ac4d3900 proto: Delay deleting heur_dtbl_entry_t in heur_dissector_delete
Add the heur_dtbl_entry_t entry as deregistered when deleting a
heuristics dissector. The UDP dissector is storing a pointer to
this in proto_data and may access the entry during reload Lua
plugins until all packets are redissected.
2021-09-29 07:08:52 +00:00
Tomasz Moń 7b82110092 USB HID: Parse bit fields with correct bit order
Implement little endian support for tvb_get_bits family of functions.
The big/little endian refers to bit numbering within an octet. In big
endian, the most significant bit is considered bit 0, while in little
endian the least significant bit is considered bit 0.

Add encoding parameters to proto tree bits format family functions.
Specify ENC_BIG_ENDIAN in all dissectors using these functions except in
USB HID that requires ENC_LITTLE_ENDIAN to work correctly.

When formatting bits values, always display most significant bit on the
leftmost position regardless of the encoding. This results in no gaps
between octets and makes the displayed value comprehensible.

Close #4478
Fix #17014
2021-09-26 18:16:28 +02:00
Triton Circonflexe d4de52690f Thrift: Complete handling of Binary & Compact protocols
- Make sure reassembly requests & errors are properly propagated from
  any point in the PDU, no matter how many sub-structure levels.
- Handle the sub-dissection methods as well:
  - Ensure the sub-dissection methods handle errors from previous calls.
  - Reduce the error handling needed in sub-dissector implementations.
  - Add missing sub-dissection methods for list, set, and map.
  - Add the handling of sub-structure.
- Handle Compact protocol in addition to the existing binary protocol.
  - Include and improve MR !3171
  - Handle reassembly the same way as for binary protocol.
  - Handle sub-dissection with the same functions.
    => Sub-dissectors only depend on .thrift files.

Additional changes:
- Use of constants instead of hard-coded values.
- Removed U64 support (never supported by thrift code generator, only
  referenced in the C++ thrift library header but not supported in reality.
- Removed references to UTF-8 and UTF-16 string for the same reason.
- Replaced references to UTF-7 string with just string (same reason).
- Replaced references to byte with i8 as the documentation explicitly
  states that byte is a compatibility name.

Documentation reference:
- https://thrift.apache.org/developers
- https://thrift.apache.org/docs/idl.html
- https://github.com/apache/thrift/blob/master/doc/specs/thrift-compact-protocol.md
- https://erikvanoosten.github.io/thrift-missing-specification/
- https://diwakergupta.github.io/thrift-missing-guide/

Closes #16244

Additional changes:
- Add authors and improve consistency
- Fix typo and clarify documentation
2021-08-27 06:04:17 +00:00
João Valverde 133b0c583f Move epan/wmem/wmem_scopes.h to epan/
This header was installed incorrectly to epan/wmem_scopes.h.

Instead of creating additional installation rules for a single
header in a subfolder (kept for backward compatibility) just
rename the standard "epan/wmem/wmem.h" include to
"epan/wmem_scopes.h" and fix the documentation.

Now the header is installed *correctly* to epan/wmem_scopes.h.
2021-07-26 14:56:11 +00:00
AndersBroman 754cce9531 Add ENC_APN_STR to handle APN strings 2021-05-20 09:27:53 +00:00