Create a public function in `epan/proto.c` to dissect a single MAC-48
address. Encapsulates the name and OUI resolution, and the LG and IG
bit parsing.
Created after observing that `packet-ieee80211.c` does not resolve the
OUI or IG/LG bits for WLAN fields (`wlan.ra`, `wlan.da`, `wlan.sa`,
`wlan.bssid`) the way that `packet-eth.c` does.
This change modifies `packet-eth.c` and `packet-ieee80211.c`
to use the new function.
Add IG/LG bits
That way, code using either ENC_RFC_822 or ENC_RFC_1123 will get
ENC_IMF_DATE_TIME format, which is what they *should* get if they're
trying to parse Internet Message Format dates.
There is no reason to have separate "RFC 822" and "RFC 1123" parsing of
Internet Message Format date/time values. RFC 1123 adds support for
4-digit years to RFC 822's 2-digit years; RFC 2822, which supplanted RFC
822, supports both, and RFC 5322, which supplanted RFC 2822, continues
to do so. I know of no cases where *only* 2-digit years should be
supported, especially given that it has beenover 23 years since January
1, 2000.
Instead, have ENC_IMF_DATE_TIME for Internet Message Format date/time
values; keep ENC_RFC_1123 for backwards compatibility, but treat it
exactly the same as ENC_IMF_DATE_TIME. Keep ENC_RFC_822 around as an
alias for ENC_IMF_DATE_TIME.
Have separate expert infos for errors parsing string-encoded byte arrays
and string-encoded dates and times, with the messages speaking of byte
arrays and dates/times rather than "numbers". For now, have the only
error being "couldn't convert"; we can add more for specific cases of
what's wrong.
In tvb_get_string_bytes() and tvb_get_string_time(), don't use errno as
a way of indicating failure, use a return of NULL. Using errno 1) runs
the risk of getting errno overwritten by intermediate calls and 2)
would force assinging particular errno values, the set of which we don't
control, to particular errors if we were to distinguish different errors
in the future.
In tvb_get_string_time(), for Internet Message Format date/time values,
check for 2-digit and 3-digit years based on the length of the year
field in the string. Handle them the way the RFCs say, not the way
strptime() does (strptime(), which was originally written to allow
programs to ask the user to specify a date and time and let them enter
it in the locale's date and time format, and to use 2-digit years in
case they were in the habit of doing so, which they might have been
given that this was the mid 1980's; the choice of 1969 as the first "not
21st century" year was based on the fact that it was the earliest year
for a date correspnding to a non-negative time_t value, which is
UNIX-specific, unlike the Internet Message Format).
Change tvb_get_string_time() to use guard clauses to handle errors; yes
it involves gotos, but it's easier to read than nested ifs, for example.
Update the Lua unit tests to reflect this.
Commit af0691342b and
commit 80ae370811 both added
proto_disable_all.
Take the one from the 80ae370811,
which made changes to allow protocols to be enabled by
proto_enable_proto_by_name() even if they are not disabled by
default, which allows --only-protocols to work without changing
the disabled-by-default setting of a protocol.
We don't want to change the disabled by default setting, because
that causes the disabled status to be persistent even when
changing Configuration Profiles with the GUI, instead of being
reset. In general, command line options that temporarily override
the Configuration Profile at startup are cleared when switching
to a new profile via the profile, and we don't want to make an
exception.
--disable-all-protocols will mark all protocols as disabled by default,
and then disable them. Certain protocols can then be enabled one by one
by using --enable-protocol.
--only-protocols is a helper option to make it easier to enable only
certain protocols It's equivalent to passing --disable-all-protocols and
then several --enable-protocol options. It accepts a comma separated
list of protocols. First all protocols will be disabled, and then all
protocols included in the list will be enabled one by one.
Side-note, it wouldn't make much sense to enable only "tcp" for example
without enabling the protocols in the lower layers (e.g: eth, sll, ip,
ipv6). In this case, something like --only-protocols eth,sll,ip,ipv6,tcp
will generally be needed in order to make sure that TCP is decoded.
Signed-off-by: Juanma Sanchez <juasanch@redhat.com>
Added an option for checking the expected RTPS message
checksum is the same as the received in the wire if
checksum is CRC-32C or MD5. Also delted unused header filters.
Introduced function proto_tree_add_checksum_bytes.
Add ENC_BOM to the list of bitflag modifiers, and use it with
UTF-16, UCS-2, and UCS-4 (UTF-32). If set, this means that the
first 2 (or 4) octets, if present, are checked to see if they are
a Big-Endian BYTE ORDER MARK ("ZERO WIDTH NON-BREAKING SPACE"). If so,
those octets are skipped and the encoding is set to Little-Endian
or Big-Endian depending on endianness of the BOM.
If the BOM is absent, the passed in Endianness flag is used normally.
Related to #17991
EBCDIC Code Page 500 has exactly the same repertoire as CP 037,
covering all of ISO-8859-1, but has 7 bytes permuted. It is
the default code page for DRDA; use it there.
Exposing the fvalue_t implementation is exposing internal
details of the implementation. Fix that by making the fvalue_t
internal to the ftypes implementation and using setters/getters
where necessary.
Properly generate filter expressions for custom columns by
using proto_construct_match_selected_string on each value and
then joining them together later instead of trying to split
the column expression value.
This ensures that escaping is done properly for display filter
strings, that commas internal to field values are not confused
with commas between occurrences, that for multifield columns
we can distinguish which field each value matches, etc.
It's not entirely clear whether AND or OR logic is appropriate
for multiple occurrences; currently OR is used.
Bump glib requirement to 2.54 for g_ptr_array_find_with_equal_func
(this doesn't drop support for any major distribution that already
meets our other library requirements, like Qt.)
Fix#18001.
Since cbd3c447 ("ftypes: Add FT_UINT_STRING to IS_FT_STRING() macro")
proto_tree_add_string() accepts FT_UINT_STRING, but the API check still
fails. Update the API check to reflect that change.
The ftype itself is encoding agnostic. In the case of literal
display filter strings it is possible and legal to contain
invalid UTF-8.
Maybe it shouldn't be but that requires a user-friendly diagnostic
message, not silently sanitizing the string as is done currently
(only a debug message is printed in that case).
Do the debug checks in proto_tree_set_string() instead. That
still detects dissector code that might need fixing, which was
the purpose for this check.
Improve documentation and add admonition for proto_tree_add_string().
Ping #18521.
This avoids having general-purpose decoding happening in
non-DLL-exported functions defined in a dissector for #18478,
and removes unused functions and avoids duplicate decoding.
This also removes unnecessary early exit conditions for #18145.
Unit test cases for varint decoding are added to verify this.
This field display type formats the representation string of
FT_STRING by replacing all space character with ' '.
Instead of "A line end\n" it will output "A line end ".
This allows cleaner code using proto_tree_add_item() and
avoids the problematic pattern
proto_tree_add_string(..., tvb_format_text_wsp(...));
because we only want to affect the way the string value is displayed,
not the actual field value stored.
When a dissector directly adds a string value through
proto_tree_add_string[_format_value], validate that it is
UTF-8 so that only valid UTF-8 strings are used internally,
and written to output (whether text, JSON, or XML.)
(We were treating it as a UTF-8 string anyway, but not
validating it.)
If the string passed in is not UTF-8, that's a dissector bug
Dissectors that use API functions like tvb_get_string_enc
will always produce valid UTF-8, but some do their own
processing.
Fix#18317
Remove the redundant BASE_FLOAT field display type. The name
BASE_FLOAT is meaningless and the value aliased to BASE_NONE.
Require BASE_NONE instead of BASE_FLOAT (corresponding to
the printf() %g format).
Add new float display types using BASE_DEC, BASE_HEX and BASE_EXP
corresponfing to %f, %a and %e respectively.
Add support for BASE_CUSTOM with floats.
It's possible for a dissector to claim a frame without adding to
the tree or being added to frame.protocols (see !6669)
Log a debug message showing the pinfo layers and the dissector that
claimed the tvb (frame/packet).
Port the script that creates init.lua to Python3. The generated init.lua
removes one newline and adds another, otherwise the output is identical
to the Perl version.
Ping #18152.
Packet info already contains the notion of layer depth for the
current protocol, among all the protocols in the frame. This
adds an extra layer number for the protocols that are the same
as the current one. Obviously this will only go above one if
the protocol is repeated in the stack, such as with IP tunneling.
Adds extra logic to track numbers for each protocol in the frame
and update them when calling a dissector.
The total layer number and protocol layer number are store in
the field info structure so they can be used after dissection,
namely by display filters.
Needed to format timestamp in #18038 - packet-cql.c
Mirrors changes made in !1924 - Add ENC_TIME_NSECS timestamp encoding
Documentation in README.dissector, proto.c, proto.h - could use
refresh in a different merge request.
This replaces the current macro reference system with
a completely different implementation. Instead of a macro a reference
is a syntax element. A reference is a constant that can be filled
in the dfilter code after compilation from an existing protocol tree.
It is best understood as a field value that can be read from a fixed
tree that is not the frame being filtered. Usually this fixed tree
is the currently selected frame when the filter is applied. This
allows comparing fields in the filtered frame with fields in the
selected frame.
Because the field reference syntax uses the same sigil notation
as a macro we have to use a heuristic to distinguish them:
if the name has a dot it is a field reference, otherwise
it is a macro name.
The reference is synctatically validated at compile time.
There are two main advantages to this implementation (and a couple of
minor ones):
The protocol tree for each selected frame is only walked if we have a
display filter and if the display filter uses references. Also only the
actual reference values are copied, intead of loading the entire tree
into a hash table (in textual form even).
The other advantage is that the reference is tested like a protocol
field against all the values in the selected frame (if there is more
than one).
Currently the reference fields are not "primed" during dissection, so
the entire tree is walked to find a particular reference (this is
similar to the previous implementation).
If the display filter contains a valid reference and the reference is
not loaded at the time the filter is run the result is the same as a
non existing field for a regular READ_TREE instruction.
Fixes#17599.
Add BASE_SHOW_UTF_8_PRINTABLE and related function tvb_utf_8_isprint
for supporting fields of bytes that are "maybe UTF-8" (default or
SHOULD be UTF-8 but could be something else, with no encoding indicator),
such as SSID fields in IEEE 802.11 (See #16208), certain OctetString
fields in Diameter or PFCP, and other places where
BASE_SHOW_ASCII_PRINTABLE is currently used. Fix#5307
This makes it easier to understand the code, avoids conflicts
and ugly and unnecessary casts.
The field display enum has evolved over time from integer types
to a type generic parameter.
Replace:
g_snprintf() -> snprintf()
g_vsnprintf() -> vsnprintf()
g_strdup_printf() -> ws_strdup_printf()
g_strdup_vprintf() -> ws_strdup_vprintf()
This is more portable, user-friendly and faster on platforms
where GLib does not like the native I/O.
Adjust the format string to use macros from intypes.h.
These display bases work to replace unprintable characters so the
name is a misnomer. In addition they are the same option and this
display behaviour is not something that is configurable.
This does not affect encodings because all our internal text strings
need to be valid UTF-8 and the source encoding is specified using
ENC_*.
Remove the assertion for valid UTF-8 in proto.c because
tvb_get_*_string() must return a valid UTF-8 string, always, and we
don't need to assert that, it is expensive.
Unlike other header fields in filter expressions protocol names
cannot contain upper-case letters. Remove that restriction. This
should make start-up slightly faster as it remove an extra loop
for each protocol filter name.
This was added in 9ead15a6eb but
I don't see a reason to have different rules for protocols and
fields, it seems the README.developer was just being vague and
conflating PROTOABBREV with PROTOFILTERNAME.
The recommendation for lower case is a style recommendation,
and it's a good one, but it should be applied uniformly. As
long as we are not enforcing this for all field filter values
there is no point in enforcing it just for protocol names and
actually it is detrimental, e.g:
hi2operations
HI2Operations.IRIsContent
HI2Operations.UUS1_Content_element
HI2Operations.iRIContent
HI2Operations.iRISequence
HI2Operations.IRIContent
HI2Operations.iRI_Begin_record_element
HI2Operations.iRI_End_record_element
HI2Operations.iRI_Continue_record_element
HI2Operations.iRI_Report_record_element
(...)
It's weird and unexpected to have this difference and there is
no technical reason to require it. What we should probably do
is not include the protocol name in the FIELDFILTERNAME and
have the registration mechanism append it to the PROTOFILTERNAME.
Also disallow leading '-' everywhere in filter names, not just
protocol filter names. It's a universal requirement.
Add the heur_dtbl_entry_t entry as deregistered when deleting a
heuristics dissector. The UDP dissector is storing a pointer to
this in proto_data and may access the entry during reload Lua
plugins until all packets are redissected.
Implement little endian support for tvb_get_bits family of functions.
The big/little endian refers to bit numbering within an octet. In big
endian, the most significant bit is considered bit 0, while in little
endian the least significant bit is considered bit 0.
Add encoding parameters to proto tree bits format family functions.
Specify ENC_BIG_ENDIAN in all dissectors using these functions except in
USB HID that requires ENC_LITTLE_ENDIAN to work correctly.
When formatting bits values, always display most significant bit on the
leftmost position regardless of the encoding. This results in no gaps
between octets and makes the displayed value comprehensible.
Close#4478Fix#17014
- Make sure reassembly requests & errors are properly propagated from
any point in the PDU, no matter how many sub-structure levels.
- Handle the sub-dissection methods as well:
- Ensure the sub-dissection methods handle errors from previous calls.
- Reduce the error handling needed in sub-dissector implementations.
- Add missing sub-dissection methods for list, set, and map.
- Add the handling of sub-structure.
- Handle Compact protocol in addition to the existing binary protocol.
- Include and improve MR !3171
- Handle reassembly the same way as for binary protocol.
- Handle sub-dissection with the same functions.
=> Sub-dissectors only depend on .thrift files.
Additional changes:
- Use of constants instead of hard-coded values.
- Removed U64 support (never supported by thrift code generator, only
referenced in the C++ thrift library header but not supported in reality.
- Removed references to UTF-8 and UTF-16 string for the same reason.
- Replaced references to UTF-7 string with just string (same reason).
- Replaced references to byte with i8 as the documentation explicitly
states that byte is a compatibility name.
Documentation reference:
- https://thrift.apache.org/developers
- https://thrift.apache.org/docs/idl.html
- https://github.com/apache/thrift/blob/master/doc/specs/thrift-compact-protocol.md
- https://erikvanoosten.github.io/thrift-missing-specification/
- https://diwakergupta.github.io/thrift-missing-guide/Closes#16244
Additional changes:
- Add authors and improve consistency
- Fix typo and clarify documentation
This header was installed incorrectly to epan/wmem_scopes.h.
Instead of creating additional installation rules for a single
header in a subfolder (kept for backward compatibility) just
rename the standard "epan/wmem/wmem.h" include to
"epan/wmem_scopes.h" and fix the documentation.
Now the header is installed *correctly* to epan/wmem_scopes.h.