The function tvb_get_const_stringz() does not check for a string
encoding and returns a pointer to a byte array. For this reason
it should not be used. Prefer other functions that return a
valid UTF-8 string from a source encoding or use tvb_get_ptr()
to fetch a byte pointer.
Remove the redundant BASE_FLOAT field display type. The name
BASE_FLOAT is meaningless and the value aliased to BASE_NONE.
Require BASE_NONE instead of BASE_FLOAT (corresponding to
the printf() %g format).
Add new float display types using BASE_DEC, BASE_HEX and BASE_EXP
corresponfing to %f, %a and %e respectively.
Add support for BASE_CUSTOM with floats.
Add conversation_new_full and find_conversation_full, which take
arbitrary element lists instead of fixed addresses and ports.
Update the comments in conversation.h to be more Doxygen-conformant.
Update README.dissector.
Use the new functionality to add initial conversation support to the
Falco Bridge dissector.
Needed to format timestamp in #18038 - packet-cql.c
Mirrors changes made in !1924 - Add ENC_TIME_NSECS timestamp encoding
Documentation in README.dissector, proto.c, proto.h - could use
refresh in a different merge request.
Add BASE_SHOW_UTF_8_PRINTABLE and related function tvb_utf_8_isprint
for supporting fields of bytes that are "maybe UTF-8" (default or
SHOULD be UTF-8 but could be something else, with no encoding indicator),
such as SSID fields in IEEE 802.11 (See #16208), certain OctetString
fields in Diameter or PFCP, and other places where
BASE_SHOW_ASCII_PRINTABLE is currently used. Fix#5307
Repeated words were found with:
egrep "(\b[a-zA-Z]+) +\1\b" . -Ir
and then manually reviewed.
Non-displayed strings (e.g., in comments)
were also corrected, to ease future review.
These display bases work to replace unprintable characters so the
name is a misnomer. In addition they are the same option and this
display behaviour is not something that is configurable.
This does not affect encodings because all our internal text strings
need to be valid UTF-8 and the source encoding is specified using
ENC_*.
Remove the assertion for valid UTF-8 in proto.c because
tvb_get_*_string() must return a valid UTF-8 string, always, and we
don't need to assert that, it is expensive.
Unlike other header fields in filter expressions protocol names
cannot contain upper-case letters. Remove that restriction. This
should make start-up slightly faster as it remove an extra loop
for each protocol filter name.
This was added in 9ead15a6eb but
I don't see a reason to have different rules for protocols and
fields, it seems the README.developer was just being vague and
conflating PROTOABBREV with PROTOFILTERNAME.
The recommendation for lower case is a style recommendation,
and it's a good one, but it should be applied uniformly. As
long as we are not enforcing this for all field filter values
there is no point in enforcing it just for protocol names and
actually it is detrimental, e.g:
hi2operations
HI2Operations.IRIsContent
HI2Operations.UUS1_Content_element
HI2Operations.iRIContent
HI2Operations.iRISequence
HI2Operations.IRIContent
HI2Operations.iRI_Begin_record_element
HI2Operations.iRI_End_record_element
HI2Operations.iRI_Continue_record_element
HI2Operations.iRI_Report_record_element
(...)
It's weird and unexpected to have this difference and there is
no technical reason to require it. What we should probably do
is not include the protocol name in the FIELDFILTERNAME and
have the registration mechanism append it to the PROTOFILTERNAME.
Also disallow leading '-' everywhere in filter names, not just
protocol filter names. It's a universal requirement.
- Make sure reassembly requests & errors are properly propagated from
any point in the PDU, no matter how many sub-structure levels.
- Handle the sub-dissection methods as well:
- Ensure the sub-dissection methods handle errors from previous calls.
- Reduce the error handling needed in sub-dissector implementations.
- Add missing sub-dissection methods for list, set, and map.
- Add the handling of sub-structure.
- Handle Compact protocol in addition to the existing binary protocol.
- Include and improve MR !3171
- Handle reassembly the same way as for binary protocol.
- Handle sub-dissection with the same functions.
=> Sub-dissectors only depend on .thrift files.
Additional changes:
- Use of constants instead of hard-coded values.
- Removed U64 support (never supported by thrift code generator, only
referenced in the C++ thrift library header but not supported in reality.
- Removed references to UTF-8 and UTF-16 string for the same reason.
- Replaced references to UTF-7 string with just string (same reason).
- Replaced references to byte with i8 as the documentation explicitly
states that byte is a compatibility name.
Documentation reference:
- https://thrift.apache.org/developers
- https://thrift.apache.org/docs/idl.html
- https://github.com/apache/thrift/blob/master/doc/specs/thrift-compact-protocol.md
- https://erikvanoosten.github.io/thrift-missing-specification/
- https://diwakergupta.github.io/thrift-missing-guide/Closes#16244
Additional changes:
- Add authors and improve consistency
- Fix typo and clarify documentation
Add a new timestamp encoding format ENC_TIME_NSECS, like ENC_TIME_SEC but
for nanosecond values. Needed for my work-in-progress dissector for Apple
push notifications.
- Fix duplicate "are are".
- Fix NTP epoch year in ENC_TIME_NTP docs (572b80d2 fixed it in the README
but not in proto.h).
- Remove completely redundant "(ie. )" clauses.
ENC_TIME_MIP6 and ENC_TIME_CLASSIC_MAC_OS_SECS were added recently by
factoring them out of specific dissectors, but they weren't documented.
I added documentation, based on comments in the dissector code they came
from.
Add support internally to using iconv (always present with glib) to convert
strings from various encodings to UTF-8 (using REPLACEMENT CHARACTER as
recommended), and use that to support GB 18030 and EUC-KR. Replace call
directly to iconv in ANSI 637 for EUC-KR to new API. Update comments
and documentation around character encodings. It is possible to replace
the calls to iconv with an internal decoder later. Tested on Linux and
on Windows (including with illegal characters). Closes#16630.
FT_STRINGZPAD is for null-*padded* strings, where the field is in an
area of specified length, and, if the string is shorter than that
length, all bytes past the end of the string are NULs.
FT_STRINGZTRUNC is for null-*truncated* strings, where the field is in
an area of specified length and, if the string is shorter than that
length, there's a null character (which might be more than one byte, for
UCS-2, UTF-16, or UTF-32), and anything after that is not guaranteed to
have any particular value.
Use IS_FT_STRING() in some places rather than enumerating all the string
types, so that those places get automatically changed if the set of
string types changes.
FT_STRINGZ should be used *ONLY* if the string is *ALWAYS* supposed to
have a null terminator, either because the length isn't otherwise
specified, so that it can only be determined by finding the terminating
null character, or because a character count *and* a NULL terminator are
both used (yes, there appear to be some cases where that's true).
FT_STRINGZPAD is null-padded rather than null-terminated; this is
typically used for fixed-length fields that contain a string value that
might be shorter than the fixed length.
Change-Id: Ifdf421ca666482583a4dfc76167eae6dc473f48a
Reviewed-on: https://code.wireshark.org/review/38137
Reviewed-by: Guy Harris <gharris@sonic.net>
The static arrays are supposed to be arrays of const pointers to int,
not arrays of non-const pointers to const int.
Fixing that means some bugs (scribbling on what's *supposed* to be a
const array) will be caught (see packet-ieee80211-radiotap.c for
examples, the first of which inspired this change and the second of
which was discovered while testing compiles with this change), and
removes the need for some annoying casts.
Also make some of those arrays static while we're at it.
Update documentation and dissector-generator tools.
Change-Id: I789da5fc60aadc15797cefecfd9a9fbe9a130ccc
Reviewed-on: https://code.wireshark.org/review/37517
Petri-Dish: Guy Harris <gharris@sonic.net>
Tested-by: Petri Dish Buildbot
Reviewed-by: Anders Broman <a.broman58@gmail.com>
Add some ENC_ values for various flavors of packed BCD, and use that
instead of explicitly calling tvb_bcd_dig_to_wmem_packet_str() and
adding the result.
Change-Id: I07511d9d09c9231b610c121cd6ffb3b16fb017a9
Reviewed-on: https://code.wireshark.org/review/36952
Reviewed-by: Guy Harris <gharris@sonic.net>
They were added in the code, but weren't documented.
Change-Id: Iaa12e2d33aa4a4b889c00a7f10b12b4c9b6e8197
Reviewed-on: https://code.wireshark.org/review/36953
Reviewed-by: Guy Harris <gharris@sonic.net>
Check is enabled by #ifdef ENABLE_CHECK_FILTER
Remaining issues found by this check are fixed here,
along with a documentation note that the entries
are checked in order and the first match is used.
The only issue not yet fixed is in packet-isup.c,
where the spec was not available to me.
Change-Id: Ife747cda9b91a265bc2b81ce0a53f55f3389919e
Reviewed-on: https://code.wireshark.org/review/36708
Petri-Dish: Martin Mathieson <martin.r.mathieson@googlemail.com>
Tested-by: Petri Dish Buildbot
Reviewed-by: Martin Mathieson <martin.r.mathieson@googlemail.com>
Generate a dissector based on doc/packet-PROTOABBREV.c.
Change-Id: I9233c1212acb30f7166ba91e39d98bc3fb123731
Reviewed-on: https://code.wireshark.org/review/35062
Reviewed-by: Graham Bloice <graham.bloice@trihedral.com>
Reviewed-by: Dario Lombardo <lomato@gmail.com>
proto_tree_add_bitmask_len also expects an expert information field
to display in the event that the decodeable length is less than the
specified length.
Bug: 16061
Change-Id: If8061b0754cd6862799ab76bf9c10e16ed5d8f38
Reviewed-on: https://code.wireshark.org/review/34567
Reviewed-by: Alexis La Goutte <alexis.lagoutte@gmail.com>
Reviewed-by: Peter Wu <peter@lekensteyn.nl>
We removed the "title" member from decode_as_t.
Update the sample code snippet accordingly.
Change-Id: I5d4ba979c955de50287f5b4deea7c64bf96f7d9b
Reviewed-on: https://code.wireshark.org/review/33574
Reviewed-by: Anders Broman <a.broman58@gmail.com>
Add a BASE_SHOW_ASCII_PRINTABLE flag for the "display" field, to use
with FT_BYTES and FT_UINT_BYTES fields; it specifies that, if the field
consists solely of printable ASCII characters, its value be displayed as
a string, in quotes. Have a routine hfinfo_format_bytes() to do that
formatting, depending on the display field value.
Add routines to fetch the display value of string and
FT_BYTES/FT_UINT_BYTES fields; for strings, it's the result of
hfinfo_format_text(), and for byte arrays, it's the result of
hfinfo_format_bytes().
Use BASE_SHOW_ASCII_PRINTABLE for extended attribute data in SMB and
SMB2. Use the routines in question for extended attribute names
(string) and data (bytes). That keeps us from displaying non-text
extended attribute data as if it were text.
Document BASE_SHOW_ASCII_PRINTABLE.
Change-Id: I24dcf459c14f00985e4daaf9b58f5933964eabd8
Reviewed-on: https://code.wireshark.org/review/33517
Petri-Dish: Guy Harris <guy@alum.mit.edu>
Tested-by: Petri Dish Buildbot
Reviewed-by: Guy Harris <guy@alum.mit.edu>
Reorganize the lists of accessors, with a top-level heading for the byte
order and subheadings for each size.
Also document ENC_HOST_ENDIAN.
Change-Id: I10131e399f6c90624a387c89340f77ea769ab33f
Reviewed-on: https://code.wireshark.org/review/32701
Reviewed-by: Guy Harris <guy@alum.mit.edu>
This adds val64_string_ext to parallel value_string_ext in the
same way that val64_string parallels value_string.
Change-Id: Iadbfc49f5a4540000ed92fd0469e8d273911e97e
Reviewed-on: https://code.wireshark.org/review/30385
Petri-Dish: Peter Wu <peter@lekensteyn.nl>
Tested-by: Petri Dish Buildbot
Reviewed-by: Anders Broman <a.broman58@gmail.com>
Add ENC_TIME_SECS_NSECS and ENC_TIME_SECS_USECS; they make it more
explicit (especially to those not familiar with UN*X data types) what
the representation is, allow for ENC_TIME_SECS_MSECS etc. if they're
needed, and match names such as ENC_TIME_SECS and ENC_TIME_MSECS.
Change-Id: I6ab36fb4da70563587141cd65ffff8523477b0c4
Reviewed-on: https://code.wireshark.org/review/28564
Reviewed-by: Guy Harris <guy@alum.mit.edu>
It has been replaced by cmake.
Change-Id: I83a5eddb8645dbbf6bca9f026066d2e995d8e87a
Reviewed-on: https://code.wireshark.org/review/26969
Petri-Dish: Dario Lombardo <lomato@gmail.com>
Tested-by: Petri Dish Buildbot
Reviewed-by: Gerald Combs <gerald@wireshark.org>
Reviewed-by: Anders Broman <a.broman58@gmail.com>
First mention tvbuff_new_subset_remaining(), as that's good enough for
most uses.
Then mention tvb_new_subset_length(), which is what most of the
remaining cases should use; we weren't even documenting it.
Then mention tvb_new_subset_length_caplen(); we want that to be used
only when *absolutely* necessary.
Change-Id: I57a6c202d4a68b001ddca8bd4c7e1d271eb52ef9
Reviewed-on: https://code.wireshark.org/review/26864
Reviewed-by: Guy Harris <guy@alum.mit.edu>
Update invalid description for tvb_get_nstringz() and
tvb_get_nstringz0().
Change-Id: I03483bc1a2aa5a701b44cd895b91289716ef215d
Reviewed-on: https://code.wireshark.org/review/26598
Reviewed-by: Peter Wu <peter@lekensteyn.nl>
Petri-Dish: Peter Wu <peter@lekensteyn.nl>
Tested-by: Petri Dish Buildbot
Reviewed-by: Anders Broman <a.broman58@gmail.com>