Thrift: Complete handling of Binary & Compact protocols

- Make sure reassembly requests & errors are properly propagated from
  any point in the PDU, no matter how many sub-structure levels.
- Handle the sub-dissection methods as well:
  - Ensure the sub-dissection methods handle errors from previous calls.
  - Reduce the error handling needed in sub-dissector implementations.
  - Add missing sub-dissection methods for list, set, and map.
  - Add the handling of sub-structure.
- Handle Compact protocol in addition to the existing binary protocol.
  - Include and improve MR !3171
  - Handle reassembly the same way as for binary protocol.
  - Handle sub-dissection with the same functions.
    => Sub-dissectors only depend on .thrift files.

Additional changes:
- Use of constants instead of hard-coded values.
- Removed U64 support (never supported by thrift code generator, only
  referenced in the C++ thrift library header but not supported in reality.
- Removed references to UTF-8 and UTF-16 string for the same reason.
- Replaced references to UTF-7 string with just string (same reason).
- Replaced references to byte with i8 as the documentation explicitly
  states that byte is a compatibility name.

Documentation reference:
- https://thrift.apache.org/developers
- https://thrift.apache.org/docs/idl.html
- https://github.com/apache/thrift/blob/master/doc/specs/thrift-compact-protocol.md
- https://erikvanoosten.github.io/thrift-missing-specification/
- https://diwakergupta.github.io/thrift-missing-guide/

Closes #16244

Additional changes:
- Add authors and improve consistency
- Fix typo and clarify documentation
This commit is contained in:
Triton Circonflexe 2021-04-23 19:34:54 +02:00 committed by Wireshark GitLab Utility
parent b17f354304
commit d4de52690f
7 changed files with 3289 additions and 706 deletions

View File

@ -3989,9 +3989,10 @@ James Lynch <lynch007[AT]gmail.com>
Chidambaram Arunachalam <carunach[AT]cisco.com>
João Valverde <j[AT]v6e.pt>
Benoît Canet <benoit[AT]scylladb.com>
Håkon Øye Amundsen <haakon.amundsen[AT]nordicsemi.no>
Håkon Øye Amundsen <haakon.amundsen[AT]nordicsemi.no>
Jeffrey Wildman <jeffrey.wildman[AT]ll.mit.edu>
Jan Schiefer <jan.schiefer[AT]keysight.com>
Triton Circonflexe <triton+enuiqr[AT]kumal.info>
= From git log =
@ -5022,7 +5023,7 @@ Mattia Cazzola <mattiac[AT]alinet.it> provided a patch to the hex dump display r
We use the exception module from Kazlib, a C library written by Kaz Kylheku <kaz[AT]kylheku.com>. Thanks go to him for his well-written library. The Kazlib home page can be found at http://www.kylheku.com/~kaz/kazlib.html
We use Lua BitOp, written by Mike Pall, for bitwise operations on numbers in Lua. The Lua BitOp home page can be found at https://bitop.luajit.org
We use Lua BitOp, written by Mike Pall, for bitwise operations on numbers in Lua. The Lua BitOp home page can be found at https://bitop.luajit.org/
snax <snax[AT]shmoo.com> gave permission to use his(?) weak key detection code from Airsnort.

View File

@ -3989,9 +3989,10 @@ James Lynch <lynch007[AT]gmail.com>
Chidambaram Arunachalam <carunach[AT]cisco.com>
João Valverde <j[AT]v6e.pt>
Benoît Canet <benoit[AT]scylladb.com>
Håkon Øye Amundsen <haakon.amundsen[AT]nordicsemi.no>
Håkon Øye Amundsen <haakon.amundsen[AT]nordicsemi.no>
Jeffrey Wildman <jeffrey.wildman[AT]ll.mit.edu>
Jan Schiefer <jan.schiefer[AT]keysight.com>
Triton Circonflexe <triton+enuiqr[AT]kumal.info>
= Acknowledgements =
@ -4001,7 +4002,7 @@ Mattia Cazzola <mattiac[AT]alinet.it> provided a patch to the hex dump display r
We use the exception module from Kazlib, a C library written by Kaz Kylheku <kaz[AT]kylheku.com>. Thanks go to him for his well-written library. The Kazlib home page can be found at http://www.kylheku.com/~kaz/kazlib.html
We use Lua BitOp, written by Mike Pall, for bitwise operations on numbers in Lua. The Lua BitOp home page can be found at http://bitop.luajit.org/
We use Lua BitOp, written by Mike Pall, for bitwise operations on numbers in Lua. The Lua BitOp home page can be found at https://bitop.luajit.org/
Henrik Brix Andersen <brix[AT]gimp.org> gave permission for his webbrowser calling routine to be used.

View File

@ -438,13 +438,20 @@ libwireshark.so.0 libwireshark0 #MINVER#
dissect_rpc_void@Base 1.99.8
dissect_rtp_shim_header@Base 2.9.1
dissect_tcp_payload@Base 1.99.0
dissect_thrift_t_byte@Base 2.5.0
dissect_thrift_t_bool@Base 3.5.0
dissect_thrift_t_i8@Base 3.5.0
dissect_thrift_t_i16@Base 3.5.0
dissect_thrift_t_i32@Base 2.5.0
dissect_thrift_t_i64@Base 2.5.1
dissect_thrift_t_double@Base 3.5.0
dissect_thrift_t_binary@Base 3.5.0
dissect_thrift_t_list@Base 3.5.0
dissect_thrift_t_map@Base 3.5.0
dissect_thrift_t_set@Base 3.5.0
dissect_thrift_t_string@Base 3.5.0
dissect_thrift_t_string_enc@Base 3.5.0
dissect_thrift_t_stop@Base 2.5.0
dissect_thrift_t_struct@Base 2.5.0
dissect_thrift_t_u64@Base 2.5.1
dissect_thrift_t_utf7@Base 2.5.0
dissect_tpkt_encap@Base 1.9.1
dissect_unknown_ber@Base 1.9.1
dissect_xdlc_control@Base 1.9.1

View File

@ -2488,7 +2488,7 @@ Where:
next_tvb is the new TVBUFF_SUBSET.
offset is the byte offset of 'tvb' at which the new tvbuff
should start. The first byte is the 0th byte.
should start. The first byte is the byte at offset 0.
To create a new TVBUFF_SUBSET that begins at a specified offset in a
parent tvbuff, with a specified number of bytes in the payload, the
@ -2503,7 +2503,7 @@ Where:
next_tvb is the new TVBUFF_SUBSET.
offset is the byte offset of 'tvb' at which the new tvbuff
should start. The first byte is the 0th byte.
should start. The first byte is the byte at offset 0.
reported_length is the number of bytes that the current protocol
says should be in the payload.
@ -2522,7 +2522,7 @@ Where:
next_tvb is the new TVBUFF_SUBSET.
offset is the byte offset of 'tvb' at which the new tvbuff
should start. The first byte is the 0th byte.
should start. The first byte is the byte at offset 0.
length is the number of bytes in the new TVBUFF_SUBSET. A length
argument of -1 says to use as many bytes as are available in
@ -2560,7 +2560,7 @@ table using their unique identifier using one of the following APIs:
table, but it lets the user add it from the command line or, in Wireshark,
through the "Decode As" UI.
Then when the dissector hits the common identifier field, it will useone of the
Then when the dissector hits the common identifier field, it will use one of the
following APIs to invoke the subdissector:
int dissector_try_uint(dissector_table_t sub_dissectors,
@ -2696,7 +2696,7 @@ function. The ptype variable is used to differentiate between
conversations over different protocols, i.e. TCP and UDP. The options
variable is used to define a conversation that will accept any destination
address and/or port. Set options = 0 if the destination port and address
are know when conversation_new is called. See section 2.4 for more
are known when conversation_new is called. See section 2.4 for more
information on usage of the options parameter.
The conversation_new prototype:

File diff suppressed because it is too large Load Diff

View File

@ -1,4 +1,7 @@
/* packet-thrift.h
*
* Copyright 2015, Anders Broman <anders.broman[at]ericsson.com>
* Copyright 2019-2021, Triton Circonflexe <triton[at]kumal.info>
*
* Wireshark - Network traffic analyzer
* By Gerald Combs <gerald@wireshark.org>
@ -6,7 +9,7 @@
*
* SPDX-License-Identifier: GPL-2.0-or-later
*
* Note used by proprietarry dissectors (too).
* Note: used by proprietary dissectors (too).
*/
#ifndef __PACKET_THRIFT_H__
@ -15,61 +18,159 @@
#include "ws_symbol_export.h"
typedef enum
{
typedef enum {
DE_THRIFT_T_STOP = 0,
DE_THRIFT_T_VOID,
DE_THRIFT_T_BOL,
DE_THRIFT_T_BYTE,
DE_THRIFT_T_VOID, // DE_THRIFT_T_UNUSED_1?
DE_THRIFT_T_BOOL,
DE_THRIFT_T_I8,
DE_THRIFT_T_DOUBLE,
DE_THRIFT_T_UNUSED_5,
DE_THRIFT_T_UNUSED_5, // Intended for U16?
DE_THRIFT_T_I16,
DE_THRIFT_T_UNUSED_7,
DE_THRIFT_T_UNUSED_7, // Intended for U32?
DE_THRIFT_T_I32,
DE_THRIFT_T_U64,
DE_THRIFT_T_UNUSED_9, // Intended for U64?
DE_THRIFT_T_I64,
DE_THRIFT_T_UTF7,
DE_THRIFT_T_BINARY,
DE_THRIFT_T_STRUCT,
DE_THRIFT_T_MAP,
DE_THRIFT_T_SET,
DE_THRIFT_T_LIST,
DE_THRIFT_T_UTF8,
DE_THRIFT_T_UTF16
} trift_type_enum_t;
} thrift_type_enum_t;
typedef struct _thrift_struct_t {
const int *p_id; /* The hf field for the struct member*/
int fid; /* The Thrift field id of the stuct memeber*/
gboolean optional; /* TRUE if element is optional, FALSE otherwise */
trift_type_enum_t type; /* The thrift type of the struct member */
} thrift_struct_t;
typedef enum {
ME_THRIFT_T_CALL = 1,
ME_THRIFT_T_REPLY,
ME_THRIFT_T_EXCEPTION,
ME_THRIFT_T_ONEWAY,
} thrift_method_type_enum_t;
/*
These functions are to be used by dissectors dissecting Thrift based protocols similar to packet-ber.c
* This is a list of flags even though not all combinations are available.
* - Framed is compatible with everything;
* - Default (0x00) is old binary;
* - Binary can be augmented with Strict (message header is different but content is the same);
* - Compact is incompatible with Binary & Strict as everything is coded differently;
* - If Compact bit is set, Strict bit will be ignored (0x06 ~= 0x04).
*
* Valid values go from 0x00 (old binary format) to 0x05 (framed compact).
*
* Note: Compact is not fully supported yet.
*/
typedef enum {
PROTO_THRIFT_BINARY = 0x00,
PROTO_THRIFT_FRAMED = 0x01,
PROTO_THRIFT_STRICT = 0x02,
PROTO_THRIFT_COMPACT = 0x04
} thrift_protocol_enum_t;
*/
WS_DLL_PUBLIC int dissect_thrift_t_stop(tvbuff_t* tvb, packet_info* pinfo _U_, proto_tree* tree, int offset);
#define THRIFT_OPTION_DATA_CANARY 0x8001da7a
#define THRIFT_REQUEST_REASSEMBLY (-1)
#define THRIFT_SUBDISSECTOR_ERROR (-2)
WS_DLL_PUBLIC int dissect_thrift_t_byte(tvbuff_t* tvb, packet_info* pinfo _U_, proto_tree* tree, int offset, int field_id _U_, gint hf_id);
WS_DLL_PUBLIC int dissect_thrift_t_i32(tvbuff_t* tvb, packet_info* pinfo _U_, proto_tree* tree, int offset, int field_id _U_, gint hf_id);
WS_DLL_PUBLIC int dissect_thrift_t_u64(tvbuff_t* tvb, packet_info* pinfo _U_, proto_tree* tree, int offset, int field_id _U_, gint hf_id);
WS_DLL_PUBLIC int dissect_thrift_t_i64(tvbuff_t* tvb, packet_info* pinfo _U_, proto_tree* tree, int offset, int field_id _U_, gint hf_id);
WS_DLL_PUBLIC int dissect_thrift_t_utf7(tvbuff_t* tvb, packet_info* pinfo _U_, proto_tree* tree, int offset, int field_id _U_, gint hf_id);
typedef struct _thrift_option_data_t {
guint32 canary; /* Ensure that we don't read garbage.
* Sub-dissectors should check against THRIFT_OPTION_DATA_CANARY. */
thrift_method_type_enum_t mtype; /* Method type necessary to know how to decode the message. */
thrift_protocol_enum_t tprotocol; /* Type and version of Thrift TProtocol.
* Framed?((Strict? Binary)|Compact) */
gint64 reply_field_id; /* First (and theoritically only) field id of the current REPLY.
* This is useful for the sub-dissectors to handle exceptions. */
gint64 previous_field_id; /* Last field id that was present in the current struct.
* Set by dissect_thrift_t_struct after the field has been
* entirely read.
* Read by the next dissect_thrift_t_field_header if only
* a delta is available (for TCompactProtocol). */
proto_tree *reassembly_tree; /* Tree were the reassembly was requested. */
/* Useful if the caller can't reassemble (Framed). */
gint32 reassembly_offset; /* Where the incomplete data starts. */
gint32 reassembly_length; /* Expected size of the data. */
} thrift_option_data_t;
#define TMFILL NULL, { .m = { NULL, NULL } }
typedef struct _thrift_member_t thrift_member_t;
struct _thrift_member_t {
const gint *p_hf_id; /* The hf field for the struct member*/
const gint16 fid; /* The Thrift field id of the stuct memeber*/
const gboolean optional; /* TRUE if element is optional, FALSE otherwise */
const thrift_type_enum_t type; /* The thrift type of the struct member */
const gint *p_ett_id; /* An ett field used for the subtree created if the member is a compound type. */
union {
const guint encoding;
const thrift_member_t *element;
const thrift_member_t *members;
struct {
const thrift_member_t *key;
const thrift_member_t *value;
} m;
} u;
};
/** These functions are to be used by dissectors dissecting Thrift based protocols similar to packet-ber.c
*
* @param[in] tvb: Pointer to the tvbuff_t holding the captured data.
* @param[in] pinfo: Pointer to the packet_info holding information about the currently dissected packet.
* @param[in] tree: Pointer to the proto_tree used to hold the display tree in Wireshark's interface.
* @param[in] offset: Offset from the beginning of the tvbuff_t where the Thrift field is. Function will dissect type, id, & data.
* @param[in] thrift_opt: Options from the Thrift dissector that will be necessary for sub-dissection (binary vs. compact, ...)
* @param[in] is_field: Indicate if the offset point to a field element and if field type and field id must be dissected.
* Only for containers internal use. Sub-dissectors must always use TRUE except for struct (see below).
* @param[in] field_id: Thrift field identifier, to check that the right field is being dissected (in case of optional fields).
* @param[in] hf_id: Header field info that describes the field to display (display name, filter name, FT_TYPE, ...).
*
* @param[in] encoding: Encoding used for string display. (Only for dissect_thrift_t_string_enc)
*
* @return Offset of the first non-dissected byte in case of success,
* THRIFT_REQUEST_REASSEMBLY (-1) in case reassembly is required, or
* THRIFT_SUBDISSECTOR_ERROR (-2) in case of error.
* Sub-dissector must follow the same convention on return.
* Replacing THRIFT_SUBDISSECTOR_ERROR with a 0 return value has the same effect
* as activating "Fallback to generic Thrift dissector if sub-dissector fails"
* in this dissector (thrift.fallback_on_generic option).
*/
WS_DLL_PUBLIC int dissect_thrift_t_stop (tvbuff_t* tvb, packet_info* pinfo, proto_tree* tree, int offset);
WS_DLL_PUBLIC int dissect_thrift_t_bool (tvbuff_t* tvb, packet_info* pinfo, proto_tree* tree, int offset, thrift_option_data_t *thrift_opt, gboolean is_field, int field_id, gint hf_id);
WS_DLL_PUBLIC int dissect_thrift_t_i8 (tvbuff_t* tvb, packet_info* pinfo, proto_tree* tree, int offset, thrift_option_data_t *thrift_opt, gboolean is_field, int field_id, gint hf_id);
WS_DLL_PUBLIC int dissect_thrift_t_i16 (tvbuff_t* tvb, packet_info* pinfo, proto_tree* tree, int offset, thrift_option_data_t *thrift_opt, gboolean is_field, int field_id, gint hf_id);
WS_DLL_PUBLIC int dissect_thrift_t_i32 (tvbuff_t* tvb, packet_info* pinfo, proto_tree* tree, int offset, thrift_option_data_t *thrift_opt, gboolean is_field, int field_id, gint hf_id);
WS_DLL_PUBLIC int dissect_thrift_t_i64 (tvbuff_t* tvb, packet_info* pinfo, proto_tree* tree, int offset, thrift_option_data_t *thrift_opt, gboolean is_field, int field_id, gint hf_id);
WS_DLL_PUBLIC int dissect_thrift_t_double (tvbuff_t* tvb, packet_info* pinfo, proto_tree* tree, int offset, thrift_option_data_t *thrift_opt, gboolean is_field, int field_id, gint hf_id);
WS_DLL_PUBLIC int dissect_thrift_t_binary (tvbuff_t* tvb, packet_info* pinfo, proto_tree* tree, int offset, thrift_option_data_t *thrift_opt, gboolean is_field, int field_id, gint hf_id);
WS_DLL_PUBLIC int dissect_thrift_t_string (tvbuff_t* tvb, packet_info* pinfo, proto_tree* tree, int offset, thrift_option_data_t *thrift_opt, gboolean is_field, int field_id, gint hf_id);
WS_DLL_PUBLIC int dissect_thrift_t_string_enc(tvbuff_t* tvb, packet_info* pinfo, proto_tree* tree, int offset, thrift_option_data_t *thrift_opt, gboolean is_field, int field_id, gint hf_id, guint encoding);
/** Dissect a Thrift struct
* Dissect a Thrift struct by calling the struct member dissector in turn from the thrift_struct_t array
*
* @param[in] tvb tvb with the thrift data
* @param[in] pinfo The packet info struct
* @param[in] tree the packet tree
* @param[in] offset the offset where to start dissection in the given tvb
* @param[in] seq an array of thrift_struct_t's containing thrift type of the struct members the hf variable to use etc.
* @param[in] field_id the Thrift field id of the struct
* @param[in] hf_id a header field of FT_BYTES which will be the struct header field
* @param[in] ett_id an ett field used for the subtree created to list the struct members.
* @return The number of bytes dissected.
*/
WS_DLL_PUBLIC int dissect_thrift_t_struct(tvbuff_t* tvb, packet_info* pinfo, proto_tree* tree, int offset, const thrift_struct_t *seq,
int field_id _U_, gint hf_id, gint ett_id);
* Dissect a Thrift struct by calling the struct member dissector in turn from the thrift_member_t array
*
* @param[in] tvb: Pointer to the tvbuff_t holding the captured data.
* @param[in] pinfo: Pointer to the packet_info holding information about the currently dissected packet.
* @param[in] tree: Pointer to the proto_tree used to hold the display tree in Wireshark's interface.
* @param[in] offset: Offset from the beginning of the tvbuff_t where the Thrift field is. Function will dissect type, id, & data.
* @param[in] thrift_opt: Options from the Thrift dissector that will be necessary for sub-dissection (binary vs. compact, ...)
* @param[in] is_field: Indicate if the offset point to a field element and if field type and field id must be dissected.
* Only for internal use in containers. Sub-dissectors must always use TRUE except for struct (see below).
* Sub-dissectors should always use TRUE except in one case:
* - Define the parameters of the Thrift command as a struct (including T_STOP at the end)
* - Single call to dissect_thrift_t_struct with is_field = FALSE.
* @param[in] field_id: Thrift field identifier, to check that the right field is being dissected (in case of optional fields).
* @param[in] hf_id: A header field of FT_BYTES which will be the struct header field
*
* @param[in] ett_id: An ett field used for the subtree created to list the container's elements.
*
* @param[in] key: Description of the map's key elements.
* @param[in] val: Description of the map's value elements.
*
* @param[in] elt: Description of the list's or set's elements.
*
* @param[in] seq: Sequence of descriptions of the structure's members.
* An array of thrift_member_t's containing thrift type of the struct members the hf variable to use etc.
*
* @return Offset of the first non-dissected byte in case of success,
* Same error values and remarks as above.
*/
WS_DLL_PUBLIC int dissect_thrift_t_map (tvbuff_t* tvb, packet_info* pinfo, proto_tree* tree, int offset, thrift_option_data_t *thrift_opt, gboolean is_field, int field_id, gint hf_id, gint ett_id, const thrift_member_t *key, const thrift_member_t *val);
WS_DLL_PUBLIC int dissect_thrift_t_set (tvbuff_t* tvb, packet_info* pinfo, proto_tree* tree, int offset, thrift_option_data_t *thrift_opt, gboolean is_field, int field_id, gint hf_id, gint ett_id, const thrift_member_t *elt);
WS_DLL_PUBLIC int dissect_thrift_t_list (tvbuff_t* tvb, packet_info* pinfo, proto_tree* tree, int offset, thrift_option_data_t *thrift_opt, gboolean is_field, int field_id, gint hf_id, gint ett_id, const thrift_member_t *elt);
WS_DLL_PUBLIC int dissect_thrift_t_struct(tvbuff_t* tvb, packet_info* pinfo, proto_tree* tree, int offset, thrift_option_data_t *thrift_opt, gboolean is_field, int field_id, gint hf_id, gint ett_id, const thrift_member_t *seq);
#endif /*__PACKET_THRIFT_H__ */

View File

@ -1115,7 +1115,7 @@ WS_DLL_PUBLIC void proto_item_prepend_text(proto_item *pi, const char *format, .
/** Set proto_item's length inside tvb, after it has already been created.
@param pi the item to set the length
@param length the new length ot the item */
@param length the new length of the item */
WS_DLL_PUBLIC void proto_item_set_len(proto_item *pi, const gint length);
/**
@ -1123,6 +1123,14 @@ WS_DLL_PUBLIC void proto_item_set_len(proto_item *pi, const gint length);
* offset, which is the offset past the end of the item; as the start
* in the item is relative to the beginning of the data source tvbuff,
* we need to pass in a tvbuff.
*
* Given an item created as:
* ti = proto_tree_add_item(*, *, tvb, offset, -1, *);
* then
* proto_item_set_end(ti, tvb, end);
* is equivalent to
* proto_item_set_len(ti, end - offset);
*
@param pi the item to set the length
@param tvb end is relative to this tvbuff
@param end this end offset is relative to the beginning of tvb