Update: Primarily to suggest calling conversation_set_dissector()

once a packet has been identified as being part of
        a particular protocol.

svn path=/trunk/; revision=47621
This commit is contained in:
Bill Meier 2013-02-11 00:12:59 +00:00
parent 2f156d0edc
commit 32834b7881
1 changed files with 88 additions and 113 deletions

View File

@ -3,7 +3,7 @@ $Date: 2008-08-04 22:41:43 +0200 (Mo, 04 Aug 2008) $
$Author: ulfl $ $Author: ulfl $
This file is a HOWTO for Wireshark developers. It describes how Wireshark This file is a HOWTO for Wireshark developers. It describes how Wireshark
heuristic protocol dissectors work and how to write them. heuristic protocol dissectors work and how to write them.
This file is compiled to give in depth information on Wireshark. This file is compiled to give in depth information on Wireshark.
@ -13,42 +13,42 @@ remarks and patches to the developer mailing list.
Prerequisites Prerequisites
------------- -------------
As this file is an addition to README.developer, it is essential to read As this file is an addition to README.developer, it is essential to read
and understand that document first. and understand that document first.
Why heuristic dissectors? Why heuristic dissectors?
------------------------- -------------------------
When Wireshark "receives" a packet, it has to find the right dissector to When Wireshark "receives" a packet, it has to find the right dissector to
start decoding the packet data. Often this can be done by known conventions, start decoding the packet data. Often this can be done by known conventions,
e.g. the Ethernet type 0x0800 means "IP on top of Ethernet" - an easy and e.g. the Ethernet type 0x0800 means "IP on top of Ethernet" - an easy and
reliable match for Wireshark. reliable match for Wireshark.
Unfortunately, these conventions are not always available, or (accidentally Unfortunately, these conventions are not always available, or (accidentally
or knowingly) some protocols don't care about those conventions and "reuse" or knowingly) some protocols don't care about those conventions and "reuse"
existing "magic numbers / tokens". existing "magic numbers / tokens".
For example TCP defines port 80 only for the use of HTTP traffic. But, this For example TCP defines port 80 only for the use of HTTP traffic. But, this
convention doesn't prevent anyone from using TCP port 80 for some different convention doesn't prevent anyone from using TCP port 80 for some different
protocol, or on the other hand using HTTP on a port number different than 80. protocol, or on the other hand using HTTP on a port number different than 80.
To solve this problem, Wireshark introduced the so called heuristic dissector To solve this problem, Wireshark introduced the so called heuristic dissector
mechanism to try to deal with these problems. mechanism to try to deal with these problems.
How Wireshark uses heuristic dissectors? How Wireshark uses heuristic dissectors?
---------------------------------------- ----------------------------------------
While Wireshark starts, heuristic dissectors (HD) register themselves slightly While Wireshark starts, heuristic dissectors (HD) register themselves slightly
different than "normal" dissectors, e.g. a HD can ask for any TCP packet, as different than "normal" dissectors, e.g. a HD can ask for any TCP packet, as
it *may* contain interesting packet data for this dissector. In reality more it *may* contain interesting packet data for this dissector. In reality more
than one HD will exist for e.g. TCP packet data. than one HD will exist for e.g. TCP packet data.
So if Wireshark has to decode TCP packet data, it will first try to find a So if Wireshark has to decode TCP packet data, it will first try to find a
dissector registered directly for the TCP port used in that packet. If it dissector registered directly for the TCP port used in that packet. If it
finds such a registered dissector it will just hand over the packet data to it. finds such a registered dissector it will just hand over the packet data to it.
In case there is no such "normal" dissector, WS will hand over the packet data In case there is no such "normal" dissector, WS will hand over the packet data
to the first matching HD. Now the HD will look into the data and decide if that to the first matching HD. Now the HD will look into the data and decide if that
data looks like something the dissector "is interested in". The return value data looks like something the dissector "is interested in". The return value
signals WS if the HD processed the data (so WS can stop working on that packet) signals WS if the HD processed the data (so WS can stop working on that packet)
or if the heuristic didn't match (so WS tries the next HD until one matches - or if the heuristic didn't match (so WS tries the next HD until one matches -
@ -64,27 +64,32 @@ SCTP, TCP, TIPC and UDP dissectors all provide this capability via their
"Try heuristic sub-dissectors first" preference, but none of them have this "Try heuristic sub-dissectors first" preference, but none of them have this
option enabled by default. option enabled by default.
Once a packet for a particular "connection" has been identified as belonging
to a particular protocol, wireshark should then be set up to always directly
call the dissector for that protocol. This removes the overhead of having
to identify each packet of the connection heuristically.
How do these heuristics work? How do these heuristics work?
----------------------------- -----------------------------
Difficult to give a general answer here. The usual heuristic works as follows: It's difficult to give a general answer here. The usual heuristic works as follows:
A HD looks into the first few packet bytes and searches for common patterns that A HD looks into the first few packet bytes and searches for common patterns that
are specific to the protocol in question. Most protocols starts with a are specific to the protocol in question. Most protocols starts with a
specific header, so a specific pattern may look like (synthetic example): specific header, so a specific pattern may look like (synthetic example):
1) first byte must be 0x42 1) first byte must be 0x42
2) second byte is a type field and can only contain values between 0x20 - 0x33 2) second byte is a type field and can only contain values between 0x20 - 0x33
3) third byte is a flag field, where the lower 4 bits always contain the value 0 3) third byte is a flag field, where the lower 4 bits always contain the value 0
4) fourth and fifth bytes contain a 16 bit length field, where the value can't 4) fourth and fifth bytes contain a 16 bit length field, where the value can't
be larger than 10000 bytes be larger than 10000 bytes
So the heuristic dissector will check incoming packet data for all of the So the heuristic dissector will check incoming packet data for all of the
4 above conditions, and only if all of the four conditions are true there is a 4 above conditions, and only if all of the four conditions are true there is a
good chance that the packet really contains the expected protocol - and the good chance that the packet really contains the expected protocol - and the
dissector continues to decode the packet data. If one condition fails, it's dissector continues to decode the packet data. If one condition fails, it's
very certainly not the protocol in question and the dissector returns to WS very certainly not the protocol in question and the dissector returns to WS
immediately "this is not my protocol" - maybe some other heuristic dissector immediately "this is not my protocol" - maybe some other heuristic dissector
is interested! is interested!
Obviously, this is *not* 100% bullet proof, but it's the best WS can offer to Obviously, this is *not* 100% bullet proof, but it's the best WS can offer to
@ -96,58 +101,81 @@ Heuristic Code Example
---------------------- ----------------------
You can find a lot of code examples in the wireshark sources, e.g.: You can find a lot of code examples in the wireshark sources, e.g.:
grep -l heur_dissector_add epan/dissectors/*.c grep -l heur_dissector_add epan/dissectors/*.c
returns (currently) 68 files. returns 132 files (Feb 2013).
For the above example criteria, the following code example might do the work For the above example criteria, the following code example might do the work
(combine this with the dissector skeleton in README.developer): (combine this with the dissector skeleton in README.developer):
XXX - please note: The following code examples were not tried in reality, XXX - please note: The following code examples were not tried in reality,
please report problems to the dev-list! please report problems to the dev-list!
static gboolean dissect_PROTOABBREV(tvbuff_t *tvb, packet_info *pinfo, proto_tree *tree) --------------------------------------------------------------------------------------------
static dissector_handle_t PROTOABBREV_handle;
static void
dissect_PROTOABBREV(tvbuff_t *tvb, packet_info *pinfo, proto_tree *tree, void *data _U_)
{
/* Dissection ... */
return;
}
static gboolean
dissect_PROTOABBREV_heur(tvbuff_t *tvb, packet_info *pinfo, proto_tree *tree, void *data _U_)
{ {
... ...
/* 1) first byte must be 0x42 */ /* 1) first byte must be 0x42 */
if ( tvb_get_guint8(tvb, 0) != 0x42 ) if ( tvb_get_guint8(tvb, 0) != 0x42 )
return (FALSE); return (FALSE);
/* 2) second byte is a type field and only can contain values between 0x20-0x33 */ /* 2) second byte is a type field and only can contain values between 0x20-0x33 */
if ( tvb_get_guint8(tvb, 1) < 0x20 || tvb_get_guint8(tvb, 1) > 0x33 ) if ( tvb_get_guint8(tvb, 1) < 0x20 || tvb_get_guint8(tvb, 1) > 0x33 )
return (FALSE); return (FALSE);
/* 3) third byte is a flag field, where the lower 4 bits always contain the value 0 */
if ( tvb_get_guint8(tvb, 2) & 0x0f )
return (FALSE);
/* 4) fourth and fifth bytes contains a 16 bit length field, where the value can't be longer than 10000 bytes */ /* 3) third byte is a flag field, where the lower 4 bits always contain the value 0 */
/* Assumes network byte order */ if ( tvb_get_guint8(tvb, 2) & 0x0f )
if ( tvb_get_ntohs(tvb, 3) > 10000 ) return (FALSE);
return (FALSE);
/* Assume it's your packet and do dissection */
...
return (TRUE); /* 4) fourth and fifth bytes contains a 16 bit length field, where the value can't be longer than 10000 bytes */
/* Assumes network byte order */
if ( tvb_get_ntohs(tvb, 3) > 10000 )
return (FALSE);
/* Assume it's your packet ... */
/* specify that dissect_PROTOABBREV is to be called directly from now on for packets for this "connection" ... */
conversation = find_or_create_conversation(pinfo);
conversation_set_dissector(conversation, PROTOABBREV_handle);
/* and do the dissection 8/
dissect_PROTOABBREV(tvb, pinfo, tree, data);
return (TRUE);
} }
void void
proto_reg_handoff_PROTOABBREV(void) proto_reg_handoff_PROTOABBREV(void)
{ {
static int PROTOABBREV_inited = FALSE; PROABBREV_handle = create_dissector_handle(dissect_PROTOABBREV,
proto_PROTOABBREV);
if ( !PROTOABBREV_inited )
{ /* register as heuristic dissector for both TCP and UDP */
/* register as heuristic dissector for both TCP and UDP */ heur_dissector_add("tcp", dissect_PROTOABBREV, proto_PROTOABBREV);
heur_dissector_add("tcp", dissect_PROTOABBREV, proto_PROTOABBREV); heur_dissector_add("udp", dissect_PROTOABBREV, proto_PROTOABBREV);
heur_dissector_add("udp", dissect_PROTOABBREV, proto_PROTOABBREV);
} #ifdef OPTIONAL
/* It's possible to write a dissector to be a dual heuristic/normal dissector */
/* by also registering the dissector "normally". */
dissector_add_uint("ip.proto", IP_PROTO_PROTOABBREV, PROTOABBREV_handle);
#endif
} }
Please note, that registering a heuristic dissector is only possible for a Please note, that registering a heuristic dissector is only possible for a
small variety of protocols. In most cases a heuristic is not needed, and small variety of protocols. In most cases a heuristic is not needed, and
adding the support would only add unused code to the dissector. adding the support would only add unused code to the dissector.
TCP and UDP are prominent examples that support HDs, as there seems to be a TCP and UDP are prominent examples that support HDs, as there seems to be a
@ -155,58 +183,5 @@ tendency to reuse known port numbers for new protocols. But TCP and UDP are
not the only dissectors that provide support for HDs. You can find more not the only dissectors that provide support for HDs. You can find more
examples by searching the Wireshark sources as follows: examples by searching the Wireshark sources as follows:
grep -l register_heur_dissector_list epan/dissectors/packet-*.c grep -l register_heur_dissector_list epan/dissectors/packet-*.c
returns (currently) 25 files. returns 38 files (Feb 2013).
It's possible to write a dissector to be a dual heuristic/normal dissector.
In that the case, dissect_PROTOABBREV should return an int with the number of
bytes dissected by your protocol rather than simply returning TRUE. If
heuristics fail, still just return 0.
static int dissect_PROTOABBREV(tvbuff_t *tvb, packet_info *pinfo, proto_tree *tree)
{
...
/* 1) first byte must be 0x42 */
if ( tvb_get_guint8(tvb, 0) != 0x42 )
return 0;
/* 2) second byte is a type field and only can contain values between 0x20-0x33 */
if ( tvb_get_guint8(tvb, 1) < 0x20 || tvb_get_guint8(tvb, 1) > 0x33 )
return 0;
/* 3) third byte is a flag field, where the lower 4 bits always contain the value 0 */
if ( tvb_get_guint8(tvb, 2) & 0x0f )
return 0;
/* 4) fourth and fifth bytes contains a 16 bit length field, where the value can't be longer than 10000 bytes */
/* Assumes network byte order */
if ( tvb_get_ntohs(tvb, 3) > 10000 )
return 0;
/* Assume it's your packet and do dissection */
...
return number_of_bytes_dissected;
}
void
proto_reg_handoff_PROTOABBREV(void)
{
static int PROTOABBREV_inited = FALSE;
dissector_handle_t PROTOABBREV_handle;
if ( !PROTOABBREV_inited )
{
/* register as heuristic dissector for both TCP and UDP */
heur_dissector_add("tcp", dissect_PROTOABBREV, proto_PROTOABBREV);
heur_dissector_add("udp", dissect_PROTOABBREV, proto_PROTOABBREV);
/* register as normal dissector for IP as well */
PROTOABBREV_handle = new_create_dissector_handle(dissect_PROTOABBREV,
proto_PROTOABBREV);
dissector_add_uint("ip.proto", IP_PROTO_PROTOABBREV, PROTOABBREV_handle);
PROTOABBREV_inited = TRUE;
}
}