A given protocol's packet format may depend, for example, on which
lower-level protocol is transporting the protocol in question. For
example, protocols that run atop both byte-stream protocols such as TCP
and TLS, and packet-oriented protocols such as UDP or DTLS, might begin
the packet with a length when running atop a byte-stream protocol, to
indicate where this packet ends and the next packet begins in the byte
stream, but not do so when running atop a packet-oriented protocol.
Dissectors can handle this in various ways:
For example, the dissector could attempt to determine the protocol over
which the packet was transported.
Unfortunately, many of those mechanisms do so by fetching data from the
packet_info structure, and many items in that structure act as global
variables, so that, for example, if there are two two PDUs for protocol
A inside a TCP segment, and the first protocol for PDU A contains a PDU
for protocol B, and protocol B's dissector, or a dissector it calls,
modifies the information in the packet_info structure so that it no
longer indicates that the parent protocol is TCP, the second PDU for
protocol A might not be correctly dissected.
Another such mechanism is to query the previous element in the layers
structure of the packet_info structure, which is a list of protocol IDs.
Unfortunately, that is not a list of earlier protocols in the protocol
stack, it's a list of earlier protocols in the dissection, which means
that, in the above example, when the second PDU for protocol A is
dissected, the list is {...,TCP,A,B,...,A}, which means that the
previous element in the list is not TCP, so, again, the second PDU for
protocol A will not be correctly dissected.
An alternative is to have multiple dissectors for the same protocol,
with the part of the protocol that's independent of the protocol
transporting the PDU being dissected by common code. Protocol B might
have an "over a byte-stream transport" dissector and an "over a packet
transport" dissector, with the first dissector being registered for use
over TCP and TLS and the other dissector being registered for use over
packet protocols. This mechanism, unlike the other mechanisms, is not
dependent on information in the packet_info structure that might be
affected by dissectors other than the one for the protocol that
transports protocol B.
Furthermore, in a LINKTYPE_WIRESHARK_UPPER_PDU pcap or pcapng packet for
protocol B, there might not be any information to indicate the protocol
that transports protocol B, so there would have to be separate
dissectors for protocol B, with separate names, so that a tag giving the
protocol name would differ for B-over-byte-stream and B-over-packets.
So:
We rename EXP_PDU_TAG_PROTO_NAME and EXP_PDU_TAG_HEUR_PROTO_NAME to
EXP_PDU_TAG_DISSECTOR_NAME and EXP_PDU_TAG_HEUR_DISSECTOR_NAME, to
emphasize that they are *not* protocol names, they are dissector names
(which has always been the case - if there's a protocol with that name,
but no dissector with that name, Wireshark will not be able to handle
the packet, as it will try to look up a dissector given that name and
fail).
We fix that exported PDU dissector to refer to those tags as dissector
names, not protocol names.
We update documentation to refer to them as DISSECTOR_NAME tags, not
PROTO_NAME tags. (If there is any documentation for this outside the
Wireshark source, it should be updated as well.)
We add comments for calls to dissector_handle_get_dissector_name() where
the dissector name is shown to the user, to indicate that it might be
that the protocol name should be used.
We update the TLS and DTLS dissectors to show the encapsulated protocol
as the string returned by dissector_handle_get_long_name(); as the
default is "Application Data", it appeaers that a descriptive name,
rather than a short API name, should be used. (We continue to use the
dissector name in debugging messages, to indicate which dissector was
called.)
Since Import from Hex Dump creates a pcapng temporary file, use
the list of encapsulations we can write to pcapng instead of pcap.
In particular, this makes WTAP_ENCAP_SYSTEMD_JOURNAL possible, so make
text_import capable of writing that encapsulation by using the proper
rec_type and block. It's not clear why someone would have a binary
hex dump of this text based format, but it works.
In text2pcap and Import from Hex Dump, allow fake IP headers with
the appropriate versions when the Raw IP, Raw IPv4, and Raw IPv6
encapsulations are specified. In such cases, do not add a dummy
Ethernet header.
Continue to reject other encapsulations besides these, Ethernet,
and Wireshark Upper PDU when appropriate. Add some checks for the
encapsulation type in text_import as well, instead of just assuming
that the callers handle it correctly.
Add some default IPv6 addresses, used in place of the unspecified
address. These are unique local addresses as in RFC 4193 with
a global ID generated using the pseudo-random algorithm mentioned
therein.
Move the parameter setup to text_import, so that later it can
be called from the GUI, including the interface name. (This has
to be a separate function because these parameters need to be
set before the call to wtap_dump_open, which is different for
regular files vs temp files vs stdout.)
Remove the '-d' option from text2pcap, and move the two levels
of debug messages in text2pcap and text_import to either
LOG_LEVEL_DEBUG or LOG_LEVEL_NOISY as appropriate.
Only warn about the parser getting an unexpected offset when
using OFFSET_NONE the first time. Use log warnings for subsequent
messages.
Strip off the whitespace/newline/colon from the offset when adding
it to the message, only output the offset number.
text2pcap used 10.1.1.1 and 10.2.2.2 for default IPv4 addresses,
and "Import From Hex Dump" used 1.1.1.1 and 2.2.2.2. The former
are a little bit better for defaults since they're RFC 1918
private IP addresses, so let's use them for the common code.
Encapsulate the feature requirements for strptime() in a
portability wrapper.
Use _GNU_SOURCE to expose strptime. It should be enough on glibc
without the side-effect of selecting a particular SUS version,
which we don't need and might hide other definitions.
Add a checkbox for the extra detection for ASCII in a hex+ASCII
hexdump even when the text looks like hexbytes to Import from Hex
Dump. Save and restore it from the settings. Work towards #16724.
If we're in the no offset mode and we parse an offset,
warn the user and ignore. At the very beginning of the file try
adding it to the preamble, maybe there's something unfortunate
like an all numeric time stamp format (ISO-8601 Basic).
A bunch of the globals are simply copied from the input parameter
text_import_info_t, just use them directly.
Move the count for packets read and written into the info type,
so that callers like text2pcap can access them as results.
Don't exit in the middle with unexpected values. Report a failure
and return a failed exit status when something goes really wrong.
Use warnings when appropriate, like when a time code value couldn't
be parsed.
Includes allowing the string "ISO" in the format string text box
in the GUI, so this works in "Import from Hex Dump" as well as
being for the text2pcap transition. Part of #16724.
Use report_message and report wtap_dump failures. Pass in
the output filename and keep track of the frame numbers for
the message parameters.
Report failure to initialize the lex scanner in text_import
instead of in the GUI, so that it would be reported from text2pcap,
and because text_import might have other failure cases that are
not the scanner.
The regex parser returns a positive number of packets processed
on success; save that number in text_import, and return zero on
success to our callers.
This is the special check for canonical hex+ASCII textdump
files that looks for the edge case where the beginning of the
ASCII column has strings that can be mistaken by the parser for
additional hex bytes. Not implemented in the GUI yet. Preparing
for text2pcap switchover. Related to #16724.
Add IPv6 handling to text_import, including the ability to
handle dummy IPv6 addresses instead of IPv4. GUI support is
still TBD. This further reduces the number of text2pcap features
that ui/text_import does not yet support. Related to #16724.
Add dummy IPv4 addresses to the text_import_info_t struct, and
use them if set in the same way text2pcap does. GUI support in
"Import from Hex Dump" is not added yet. This is also part of the
work for text2pcap to eventually call text_import. Related to #16724.
The encapsulation type that text_import expects and puts
directly into rec.rec_header.packet_header.pkt_encap is a
wiretap encap type, not a pcap link type. Fix the name and
comment appropriately.
Correctly handle when a minimum packet length forces fragmentation of
SCTP and we are generating dummy SCTP DATA chunk headers: mark fragmentation
in the chunk flags and set the transmission sequence number and
stream sequence number appropriately.
Port from text2pcap commit f8d48662c8
Part of #16724.
The "Import from Hex Dump" time delta for packets without a timestamp
was changed to be a nanosecond, but the time resolution for the file
created by import_text_dialog is the default, microseconds. Until
that is configurable, the time tick used needs to be microseconds like
it was before.
Clean up the code so that it's a little more consistent about when
and how the extra time tick is added, namely:
1. If there is no time format passed in.
2. If time format conversion for the packet fails for any reason.
We don't add an extra delta in other situations, e.g. if packets just
happen to have the same valid time value.
Fix#15562.
_parse_time, which uses g_strlcpy, expects that end_field points
to the position after the end of the field (such as the \0.)
text_import_regex handles this correctly, but when importing from
hex dumps the last character of the timestamp was being cut off,
which makes a big difference when fractional seconds are not used.
Those routines can't add an option if there's no block to add it to;
this meant that neither the direction nor the sequence number would be
set when importing a packet.
Most of the time, the return value tells us nothing useful, as we've
already decided that we're perfectly willing to live with string
truncation. Hopefully this keeps Coverity from whining that those
routines could return an error code (NARRATOR: They don't) and thus that
we're ignoring the possibility of failure (as indicated, we've already
decided that we can live with string truncation, so truncation is *NOT*
a failure).
The secs field is a time_t, which is not necessarily 32 bits. If it's
not, casting away the upper bits, by casting to guint32, introduces a
Y2.038K bug.
Either cast to time_t or, if you're assigning a time_t to it, don't
bother with the cast.
Remove the editor modeline blocks from the source files in ui that use 4
space indentation by running
perl -i -p0e 's{ \n+ /[ *\n]+ editor \s+ modelines .* shiftwidth= .* \*/ \s+ } {\n}gsix' $( ag -l shiftwidth=4 $( ag -g '\.(c|cpp|h|m|mm)') )
This gives us one source of indentation truth for these files, and it
*shouldn't* affect anyone since
- These files match the default in our top-level .editorconfig.
- The one notable editor that's likely to be used on these files and
*doesn't* support EditorConfig (Qt Creator) defaults to 4 space
indentation.
Modularized the parser backend slightly to have the needed hooks
Modified the timestamp format slightly to enable arbitrary postion for
second fractions
Added a regex based seeking parser for textfiles as frontend alternative
to text_import_scanner.l
Regex is using the GLib implementation
Supported frame-data formats are bin, hex, oct and base64
Regex based importing UI
Fixed Meory-leak in ImportTextDialog::exec()
A new tab was added to the text_import ui to accomodate the new fields
Hints are available and styled accordingly
During introduction of proper direction support this line was left over,
causing TCP dest port to remain independant of direction. This change
simply drops the line.
See CID 1444115
Change-Id: I4ff362925e422bc57cfa3842127ddaf8695cf303
Reviewed-on: https://code.wireshark.org/review/32902
Petri-Dish: Anders Broman <a.broman58@gmail.com>
Tested-by: Petri Dish Buildbot
Reviewed-by: Anders Broman <a.broman58@gmail.com>
This is a collection of routines, not a program.
Change-Id: I76296576443602b7ea016c5311e66a52a73ee941
Reviewed-on: https://code.wireshark.org/review/32491
Reviewed-by: Guy Harris <guy@alum.mit.edu>
Instead, add a new T_EOF token type, call parse_token() with it when we
get an EOF, and, in parse_token(), write the current packet if we get a
T_EOF token.
That's a bit simpler, and would let us treat EOFs in different places
differently, if, for example, we want to report warnings for
half-finished packets.
Change-Id: Ie41a8a1dedf91c34300468e073f18bf806e01892
Reviewed-on: https://code.wireshark.org/review/32489
Reviewed-by: Guy Harris <guy@alum.mit.edu>