doc: Update text2pcap and Import from Hexdump doc

Update the text2pcap man page and the Import from Hexdump WSUG
page to clarify how to use it, for grammar, and to remove a few
things that are no longer relevant. (E.g., it's no longer the case that
files without an EOL don't work.)
Fix #15563, #15564.
This commit is contained in:
John Thacker 2022-02-22 07:32:01 -05:00 committed by A Wireshark GitLab Utility
parent 0e427ac837
commit 1d84a092cf
2 changed files with 88 additions and 73 deletions

View File

@ -44,7 +44,7 @@ text2pcap - Generate a capture file from an ASCII hexdump of packets
*Text2pcap* is a program that reads in an ASCII hex dump and writes the
data described into a capture file. *text2pcap* can read hexdumps with
multiple packets in them, and build a capture file of multiple packets.
*Text2pcap* is also capable of generating dummy Ethernet, IP and UDP, TCP,
*Text2pcap* is also capable of generating dummy Ethernet, IP, and UDP, TCP
or SCTP headers, in order to build fully processable packet dumps from
hexdumps of application-level data only.
@ -56,56 +56,58 @@ file format.
*Text2pcap* understands a hexdump of the form generated by __od -Ax
-tx1 -v__. In other words, each byte is individually displayed, with
spaces separating the bytes from each other. Each line begins with an offset
describing the position in the packet, each new packet starts with an offset
of 0 and there is a space separating the offset from the following bytes.
The offset is a hex number (can also be octal or decimal - see *-o*),
of more than two hex digits.
spaces separating the bytes from each other. Hex digits can be upper
or lowercase.
Here is a sample dump that *text2pcap* can recognize:
In normal operation, each line must begin with an offset describing the
position in the packet, followed a colon, space, or tab separating it from
the bytes. There is no limit on the width or number of bytes per line, but
lines with only hex bytes without a leading offset are ignored (in other words,
line breaks should not be inserted in long lines that wrap.) Offsets are more
than two digits; they are in hex by default, but can also be in octal or
decimal - see *-o*. Each packet must begin with offset zero, and an offset
zero indicates the beginning of a new packet. Offset values must be correct;
an unexpected value causes the current packet to be aborted and the next
packet start awaited. There is also a single packet mode with no offsets;
see *-o*.
000000 00 0e b6 00 00 02 00 0e b6 00 00 01 08 00 45 00
000010 00 28 00 00 00 00 ff 01 37 d1 c0 00 02 01 c0 00
000020 02 02 08 00 a6 2f 00 01 00 01 48 65 6c 6c 6f 20
000030 57 6f 72 6c 64 21
000036
Packets may be preceded by a direction indicator ('I' or 'O') and/or a
timestamp if indicated by the command line (see *-D* and *-t*). If both are
present, the direction indicator precedes the timestamp. The format of the
timestamps is specified as a mandatory parameter to *-t*. If no timestamp is
parsed, in the case of the first packet the current system time is used, while
subsequent packets are written with timestamps one microsecond later than that
of the previous packet.
Note the last byte must either be followed by the expected next offset value
as in the example above or a space or a line-end character(s).
Other text in the input data is ignored. Any text before the offset is
ignored, including email forwarding characters '>'. Any text on a line
after the bytes is ignored, e.g. an ASCII character dump (but see *-a* to
ensure that hex digits in the character dump are ignored). Any line where
the first non-whitespace character is a '#' will be ignored as a comment.
Any lines of text between the bytestring lines are considered preamble;
the beginning of the preamble is scanned for the direction indicator and
timestamp as mentioned above and otherwise ignored.
There is no limit on the width or number of bytes per line. Also the
text dump at the end of the line is ignored. Bytes/hex numbers can be
uppercase or lowercase. Any text before the offset is ignored,
including email forwarding characters '>'. Any lines of text between
the bytestring lines is ignored. The offsets are used to track the
bytes, so offsets must be correct. Any line which has only bytes
without a leading offset is ignored. An offset is recognized as being
a hex number longer than two characters. Any text after the bytes is
ignored (e.g. the character dump). Any hex numbers in this text are
also ignored. An offset of zero is indicative of starting a new
packet, so a single text file with a series of hexdumps can be
converted into a packet capture with multiple packets.
Packets may be preceded by a direction indicator and a timestamp if
indicated by the command line (see *-D* and *-t*). The format of the
timestamps is specified as a mandatory parameter to *-t*. If timestamp
parsing is not enabled or failed, the first packet is timestamped
with the current time the conversion takes place. Multiple packets
are written with timestamps differing by one microsecond each.
Any line beginning with #TEXT2PCAP is a directive and options
can be inserted after this command to be processed by *text2pcap*.
Currently there are no directives implemented; in the future, these may
be used to give more fine grained control on the dump and the way it
should be processed e.g. timestamps, encapsulation type etc.
In general, short of these restrictions, *text2pcap* is pretty liberal
about reading in hexdumps and has been tested with a variety of
mangled outputs (including being forwarded through email multiple
times, with limited line wrap etc.)
There are a couple of other special features to note. Any line where
the first non-whitespace character is '#' will be ignored as a
comment. Any line beginning with #TEXT2PCAP is a directive and options
can be inserted after this command to be processed by
*text2pcap*. Currently there are no directives implemented; in the
future, these may be used to give more fine grained control on the
dump and the way it should be processed e.g. timestamps, encapsulation
type etc.
Here is a sample dump that *text2pcap* can recognize, with optional
directional indicator and timestamp:
I 2019-05-14T19:04:57Z
000000 00 0e b6 00 00 02 00 0e b6 00 00 01 08 00 45 00
000010 00 28 00 00 00 00 ff 01 37 d1 c0 00 02 01 c0 00
000020 02 02 08 00 a6 2f 00 01 00 01 48 65 6c 6c 6f 20
000030 57 6f 72 6c 64 21
000036
*Text2pcap* is also capable of scanning a text input file using a custom Perl
compatible regular expression that matches a single packet. *text2pcap*

View File

@ -436,15 +436,53 @@ Two methods for converting the input are supported:
==== Standard ASCII Hexdumps
Wireshark understands a hexdump of the form generated by `od -Ax -tx1 -v`. In
other words, each byte is individually displayed and surrounded with a space.
Each line begins with an offset describing the position in the packet, each
new packet starts with an offset of 0 and there is a space separating the
offset from the following bytes. The offset is a hex number (can also be octal
or decimal), of more than two hex digits.
Here is a sample dump that can be imported:
Wireshark understands a hexdump of the form generated by `od -Ax -tx1 -v`.
In other words, each byte is individually displayed, with spaces separating
the bytes from each other. Hex digits can be upper or lowercase.
In normal operation, each line must begin with an offset describing the
position in the packet, followed a colon, space, or tab separating it from
the bytes. There is no limit on the width or number of bytes per line, but
lines with only hex bytes without a leading offset are ignored (i.e.,
line breaks should not be inserted in long lines that wrap.) Offsets are more
than two digits; they are in hex by default, but can also be in octal or
decimal. Each packet must begin with offset zero, and an offset
zero indicates the beginning of a new packet. Offset values must be correct;
an unexpected value causes the current packet to be aborted and the next
packet start awaited. There is also a single packet mode with no offsets.
Packets may be preceded by a direction indicator ('I' or 'O') and/or a
timestamp if indicated. If both are present, the direction indicator precedes
the timestamp. The format of the timestamps must be specified. If no timestamp
is parsed, in the case of the first packet the current system time is used,
while subsequent packets are written with timestamps one microsecond later than
that of the previous packet.
Other text in the input data is ignored. Any text before the offset is
ignored, including email forwarding characters '>'. Any text on a line
after the bytes is ignored, e.g. an ASCII character dump (but see *-a* to
ensure that hex digits in the character dump are ignored). Any line where
the first non-whitespace character is a '#' will be ignored as a comment.
Any lines of text between the bytestring lines are considered preamble;
the beginning of the preamble is scanned for the direction indicator and
timestamp as mentioned above and otherwise ignored.
Any line beginning with #TEXT2PCAP is a directive and options
can be inserted after this command to be processed by Wireshark.
Currently there are no directives implemented; in the future, these may
be used to give more fine grained control on the dump and the way it
should be processed e.g. timestamps, encapsulation type etc.
In general, short of these restrictions, Wireshark is pretty liberal
about reading in hexdumps and has been tested with a variety of
mangled outputs (including being forwarded through email multiple
times, with limited line wrap etc.)
Here is a sample dump that can be imported, including optional
directional indicator and timestamp:
----
I 2019-05-14T19:04:57Z
000000 00 e0 1e a7 05 6f 00 10 ........
000008 5a a0 b9 12 08 00 46 00 ........
000010 03 68 00 00 00 00 0a 2e ........
@ -454,31 +492,6 @@ Here is a sample dump that can be imported:
000030 01 01 0f 19 03 80 11 01 ........
----
There is no limit on the width or number of bytes per line. Also the text dump
at the end of the line is ignored. Byte and hex numbers can be uppercase or
lowercase. Any text before the offset is ignored, including email forwarding
characters _>_. Any lines of text between the bytestring lines are ignored.
The offsets are used to track the bytes, so offsets must be correct. Any line
which has only bytes without a leading offset is ignored. An offset is
recognized as being a hex number longer than two characters. Any text after the
bytes is ignored (e.g. the character dump). Any hex numbers in this text are
also ignored. An offset of zero is indicative of starting a new packet, so a
single text file with a series of hexdumps can be converted into a packet
capture with multiple packets. Packets may be preceded by a timestamp. These are
interpreted according to the format given. If not the first packet is
timestamped with the current time the import takes place. Multiple packets are
written with timestamps differing by one nanosecond each. In general, short of
these restrictions, Wireshark is pretty liberal about reading in hexdumps and
has been tested with a variety of mangled outputs (including being forwarded
through email multiple times, with limited line wrap etc.)
There are a couple of other special features to note. Any line where the first
non-whitespace character is `#` will be ignored as a comment. Any line beginning
with `#TEXT2PCAP` is a directive and options can be inserted after this command to
be processed by Wireshark. Currently there are no directives implemented. In the
future these may be used to give more fine grained control on the dump and the
way it should be processed e.g. timestamps, encapsulation type etc.
==== Regular Text Dumps
Wireshark is also capable of scanning the input using a custom perl regular