wireshark/doc/text2pcap.adoc

include::../docbook/attributes.adoc[]
= text2pcap(1)
:doctype: manpage
:stylesheet: ws.css
:linkcss:
:copycss: ../docbook/{stylesheet}

== NAME

text2pcap - Generate a capture file from an ASCII hexdump of packets

== SYNOPSIS

[manarg]
*text2pcap*
[ *-a* ]
[ *-b* 2|8|16|64 ]
[ *-D* ]
[ *-e* <l3pid> ]
[ *-E* <encapsulation type> ]
[ *-F* <file format> ]
[ *-h* ]
[ *-i* <proto> ]
[ *-l* <typenum> ]
[ *-N* <intf-name> ]
[ *-m* <max-packet> ]
[ *-o* hex|oct|dec|none ]
[ *-q* ]
[ *-r* <regex> ]
[ *-s* <srcport>,<destport>,<tag> ]
[ *-S* <srcport>,<destport>,<ppi> ]
[ *-t* <timefmt> ]
[ *-T* <srcport>,<destport> ]
[ *-u* <srcport>,<destport> ]
[ *-v* ]
[ *-4* <srcip>,<destip> ]
[ *-6* <srcip>,<destip> ]
<__infile__>|-
<__outfile__>|-

== DESCRIPTION

*Text2pcap* is a program that reads in an ASCII hex dump and writes the
data described into a capture file.  *text2pcap* can read hexdumps with
multiple packets in them, and build a capture file of multiple packets.
*Text2pcap* is also capable of generating dummy Ethernet, IP, and UDP, TCP
or SCTP headers, in order to build fully processable packet dumps from
hexdumps of application-level data only.

*Text2pcap* can write the file in several output formats.
The *-F* flag can be used to specify the format in which to write the
capture file, *text2pcap -F* provides a list of the available output
formats. By default, it writes the packets to __outfile__ in the *pcapng*
file format.

*Text2pcap* understands a hexdump of the form generated by __od -Ax
 -tx1 -v__.  In other words, each byte is individually displayed, with
spaces separating the bytes from each other.  Hex digits can be upper
or lowercase.

In normal operation, each line must begin with an offset describing the
position in the packet, followed a colon, space, or tab separating it from
the bytes.  There is no limit on the width or number of bytes per line, but
lines with only hex bytes without a leading offset are ignored (in other words,
line breaks should not be inserted in long lines that wrap.) Offsets are more
than two digits; they are in hex by default, but can also be in octal or
decimal - see *-o*.  Each packet must begin with offset zero, and an offset
zero indicates the beginning of a new packet.  Offset values must be correct;
an unexpected value causes the current packet to be aborted and the next
packet start awaited.  There is also a single packet mode with no offsets;
see *-o*.

Packets may be preceded by a direction indicator ('I' or 'O') and/or a
timestamp if indicated by the command line (see *-D* and *-t*).  If both are
present, the direction indicator precedes the timestamp.  The format of the
timestamps is specified as a mandatory parameter to *-t*.  If no timestamp is
parsed, in the case of the first packet the current system time is used, while
subsequent packets are written with timestamps one microsecond later than that
of the previous packet.

Other text in the input data is ignored. Any text before the offset is
ignored, including email forwarding characters '>'. Any text on a line
after the bytes is ignored, e.g. an ASCII character dump (but see *-a* to
ensure that hex digits in the character dump are ignored).  Any line where
the first non-whitespace character is a '#' will be ignored as a comment.
Any lines of text between the bytestring lines are considered preamble;
the beginning of the preamble is scanned for the direction indicator and
timestamp as mentioned above and otherwise ignored.

Any line beginning with #TEXT2PCAP is a directive and options
can be inserted after this command to be processed by *text2pcap*.
Currently there are no directives implemented; in the future, these may
be used to give more fine grained control on the dump and the way it
should be processed e.g. timestamps, encapsulation type etc.

In general, short of these restrictions, *text2pcap* is pretty liberal
about reading in hexdumps and has been tested with a variety of
mangled outputs (including being forwarded through email multiple
times, with limited line wrap etc.)

Here is a sample dump that *text2pcap* can recognize, with optional
directional indicator and timestamp:

    I 2019-05-14T19:04:57Z
    000000 00 0e b6 00 00 02 00 0e b6 00 00 01 08 00 45 00
    000010 00 28 00 00 00 00 ff 01 37 d1 c0 00 02 01 c0 00
    000020 02 02 08 00 a6 2f 00 01 00 01 48 65 6c 6c 6f 20
    000030 57 6f 72 6c 64 21
    000036

*Text2pcap* is also capable of scanning a text input file using a custom Perl
compatible regular expression that matches a single packet. *text2pcap*
searches the given file (which must end with '\n') for non-overlapping non-empty
strings matching the regex. Named capturing subgroups, which must match
exactly once per packet, are used to identify fields to import. The following
fields are supported in regex mode, one mandatory and three optional:

    "data"  Actual captured frame data to import
    "time"  Timestamp of packet
    "dir"   Direction of packet
    "seqno" Arbitrary ID of packet

The 'data' field is the captured data, which must be in a selected encoding:
hexadecimal (the default), octal, binary, or base64 and containing no
characters in the data field outside the encoding set besides whitespace.
The 'time' field is parsed according to the format in the *-t* parameter.
The first character of the 'dir' field is compared against a set of characters
corresponding to inbound and outbound that default to "iI<" for inbound and
"oO>" for outbound to assign a direction. The 'seqno' field is assumed to
be a positive integer base 10 used for an arbitrary ID. An optional field's
information will only be written if the field is present in the regex and if
the capture file format supports it. (E.g., the pcapng format supports all
three fields, but the pcap format only supports timestamps.)

Here is a sample dump that the regex mode can process with the regex
'^(?<dir>[<>])\s(?<time>\d+:\d\d:\d\d.\d+)\s(?<data>[0-9a-fA-F]+)$' along
with timestamp format '%H:%M:%S.%f', directional indications of '<' and '>',
and hex encoding:

    > 0:00:00.265620 a130368b000000080060
    > 0:00:00.280836 a1216c8b00000000000089086b0b82020407
    < 0:00:00.295459 a2010800000000000000000800000000
    > 0:00:00.296982 a1303c8b00000008007088286b0bc1ffcbf0f9ff
    > 0:00:00.305644 a121718b0000000000008ba86a0b8008
    < 0:00:00.319061 a2010900000000000000001000600000
    > 0:00:00.330937 a130428b00000008007589186b0bb9ffd9f0fdfa3eb4295e99f3aaffd2f005
    > 0:00:00.356037 a121788b0000000000008a18

The regex is compiled with multiline support, and it is recommended to use
the anchors '^' and '$' for best results.

*Text2pcap* also allows the user to read in dumps of application-level
data and insert dummy L2, L3 and L4 headers before each packet. This allows
Wireshark or any other full-packet decoder to handle these dumps.
If the encapsulation type is Ethernet, the user can elect to insert Ethernet
headers, Ethernet and IP, or Ethernet, IP and UDP/TCP/SCTP headers before
each packet. The fake headers can also be used with the Raw IP, Raw IPv4,
or Raw IPv6 encapsulations, with the Ethernet header omitted. These
encapsulation options can be used in both hexdump mode and regex mode.

When <__infile__> or <__outfile__> are '-', standard input or standard
output, respectively, are used.

== OPTIONS

-a::
+
--
Enables ASCII text dump identification. It allows one to identify the start of
the ASCII text dump and not include it in the packet even if it looks like HEX.
This parameter has no effect in regex mode.

*NOTE:* Do not enable it if the input file does not contain the ASCII text dump.
--

-b 2|8|16|64::
+
--
Specify the base (radix) of the encoding of the packet data in regex mode.
The supported options are 2 (binary), 8 (octal), 16 (hexadecimal), and 64
(base64 encoding), with hex as the default. This parameter has no effect
in hexdump mode.
--

-D::
+
--
Indicates that the text before each input packet may start either with an I
or O indicating that the packet is inbound or outbound. If both this flag
and the __t__ flag are used, the directional indicator is expected before
the time code.
This parameter has no effect in regex mode, where the presence of the `<dir>`
capturing group determines whether direction indicators are expected.

Direction indication is stored in the packet headers if the output format
supports it (e.g. pcapng), and is also used when generating dummy headers
to swap the source and destination addresses and ports as appropriate.
--

-e <l3pid>::
+
--
Include a dummy Ethernet header before each packet. Specify the L3PID
for the Ethernet header in hex. Use this option if your dump has Layer
3 header and payload (e.g. IP header), but no Layer 2
encapsulation. Example: __-e 0x806__ to specify an ARP packet.

For IP packets, instead of generating a fake Ethernet header you can
also use __-E rawip__ or __-l 101__ to indicate raw IP encapsulation.
Note that raw IP encapsulation does not work for any non-IP Layer 3 packet
(e.g. ARP), whereas generating a dummy Ethernet header with __-e__ works
for any sort of L3 packet.
--

-E <encapsulation type>::
+
--
Sets the packet encapsulation type of the output capture file.
*text2pcap -E* provides a list of the available types; note that not
all file formats support all encapsulation types.  The default type is
ether (Ethernet).

*NOTE:* This sets the encapsulation type of the output file, but does
not translate the packet headers or add additional headers. It is used
to specify the encapsulation that matches the input data.
--

-F <file format>::
+
--
Sets the file format of the output capture file. *Text2pcap* can write
the file in several formats; *text2pcap -F* provides a list of the
available output formats.  The default is the *pcapng* format.
--

-h::
Displays a help message.

-i <proto>::
+
--
Include dummy IP headers before each packet. Specify the IP protocol
for the packet in decimal. Use this option if your dump is the payload
of an IP packet (i.e. has complete L4 information) but does not have
an IP header with each packet. Note that an appropriate Ethernet header
is automatically included with each packet as well if the link-layer
type is Ethernet.
Example: __-i 46__ to specify an RSVP packet (IP protocol 46).  See
https://www.iana.org/assignments/protocol-numbers/protocol-numbers.xhtml for
the complete list of assigned internet protocol numbers.
--

-l <typenum>::
+
--
Sets the packet encapsulation type of the output capture file, using
pcap link-layer header type numbers.  Default is Ethernet (1).
See https://www.tcpdump.org/linktypes.html for the complete list
of possible encapsulations.
Example: __-l 7__ for ARCNet packets encapsulated BSD-style.
--

-m <max-packet>::
+
--
Set the maximum packet length, default is 262144.
Useful for testing various packet boundaries when only an application
level datastream is available.  Example:

__od -Ax -tx1 -v stream | text2pcap -m1460 -T1234,1234 - stream.pcap__

will convert from plain datastream format to a sequence of Ethernet
TCP packets.
--

-N <intf-name>::
Specify a name for the interface included when writing a pcapng format file.

-o hex|oct|dec|none::
+
--
Specify the radix for the offsets (hex, octal, decimal, or none). Defaults to
hex. This corresponds to the `-A` option for __od__. This parameter has no
effect in regex mode.

*NOTE:* With __-o none__, only one packet will be created, ignoring any
direction indicators or timestamps after the first byte along with any offsets.
--

-P <dissector>::
+
--
Include an EXPORTED_PDU header before each packet.  Specify, as a
string, the dissector to be called for the packet (DISSECTOR_NAME tag).
Use this option if your dump is the payload for a single upper layer
protocol (so specifying a link layer type would not work) and you wish
to create a capture file without a full dummy protocol stack.
Automatically sets the link layer type to Wireshark Upper PDU export.
Without this option, if the Upper PDU export link layer type (252) is
selected the dissector defaults to "data".
--

-q::
Don't display the summary of the options selected at the beginning, or the count of packets processed at the end.

-r <regex>::
+
--
Process the file in regex mode using __regex__ as described above.

*NOTE:* The regex mode uses memory-mapped I/O and does not work on
streams that do not support seeking, like terminals and pipes.
--

-s <srcport>,<destport>,<tag>::
+
--
Include dummy SCTP headers before each packet.  Specify, in decimal, the
source and destination SCTP ports, and verification tag, for the packet.
Use this option if your dump is the SCTP payload of a packet but does
not include any SCTP, IP or Ethernet headers.  Note that appropriate
Ethernet and IP headers are automatically also included with each
packet.  A CRC32C checksum will be put into the SCTP header.
--

-S <srcport>,<destport>,<ppi>::
+
--
Include dummy SCTP headers before each packet.  Specify, in decimal, the
source and destination SCTP ports, and a verification tag of 0, for the
packet, and prepend a dummy SCTP DATA chunk header with a payload
protocol identifier if __ppi__.  Use this option if your dump is the SCTP
payload of a packet but does not include any SCTP, IP or Ethernet
headers.  Note that appropriate Ethernet and IP headers are
automatically included with each packet.  A CRC32C checksum will be put
into the SCTP header.
--

-t <timefmt>::
+
--
Treats the text before the packet as a date/time code; __timefmt__ is a
format string supported by strftime(3), supplemented with the field
descriptor '%f' for fractional seconds up to nanoseconds.
Example: The time "10:15:14.5476" has the format code "%H:%M:%S.%f"
The special format string __ISO__ indicates that the string should be
parsed according to the ISO-8601 specification. This parameter is used
in regex mode if and only if the `<time>` capturing group is present.

*NOTE:* Date/time fields from the current date/time are
used as the default for unspecified fields.
--

-T <srcport>,<destport>::
+
--
Include dummy TCP headers before each packet. Specify the source and
destination TCP ports for the packet in decimal. Use this option if
your dump is the TCP payload of a packet but does not include any TCP,
IP or Ethernet headers. Note that appropriate Ethernet and IP headers
are automatically also included with each packet.
Sequence numbers will start at 0.
--

-u <srcport>,<destport>::
+
--
Include dummy UDP headers before each packet. Specify the source and
destination UDP ports for the packet in decimal. Use this option if
your dump is the UDP payload of a packet but does not include any UDP,
IP or Ethernet headers. Note that appropriate Ethernet and IP headers
are automatically also included with each packet.
Example: __-u1000,69__ to make the packets look like TFTP/UDP packets.
--

-v::
Print the version and exit.

-4 <srcip>,<destip>::
+
--
Prepend dummy IP header with specified IPv4 dest and source address.
This option should be accompanied by one of the following options: -i, -s, -S, -T, -u
Use this option to apply "custom" IP addresses.
Example: __-4 10.0.0.1,10.0.0.2__ to use 10.0.0.1 and 10.0.0.2 for all IP packets.
--

-6 <srcip>,<destip>::
+
--
Prepend dummy IP header with specified IPv6 dest and source address.
This option should be accompanied by one of the following options: -i, -s, -S, -T, -u
Use this option to apply "custom" IP addresses.
Example: __-6 2001:db8::b3ff:fe1e:8329,2001:0db8:85a3::8a2e:0370:7334__ to
use 2001:db8::b3ff:fe1e:8329 and 2001:0db8:85a3::8a2e:0370:7334 for all IP packets.
--

include::diagnostic-options.adoc[]

== SEE ALSO

od(1), xref:https://www.tcpdump.org/manpages/pcap.3pcap.html[pcap](3), xref:wireshark.html[wireshark](1), xref:tshark.html[tshark](1), xref:dumpcap.html[dumpcap](1), xref:mergecap.html[mergecap](1),
xref:editcap.html[editcap](1), strftime(3), xref:https://www.tcpdump.org/manpages/pcap-filter.7.html[pcap-filter](7) or xref:https://www.tcpdump.org/manpages/tcpdump.1.html[tcpdump](8)

== NOTES

This is the manual page for *Text2pcap* {wireshark-version}.
*Text2pcap* is part of the *Wireshark* distribution.
The latest version of *Wireshark* can be found at https://www.wireshark.org.

== AUTHORS

.Original Author
[%hardbreaks]
Ashok Narayanan <ashokn[AT]cisco.com>