/* text_import.c * State machine for text import * November 2010, Jaap Keuter * Modified March 2021, Paul Weiß * * Wireshark - Network traffic analyzer * By Gerald Combs * Copyright 1998 Gerald Combs * * Based on text2pcap.c by Ashok Narayanan * * SPDX-License-Identifier: GPL-2.0-or-later */ /******************************************************************************* * * This code reads in an ASCII hexdump of this common format: * * 00000000 00 E0 1E A7 05 6F 00 10 5A A0 B9 12 08 00 46 00 .....o..Z.....F. * 00000010 03 68 00 00 00 00 0A 2E EE 33 0F 19 08 7F 0F 19 .h.......3...... * 00000020 03 80 94 04 00 00 10 01 16 A2 0A 00 03 50 00 0C .............P.. * 00000030 01 01 0F 19 03 80 11 01 1E 61 00 0C 03 01 0F 19 .........a...... * * Each bytestring line consists of an offset, one or more bytes, and * text at the end. An offset is defined as a hex string of more than * two characters. A byte is defined as a hex string of exactly two * characters. The text at the end is ignored, as is any text before * the offset. Bytes read from a bytestring line are added to the * current packet only if all the following conditions are satisfied: * * - No text appears between the offset and the bytes (any bytes appearing after * such text would be ignored) * * - The offset must be arithmetically correct, i.e. if the offset is 00000020, * then exactly 32 bytes must have been read into this packet before this. * If the offset is wrong, the packet is immediately terminated * * A packet start is signaled by a zero offset. * * Lines starting with #TEXT2PCAP are directives. These allow the user * to embed instructions into the capture file which allows text2pcap * to take some actions (e.g. specifying the encapsulation * etc.). Currently no directives are implemented. * * Lines beginning with # which are not directives are ignored as * comments. Currently all non-hexdump text is ignored by text2pcap; * in the future, text processing may be added, but lines prefixed * with '#' will still be ignored. * * The output is a libpcap packet containing Ethernet frames by * default. This program takes options which allow the user to add * dummy Ethernet, IP and UDP, TCP or SCTP headers to the packets in order * to allow dumps of L3 or higher protocols to be decoded. * * Considerable flexibility is built into this code to read hexdumps * of slightly different formats. For example, any text prefixing the * hexdump line is dropped (including mail forwarding '>'). The offset * can be any hex number of four digits or greater. * * This converter cannot read a single packet greater than * WTAP_MAX_PACKET_SIZE_STANDARD. The snapshot length is automatically * set to WTAP_MAX_PACKET_SIZE_STANDARD. */ /******************************************************************************* * Alternatively this parses a Textfile based on a prel regex containing named * capturing groups like so: * (?\d+)\s*(?<|>)\s*(?