text2pcap: gracefully handle hexdump without trailing LF

When copying hexdumps, the newline might be missing which would result
in a capture file missing one byte in its packet. Adjust the grammar to
recognize the two trailing hexadecimal characters as a "byte".

This is safe because Flex picks the rule that matches the longest input
string. So given "01 ", it will always match all three characters. If
something like "01x" is given, then the "text" rule will be matched (as
before). Only if no more characters are available (such as at the end of
a file), then the rule will match two hexdigits.

Remove the unnecessary hexdigit rule while at it.

Change-Id: I21dc37d684d1c410ce720cb27706a6e54f87f94d
Reviewed-on: https://code.wireshark.org/review/30190
Petri-Dish: Peter Wu <peter@lekensteyn.nl>
Tested-by: Petri Dish Buildbot
Reviewed-by: Anders Broman <a.broman58@gmail.com>
This commit is contained in:
Peter Wu 2018-10-13 00:21:16 +02:00 committed by Anders Broman
parent 9b72da0cdd
commit 22cf80d30d
2 changed files with 8 additions and 2 deletions

View File

@ -332,6 +332,13 @@ class case_text2pcap_parsing(subprocesstest.SubprocessTestCase):
"7f 00 00 01 ff 98 00 13 00 0d b5 48 66 69 72 73\n" "7f 00 00 01 ff 98 00 13 00 0d b5 48 66 69 72 73\n"
self.check_rawip(pdata, 0, 0) self.check_rawip(pdata, 0, 0)
def test_text2pcap_eol_missing(self):
'''Verify that the last LF can be missing.'''
pdata = "0000 45 00 00 21 00 01 00 00 40 11 7c c9 7f 00 00 01\n" \
"0010 7f 00 00 01 ff 98 00 13 00 0d b5 48 66 69 72 73\n" \
"0020 74"
self.check_rawip(pdata, 1, 33)
def run_text2pcap_content(test, content, args): def run_text2pcap_content(test, content, args):
testin_file = test.filename_from_id(testin_txt) testin_file = test.filename_from_id(testin_txt)

View File

@ -68,10 +68,9 @@ DIAG_OFF_FLEX
%} %}
hexdigit [0-9A-Fa-f]
directive ^#TEXT2PCAP.*\r?\n directive ^#TEXT2PCAP.*\r?\n
comment ^[\t ]*#.*\r?\n comment ^[\t ]*#.*\r?\n
byte [0-9A-Fa-f][0-9A-Fa-f][ \t] byte [0-9A-Fa-f][0-9A-Fa-f][ \t]?
byte_eol [0-9A-Fa-f][0-9A-Fa-f]\r?\n byte_eol [0-9A-Fa-f][0-9A-Fa-f]\r?\n
offset [0-9A-Fa-f]+[: \t] offset [0-9A-Fa-f]+[: \t]
offset_eol [0-9A-Fa-f]+\r?\n offset_eol [0-9A-Fa-f]+\r?\n