Commit Graph

478 Commits

Author SHA1 Message Date
João Valverde df0fc8b517 dfilter: Try to be more flexible with leading colons
For an expression starting with a colon (a literal) try to parse
the value with and without colon. This avoids excluding some
valid representations like the IPv6 address "::1".
2022-03-05 11:10:54 +00:00
João Valverde c4f9d8abda dfilter: Rename "unparsed" to "literal"
A literal value is a value that cannot be interpreted as a
registered protocol. An unparsed value can be a literal or
an identifier (protocol/field) according to context and the
current disambiguation rules.

Strictly literal here is to be understood to  mean "numeric
literal, including numeric arrays, but not strings or character
constants".
2022-03-05 11:10:54 +00:00
João Valverde 1278e36152 dfilter: Add more debug code 2022-02-27 23:35:57 +00:00
João Valverde ef31431aeb dfilter: Add a true/false boolean representation
Minor code cleanup.
2022-02-23 23:37:47 +00:00
João Valverde 70d516368b Fix EditorConfig settings 2022-02-23 23:37:47 +00:00
João Valverde 9cc3e7e1bb dfilter: Add support for binary literal constants
Example: 0b1001, 0B111000, etc.
2022-02-23 22:27:59 +00:00
Guy Harris ec0aaf1811 ftype-time: check for NULL from gmtime() and localtime().
On Windows, they return NULL for times prior to the Epoch.
2022-01-04 15:35:18 -08:00
João Valverde 8501dc48dd dfilter: Accept byte arrays without separators
This relaxes the display filter syntax to accept byte arrays without
separators. An expression such as the following becomes valid:

    quic.dcid == b1f0b7cbe0897974

Previously it had to be written as:

    quic.dcid == b1:f0:b7:cb:e0:89:79:74

Partially fixes #17818.
2022-01-03 16:27:16 +00:00
João Valverde dd9ac15ff2 dfilter: Require separators with ISO 8601 time
Require date/time separators when entering a time value, e,g:
    2014-07-04 12:34:56.789+00:00

Separators in the timezone offset are an exception, they are
never mandatory.

This excludes ISO basic format to avoid inputs that could
be entirely numbers indistinguishable from Epoch time, in case
we want to support that in the future.
2022-01-02 10:44:01 +00:00
João Valverde e724a4baf6 dfilter: Use ISO8601 as the default time format
Change from a default custom time representation to ISO8601.
All the existing formats are still supported for backward-
compatibility.

Before:

  Filter: frame.time == "2011-07-04 12:34:56"

  Constants:
  00000 PUT_FVALUE	"Jul  4, 2011 12:34:56.000000000" <FT_ABSOLUTE_TIME> -> reg#1
  (...)

After:

  Filter: frame.time == "2011-07-04 12:34:56"

  Constants:
  00000 PUT_FVALUE	"2011-07-04 12:34:56+0100" <FT_ABSOLUTE_TIME> -> reg#1
  (...)
2021-12-31 15:01:41 +00:00
João Valverde 0047ca961f dfilter: Add support for entering time in UTC
Add the option to enter a filter with an absolute time
value in UTC. Otherwise the value is interpreted in
local time.

The syntax used is an "UTC" suffix, for example:

    frame.time == "Dec 31, 2002 13:55:31.3 UTC"

This also changes the behavior of "Apply Selected as filter".
Fields using a local time display type will use local time
and fields using UTC display type will be applied using UTC.

Fixes #13268.
2021-12-30 17:53:09 +00:00
João Valverde 64572a11f9 dfilter: Use better error messages for absolute times 2021-12-29 02:25:38 +00:00
João Valverde 445dcd3117 epan: Extend abs_time_to_str() with a flags argument 2021-12-28 04:05:20 +00:00
João Valverde a566076839 epan: Move time display types to field_display_e
This makes it easier to understand the code, avoids conflicts
and ugly and unnecessary casts.

The field display enum has evolved over time from integer types
to a type generic parameter.
2021-12-27 22:31:31 +00:00
João Valverde 0d5bfd44a8 Use a wrapper function to call strptime()
Encapsulate the feature requirements for strptime() in a
portability wrapper.

Use _GNU_SOURCE to expose strptime. It should be enough on glibc
without the side-effect of selecting a particular SUS version,
which we don't need and might hide other definitions.
2021-12-27 14:07:32 +00:00
João Valverde 36d5aad962 wsutil: Split ws_regex_matches() into two functions
Split ws_regex_matches() into two functions with better semantics
and remove the WS_REGEX_ZERO_TERMINATED symbol.

ws_regex_matches() matches zero terminated strings.

ws_regex_matches_length() matches a string length in code units.
2021-12-21 00:40:02 +00:00
João Valverde c5a19582e4 epan: Convert to use stdio.h from GLib
Replace:
    g_snprintf() -> snprintf()
    g_vsnprintf() -> vsnprintf()
    g_strdup_printf() -> ws_strdup_printf()
    g_strdup_vprintf() -> ws_strdup_vprintf()

This is more portable, user-friendly and faster on platforms
where GLib does not like the native I/O.

Adjust the format string to use macros from intypes.h.
2021-12-19 19:29:53 +00:00
João Valverde fb0e1a4907 regex: Remove requirement for ssize_t
The type ssize_t is not available on Windows. Because this is
used in the public API we must provide a definition for it.
To avoid having to add a header to fix this use a size_t in
the API instead, and assign SIZE_MAX to represent a null
terminated string.
2021-12-13 23:57:32 +00:00
Moshe Kaplan a523135202 epan: Add header files to Doxygen
Add @file markers for epan
headers so that Doxygen will
generate documentation for them.
2021-11-30 08:46:49 +00:00
Moshe Kaplan 1c3a9af869 Add files with WS_DLL_PUBLIC to Doxygen
Add @file markers for most files that
contain functions exported with
WS_DLL_PUBLIC so that Doxygen will
generate documentation for them.
2021-11-29 21:27:45 +00:00
João Valverde 54bdc20e45 epan: Rewrite ws_escape_string() to use wmem
Return a wmem-allocated string.

Add boolean argument to enable/disable adding double quotes.
2021-11-29 17:47:53 +00:00
João Valverde 44121e2c3b Move escape_string() to wsutil
Move this utility function to wsutil. Rename to
ws_escape_string().

Also add tests.
2021-11-29 17:47:53 +00:00
João Valverde ef8125e3ae Move two functions from epan to wsutil/str_util
Move epan_memmem() and epan_strcasestr() to wsutil/str_util.
Rename to ws_memmem() and ws_strcasestr(). Add compile time
check for a system implementation and use that if available.

We invoke those functions using a wrapper to avoid exposing
_GNU_SOURCE outside of the implementation.
2021-11-28 12:32:51 +00:00
João Valverde 943c282009 dfilter: Parse character constants in lexer
Invalid character constants should be handled in the lexical scanner.

Todo: See if some code could be shared to parse double quoted strings.

It also fixes some unintuitive type coercions to string. Character
constants should be treated as characters, or maybe integers, or
maybe even throw an invalid comparison error, but coverting to a
literal string or byte array is surprising and not particularly
useful:
  '\xFF' -> "'\xFF'" (equals)
  '\xFF' -> "FF"     (contains)

Before:

    Filter: http.request.method contains "\x63"

    Constants:
    00000 PUT_FVALUE	"c" <FT_STRING> -> reg#1
    (...)

    Filter: http.request.method contains '\x63'

    Constants:
    00000 PUT_FVALUE	"63" <FT_STRING> -> reg#1
    (...)

    Filter: http.request.method == "\x63"

    Constants:
    00000 PUT_FVALUE	"c" <FT_STRING> -> reg#1
    (...)

    Filter: http.request.method == '\x63'

    Constants:
    00000 PUT_FVALUE	"'\\x63'" <FT_STRING> -> reg#1
    (...)

After:

    Filter: http.request.method contains '\x63'

    Constants:
    00000 PUT_FVALUE	"c" <FT_STRING> -> reg#1
    (...)

    Filter: http.request.method == '\x63'

    Constants:
    00000 PUT_FVALUE	"c" <FT_STRING> -> reg#1
    (...)
2021-11-24 08:40:20 +00:00
João Valverde 7028646f9e dfilter: Fix invalid character constant error message
This reverts commit d635ff4933.

A charconst cannot be a value string, for that reason it is not
redundant with unparsed.

Maybe character constants should be parsed in the lexical scanner
instead.

Before:
  Filter: ip.proto == '\g'
  dftest: "'\g'" cannot be found among the possible values for ip.proto.

After:
  Filter: ip.proto == '\g'
  dftest: "'\g'" isn't a valid character constant.
2021-11-23 17:35:40 +00:00
João Valverde 72c5efea1b dfilter: Reject invalid character escape sequences
For double quoted strings. This is consistent with single quote
character constants and the C standard. It also avoids common
mistakes where the superfluous backslash is silently suppressed.
2021-11-23 16:48:02 +00:00
João Valverde 274531820a Move regex code to wsutil 2021-11-14 21:00:59 +00:00
João Valverde b9f2e4b7fa Make PCRE2 a required dependency 2021-11-14 21:00:59 +00:00
João Valverde 9df5279af7 dfilter: Remove support for GRegex
PCRE2 is mature, widely used and widely available. Supporting two
different RE implementations, one of which is unmaintained, is
unnecessary and counter-productive.
2021-11-14 21:00:59 +00:00
João Valverde ed8a02af17 dfilter: Add support for PCRE2
PCRE2 is the future of PCRE. The only advantage of GRegex is that
it comes bundled with GLib, which is not an advantage at all.
PCRE2 is widely available, the GRegex abstractions layer are not a
good fit and abstract things that don't need abstracting or that we
could handle better ourselves, there are open bugs (#12997) and
maintenance is spotty at best.

GRegex comes with many of the problems of bundled code, aggravated by
the fact that it completely falls outside of our control.
2021-11-14 21:00:59 +00:00
João Valverde 526ccff3d0 ftypes: Remove unused function declarations
Some registration function declarations were left behind,
remove them. While here move function declarations after
typedefs as is customary.
2021-11-11 09:50:29 +00:00
João Valverde fd78f1ed02 ftypes: Clean up duplicate struct field 2021-11-11 09:50:12 +00:00
João Valverde 5503d5e131 ftypes: Optimize a memory allocation
Avoid pre-allocating a large worst case size character buffer.
2021-11-11 03:56:04 +00:00
João Valverde 1a32a75a62 ftypes: Internal headers need to be internal
The header ftypes-int.h should not be used outside of epan/ftypes
because it is a private header.

The functions fvalue_free() and fvalue_cleanup() need not and should
not be macros either.
2021-11-11 03:15:31 +00:00
João Valverde 6ad14ac4fa ftypes: Remove fvalue_string_repr_len()
The implementation is pre-computing the length and using that
to allocate a buffer. This doesn't have any practical advantage
and is inefficient because the code is mostly doing the same work
twice. Remove the unnecessary length pre-computation step.
2021-11-10 16:02:45 +00:00
João Valverde b49abcb215 epan: Remove fvalue_string_repr_len() from the public API
This function is unnecessary. Clients are receiving a wmem-allocated
buffer and have no need to know the length apriori.
2021-11-10 16:01:21 +00:00
João Valverde 084619088c ftypes: Bugfix missing return statement 2021-11-10 16:01:21 +00:00
João Valverde 4c800f2dba ftypes: Replace a g_snprintf() 2021-11-10 16:00:27 +00:00
João Valverde 7630577ffa ftypes: Bugfix a buffer size
The 'size' variable is not the size of the 'mantissa_str' buffer.
'size' is the output buffer size, sizeof(mantissa_str) is fixed
at 8 bytes.
2021-11-10 15:43:01 +00:00
João Valverde 69c850df51 ftypes: Simplify fvalue_can_*() interface
If an ftype can participate in equala assume it can also participate in
not equals. Use fvalue_can_eq() instead of fvalue_can_ne().

If it can participate in one order comparison it can participate in all.
Replace any comparison with fvalue_can_cmp().
2021-11-07 22:44:59 +00:00
João Valverde 353beb6c6d dfilter: Fixup a null return value 2021-11-01 10:46:53 +00:00
João Valverde d635ff4933 dfilter: Remove redundant STTYPE_CHARCONST syntax node
A charconst uses the same semantic rules as unparsed so just
use the latter to avoid redundancies.

We keep the use of TOKEN_CHARCONST as an optimization to avoid
an unnecessary name resolution (lookup for a registered field with
the same name as the charconst).
2021-10-31 20:33:31 +00:00
João Valverde db04d188e1 Remove some unnecessary casts.
Casts are best avoided unless they are truly required. Fix some
constness mismatches this revealed.
2021-10-27 10:24:20 +01:00
João Valverde e8800ff3c4 dfilter: Add a thin encapsulation layer for REs 2021-10-18 12:09:36 +00:00
João Valverde c484ad0e5c dfilter: Don't try to parse byte arrays as strings
It won't work with embedded null bytes so don't try. This is
not an additional restriction, it just removes a hidden failure
mode. To support matching embedded NUL bytes we would have
to use an internal string representation other than
null-terminated C strings (which doesn't seem very onerous with
GString).

Before:
  Filter: http.user_agent == 41:42:00:43

  Constants:
  00000 PUT_FVALUE	"AB" <FT_STRING> -> reg#1

  Instructions:
  00000 READ_TREE		http.user_agent -> reg#0
  00001 IF-FALSE-GOTO	3
  00002 ANY_EQ		reg#0 == reg#1
  00003 RETURN

After:
  Filter: http.user_agent == 41:42:00:43

  Constants:
  00000 PUT_FVALUE	"41:42:00:43" <FT_STRING> -> reg#1

  Instructions:
  00000 READ_TREE		http.user_agent -> reg#0
  00001 IF-FALSE-GOTO	3
  00002 ANY_EQ		reg#0 == reg#1
  00003 RETURN
2021-10-15 13:06:51 +01:00
João Valverde 144dc1e2ee dfilter: Use the same semantic rules for protocols and bytes
FT_PROTOCOL and FT_BYTES are the same semantic type, but one is
backed by a GByteArray and the other by a TVBuff. Use the same
semantic rules to parse both. In particular unparsed strings
are not converted to literal strings for protocols.

Before:
  Filter: frame contains 0x0000

  Constants:
  00000 PUT_FVALUE	30:78:30:30:30:30 <FT_PROTOCOL> -> reg#1

  Instructions:
  00000 READ_TREE		frame -> reg#0
  00001 IF-FALSE-GOTO	3
  00002 ANY_CONTAINS	reg#0 contains reg#1
  00003 RETURN

  Filter: frame[5:] contains 0x0000
  dftest: "0x0000" is not a valid byte string.

After:
  Filter: frame contains 0x0000
  dftest: "0x0000" is not a valid byte string.

  Filter: frame[5:] contains 0x0000
  dftest: "0x0000" is not a valid byte string.

Related to #17634.
2021-10-15 13:06:51 +01:00
João Valverde 041aa24a37 ftypes: Rewrite FT_PROTOCOL comparison operator
For efficiency do the comparison in a single function call
instead of trying to preserving exactly the previous semantics.

Still I tried not to deviate much.
2021-10-10 20:48:29 +00:00
João Valverde 13e9e7199c ftypes: Use an order function to compare ftypes
All the order operators can be defined in terms of 'lt'
and 'eq' so use that to reduce the number of required
methods from 6 to 2.

Further reduce to one by combining those two into a single
function that has memcmp semantics: negative return is
"less than", positive is "greater than" and zero is equal.
2021-10-10 20:48:29 +00:00
João Valverde 5fcdf25697 dfilter: Generalize special case of one byte literal
Instead of only accepting a byte literal specification if the LHS is a
len-1 byte string, accept it everywhere bytes are wanted.

Before:
  $ dftest "frame[1] contains 0x01"
  Filter: frame[1] contains 0x01

  Constants:
  00000 PUT_FVALUE	01 <FT_BYTES> -> reg#2

  Instructions:
  (...)

  $ dftest "frame[1:4] contains 0x01"
  Filter: frame[1:4] contains 0x01
  dftest: "0x01" is not a valid byte string.

After:
  $ dftest "frame[1:4] contains 0x01"
  $ Filter: frame[1:4] contains 0x01

  Constants:
  00000 PUT_FVALUE	01 <FT_BYTES> -> reg#2

  Instructions:
  (...)
2021-10-07 23:01:50 +00:00
João Valverde 9dab2280ca dfilter: Fix parsing of octal character escape sequences
Octal escape sequences \NNN can have between 1 and 3 digits. If
the sequence had less than 3 digits the parser got out of sync
with an incorrect double increment of the pointer and errors out
parsing sequences like \0, \2 or \33.

Before:
  Filter: ip.proto == '\33'
  dftest: "'\33'" is too long to be a valid character constant.

After:
  Filter: ip.proto == '\33'

  Constants:
  00000 PUT_FVALUE	27 <FT_UINT8> -> reg#1

  Instructions:
  00000 READ_TREE		ip.proto -> reg#0
  00001 IF-FALSE-GOTO	3
  00002 ANY_EQ		reg#0 == reg#1
  00003 RETURN

Fixes #16525.
2021-10-07 18:44:37 +00:00