When decoding UCS-4/UTF-32, map Unicode code points above
0x10FFFFF to REPLACEMENT CHARACTER, as they are not legal,
and would create invalid UTF-8.
Also if the number of bytes given is not a multiple of 4,
insert a replacement character at the end as well.
This is two long standing todos. Fixes#18435.
The result of Base64 decoding might not be valid UTF-8, so
check it as such. Also add the decoded result as a new tvb data
source, so that it's easier to do other manipulations on it from
the GUI in case it isn't UTF-8.
Note that RFC 7617 says that the encoding is only known to be
UTF-8 if the charset parameter was used in the WWW-Authenticate
header, so perhaps this should be a FT_BYTES using
BASE_SHOW_UTF_8_PRINTABLE
Fix#18408
As the Qt6 QString::QString(const QByteArray &ba) documenation says:
"Note: any null ('\0') bytes in the byte array will be included in this
string, converted to Unicode null characters (U+0000). This behavior is
different from Qt 5.x."
Make sure FieldInformation::toString() truncates its display label byte
array before converting it to a QString.
Fixes#18428
This removes the last dependency of the logging subsystem on the
preferences module. The latter is started much later than the former
and this is an issue.
The Windows-only preference "gui.console_open" is stored in the
registry as HKEY_LOCAL_USER\Software\Wireshark\ConsoleOpen. The semantics
are exactly the same. The preference is read by the logging subsystem
for initialization and then again by the preferences (read/write) so
the user can configure it as before.
The code to store the preference also in the preferences file was
kept, for backward compatibility and because it is not incompatible
with using the Registry concurrently.
The elimination of the prefs dependency also allows moving the Windows
console logic to wsutil and add the functionality to wslog directly,
thereby eliminating the superfluous Wireshark/Logray custom log handler.
To be able to read the ws_log_console_open global variable from
libwireshark it becomes necessary to add a new export macro
symbol called WSUTIL_EXPORT.
If the formatted string generated by expert_add_info_format is
truncated by being larger than ITEM_LABEL_LENGTH, it might get
truncated in the middle of a multibyte UTF-8 character.
Check for that, and end the string where the partial character
starts.
Fix#18421
GMTLS is a non-official name, now that these is a Chinese National Standard called
"GB/T 38636-2020 Information security technology—Transport layer cryptography protocol(TLCP)"
so we replace GMTLS by TLCP
Removing the artificial shell prompt symbols does not hurt
legibility and makes is significantly easier to copy-paste
commands, either by double-clicking for a single line or
click and drag for a multiline block of text.
Make sure we fetch AWS_PROFILE if it exists. Don't add AWS_PROFILE or
AWS_REGION if they're already in the profile and region lists. Fix our
default values.
They're of type FT_NONE, meaning that they do not have values, they're
just present or not.
Handle the TCP analysis fields "tcp.analysis.retransmission" and
"tcp.analysis.keep_alive", both of which are expert infos, by just
seeing if they're present or not.
Fixes a problem mentioned in a comment in merge request !8412.
The function tvb_get_const_stringz() does not check for a string
encoding and returns a pointer to a byte array. For this reason
it should not be used. Prefer other functions that return a
valid UTF-8 string from a source encoding or use tvb_get_ptr()
to fetch a byte pointer.
This adds a new attribute that allows declaring Wireshark
functions as deprecated.
Also disabe -Werror with deprecated declarations Deprecated
declarations can be introduced suddenly with a new version
of an external dependency or a new internal deprecation and
that has its own timeline to fix. We should still be able to
build with -Werror in that case.
Replace strtol/strtoul with the glib functions that do
not have a locale dependency.
Cleanup some casts and print formats. Remove some code
duplication. Add some null checks.
Rename a function for consistency.
This patch fixes some bugs that occur with padded Ethernet frames and
frames, when a Ethernet FCS is present. In the past that lead to wrong
detection of the ICV by the Ethernet dissector (trailer).
Such errors did occur for example with frames that were padded and
MACsec was added later; thus, being bigger than expected for the
heuristics in the packet-eth.c: "pinfo->fd->pkt_len >= 60 ..."
Format is defined by Wi-Fi EasyMesh™ Specification Version 4.0
* 5.2.2 Backhaul STA Configuration (Table 4. Multi-AP Default 802.1Q Setting
subelement format)
* 7.1 AP configuration (Table 15. Multi-AP Profile subelement)
We need to convert strings to valid UTF-8 for internal Wireshark
use. Since we aren't storing the code set service context as
conversation data (it's sent only when initially connecting, not
on a per-request basis), use the default of ISO-8859-1.
Also some automatic fixes of allocation patterns
Fix#18410
We amend the :<numeric> pattern to not eat the leading
colon. Because the colon can be part of the value (with IPv6 addresses
for example) we want to avoid doing that.
IPv6 addresses are covered by their own rules but this removes the
requirement in the future to handle any special cases and avoids
surprises.
For this reason the colon-prefix syntax is already explicitly defined to
work only for byte arrays and there is currently no universal
syntax for all literal values or even all numbers.
Other numbers can keep using the lexical type "unparsed".
```
run/dftest "_ws.ftypes.uint8 == :fd"
Filter: _ws.ftypes.uint8 == :fd
dftest: ":fd" is not a valid number.
_ws.ftypes.uint8 == :fd
^~~
run/dftest "_ws.ftypes.uint8 == fd"
Filter: _ws.ftypes.uint8 == fd
dftest: "fd" is not a valid number.
_ws.ftypes.uint8 == fd
^~
run/dftest "_ws.ftypes.uint8 == 0xfd"
Filter: _ws.ftypes.uint8 == 0xfd
Syntax tree:
0 TEST_ANY_EQ:
1 FIELD(_ws.ftypes.uint8 <FT_UINT8>)
1 FVALUE(253 <FT_UINT8>)
Instructions:
00000 READ_TREE _ws.ftypes.uint8 <FT_UINT8> -> reg#0
00001 IF_FALSE_GOTO 3
00002 ANY_EQ reg#0 == 253 <FT_UINT8>
00003 RETURN
run/dftest "_ws.ftypes.bytes == fd"
Filter: _ws.ftypes.bytes == fd
Syntax tree:
0 TEST_ANY_EQ:
1 FIELD(_ws.ftypes.bytes <FT_BYTES>)
1 FVALUE(fd <FT_BYTES>)
Instructions:
00000 READ_TREE _ws.ftypes.bytes <FT_BYTES> -> reg#0
00001 IF_FALSE_GOTO 3
00002 ANY_EQ reg#0 == fd <FT_BYTES>
00003 RETURN
run/dftest "_ws.ftypes.bytes == :fd"
Filter: _ws.ftypes.bytes == :fd
Syntax tree:
0 TEST_ANY_EQ:
1 FIELD(_ws.ftypes.bytes <FT_BYTES>)
1 FVALUE(fd <FT_BYTES>)
Instructions:
00000 READ_TREE _ws.ftypes.bytes <FT_BYTES> -> reg#0
00001 IF_FALSE_GOTO 3
00002 ANY_EQ reg#0 == fd <FT_BYTES>
00003 RETURN
```
The <...> syntax for literals, intended to be as generic as
possible, unintentionally introduced an ambiguity with the
relational expression "a < b or a > c".
Literals are values like numbers, bytes, IPv6 addresses or, one
could imagine, UNC paths for example, if an FT_UNC type were to
be added in the future.
We could use a new unique symbol like @...@ but the <...>
syntax is very recent and may not be necessary with ":xxx" so
just remove it.
A byte array can be explicitly declared by prefixing with a colon. It
is not as generic but the main ambiguity that this new syntax attempted
to solve is bytes vs protocol names. We don't want to introduce a new
reserved symbol for now, until other requirements if any are more clear.
Fixes#18418.