The C language does not guarantee that "char" is signed or unsigned; it
just states that it's "implementation-dependent".
At least some C compilers for some architectures make it unsigned, so
you need "signed char" to get a signed value. In particular, it's
unsigned for most ARM compilers (compilers for Darwin-based OSes such as
macOS make it signed on all platforms, including ARM), which causes a
warning about "ba[i] < '\0'" always being false.
The purpose of that test is to check for octets that correspond neither
to ASCII printable characters nor ASCII control characters; just test
with !g_ascii_isprint(ba[i]) && !g_ascii_iscntrl(ba[i]). (Those are
macros, so it's not as if that adds any subroutine call overhead.)
Add some comments to explain what's being done in
ShowPacketBytesDialog::symbolizeBuffer() while we're at it. (Not one of
the better uses of C++ polymorphism, giving "replace the octet at this
location with this sequence of octets" and "replace all octets equal to
this value with this sequence of octets" the same name, even though what
they do differs significantly. I would have called one replace_at and
the other replace_all or something such as that, but the Qt developers
didn't ask me....)
Add an RPMBUILD_EXTRA_ARGS variable to CMakeLists.txt and use it in
GitLab CI to define __cmake_builddir. This should let ccache work with
our RPM builds.
hex_str_to_bytes_encoding() consumes pairs of hex digits (and
optional separator) to turn into bytes. It can return a pointer
to the character after the last digit consumed. Don't advance
the end pointer after a single unpaired digit that is not consumed
as part of the hex string returned.
tvb_get_string_bytes() can pass back the end offset. If conversion
fails, return the initial offset instead of zero to make repeated
calls easier in cases where the full length is not decoded due to
errors.
Relatedly, no dissector currently uses this return value, because
it's not useful currently.
GitLab CI builds RPMs in a different directory for each pipeline
($CI_PROJECT_DIR/build/packaging/rpm/BUILD/wireshark-<version>), so set
base_dir to the build directory and enable absolute_paths_in_stderr.
Fix our cache directory max sizes as well.
The proto.h APIs expect valid UTF-8 so replace uses of format_text()
with a label copy function that just does formatting and does not
check for encoding errors. Avoid multiple levels of temporary
string allocations.
Make sure the copy does not truncate a multibyte character and
produce invalid strings. Add debug checks for UTF-8 encoding errors
instead.
We escape C0 and C1 control codes (because control codes)
and ASCII whitespace (and bell).
Overall the goal is to be more efficient and optimized and help
detect misuse of APIs by passing invalid UTF-8.
Add a unit test for ws_label_strcat.
Use the setup_frame_number to look for and create conversations
with srtcp_add_address, the same way as done in srtp_add_address.
This ensures that RTP and RTCP find the same conversation when
called back to back (as when handling them multiplexed on the
same conversation.
Related to #18460.
As far as I can tell, get_unicode_or_ascii_string() always
nul-terminates string (as it should), so remove g_strlcpy()
copy that can truncate string and produce invalid UTF-8.
This avoids having general-purpose decoding happening in
non-DLL-exported functions defined in a dissector for #18478,
and removes unused functions and avoids duplicate decoding.
This also removes unnecessary early exit conditions for #18145.
Unit test cases for varint decoding are added to verify this.
If a character is not a valid Unicode codepoint, i.e. one of
the code points reserved for surrogate pairs or a code point
above 0x10FFFF, don't add it to a wmem_strbuf when converting
from other encodings but add a replacement character instead, by
using a new wmem_strbuf_append_unichar_validated() function.
Now we produce valid UTF-8 in various situations where UCS-2 or UTF-32
can encode unpaired surrogate codepoints. Consolidate some related
checks that are now redundant.
Also add a replacement character to the end of invalid UCS-2 strings
with an odd number of bytes, as done with UTF-16 and UTF-32.
Fix#18508
QFontMetrics::leading() was zero for Consolas on Windows in Qt5, but is
nonzero in Qt6. This revealed that we were inconsistently using height()
and leading() to calculate our line height. Just use lineSpacing()
instead.
Fixes#18438.
Rename tvb_get_nstringz0() to tvb_get_raw_bytes_as_stringz()
to reflect the fact that this function does not return
a string (UTF-8 internal text string).
Remove tvb_get_stringz() because it is unused and just seems
dangerous.
In PacketCable MTA capabilities, the length of the capability
is store as hex digits in ASCII. If bogus, the incorrect value
is added as an expert info. Ensure that it's formatted as UTF-8
and for display when added to the tree.
Fix#18437
3GPP came up with a special encoding of TDMA frame number, which reduces
the amount of bits needed to carry it from 32 to 16. This encoding is
not only employed on the radio interface (GSM RR), but also on the
A-bis/RSL interface which is used between BTS and BSC nodes.
From the user perspective, parsed RFN value is a lot more meaningful
than the T1/T2/T3 variables used on the wire. The GSM RR dissector
does show parsed RFN value together with these variables, while the RSL
dissector does not. Let's show it in the RSL dissector too.