ftypes: Do not sanitize strings for UTF-8 errors

The ftype itself is encoding agnostic. In the case of literal
display filter strings it is possible and legal to contain
invalid UTF-8.

Maybe it shouldn't be but that requires a user-friendly diagnostic
message, not silently sanitizing the string as is done currently
(only a debug message is printed in that case).

Do the debug checks in proto_tree_set_string() instead. That
still detects dissector code that might need fixing, which was
the purpose for this check.

Improve documentation and add admonition for proto_tree_add_string().

Ping #18521.
This commit is contained in:
João Valverde 2022-10-26 15:13:22 +01:00
parent c1cede8d7c
commit 76a6e2a2bf
3 changed files with 9 additions and 3 deletions

View File

@ -44,7 +44,6 @@ string_fvalue_set_strbuf(fvalue_t *fv, wmem_strbuf_t *value)
string_fvalue_free(fv);
fv->value.strbuf = value;
WS_UTF_8_SANITIZE_STRBUF(fv->value.strbuf);
}
static char *
@ -77,7 +76,6 @@ val_from_string(fvalue_t *fv, const char *s, size_t len, gchar **err_msg _U_)
else
fv->value.strbuf = wmem_strbuf_new(NULL, s);
WS_UTF_8_SANITIZE_STRBUF(fv->value.strbuf);
return TRUE;
}

View File

@ -5010,6 +5010,8 @@ proto_tree_add_string(proto_tree *tree, int hfindex, tvbuff_t *tvb, gint start,
pi = proto_tree_add_pi(tree, hfinfo, tvb, start, &length);
DISSECTOR_ASSERT(length >= 0);
WS_UTF_8_CHECK(value, -1);
proto_tree_set_string(PNODE_FINFO(pi), value);
return pi;
@ -5059,7 +5061,6 @@ static void
proto_tree_set_string(field_info *fi, const char* value)
{
if (value) {
/* String must be valid UTF-8. It is sanitized otherwise (if enabled at compile time). */
fvalue_set_string(&fi->value, value);
} else {
/*

View File

@ -1986,6 +1986,13 @@ proto_tree_add_oid_format(proto_tree *tree, int hfindex, tvbuff_t *tvb, gint sta
proto_tree. The value passed in should be a UTF-8 encoded null terminated
string, such as produced by tvb_get_string_enc(), regardless of the original
packet data.
String must be valid UTF-8 but do not format the string for display in any way,
for example by escaping unprintable characters, because this is packet data,
not a display string. Formatting is a concern of the UI. Doing that here would
change the meaning of the captured data and make display filtering very
unintuitive for speacial characters.
@param tree the tree to append this item to
@param hfindex field index
@param tvb the tv buffer of the current data