make-manuf.py: Expand a comment.

Change-Id: I545a63bb4a045ba93d1ad1ee82315315bdbb3c9e
Reviewed-on: https://code.wireshark.org/review/29508
Reviewed-by: Anders Broman <a.broman58@gmail.com>
This commit is contained in:
Gerald Combs 2018-09-09 09:40:34 -07:00 committed by Anders Broman
parent ce3d7840c1
commit cba7dfb40b
1 changed files with 7 additions and 1 deletions

View File

@ -70,7 +70,13 @@ def shorten(manuf):
# Remove all spaces
manuf = re.sub('\s+', '', manuf)
# Truncate all names to a reasonable length, say, 8 characters.
# If the string contains UTF-8, this may be substantially more than 8 bytes.
# If the string contains UTF-8, this may be substantially more than 8
# bytes. It might also be less than 8 visible characters. Python slices
# unicode strings by code point, which is better than raw bytes but not
# as good as grapheme clusters. https://bugs.python.org/issue30717
#
# In our case 'Savroni̇k Elektroni̇k' is truncated to 'Savroni̇', which
# is 7 visible characters, 8 code points, and 9 bytes.
manuf = manuf[:8]
if manuf.lower() == orig_manuf.lower():