Commit Graph

331 Commits

Author SHA1 Message Date
Pascal Quantin cfe11b1097 Add ENC_ASCII_7BITS encoding
Change-Id: I01ec87ff4181afb5b2de487fd5f5200f8d62f17d
Reviewed-on: https://code.wireshark.org/review/1088
Reviewed-by: Pascal Quantin <pascal.quantin@gmail.com>
2014-04-13 20:02:52 +00:00
Guy Harris cb16dff992 Get rid of more tvb_get_nstringz* calls.
Add an FT_STRINGZPAD type, for null-padded strings (typically
fixed-length fields, where the string can be up to the length of the
field, and is null-padded if it's shorter than that), and use it.  Use
IS_FT_STRING() in more cases, so that less code needs to know what types
are string types.

Add a tvb_get_stringzpad() routine, which gets null-padded strings.
Currently, it does the same thing that tvb_get_string_enc() does, but
that might change if we don't store string values as null-terminated
strings.

Change-Id: I46f56e130de8f419a19b56ded914e24cc7518a66
Reviewed-on: https://code.wireshark.org/review/1082
Reviewed-by: Guy Harris <guy@alum.mit.edu>
2014-04-12 22:27:22 +00:00
Guy Harris ae127f23fa Add Mac Roman and DOS CP437.
Change-Id: Ib96f2cf4ea71cd0cc2c703d58b9d254bf4c1248a
Reviewed-on: https://code.wireshark.org/review/1077
Reviewed-by: Guy Harris <guy@alum.mit.edu>
2014-04-12 08:54:06 +00:00
AndersBroman df80f3133c Fix a typo
Change-Id: Ie32a140e49140a92c69cb6fa84cdc55402516830
Reviewed-on: https://code.wireshark.org/review/935
Reviewed-by: Anders Broman <a.broman58@gmail.com>
2014-04-03 04:39:06 +00:00
AndersBroman 67cc462941 Don't use external function internaly to avoid multiple checks.
tvb_captured_length()                      -> tvb->length
tvb_captured_length_remaining(tvb, offset) -> (Inline) _tvb_captured_length_remaining()
tvb_get_ptr()                              -> ensure_contiguous()

Change-Id: I3540854c9b51ca9c3319b030c7d91b4aff976a26
Reviewed-on: https://code.wireshark.org/review/922
Reviewed-by: Anders Broman <a.broman58@gmail.com>
2014-04-03 04:08:45 +00:00
AndersBroman 84bc050a89 In the string handling routines don't call tvb_get_...() inside the loops insted get the ptr and read
directly avoiding the overhead of calling fast_ensure_contiguous()
repeatibly.

Change-Id: Ib5eee87ef9d49cb4e46b3b9c3d3db0134d3c4a32
Reviewed-on: https://code.wireshark.org/review/889
Reviewed-by: Anders Broman <a.broman58@gmail.com>
2014-04-03 04:08:00 +00:00
AndersBroman 01b65269bf Inlining some tvb function gives a 6% performance gain according to
valgrind.

Change-Id: I7881f8c1407d422a3f1ad5bc17e975b45703db74
Reviewed-on: https://code.wireshark.org/review/909
Reviewed-by: Evan Huus <eapache@gmail.com>
Reviewed-by: Anders Broman <a.broman58@gmail.com>
2014-04-01 15:41:33 +00:00
Guy Harris d156deff04 Rename "size" variable to "length", to match other string routines.
Change-Id: I385345cfafc7e7b4d3e66713fb0fe570b39f438d
Reviewed-on: https://code.wireshark.org/review/865
Reviewed-by: Guy Harris <guy@alum.mit.edu>
2014-03-29 20:23:09 +00:00
Evan Huus 521bab1e1c Use sized strbufs when extracting tvb strings
We rarely know exactly how long a string will be, but we frequently have a good
lower bound (that's better than the default strbuf size of 16). Starting at that
size probably reduces the amount of allocation/copying needed.

Also make use of the new _finalize() method to save memory and avoid constness
problems.

Change-Id: I3f043bd12c1ccfce5990168fb6531ecd287bec5b
Reviewed-on: https://code.wireshark.org/review/856
Reviewed-by: Guy Harris <guy@alum.mit.edu>
Reviewed-by: Evan Huus <eapache@gmail.com>
2014-03-29 20:01:17 +00:00
Bill Meier 99b55eb7a6 Fix a typo in a comment; use consistent indentation matching that specified by the editor modelines.
Change-Id: I6d4ad3675ec9099913c8a32ad1f2758316158f68
Reviewed-on: https://code.wireshark.org/review/587
Reviewed-by: Bill Meier <wmeier@newsguy.com>
Tested-by: Bill Meier <wmeier@newsguy.com>
2014-03-10 13:27:03 +00:00
Alexis La Goutte 296591399f Remove all $Id$ from top of file
(Using sed : sed -i '/^ \* \$Id\$/,+1 d')

Fix manually some typo (in export_object_dicom.c and crc16-plain.c)

Change-Id: I4c1ae68d1c4afeace8cb195b53c715cf9e1227a8
Reviewed-on: https://code.wireshark.org/review/497
Reviewed-by: Anders Broman <a.broman58@gmail.com>
2014-03-04 14:27:33 +00:00
Guy Harris 8d234a0d8c More tvbuff API deprecation, comment expansion, and documentation updates.
Do with tvb_get_stringz() what was done with tvb_get_string().

Redo the comments for the string get routines to try to give more detail
in a fashion that's a bit less hard to read.

Warn, in comments, of the problems with using
tvb_get_string()/tvb_get_stringz() (i.e., if your strings are non-ASCII,
all bytes with the 8th bit set are going be replaced by the Unicode
REPLACEMENT CHARACTER, and displayed as such).

Warn, in a comment, of the problems with tvb_get_const_stringz() (i.e.,
it gives you raw bytes, rather than guaranteed-to-be-valid UTF-8).

Update documentation and release notes appropriately.

Change-Id: Ibd3efb92a203861f507ce71bc8d04d19d9d38a93
Reviewed-on: https://code.wireshark.org/review/327
Reviewed-by: Guy Harris <guy@alum.mit.edu>
2014-02-26 22:04:08 +00:00
Bill Meier 11b5c15fdb Remove trailing whitespace
Change-Id: I8116f63ff88687c8db3fd6e8e23b22ab2f759af0
Reviewed-on: https://code.wireshark.org/review/385
Reviewed-by: Bill Meier <wmeier@newsguy.com>
Tested-by: Bill Meier <wmeier@newsguy.com>
2014-02-25 20:46:49 +00:00
Evan Huus 22149c5523 TVB API deprecations and cleanup
- rename tvb_length and similar to tvb_captured_length and similar; leave
  #defines in place for backwards-compat, but mark them clearly as deprecated in
  code comments and in checkAPI
- remove tvb_get_string as C code and just leave a #define in place for
  backwards-compat; mark it clearly as deprecated in code comment and checkAPI
- update READMEs and sample dissector for all of the above
- while in the neighbourhood, make checkAPI skip (and warn) for missing files
  instead of bailing on the whole check, so subsequent files still get checked

Change-Id: I32fc437896ca86ca73e9b49d5f50400adf8ec5ad
Reviewed-on: https://code.wireshark.org/review/311
Reviewed-by: Evan Huus <eapache@gmail.com>
2014-02-22 15:02:01 +00:00
Guy Harris 4d9475e4ef Get rid of tvb_get_faked_unicode() - tvb_get_string_enc() does the job
better.

We don't need eventlog_get_unicode_string_length() in the eventlog
dissector, either - tvb_unicode_strsize() does the job just as well.

svn path=/trunk/; revision=54874
2014-01-21 09:56:34 +00:00
Guy Harris 9cdf8dd5f5 Don't do the byte-with-8th-bit-set-to-REPLACEMENT-CHARACTER mapping for
UTF-8 strings.

Add that mapping for null-terminated ASCII strings.

Factor out some common parts of comments about string routines, and
clean up some other comments.

svn path=/trunk/; revision=54868
2014-01-21 01:23:29 +00:00
Martin Kaiser 26701ed0f7 remove todo comments
svn path=/trunk/; revision=54865
2014-01-20 21:56:38 +00:00
Martin Kaiser 933e95c8ec tvb_get_string(): replace 8bit characters with the unicode replacement char
svn path=/trunk/; revision=54864
2014-01-20 21:39:00 +00:00
Guy Harris 9228c72ef0 Explain casting away const.
svn path=/trunk/; revision=54816
2014-01-15 08:35:55 +00:00
Jakub Zawadzki d1dcee936b Move defines for helping with UTF-16 surrogate pairs to wsutil/unicode-utils.h
tvbuff version was moved, but with 'or' optimization from packet-json.

svn path=/trunk/; revision=54632
2014-01-07 21:55:49 +00:00
Bill Meier b26f50cbb1 (Trivial) explicitely --> explicitly
svn path=/trunk/; revision=54594
2014-01-04 17:29:20 +00:00
Pascal Quantin 6ebc058f47 Add proto_tree_add_ts_23_038_7bits_item() / tvb_get_ts_23_038_7bits_string() functions and update dissectors to use it.
Remove gsm_sms_char_7bit_unpack() / gsm_sms_chars_to_utf8() functions.
Update documentation a bit.

svn path=/trunk/; revision=54534
2014-01-01 14:33:19 +00:00
Jakub Zawadzki a65cbe8e7b Add new function: tvb_skip_guint8()
svn path=/trunk/; revision=54505
2013-12-30 23:58:45 +00:00
Guy Harris a8ac118885 Use Unicode REPLACEMENT CHARACTER for TS 23.038 errors, as we do for
unassigned code points in some other character sets.

svn path=/trunk/; revision=54477
2013-12-27 23:55:23 +00:00
Guy Harris 5f91a0afc7 Oops, escape characters shouldn't cause anything to be added to the
string, they should just cause TRUE to be returned - it's the *next*
code point that gets treated specially and, after mapping, added to the
string.

svn path=/trunk/; revision=54431
2013-12-24 01:03:59 +00:00
Evan Huus 5a81522aa2 Make sure uchar is always initialized. Just use '?' since the comment indicates
that it's a weird (undefined?) case.

svn path=/trunk/; revision=54430
2013-12-24 00:54:30 +00:00
Guy Harris bd8aeb9054 Update some comments.
svn path=/trunk/; revision=54429
2013-12-24 00:23:09 +00:00
Guy Harris 0d7a48a8bf Add a ENC_3GPP_TS_23_038 encoding, for the standard SMS alphabet in a
bit-packed string, and use it in some places.

svn path=/trunk/; revision=54428
2013-12-24 00:20:09 +00:00
Guy Harris eb3ff1396f Fix warning.
svn path=/trunk/; revision=54375
2013-12-23 02:19:27 +00:00
Guy Harris 8a5d226894 Fix the offset constant in SURROGATE_VALUE(), and add rather than OR it.
Expand a bunch of comments, discussing what various routines do and
should perhaps do.

Pull the core of tvb_get_ucs_2_string()/tvb_get_ucs_2_stringz() and
tvb_get_ucs_4_string()/tvb_get_ucs_4_stringz() into common routines, as
we did for tvb_get_utf_16_string()/tvb_get_utf_16_stringz().

svn path=/trunk/; revision=54374
2013-12-23 01:25:20 +00:00
Bill Meier e348c13deb (Trivial)
- Minor whitespace changes;
- Fix a typo;
- Add editor modelines to tnbuff_subset.c

svn path=/trunk/; revision=54364
2013-12-22 15:47:17 +00:00
Bill Meier 400a1fcd60 Use G_GINT64_CONSTANT(n) rather than 'nLL' when defining constants;
Fix a typo in a comment.

svn path=/trunk/; revision=54357
2013-12-22 14:43:35 +00:00
Jakub Zawadzki 1f88687d3f tvb_get_ucs_4_string: increase offset by 4
copy&paste of tvb_get_ucs_2_string?

svn path=/trunk/; revision=54353
2013-12-22 10:45:22 +00:00
Guy Harris fc7a77189d Add UCS-4 support, and use it.
Shuffle the character ENC_ values around a bit, keeping the Unicode
encodings together, moving the Windows code pages (only one for now)
after the ISO 8859 encodings, and putting "I can't believe it's not
ASCII!" at the end.

Fix some comment typoes, and update another comment, while we're at it.

svn path=/trunk/; revision=54351
2013-12-22 08:45:57 +00:00
Guy Harris f231a273f2 Add the rest of ISO-8859-n, thanks to Jakub's "generate a mapping table"
program.

Put the character-encoding cases in order.

svn path=/trunk/; revision=54344
2013-12-21 21:55:46 +00:00
Guy Harris 92f177ec97 Get rid of tvb_get_unicode_string() and tvb_get_unicode_stringz();
instead, have static routines to get UCS-2 (no surrogate pairs) and
UTF-16 (with surrogate pairs) strings, with the routines to handle
UTF-16 actually handling surrogate pairs.

Update some out-of-date comments while we're at it.

svn path=/trunk/; revision=54318
2013-12-21 01:42:41 +00:00
Evan Huus a6415ece0a Rename a couple of to_str functions to have ep_ in the name. This makes it
obvious that the returned string is ephemeral, and opens up the original names
in the API for versions that take a wmem pool (and thus can work in any scope).

svn path=/trunk/; revision=54249
2013-12-19 15:49:09 +00:00
Jakub Zawadzki 099294dd16 Add charset table for ISO/IEC 8859-9 (ENC_ISO_8859_9)
svn path=/trunk/; revision=54239
2013-12-18 23:32:06 +00:00
Evan Huus 8f665d9b36 Add a sixteenth element to all BCD digit sets to avoid garbage values when
decoding corrupt bytes. Some of these digit sets could probably be
deduplicated...

svn path=/trunk/; revision=54224
2013-12-18 15:54:32 +00:00
Jakub Zawadzki 0de43ce2dd Create sign extension routines in <wsutil/sign_ext.h>, use it in few places.
svn path=/trunk/; revision=54197
2013-12-17 21:36:33 +00:00
Michael Mann 79d336c664 Handle signed integers > 32 bits. Bug 8454 (https://bugs.wireshark.org/bugzilla/show_bug.cgi?id=8454)
svn path=/trunk/; revision=54183
2013-12-17 16:50:33 +00:00
Martin Kaiser a07c0ff146 add support for ISO 8859-5
svn path=/trunk/; revision=54132
2013-12-15 19:13:31 +00:00
Guy Harris 30ab019f2b In tvb_get_unicode_string(), if the byte count is odd, ignore the last
byte.  (We should perhaps add an expert info indication in those cases.)

svn path=/trunk/; revision=54074
2013-12-13 22:35:50 +00:00
Guy Harris 562348fbb8 Add ENC_ISO_8859_1.
Move the Wikipedia links for the code page layouts in front of the
tables whose contents reflect the code page layouts.

svn path=/trunk/; revision=53837
2013-12-08 01:05:35 +00:00
Jakub Zawadzki 0e5bc8a49c Add string encoding for ISO/IEC 8859-2 (ENC_ISO_8859_2)
svn path=/trunk/; revision=53826
2013-12-07 15:02:55 +00:00
Jakub Zawadzki 113b078a4d Add new string proto encoding for windows-1250 (ENC_WINDOWS_1250)
- Move windows-1250 to unicode encoding table to charset.c
- Add tvb_get_string_unichar2, tvb_get_stringz_unichar2 functions which recode tvb-string to UTF-8.

svn path=/trunk/; revision=53819
2013-12-07 10:10:03 +00:00
Jakub Zawadzki b3c93326bc Remove #if 0 inverse_bit_mask8 array.
It was only used by tvb_get_bits_buf (removed in r53183).

svn path=/trunk/; revision=53818
2013-12-07 09:14:35 +00:00
Jakub Zawadzki c1ef044de5 Move tvb_uncompress() to tvbuff_zlib.c
svn path=/trunk/; revision=53815
2013-12-06 23:23:44 +00:00
Jakub Zawadzki 5ac6474c94 Rename some of pint.h macros to match common style (bits number on the end).
pntohs  -> pntoh16
   pntohl  -> pntoh32
   pletohs -> pletoh16
   pletohl -> pletoh32
   phtons  -> phton16
   phtonl  -> phton32


svn path=/trunk/; revision=53652
2013-11-29 18:59:06 +00:00
Alexis La Goutte f482d8737f Fix unused-const-variable error when build with clang 3.4
svn path=/trunk/; revision=53512
2013-11-22 14:52:25 +00:00