forked from osmocom/wireshark
febc842521
Change-Id: Iac5066ff97d26de1660c38b9cd3f17781a521823 Reviewed-on: https://code.wireshark.org/review/6949 Reviewed-by: Evan Huus <eapache@gmail.com>
358 lines
14 KiB
Text
358 lines
14 KiB
Text
1. Introduction
|
|
|
|
The 'wmem' memory manager is Wireshark's memory management framework, replacing
|
|
the old 'emem' framework which was removed in Wireshark 2.0.
|
|
|
|
In order to make memory management easier and to reduce the probability of
|
|
memory leaks, Wireshark provides its own memory management API. This API is
|
|
implemented inside epan/wmem/ and provides memory pools and functions that make
|
|
it easy to manage memory even in the face of exceptions (which many dissector
|
|
functions can raise).
|
|
|
|
Correct use of these functions will make your code faster, and greatly reduce
|
|
the chances that it will leak memory in exceptional cases.
|
|
|
|
Wmem was originally conceived in this email to the wireshark-dev mailing list:
|
|
https://www.wireshark.org/lists/wireshark-dev/201210/msg00178.html
|
|
|
|
2. Usage for Consumers
|
|
|
|
If you're writing a dissector, or other "userspace" code, then using wmem
|
|
should be very similar to using emem. All you need to do is include the header
|
|
(epan/wmem/wmem.h) and get a handle to a memory pool (if you want to *create*
|
|
a memory pool, see the section "3. Usage for Producers" below).
|
|
|
|
A memory pool is an opaque pointer to an object of type wmem_allocator_t, and
|
|
it is the very first parameter passed to almost every call you make to wmem.
|
|
Other than that parameter (and the fact that functions are prefixed wmem_
|
|
instead of ep_ or se_) usage is exactly like that of emem. For example:
|
|
|
|
wmem_alloc(myPool, 20);
|
|
|
|
allocates 20 bytes in the pool pointed to by myPool.
|
|
|
|
2.1 Available Pools
|
|
|
|
2.1.1 (Sort Of) Global Pools
|
|
|
|
Dissectors that include the wmem header file will have three pools available
|
|
to them automatically: wmem_packet_scope(), wmem_file_scope() and
|
|
wmem_epan_scope();
|
|
|
|
The packet pool is scoped to the dissection of each packet, meaning that any
|
|
memory allocated in it will be automatically freed at the end of the current
|
|
packet. The file pool is similarly scoped to the dissection of each file,
|
|
meaning that any memory allocated in it will be automatically freed when the
|
|
current capture file is closed.
|
|
|
|
NB: Using these pools outside of the appropriate scope (eg using the packet
|
|
pool when there isn't a packet being dissected) will throw an assertion.
|
|
See the comment in epan/wmem/wmem_scopes.c for details.
|
|
|
|
The epan pool is scoped to the library's lifetime - memory allocated in it is
|
|
not freed until epan_cleanup() is called, which is typically at the very end of
|
|
the program.
|
|
|
|
2.1.2 Pinfo Pool
|
|
|
|
Certain allocations (such as AT_STRINGZ address allocations and anything that
|
|
might end up being passed to add_new_data_source) need their memory to stick
|
|
around a little longer than the usual packet scope - basically until the
|
|
next packet is dissected. This is, in fact, the scope of Wireshark's pinfo
|
|
structure, so the pinfo struct has a 'pool' member which is a wmem pool scoped
|
|
to the lifetime of the pinfo struct.
|
|
|
|
2.1.3 NULL Pool
|
|
|
|
Any function that takes a pointer to a wmem_allocator_t can also be passed NULL
|
|
instead. In this case the memory is not allocated in a pool at all - it is
|
|
manually managed, and *must* eventually be passed to wmem_free() in order to
|
|
prevent memory leaks. Note that passing wmem_allocated memory directly to free()
|
|
or g_free() is not safe; the backing type of manually managed memory may be
|
|
changed without warning.
|
|
|
|
2.2 API
|
|
|
|
Full documentation for each function (parameters, return values, behaviours)
|
|
lives (or will live) in Doxygen-format in the header files for those functions.
|
|
This is just an overview of which header files you should be looking at.
|
|
|
|
2.2.1 Core API
|
|
|
|
wmem_core.h
|
|
- Basic memory management functions (wmem_alloc, wmem_realloc, wmem_free).
|
|
|
|
2.2.2 Strings
|
|
|
|
wmem_strutl.h
|
|
- Utility functions for manipulating null-terminated C-style strings.
|
|
Functions like strdup and strdup_printf.
|
|
|
|
wmem_strbuf.h
|
|
- A managed string object implementation, similar to std::string in C++ or
|
|
GString from Glib.
|
|
|
|
2.2.3 Container Data Structures
|
|
|
|
wmem_array.h
|
|
- A growable array (AKA vector) implementation.
|
|
|
|
wmem_list.h
|
|
- A doubly-linked list implementation.
|
|
|
|
wmem_map.h
|
|
- A hash map (AKA hash table) implementation.
|
|
|
|
wmem_queue.h
|
|
- A queue implementation (first-in, first-out).
|
|
|
|
wmem_stack.h
|
|
- A stack implementation (last-in, first-out).
|
|
|
|
wmem_tree.h
|
|
- A balanced binary tree (red-black tree) implementation.
|
|
|
|
2.2.4 Miscellaneous Utilities
|
|
|
|
wmem_miscutl.h
|
|
- Misc. utility functions like memdup.
|
|
|
|
2.3 Callbacks
|
|
|
|
WARNING: You probably don't actually need these; use them only when you're
|
|
sure you understand the dangers.
|
|
|
|
Sometimes (though hopefully rarely) it may be necessary to store data in a wmem
|
|
pool that requires additional cleanup before it is freed. For example, perhaps
|
|
you have a pointer to a file-handle that needs to be closed. In this case, you
|
|
can register a callback with the wmem_register_callback function
|
|
declared in wmem_user_cb.h. Every time the memory in a pool is freed, all
|
|
registered cleanup functions are called first.
|
|
|
|
Note that callback calling order is not defined, you cannot rely on a
|
|
certain callback being called before or after another.
|
|
|
|
WARNING: Manually freeing or moving memory (with wmem_free or wmem_realloc)
|
|
will NOT trigger any callbacks. It is an error to call either of
|
|
those functions on memory if you have a callback registered to deal
|
|
with the contents of that memory.
|
|
|
|
3. Usage for Producers
|
|
|
|
NB: If you're just writing a dissector, you probably don't need to read
|
|
this section.
|
|
|
|
One of the problems with the old emem framework was that there were basically
|
|
two allocator backends (glib and mmap) that were all mixed together in a mess
|
|
of if statements, environment variables and #ifdefs. In wmem the different
|
|
allocator backends are cleanly separated out, and it's up to the owner of the
|
|
pool to pick one.
|
|
|
|
3.1 Available Allocator Back-Ends
|
|
|
|
Each available allocator type has a corresponding entry in the
|
|
wmem_allocator_type_t enumeration defined in wmem_core.h. See the doxygen
|
|
comments in that header file for details on each type.
|
|
|
|
3.2 Creating a Pool
|
|
|
|
To create a pool, include the regular wmem header and call the
|
|
wmem_allocator_new() function with the appropriate type value.
|
|
For example:
|
|
|
|
#include "wmem/wmem.h"
|
|
|
|
wmem_allocator_t *myPool;
|
|
myPool = wmem_allocator_new(WMEM_ALLOCATOR_SIMPLE);
|
|
|
|
From here on in, you don't need to remember which type of allocator you used
|
|
(although allocator authors are welcome to expose additional allocator-specific
|
|
helper functions in their headers). The "myPool" variable can be passed around
|
|
and used as normal in allocation requests as described in section 2 of this
|
|
document.
|
|
|
|
3.3 Destroying a Pool
|
|
|
|
Regardless of which allocator you used to create a pool, it can be destroyed
|
|
with a call to the function wmem_destroy_allocator(). For example:
|
|
|
|
#include "wmem/wmem.h"
|
|
|
|
wmem_allocator_t *myPool;
|
|
|
|
myPool = wmem_allocator_new(WMEM_ALLOCATOR_SIMPLE);
|
|
|
|
/* Allocate some memory in myPool ... */
|
|
|
|
wmem_destroy_allocator(myPool);
|
|
|
|
Destroying a pool will free all the memory allocated in it.
|
|
|
|
3.4 Reusing a Pool
|
|
|
|
It is possible to free all the memory in a pool without destroying it,
|
|
allowing it to be reused later. Depending on the type of allocator, doing this
|
|
(by calling wmem_free_all()) can be significantly cheaper than fully destroying
|
|
and recreating the pool. This method is therefore recommended, especially when
|
|
the pool would otherwise be scoped to a single iteration of a loop. For example:
|
|
|
|
#include "wmem/wmem.h"
|
|
|
|
wmem_allocator_t *myPool;
|
|
|
|
myPool = wmem_allocator_new(WMEM_ALLOCATOR_SIMPLE);
|
|
for (...) {
|
|
|
|
/* Allocate some memory in myPool ... */
|
|
|
|
/* Free the memory, faster than destroying and recreating
|
|
the pool each time through the loop. */
|
|
wmem_free_all(myPool);
|
|
}
|
|
wmem_destroy_allocator(myPool);
|
|
|
|
4. Internal Design
|
|
|
|
Despite being written in Wireshark's standard C90, wmem follows a fairly
|
|
object-oriented design pattern. Although efficiency is always a concern, the
|
|
primary goals in writing wmem were maintainability and preventing memory
|
|
leaks.
|
|
|
|
4.1 struct _wmem_allocator_t
|
|
|
|
The heart of wmem is the _wmem_allocator_t structure defined in the
|
|
wmem_allocator.h header file. This structure uses C function pointers to
|
|
implement a common object-oriented design pattern known as an interface (also
|
|
known as an abstract class to those who are more familiar with C++).
|
|
|
|
Different allocator implementations can provide exactly the same interface by
|
|
assigning their own functions to the members of an instance of the structure.
|
|
The structure has eight members in three groups.
|
|
|
|
4.1.1 Implementation Details
|
|
|
|
- private_data
|
|
- type
|
|
|
|
The private_data pointer is a void pointer that the allocator implementation can
|
|
use to store whatever internal structures it needs. A pointer to private_data is
|
|
passed to almost all of the other functions that the allocator implementation
|
|
must define.
|
|
|
|
The type field is an enumeration of type wmem_allocator_type_t (see
|
|
section 3.1). Its value is set by the wmem_allocator_new() function, not
|
|
by the implementation-specific constructor. This field should be considered
|
|
read-only by the allocator implementation.
|
|
|
|
4.1.2 Consumer Functions
|
|
|
|
- alloc()
|
|
- free()
|
|
- realloc()
|
|
|
|
These function pointers should be set to functions with semantics obviously
|
|
similar to their standard-library namesakes. Each one takes an extra parameter
|
|
that is a copy of the allocator's private_data pointer.
|
|
|
|
Note that realloc() and free() are not expected to be called directly by user
|
|
code in most cases - they are primarily optimisations for use by data
|
|
structures that wmem might want to implement (it's inefficient, for example, to
|
|
implement a dynamically sized array without some form of realloc).
|
|
|
|
Also note that allocators do not have to handle NULL pointers or 0-length
|
|
requests in any way - those checks are done in an allocator-agnostic way
|
|
higher up in wmem. Allocator authors can assume that all incoming pointers
|
|
(to realloc and free) are non-NULL, and that all incoming lengths (to malloc
|
|
and realloc) are non-0.
|
|
|
|
4.1.3 Producer/Manager Functions
|
|
|
|
- free_all()
|
|
- gc()
|
|
- cleanup()
|
|
|
|
All of these functions take only one parameter, which is the allocator's
|
|
private_data pointer.
|
|
|
|
The free_all() function should free all the memory currently allocated in the
|
|
pool. Note that this is not necessarily exactly the same as calling free()
|
|
on all the allocated blocks - free_all() is allowed to do additional cleanup
|
|
or to make use of optimizations not available when freeing one block at a time.
|
|
|
|
The gc() function should do whatever it can to reduce excess memory usage in
|
|
the dissector by returning unused blocks to the OS, optimizing internal data
|
|
structures, etc.
|
|
|
|
The cleanup() function should do any final cleanup and free any and all memory.
|
|
It is basically the equivalent of a destructor function. For simplicity, wmem
|
|
is guaranteed to call free_all() immediately before calling this function. There
|
|
is no such guarantee that gc() has (ever) been called.
|
|
|
|
4.2 Pool-Agnostic API
|
|
|
|
One of the issues with emem was that the API (including the public data
|
|
structures) required wrapper functions for each scope implemented. Even
|
|
if there was a stack implementation in emem, it wasn't necessarily available
|
|
for use with file-scope memory unless someone took the time to write se_stack_
|
|
wrapper functions for the interface.
|
|
|
|
In wmem, all public APIs take the pool as the first argument, so that they can
|
|
be written once and used with any available memory pool. Data structures like
|
|
wmem's stack implementation only take the pool when created - the provided
|
|
pointer is stored internally with the data structure, and subsequent calls
|
|
(like push and pop) will take the stack itself instead of the pool.
|
|
|
|
4.3 Debugging
|
|
|
|
The primary debugging control for wmem is the WIRESHARK_DEBUG_WMEM_OVERRIDE
|
|
environment variable. If set, this value forces all calls to
|
|
wmem_allocator_new() to return the same type of allocator, regardless of which
|
|
type is requested normally by the code. It currently has three valid values:
|
|
|
|
- The value "simple" forces the use of WMEM_ALLOCATOR_SIMPLE. The valgrind
|
|
script currently sets this value, since the simple allocator is the only
|
|
one whose memory allocations are trackable properly by valgrind.
|
|
|
|
- The value "strict" forces the use of WMEM_ALLOCATOR_STRICT. The fuzz-test
|
|
script currently sets this value, since the goal when fuzz-testing is to find
|
|
as many errors as possible.
|
|
|
|
- The value "block" forces the use of WMEM_ALLOCATOR_BLOCK. This is not
|
|
currently used by any scripts, but is useful for stress-testing the block
|
|
allocator.
|
|
|
|
- The value "block_fast" forces the use of WMEM_ALLOCATOR_BLOCK_FAST. This is
|
|
not currently used by any scripts, but is useful for stress-testing the fast
|
|
block allocator.
|
|
|
|
Note that regardless of the value of this variable, it will always be safe to
|
|
call allocator-specific helpers functions. They are required to be safe no-ops
|
|
if the allocator argument is of the wrong type.
|
|
|
|
4.4 Testing
|
|
|
|
There is a simple test suite for wmem that lives in the file wmem_test.c and
|
|
should get automatically built into the binary 'wmem_test' when building
|
|
Wireshark. It contains at least basic tests for all existing functionality.
|
|
The suite is run automatically by the build-bots via the shell script
|
|
test/test.sh which calls out to test/suite-unittests.sh.
|
|
|
|
New features added to wmem (allocators, data structures, utility
|
|
functions, etc.) MUST also have tests added to this suite.
|
|
|
|
The test suite could potentially use a clean-up by someone more
|
|
intimately familiar with Glib's testing framework, but it does the job.
|
|
|
|
/*
|
|
* Editor modelines - https://www.wireshark.org/tools/modelines.html
|
|
*
|
|
* Local variables:
|
|
* c-basic-offset: 4
|
|
* tab-width: 8
|
|
* indent-tabs-mode: nil
|
|
* End:
|
|
*
|
|
* vi: set shiftwidth=4 tabstop=8 expandtab:
|
|
* :indentSize=4:tabSize=8:noTabs=true:
|
|
*/
|