5a58a1435c
Fix an assortment of typos and other minor errors in various README files svn path=/trunk/; revision=23166
118 lines
4.6 KiB
Text
118 lines
4.6 KiB
Text
$Id$
|
|
|
|
XXX - move this file to epan?
|
|
|
|
1. How the Display Filter Engine works.
|
|
|
|
code:
|
|
epan/dfilter/* - the display filter engine, including
|
|
scanner, parser, syntax-tree semantics checker, DFVM bytecode
|
|
generator, and DFVM engine.
|
|
epan/ftypes/* - the definitions of the various FT_* field types.
|
|
epan/proto.c - proto_tree-related routines
|
|
|
|
1.1 Parsing text.
|
|
|
|
The scanner/parser pair read the string representing the display filter
|
|
and convert it into a very simple syntax tree. The syntax tree is very
|
|
simple in that it is possible that many of the nodes contain unparsed
|
|
chunks of text from the display filter.
|
|
|
|
1.1 Enhancing the syntax tree.
|
|
|
|
The semantics of the simple syntax tree are checked to make sure that
|
|
the fields that are being compared are being compared to appropriate
|
|
values. For example, if a field is an integer, it can't be compared to
|
|
a string, unless a value_string has been defined for that field.
|
|
|
|
During the process of checking the semantics, the simple syntax tree is
|
|
fleshed out and no longer contains nodes with unparsed information. The
|
|
syntax tree is no longer in its simple form, but in its complete form.
|
|
|
|
1.2 Converting to DFVM bytecode.
|
|
|
|
The syntax tree is analyzed to create a sequence of bytecodes in the
|
|
"DFVM" language. "DFVM" stands for Display Filter Virtual Machine. The
|
|
DFVM is similar in spirit, but not in definition, to the BPF VM that
|
|
libpcap uses to analyze packets.
|
|
|
|
A virtual bytecode is created and used so that the actual process of
|
|
filtering packets will be fast. That is, it should be faster to process
|
|
a list of VM bytecodes than to attempt to filter packets directly from
|
|
the syntax tree. (heh... no measurement has been made to support this
|
|
supposition)
|
|
|
|
1.3 Filtering.
|
|
|
|
Once the DFVM bytecode has been produced, it's a simple matter of
|
|
running the DFVM engine against the proto_tree from the packet
|
|
dissection, using the DFVM bytecodes as instructions. If the DFVM
|
|
bytecode is known before packet dissection occurs, the
|
|
proto_tree-related code can be "primed" to store away pointers to
|
|
field_info structures that are interesting to the display filter. This
|
|
makes lookup of those field_info structures during the filtering process
|
|
faster.
|
|
|
|
1.4 Display Filter Functions.
|
|
|
|
You define a display filter function by adding an entry to
|
|
the df_functions table in epan/dfilter/dfunctions.c. The record struct
|
|
is defined in dfunctions.h, and shown here:
|
|
|
|
typedef struct {
|
|
char *name;
|
|
DFFuncType function;
|
|
ftenum_t retval_ftype;
|
|
guint min_nargs;
|
|
guint max_nargs;
|
|
DFSemCheckType semcheck_param_function;
|
|
} df_func_def_t;
|
|
|
|
name - the name of the function; this is how the user will call your
|
|
function in the display filter language
|
|
|
|
function - this is the run-time processing of your function.
|
|
|
|
retval_ftype - what type of FT_* type does your function return?
|
|
|
|
min_nargs - minimum number of arguments your function accepts
|
|
max_nargs - maximum number of arguments your function accepts
|
|
|
|
semcheck_param_function - called during the semantic check of the
|
|
display filter string.
|
|
|
|
DFFuncType function
|
|
-------------------
|
|
typedef gboolean (*DFFuncType)(GList *arg1list, GList *arg2list, GList **retval);
|
|
|
|
The return value of your function is a gboolean; TRUE if processing went fine,
|
|
or FALSE if there was some sort of exception.
|
|
|
|
For now, display filter functions can accept a maximum of 2 arguments.
|
|
The "arg1list" parameter is the GList for the first argument. The
|
|
'arg2list" parameter is the GList for the second argument. All arguments
|
|
to display filter functions are lists. This is because in the display
|
|
filter language a protocol field may have multiple instances. For example,
|
|
a field like "ip.addr" will exist more than once in a single frame. So
|
|
when the user invokes this display filter:
|
|
|
|
somefunc(ip.addr) == TRUE
|
|
|
|
even though "ip.addr" is a single argument, the "somefunc" function will
|
|
receive a GList of *all* the values of "ip.addr" in the frame.
|
|
|
|
Similarly, the return value of the function needs to be a GList, since all
|
|
values in the display filter language are lists. The GList** retval argument
|
|
is passed to your function so you can set the pointer to your return value.
|
|
|
|
DFSemCheckType
|
|
--------------
|
|
typedef void (*DFSemCheckType)(int param_num, stnode_t *st_node);
|
|
|
|
For each parameter in the syntax tree, this function will be called.
|
|
"param_num" will indicate the number of the parameter, starting with 0.
|
|
The "stnode_t" is the syntax-tree node representing that parameter.
|
|
If everything is okay with the value of that stnode_t, your function
|
|
does nothing --- it merely returns. If something is wrong, however,
|
|
it should THROW a TypeError exception.
|