Manually editing etc/libnl/classid before adding tc objects is a pain.
This patch adds code to attempt auto generating a unique tc id which
will then be assigned to the provided name and added to the classid
file.
This will make the following commands work with prior definitions of
the names "top" and "test"
sudo sbin/nl-qdisc-add --dev eth0 --parent root --id top htb
sudo sbin/nl-class-add --dev eth0 --parent top --id test htb --rate 100mbit
It will generate the following ids automatically:
4001: top
4001:1 test
- Fixes a bunch of bugs related to ematches
- Adds support for the nbyte ematch
- Adds a bison/flex parser for ematch expressions, expressions
may look like this:
ip.length > 256 && pattern(ip6.src = 3ffe::/16)
documenation on syntax follows
- adds ematch support to the basic classifier (--ematch EXPR)
The netlink message buffer is preallocated to a page and later
expanded as needed. Everything was properly paded and zeroed
out except for the unused part at the end. Use calloc() to
allocate the buffer.
This patch includes various bugfixes in the packet location parser.
Namely it removes two memory leaks if parsing fails. The parser is
correctly quit if an allocation error occurs and it is no longer
possible to add duplicates.
It removes the possibility to differ between net and host byteorder.
This is better done in the actual classifiers as it makes more sense
to specify this together with the value to compare against.
The patch also extends the API to add new packet locations via
rtnl_pktloc_add().
It introduces reference counting, therefore you now have to give
back packet locations with rtnl_pktloc_put() after looking them up
with rtnl_pktloc_lookup(). But you are allowed to keep using them
if the packet location file has been reread.
The packet location file now also understands "eth", "ip", and
"tcp" for "link", "net", and "transport".
A --list option has been added to nl-pktloc-lookup to list all
packet location definitions
A --u32=VALUE option has been added to let nl-pktloc-lookup print
the definition in iproute2's u32 selector style.
A manual page has been written for nl-pktloc-lookup.
Finally, nl-pktloc-lookup has been made installable.
So far all common tc atttributes were accessed via specific functions, i.e.
rtnl_class_set_parent(), rtnl_qdisc_set_parent(), rtnl_cls_set_parent()
which implied a lot of code duplication. Since all tc objects are derived
from struct rtnl_tc and these common attributes are already stored in there
this patch removes all type specific functions and makes rtnl_tc_* attribute
functions public.
rtnl_qdisc_set_parent(qdisc, 10);
becomes:
rtnl_tc_set_parent((struct rtnl_tc *) qdisc, 10);
This patch also adds the following new attributes to tc objects therefore
removing them as tc specific attributes:
- mtu
- mpu
- overhead
This allows for the rate table calculations to be unified as well taking into
account the new kernel behavior to take care of overhead automatically.
Dumping objects as environment variables has never been implemented
completely and only increases the size of the library for no real
purpose. Integration into scripts is better achieved by implementing
a python module anyway.
Adds a cli based tool to add/update traffic classes. This tool requires
each class to be supported via the respetive qdisc module in
pkglibdir/cli/qdisc/$name.so.
Syntax:
nl-class-add --dev eth2 --parent 1: --id 1:1 htb --rate 100mbit
nl-class-add --update --dev eth2 --id 1:1 htb --rate 200mbit
A database to resolve qdisc/class names to classid values and vice versa.
The function rtnl_tc_handle2str() and rtnl_tc_str2handle() will resolve
names automatically.
A CLI based tool nl-classid-lookup is provided to integrate the database
into existing iproute2 scripts.
Adds a cli based tool to add/update/replace qdiscs. This tool requires
each qdisc to be supported via a dynamic loadable module in
pkglibdir/cli/qdisc/$name.so.
So far HTB and blackhole have been implemented.
Syntax:
nl-qdisc-add --dev eth2 --parent root --id 1: htb --r2q=5
nl-qdisc-add --update-only --dev eth2 --id 1: htb --r2q=10
I have a patch against commit d378220c96
extending libnl with a facility to receive generic netlink messages sent
to multicast groups.
Essentially it add one new function genl_ctrl_resolve_grp which
prototype looks like this
int genl_ctrl_resolve_grp(struct nl_sock *sk, const char *family_name,
const char *grp_name)
It resolves the family name and the group name to group id. Then
the returned id can be used in nl_socket_add_membership to subscribe
to multicast messages.
Besides that it adds two more functions
uint32_t nl_socket_get_peer_groups(struct nl_sock *sk)
void nl_socket_set_peer_groups(struct nl_sock *sk, uint32_t groups)
allowing to modify the socket peer groups field. So it's possible to
multicast messages from the user space using the legacy interface.
Looks like there is no way (or I was not able to find one?) to modify
the netlink socket destination group from the user space, when the
group id is greater then 32.
don't try to give the kernel an empty RTA_DST attribute. this would
previously happening on trying to delete the default route as returned
from the kernel. the kernel doesn't add a RTA_DST atttribute, so libnl
does nl_addr_alloc(0) and inserts a zero-length RTA_DST attribute into
the deletion request, which the kernel then refuses with ERANGE.
Signed-off-by: David Lamparter <equinox@diac24.net>
This patch enables out-of-source builds like this
$ cd builddir && src_dir/configure && make
Before this patch there was an error about missing netlink/version.h which
is built by automake in top_builddir rather than top_srcdir which is already
in include search path.
Signed-off-by: Andreas Bießmann <biessmann@corscience.de>
When an alternate kernel header include directory is added in
CPPFLAGS, the libnl build fails. This is because the local copy of
kernel headers is added in AM_CFLAGS, which gets included after
CPPFLAGS in the automake-generated makefile. Switching to AM_CPPFLAGS
fixes the problems.
the patch below adds the possibility to
pass user data to callbacks of type
change_func_t when using the nl_cache_mngr_*
family of functions.
If there is any better way to do this,
without duplicating the code in
cache_mngr.c please let me know.
Without this patch, running alloc / free cache loop will lead to huge memory
leaks on machine with 3000 interfaces with tbf qdiscs.
Here was valgrind output:
==5580== 18,070,728 bytes in 347,514 blocks are definitely lost in loss record
32 of 32
==5580== at 0x4025485: calloc (in /lib/valgrind/vgpreload_memcheck-x86-
linux.so)
==5580== by 0x405F410: tbf_msg_parser (tbf.c:46)
==5580== by 0x405302B: qdisc_msg_parser (qdisc.c:119)
==5580== by 0x4033DC9: nl_cache_parse (cache.c:643)
==5580== by 0x4033E7C: update_msg_parser (cache.c:460)
==5580== by 0x4038A11: nl_recvmsgs (netlink-local.h:112)
==5580== by 0x4034175: __cache_pickup (cache.c:483)
==5580== by 0x40343FF: nl_cache_pickup (cache.c:516)
==5580== by 0x403447D: nl_cache_refill (cache.c:698)
==5580== by 0x4034AB7: nl_cache_alloc_and_fill (cache.c:198)
==5580== by 0x4053216: rtnl_qdisc_alloc_cache (qdisc.c:388)
==5580== by 0x80489DB: main (in /home/root/nltest)
Patch complied and tested for same test case, no more leaks anymore.
Rules don't have unique identifiers, so all attributes are compared
by initializing the ID mask to ~0. This doesn't work however since
nl_object_identical verifies whether the ID attributes are actually
present before comparing the objects, which is never the case.
Work around by using the intersection of present attributes when
comparing two rule objects.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Neighbour entries for the same destination may exist on multiple
interfaces. Include the interface in the ID attributes.
Signed-off-by: Patrick McHardy <kaber@trash.net>
When resyncing a cache, there are no delete messages, so they need to
be synthesized for deleted objects.
Signed-off-by: Patrick McHardy <kaber@trash.net>
I've noticed a wrong behavior when setting up some delays in a netem
qdisc. I will try to make the things easier for the reader describing
the calls path.
To set up a delay (or jitter...) I use 'rtnl_netem_set_delay' which
requires an int parameter that tells the delay in micro seconds. Inside
this func, the delay is set up with the help of 'nl_us2ticks', which is
just an arithmetic operation (us * ticks_per_usec), where us is the
input parameter and ticks_per_usec is a global variable initialized in
'get_psched_settings'. And here is the problem:
If this variable is going to be calculated using '/proc/net/psched', I
think the file scan is not done properly.
I don't understand what the meaning of the asterisk is here:
int r = fscanf(fd, "%08x%08x%08x%*08x", &tick, &us, &nom);
if (4 == r && nom == 1000000 && !got_tick)
ticks_per_usec = (double)tick/(double)us;
The execution path never gets in the if statement, because r is always
3, and if the fourth parameter is read (avoiding the asterisk), there is
no variable to store it in, so it comes a segv. In my opinion we can get
rid of the if statement, because I think the proc psched file has always
a fixed format of 4 parameters, and 'nom' is always 1000000
(http://lxr.linux.no/#linux+v2.6.32/net/sched/sch_api.c#L1678).
Find attached a patch I did, if I am correct.
nfnl_queue_msg_send_verdict_payload() will to send the verdict, mark,
and possibly changed payload through the netlink socket.
Add a few docbook comments in other funcs.
Signed-off-by: Karl Hiramoto <karl@hiramoto.org>
Create new function nl_send_iovec() to be used to send multiple 'struct iovec'
through the netlink socket. This will be used for NF_QUEUE, to send
packet payload of a modified packet.
Refactor nl_send() to use nl_send_iovec() sending a single struct iovec.
Create new function nl_auto_complete() by refactoring nl_send_auto_complete(),
so other functions that call nl_send may also use nl_auto_complete()
Signed-off-by: Karl Hiramoto <karl@hiramoto.org>
libnl-route must be handled before libnl-nf in lib_LTLIBRARIES since
the later depends on the former. Additionally nf-monitor, nl-list-caches,
nl-list-sockets and nl-util-addr have been dropped from the Makefile.
Signed-off-by: Patrick McHardy <kaber@trash.net>
addr_obj.ops.oo_id_attrs included ADDR_ATTR_PEER, so any address that
didn't have a peer address set would compare as unequal to itself,
meaning it could never be removed from a cache after it was added, etc.
I found the following bug, where nlmsg_ok() in lib/msg.c would
incorrectly return 'true' when the input argument 'remaining' was a negative
number. This happens when the message is not aligned the way that libnl
expects (although it is still legal).
In the comparison of the signed and unsigned numbers on line 284, the signed
number gets converted to an unsigned number, which is unexpected and
naturally produces a bug. My patch is below. The cast is ugly, but it
fixes the problem.
Issues solved:
* PACKAGE_VERSION was abused for SOVERSION
* unneeded DEP stage
* did not support out-of-tree builds
* no way to turn off silent mode
* overriding CFLAGS at make time was not supported
* no static libs were provided
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Current calculation is always off, not reflecting the right position
in the bitmap, which results in failures due to conflicts (detected at
the kernel level) when trying to open a new handle.
Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Instead of calling the membership functions several times it is
helpfull to extend the API and make the single group functions a
special case.
The value 0 (NFNLGRP_NONE) terminates this list.
Example use:
nl_socket_add_memberships(sock, group_1, group_2, 0);
nl_socket_drop_memberships(sock, group_1, group_2, 0);
Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
commit e92539843a0c7e5116254382626cce226bf2135e
Author: Patrick McHardy <kaber@trash.net>
Date: Thu Oct 23 13:46:16 2008 +0200
libnl: nfqueue: add nfqueue specific socket allocation function
nfqueue users usually send verdict messages from the receive callback.
When waiting for ACKs, the receive callback might be called again
recursively until the stack blows up.
Add a nfqueue specific socket allocation function that automatically
disables ACKing for the socket.
Signed-off-by: Patrick McHardy <kaber@trash.net>
we're using libnl-1.1 for a project. When trying to delete all
addresses of an interface by only setting interface index and
address family of an rtnl_addr and executing rtnl_addr_delete()
we received some error (I don't remember what it was).
The bug(?) is in build_addr_msg() in lib/route/addr.c:
IFA_ADDRESS is set to a_local when a_peer is not set,
without checking if a_local was set. We just added
if (tmpl->ce_mask & ADDR_ATTR_LOCAL)
after the "else" (line 496 in the current git).
This changes make nfnl_ct_get_src_port() and others return the value
in host byte order rather than in network byte order.
Also splits printing into details and statistical section and
improves readability.
The idea of a common handle is long revised and only misleading,
nl_handle really represents a socket with some additional
action handlers assigned to it.
Alias for nl_handle is kept for backwards compatibility.
Replaces obsolete calls to nla_get_addr() and nla_get_data()
with nl_addr_alloc_attr() respectively nl_data_alloc_attr().
Also fixes missing error handling while parsing routing multipath
configuration.
In order for the interface to become more thread safe, the error
handling was revised to no longer depend on a static errno and
error string buffer.
This patch converts all error paths to return a libnl specific
error code which can be translated to a error message using
nl_geterror(int error). The functions nl_error() and
nl_get_errno() are therefore obsolete.
This change required various sets of function prototypes to be
changed in order to return an error code, the most prominent
are:
struct nl_cache *foo_alloc_cache(...);
changed to:
int foo_alloc_cache(..., struct nl_cache **);
struct nl_msg *foo_build_request(...);
changed to:
int foo_build_request(..., struct nl_msg **);
struct foo *foo_parse(...);
changed to:
int foo_parse(..., struct foo **);
This pretty much only leaves trivial allocation functions to
still return a pointer object which can still return NULL to
signal out of memory.
This change is a serious API and ABI breaker, sorry!
Added rtnl_route_foreach_nexthop() to walk the list of nexthops invoking a
caller-provided callback for each nexthop entry, and added rtnl_route_nexthop_n()
to retrieve the Nth nexthop entry in the list.
Using rtnl_route_get_metric() for route comparison became a bottleneck
because each metric which was not available resulted in the generation
of an error message. This changeset avoids this by accessing rt_metrics
and rt_metrics_mask directly while comparing route objects.
As pointed out by Regis Hanna, a considerable performance gain can be
achieved by using malloc() over calloc() when allocating netlink message
buffers. This is likely due to the fact that we use a complete page for
each message.
This changesets adds the possibility to fill a nl_cache with
the contents of the route cache. It also adds the possibility
to limit route caches to certain address families.
New netem-related functionality:
Added ability to save new settings to the kernel. In netem.c, the
netem_get_opts() stub has been replaced with netem_build_msg() which
manipulates the nl_msg data directly and returns an error code instead
of a new nl_msg. Modifications to qdisc_build() in qdisc.c and struct
rtnl_qdisc_ops were necessary for this.
Added support for getting/setting corruption probability/correlation.
Added support for setting a delay distribution.
Fixed tbf_msg_parser() to call tbf_alloc() instead of tbf_qdisc() to
prevent a seg fault.
I stepped over libnl always freeing the messages and it
kind of made it awkward to reuse the message data without
reallocating.
The basic idea is: if a callback return value has a bit set,
don't free that message. The calling application owns it.
By default, things stay as before (messages are freed).
Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Adds all missing routing attributes and brings the routing
related code to a working state. In the process the API
was broken several times with the justification that nobody
is using this code yet.
The changes include new example code which is also a prototype
for how plain CLI tools could look like to control routes.
[LIBNL]: Fix nfnl_queue_msg_get_packetid() return type
The packet-ID is a 32 bit value, but nfnl_queue_msg_get_packetid() returns
an uint16_t. Makes queueing fail after 2^16 packets.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Split the nfnetlink_log code into two seperate objects, "netfilter/log"
to represent logging instances and "netfilter/log_msg" to represent
log messages. Also perform some function name unification for consistency
with other libnl object types, mainly renaming nfnl_log_build_*_msg
to nfnl_log_build_*_request.
This changes the API in an incompatible way, but since this feature is
new and the libnl netfilter headers haven't been installed so far,
there shouldn't be any users affected by this.
Signed-off-by: Patrick McHardy <kaber@trash.net>
The NUFLA_GID attribute (currently only in net-2.6.25) contains the
gid of the sending process for locally generated packets.
Signed-off-by: Patrick McHardy <kaber@trash.net>
The hwproto doesn't have its own attribute and is also present when
not set. Don't set the attribute if its value is zero.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Caches allocated by the cache manager must be freed again when the cache
manager itself is freed. However, the netlink socket is allocated
indepdently so it should not be freed.
Patrick McHardy reported a problem where pointers to the
payload of a netlink message as returned by f.e. the
nesting helpers become stale when the payload data
chunk is reallocated.
In order to avoid further problems, the payload chunk is
no longer extended on the fly. Instead the allocation is
made during netlink message object allocation time with
a default size of a page which should be fine for the
majority of all users. Additionally the functions
nlmsg_alloc_size() and nlmsg_set_default_size() have been
added to allocate messages of a particular length and to
modify the default message size.
So far the destination address for default routes was NULL
which complicated the handling of routes in general. By
assigning a address of length zero they can be compared
to each other.
This allows the cache manager to properly handle default
routes.
Fixes an off-by-one when releasing local ports. Fixes nl_connect()
to properly close the socket upon failure. Return EBADFD if
operations are performed on unconnected sockets where appropriate.
Makes nl_handle_alloc() return an error if all local ports are
used up.
This interface was internal so far which required all code defining
caches to be compiled with the sources available.
In order to simplify the interface, the co_msg_parser prototype was
changed to take the struct nl_parser_param directly instead of a
void *. It used to be void * because the co_msg_parser was directly
passed as the NL_CB_VALID callback function.
This interface was internal so far which required all code defining
objects to be compiled with the sources available.
This change exposes struct nl_object_ops which seems safe as it
is not supposed to be embedded in other structures.
Patch contains extensive documentation to help with the creation
of own object implementations.