dect
/
linux-2.6
Archived
13
0
Fork 0
Commit Graph

424 Commits

Author SHA1 Message Date
Jozsef Kadlecsik 6c02788969 netfilter: ipset: hash:ip set type support
The module implements the hash:ip type support in four flavours:
for IPv4 or IPv6, both without and with timeout support.

All the hash types are based on the "array hash" or ahash structure
and functions as a good compromise between minimal memory footprint
and speed. The hashing uses arrays to resolve clashes. The hash table
is resized (doubled) when searching becomes too long. Resizing can be
triggered by userspace add commands only and those are serialized by
the nfnl mutex. During resizing the set is read-locked, so the only
possible concurrent operations are the kernel side readers. Those are
protected by RCU locking.

Because of the four flavours and the other hash types, the functions
are implemented in general forms in the ip_set_ahash.h header file
and the real functions are generated before compiling by macro expansion.
Thus the dereferencing of low-level functions and void pointer arguments
could be avoided: the low-level functions are inlined, the function
arguments are pointers of type-specific structures.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-01 15:38:36 +01:00
Jozsef Kadlecsik 72205fc68b netfilter: ipset: bitmap:ip set type support
The module implements the bitmap:ip set type in two flavours, without
and with timeout support. In this kind of set one can store IPv4
addresses (or network addresses) from a given range.

In order not to waste memory, the timeout version does not rely on
the kernel timer for every element to be timed out but on garbage
collection. All set types use this mechanism.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-01 15:33:17 +01:00
Jozsef Kadlecsik a7b4f989a6 netfilter: ipset: IP set core support
The patch adds the IP set core support to the kernel.

The IP set core implements a netlink (nfnetlink) based protocol by which
one can create, destroy, flush, rename, swap, list, save, restore sets,
and add, delete, test elements from userspace. For simplicity (and backward
compatibilty and for not to force ip(6)tables to be linked with a netlink
library) reasons a small getsockopt-based protocol is also kept in order
to communicate with the ip(6)tables match and target.

The netlink protocol passes all u16, etc values in network order with
NLA_F_NET_BYTEORDER flag. The protocol enforces the proper use of the
NLA_F_NESTED and NLA_F_NET_BYTEORDER flags.

For other kernel subsystems (netfilter match and target) the API contains
the functions to add, delete and test elements in sets and the required calls
to get/put refereces to the sets before those operations can be performed.

The set types (which are implemented in independent modules) are stored
in a simple RCU protected list. A set type may have variants: for example
without timeout or with timeout support, for IPv4 or for IPv6. The sets
(i.e. the pointers to the sets) are stored in an array. The sets are
identified by their index in the array, which makes possible easy and
fast swapping of sets. The array is protected indirectly by the nfnl
mutex from nfnetlink. The content of the sets are protected by the rwlock
of the set.

There are functional differences between the add/del/test functions
for the kernel and userspace:

- kernel add/del/test: works on the current packet (i.e. one element)
- kernel test: may trigger an "add" operation  in order to fill
  out unspecified parts of the element from the packet (like MAC address)
- userspace add/del: works on the netlink message and thus possibly
  on multiple elements from the IPSET_ATTR_ADT container attribute.
- userspace add: may trigger resizing of a set

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-01 15:28:35 +01:00
Jozsef Kadlecsik f703651ef8 netfilter: NFNL_SUBSYS_IPSET id and NLA_PUT_NET* macros
The patch adds the NFNL_SUBSYS_IPSET id and NLA_PUT_NET* macros to the
vanilla kernel.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-01 15:20:14 +01:00
Jan Engelhardt 06988b0693 netfilter: xtables: add missing header inclusions for headers_check
Resolve these warnings on `make headers_check`:

usr/include/linux/netfilter/xt_CT.h:7: found __[us]{8,16,32,64} type
without #include <linux/types.h>
...

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2011-01-20 17:50:17 +01:00
Patrick McHardy 4cda47d2e7 Merge branch 'connlimit' of git://dev.medozas.de/linux 2011-01-20 14:17:32 +01:00
Jan Engelhardt ba12b130a6 netfilter: xtables: remove duplicate member
Accidentally missed removing the old out-of-union "inverse" member,
which caused the struct size to change which then gives size mismatch
warnings when using an old iptables.

It is interesting to see that gcc did not warn about this before.
(Filed http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47376 )

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2011-01-20 14:01:12 +01:00
Patrick McHardy 82d800d8e7 Merge branch 'connlimit' of git://dev.medozas.de/linux
Conflicts:
	Documentation/feature-removal-schedule.txt

Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-01-20 10:33:55 +01:00
Jan Engelhardt 5d84492864 netfilter: xtables: remove extraneous header that slipped in
Commit 0b8ad87 (netfilter: xtables: add missing header files to export
list) erroneously added this.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-01-20 08:48:15 +01:00
Patrick McHardy 14f0290ba4 Merge branch 'master' of /repos/git/net-next-2.6 2011-01-19 23:51:37 +01:00
Jan Engelhardt cc4fc02257 netfilter: xtables: connlimit revision 1
This adds destination address-based selection. The old "inverse"
member is overloaded (memory-wise) with a new "flags" variable,
similar to how J.Park did it with xt_string rev 1. Since revision 0
userspace only sets flag 0x1, no great changes are made to explicitly
test for different revisions.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2011-01-19 18:27:46 +01:00
Pablo Neira Ayuso a992ca2a04 netfilter: nf_conntrack_tstamp: add flow-based timestamp extension
This patch adds flow-based timestamping for conntracks. This
conntrack extension is disabled by default. Basically, we use
two 64-bits variables to store the creation timestamp once the
conntrack has been confirmed and the other to store the deletion
time. This extension is disabled by default, to enable it, you
have to:

echo 1 > /proc/sys/net/netfilter/nf_conntrack_timestamp

This patch allows to save memory for user-space flow-based
loogers such as ulogd2. In short, ulogd2 does not need to
keep a hashtable with the conntrack in user-space to know
when they were created and destroyed, instead we use the
kernel timestamp. If we want to have a sane IPFIX implementation
in user-space, this nanosecs resolution timestamps are also
useful. Other custom user-space applications can benefit from
this via libnetfilter_conntrack.

This patch modifies the /proc output to display the delta time
in seconds since the flow start. You can also obtain the
flow-start date by means of the conntrack-tools.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-01-19 16:00:07 +01:00
Jiri Olsa 93557f53e1 netfilter: nf_conntrack: nf_conntrack snmp helper
Adding support for SNMP broadcast connection tracking. The SNMP
broadcast requests are now paired with the SNMP responses.
Thus allowing using SNMP broadcasts with firewall enabled.

Please refer to the following conversation:
http://marc.info/?l=netfilter-devel&m=125992205006600&w=2

Patrick McHardy wrote:
> > The best solution would be to add generic broadcast tracking, the
> > use of expectations for this is a bit of abuse.
> > The second best choice I guess would be to move the help() function
> > to a shared module and generalize it so it can be used for both.
This patch implements the "second best choice".

Since the netbios-ns conntrack module uses the same helper
functionality as the snmp, only one helper function is added
for both snmp and netbios-ns modules into the new object -
nf_conntrack_broadcast.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-01-18 18:12:24 +01:00
Patrick McHardy a8fc0d9b34 Merge branch 'master' of git://dev.medozas.de/linux 2011-01-18 16:20:53 +01:00
Florian Westphal 94b27cc361 netfilter: allow NFQUEUE bypass if no listener is available
If an skb is to be NF_QUEUE'd, but no program has opened the queue, the
packet is dropped.

This adds a v2 target revision of xt_NFQUEUE that allows packets to
continue through the ruleset instead.

Because the actual queueing happens outside of the target context, the
'bypass' flag has to be communicated back to the netfilter core.

Unfortunately the only choice to do this without adding a new function
argument is to use the target function return value (i.e. the verdict).

In the NF_QUEUE case, the upper 16bit already contain the queue number
to use.  The previous patch reduced NF_VERDICT_MASK to 0xff, i.e.
we now have extra room for a new flag.

If a hook issued a NF_QUEUE verdict, then the netfilter core will
continue packet processing if the queueing hook
returns -ESRCH (== "this queue does not exist") and the new
NF_VERDICT_FLAG_QUEUE_BYPASS flag is set in the verdict value.

Note: If the queue exists, but userspace does not consume packets fast
enough, the skb will still be dropped.

Signed-off-by: Florian Westphal <fwestphal@astaro.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-01-18 16:08:30 +01:00
Jan Engelhardt 0b8ad87627 netfilter: xtables: add missing header files to export list
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2011-01-18 11:25:02 +01:00
Thomas Graf 43f393caec netfilter: audit target to record accepted/dropped packets
This patch adds a new netfilter target which creates audit records
for packets traversing a certain chain.

It can be used to record packets which are rejected administraively
as follows:

  -N AUDIT_DROP
  -A AUDIT_DROP -j AUDIT --type DROP
  -A AUDIT_DROP -j DROP

a rule which would typically drop or reject a packet would then
invoke the new chain to record packets before dropping them.

  -j AUDIT_DROP

The module is protocol independant and works for iptables, ip6tables
and ebtables.

The following information is logged:
 - netfilter hook
 - packet length
 - incomming/outgoing interface
 - MAC src/dst/proto for ethernet packets
 - src/dst/protocol address for IPv4/IPv6
 - src/dst port for TCP/UDP/UDPLITE
 - icmp type/code

Cc: Patrick McHardy <kaber@trash.net>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Thomas Graf <tgraf@redhat.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-01-16 18:10:28 +01:00
Eric Dumazet 255d0dc340 netfilter: x_table: speedup compat operations
One iptables invocation with 135000 rules takes 35 seconds of cpu time
on a recent server, using a 32bit distro and a 64bit kernel.

We eventually trigger NMI/RCU watchdog.

INFO: rcu_sched_state detected stall on CPU 3 (t=6000 jiffies)

COMPAT mode has quadratic behavior and consume 16 bytes of memory per
rule.

Switch the xt_compat algos to use an array instead of list, and use a
binary search to locate an offset in the sorted array.

This halves memory need (8 bytes per rule), and removes quadratic
behavior [ O(N*N) -> O(N*log2(N)) ]

Time of iptables goes from 35 s to 150 ms.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-01-13 12:05:12 +01:00
Patrick McHardy b017900aac netfilter: xt_conntrack: support matching on port ranges
Add a new revision 3 that contains port ranges for all of origsrc,
origdst, replsrc and repldst. The high ports are appended to the
original v2 data structure to allow sharing most of the code with
v1 and v2. Use of the revision specific port matching function is
made dependant on par->match->revision.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-01-13 12:05:12 +01:00
Jan Engelhardt 5df15196a2 netfilter: xt_comment: drop unneeded unsigned qualifier
Since a string is stored, and not something like a MAC address that
would rely on (un)signedness, drop the qualifier.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-01-13 12:05:11 +01:00
Eric Dumazet 83723d6071 netfilter: x_tables: dont block BH while reading counters
Using "iptables -L" with a lot of rules have a too big BH latency.
Jesper mentioned ~6 ms and worried of frame drops.

Switch to a per_cpu seqlock scheme, so that taking a snapshot of
counters doesnt need to block BH (for this cpu, but also other cpus).

This adds two increments on seqlock sequence per ipt_do_table() call,
its a reasonable cost for allowing "iptables -L" not block BH
processing.

Reported-by: Jesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Patrick McHardy <kaber@trash.net>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Jesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-01-10 20:11:38 +01:00
Jan Engelhardt f1c722295e netfilter: xtables: use guarded types
We are supposed to use the kernel's own types in userspace exports.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-12-15 22:58:53 +01:00
Linus Torvalds 5f05647dd8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1699 commits)
  bnx2/bnx2x: Unsupported Ethtool operations should return -EINVAL.
  vlan: Calling vlan_hwaccel_do_receive() is always valid.
  tproxy: use the interface primary IP address as a default value for --on-ip
  tproxy: added IPv6 support to the socket match
  cxgb3: function namespace cleanup
  tproxy: added IPv6 support to the TPROXY target
  tproxy: added IPv6 socket lookup function to nf_tproxy_core
  be2net: Changes to use only priority codes allowed by f/w
  tproxy: allow non-local binds of IPv6 sockets if IP_TRANSPARENT is enabled
  tproxy: added tproxy sockopt interface in the IPV6 layer
  tproxy: added udp6_lib_lookup function
  tproxy: added const specifiers to udp lookup functions
  tproxy: split off ipv6 defragmentation to a separate module
  l2tp: small cleanup
  nf_nat: restrict ICMP translation for embedded header
  can: mcp251x: fix generation of error frames
  can: mcp251x: fix endless loop in interrupt handler if CANINTF_MERRF is set
  can-raw: add msg_flags to distinguish local traffic
  9p: client code cleanup
  rds: make local functions/variables static
  ...

Fix up conflicts in net/core/dev.c, drivers/net/pcmcia/smc91c92_cs.c and
drivers/net/wireless/ath/ath9k/debug.c as per David
2010-10-23 11:47:02 -07:00
Balazs Scheidler 6ad7889327 tproxy: added IPv6 support to the TPROXY target
This requires a new revision as the old target structure was
IPv4 specific.

Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-10-21 16:17:26 +02:00
Eric Paris 1cc63249ad conntrack: export lsm context rather than internal secid via netlink
The conntrack code can export the internal secid to userspace.  These are
dynamic, can change on lsm changes, and have no meaning in userspace.  We
should instead be sending lsm contexts to userspace instead.  This patch sends
the secctx (rather than secid) to userspace over the netlink socket.  We use a
new field CTA_SECCTX and stop using the the old CTA_SECMARK field since it did
not send particularly useful information.

Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: Paul Moore <paul.moore@hp.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21 10:12:51 +11:00
Eric Paris 2606fd1fa5 secmark: make secmark object handling generic
Right now secmark has lots of direct selinux calls.  Use all LSM calls and
remove all SELinux specific knowledge.  The only SELinux specific knowledge
we leave is the mode.  The only point is to make sure that other LSMs at
least test this generic code before they assume it works.  (They may also
have to make changes if they do not represent labels as strings)

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Paul Moore <paul.moore@hp.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21 10:12:48 +11:00
Pablo Neira Ayuso ebbf41df4a netfilter: ctnetlink: add expectation deletion events
This patch allows to listen to events that inform about
expectations destroyed.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-10-19 10:19:06 +02:00
Jan Engelhardt 75f0a0fd78 netfilter: xtables: unify {ip,ip6,arp}t_error_target
Unification of struct *_error_target was forgotten in
v2.6.16-1689-g1e30a01.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-10-13 18:00:50 +02:00
Simon Horman 001985b2c0 netfilter: nf_conntrack_sip: Add callid parser
Signed-off-by: Simon Horman <horms@verge.net.au>
Acked-by: Julian Anastasov <ja@ssi.bg>
2010-10-04 22:45:23 +09:00
Pablo Neira Ayuso bc01befdcf netfilter: ctnetlink: add support for user-space expectation helpers
This patch adds the basic infrastructure to support user-space
expectation helpers via ctnetlink and the netfilter queuing
infrastructure NFQUEUE. Basically, this patch:

* adds NF_CT_EXPECT_USERSPACE flag to identify user-space
  created expectations. I have also added a sanity check in
  __nf_ct_expect_check() to avoid that kernel-space helpers
  may create an expectation if the master conntrack has no
  helper assigned.
* adds some branches to check if the master conntrack helper
  exists, otherwise we skip the code that refers to kernel-space
  helper such as the local expectation list and the expectation
  policy.
* allows to set the timeout for user-space expectations with
  no helper assigned.
* a list of expectations created from user-space that depends
  on ctnetlink (if this module is removed, they are deleted).
* includes USERSPACE in the /proc output for expectations
  that have been created by a user-space helper.

This patch also modifies ctnetlink to skip including the helper
name in the Netlink messages if no kernel-space helper is set
(since no user-space expectation has not kernel-space kernel
assigned).

You can access an example user-space FTP conntrack helper at:
http://people.netfilter.org/pablo/userspace-conntrack-helpers/nf-ftp-helper-userspace-POC.tar.bz

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-09-28 21:06:34 +02:00
Pablo Neira Ayuso 8b008faf92 netfilter: ctnetlink: allow to specify the expectation flags
With this patch, you can specify the expectation flags for user-space
created expectations.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-09-22 08:36:59 +02:00
Linus Torvalds d1b113bb02 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (27 commits)
  netfilter: fix CONFIG_COMPAT support
  isdn/avm: fix build when PCMCIA is not enabled
  header: fix broken headers for user space
  e1000e: don't check for alternate MAC addr on parts that don't support it
  e1000e: disable ASPM L1 on 82573
  ll_temac: Fix poll implementation
  netxen: fix a race in netxen_nic_get_stats()
  qlnic: fix a race in qlcnic_get_stats()
  irda: fix a race in irlan_eth_xmit()
  net: sh_eth: remove unused variable
  netxen: update version 4.0.74
  netxen: fix inconsistent lock state
  vlan: Match underlying dev carrier on vlan add
  ibmveth: Fix opps during MTU change on an active device
  ehea: Fix synchronization between HW and SW send queue
  bnx2x: Update bnx2x version to 1.52.53-4
  bnx2x: Fix PHY locking problem
  rds: fix a leak of kernel memory
  netlink: fix compat recvmsg
  netfilter: fix userspace header warning
  ...
2010-08-23 18:30:30 -07:00
David S. Miller 2d4833aae6 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	include/linux/if_pppox.h

Fix conflict between Changli's __packed header file fixes and
the new PPTP driver.

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-22 22:37:04 -07:00
Changli Gao 09cd2b99c6 header: fix broken headers for user space
__packed is only defined in kernel space, so we should use
__attribute__((packed)) for the code shared between kernel and user space.

Two __attribute() annotations are replaced with __attribute__() too.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-22 21:15:39 -07:00
David S. Miller d3c6e7ad09 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-08-21 23:32:24 -07:00
Sam Ravnborg e243f5b6de netfilter: fix userspace header warning
"make headers_check" issued the following warning:

  CHECK   include/linux/netfilter (64 files)
usr/include/linux/netfilter/xt_ipvs.h:19: found __[us]{8,16,32,64} type without #include <linux/types.h>

Fix this by as suggested including linux/types.h.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-18 23:34:26 -07:00
Sam Ravnborg 60641aa1f3 include: replace unifdef-y with header-y
unifdef-y and header-y has same semantic.
So there is no need to have both.

Drop the unifdef-y variant and sort all lines again

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2010-08-14 22:26:51 +02:00
Changli Gao 49daf6a226 xt_quota: report initial quota value instead of current value to userspace
We should copy the initial value to userspace for iptables-save and
to allow removal of specific quota rules.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-23 14:07:47 +02:00
Eric Dumazet e8648a1fdb netfilter: add xt_cpu match
In some situations a CPU match permits a better spreading of
connections, or select targets only for a given cpu.

With Remote Packet Steering or multiqueue NIC and appropriate IRQ
affinities, we can distribute trafic on available cpus, per session.
(all RX packets for a given flow is handled by a given cpu)

Some legacy applications being not SMP friendly, one way to scale a
server is to run multiple copies of them.

Instead of randomly choosing an instance, we can use the cpu number as a
key so that softirq handler for a whole instance is running on a single
cpu, maximizing cache effects in TCP/UDP stacks.

Using NAT for example, a four ways machine might run four copies of
server application, using a separate listening port for each instance,
but still presenting an unique external port :

iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 0 \
        -j REDIRECT --to-port 8080

iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 1 \
        -j REDIRECT --to-port 8081

iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 2 \
        -j REDIRECT --to-port 8082

iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 3 \
        -j REDIRECT --to-port 8083

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-23 12:59:36 +02:00
Hannes Eder 9c3e1c3967 netfilter: xt_ipvs (netfilter matcher for IPVS)
This implements the kernel-space side of the netfilter matcher xt_ipvs.

[ minor fixes by Simon Horman <horms@verge.net.au> ]
Signed-off-by: Hannes Eder <heder@google.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
[ Patrick: added xt_ipvs.h to Kbuild ]
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-23 12:42:58 +02:00
Michael S. Tsirkin 22cb516696 netfilter: correct CHECKSUM header and export it
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-16 14:08:20 +02:00
Michael S. Tsirkin edf0e1fb0d netfilter: add CHECKSUM target
This adds a `CHECKSUM' target, which can be used in the iptables mangle
table.

You can use this target to compute and fill in the checksum in
a packet that lacks a checksum.  This is particularly useful,
if you need to work around old applications such as dhcp clients,
that do not work well with checksum offloads, but don't want to
disable checksum offload in your device.

The problem happens in the field with virtualized applications.
For reference, see Red Hat bz 605555, as well as
http://www.spinics.net/lists/kvm/msg37660.html

Typical expected use (helps old dhclient binary running in a VM):
iptables -A POSTROUTING -t mangle -p udp --dport bootpc \
	-j CHECKSUM --checksum-fill

Includes fixes by Jan Engelhardt <jengelh@medozas.de>

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-15 17:20:46 +02:00
Pablo Neira Ayuso cca5cf91c7 nfnetlink_log: do not expose NFULNL_COPY_DISABLED to user-space
This patch moves NFULNL_COPY_PACKET definition from
linux/netfilter/nfnetlink_log.h to net/netfilter/nfnetlink_log.h
since this copy mode is only for internal use.

I have also changed the value from 0x03 to 0xff. Thus, we avoid
a gap from user-space that may confuse users if we add new
copy modes in the future.

This change was introduced in:
http://www.spinics.net/lists/netfilter-devel/msg13535.html

Since this change is not included in any stable Linux kernel,
I think it's safe to make this change now. Anyway, this copy
mode does not make any sense from user-space, so this patch
should not break any existing setup.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-07-15 11:27:41 +02:00
Patrick McHardy f9181f4ffc Merge branch 'master' of /repos/git/net-next-2.6
Conflicts:
	include/net/netfilter/xt_rateest.h
	net/bridge/br_netfilter.c
	net/netfilter/nf_conntrack_core.c

Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-15 17:31:06 +02:00
Luciano Coelho 0902b469bd netfilter: xtables: idletimer target implementation
This patch implements an idletimer Xtables target that can be used to
identify when interfaces have been idle for a certain period of time.

Timers are identified by labels and are created when a rule is set with a new
label.  The rules also take a timeout value (in seconds) as an option.  If
more than one rule uses the same timer label, the timer will be restarted
whenever any of the rules get a hit.

One entry for each timer is created in sysfs.  This attribute contains the
timer remaining for the timer to expire.  The attributes are located under
the xt_idletimer class:

/sys/class/xt_idletimer/timers/<label>

When the timer expires, the target module sends a sysfs notification to the
userspace, which can then decide what to do (eg. disconnect to save power).

Cc: Timo Teras <timo.teras@iki.fi>
Signed-off-by: Luciano Coelho <luciano.coelho@nokia.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-15 15:04:00 +02:00
Eric Dumazet f5c5440d40 netfilter: nfnetlink_log: RCU conversion, part 2
- must use atomic_inc_not_zero() in instance_lookup_get()

- must use hlist_add_head_rcu() instead of hlist_add_head()

- must use hlist_del_rcu() instead of hlist_del()

- Introduce NFULNL_COPY_DISABLED to stop lockless reader from using an
instance, before we do final instance_put() on it.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-14 16:15:23 +02:00
Eric Dumazet 5bfddbd46a netfilter: nf_conntrack: IPS_UNTRACKED bit
NOTRACK makes all cpus share a cache line on nf_conntrack_untracked
twice per packet. This is bad for performance.
__read_mostly annotation is also a bad choice.

This patch introduces IPS_UNTRACKED bit so that we can use later a
per_cpu untrack structure more easily.

A new helper, nf_ct_untracked_get() returns a pointer to
nf_conntrack_untracked.

Another one, nf_ct_untracked_status_or() is used by nf_nat_init() to add
IPS_NAT_DONE_MASK bits to untracked status.

nf_ct_is_untracked() prototype is changed to work on a nf_conn pointer.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-08 16:09:52 +02:00
Eric Dumazet 7489aec8ee netfilter: xtables: stackptr should be percpu
commit f3c5c1bfd4 (netfilter: xtables: make ip_tables reentrant)
introduced a performance regression, because stackptr array is shared by
all cpus, adding cache line ping pongs. (16 cpus share a 64 bytes cache
line)

Fix this using alloc_percpu()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-By: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-05-31 16:41:35 +02:00
Luciano Coelho 7ea7b858f4 netfilter: fix description of expected checkentry return code on xt_target
The text describing the return codes that are expected on calls to
checkentry() was incorrect.  Instead of returning true or false, or an error
code, it should return 0 or an error code.

Signed-off-by: Luciano Coelho <luciano.coelho@nokia.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-05-20 15:59:16 +02:00
Jan Engelhardt b4ba26119b netfilter: xtables: change hotdrop pointer to direct modification
Since xt_action_param is writable, let's use it. The pointer to
'bool hotdrop' always worried (8 bytes (64-bit) to write 1 byte!).
Surprisingly results in a reduction in size:

   text    data     bss filename
5457066  692730  357892 vmlinux.o-prev
5456554  692730  357892 vmlinux.o

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-05-11 18:35:27 +02:00
Jan Engelhardt 62fc805108 netfilter: xtables: deconstify struct xt_action_param for matches
In future, layer-3 matches will be an xt module of their own, and
need to set the fragoff and thoff fields. Adding more pointers would
needlessy increase memory requirements (esp. so for 64-bit, where
pointers are wider).

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-05-11 18:33:37 +02:00
Jan Engelhardt 4b560b447d netfilter: xtables: substitute temporary defines by final name
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-05-11 18:31:17 +02:00
Jan Engelhardt de74c16996 netfilter: xtables: combine struct xt_match_param and xt_target_param
The structures carried - besides match/target - almost the same data.
It is possible to combine them, as extensions are evaluated serially,
and so, the callers end up a little smaller.

  text  data  bss  filename
-15318   740  104  net/ipv4/netfilter/ip_tables.o
+15286   740  104  net/ipv4/netfilter/ip_tables.o
-15333   540  152  net/ipv6/netfilter/ip6_tables.o
+15269   540  152  net/ipv6/netfilter/ip6_tables.o

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-05-11 18:23:43 +02:00
Jan Engelhardt 4b2cbd42be netfilter: x_tables: rectify XT_FUNCTION_MAXNAMELEN usage
There has been quite a confusion in userspace about
XT_FUNCTION_MAXNAMELEN; because struct xt_entry_match used MAX-1,
userspace would have to do an awkward MAX-2 for maximum length
checking (due to '\0'). This patch adds a new define that matches the
definition of XT_TABLE_MAXNAMELEN - being the size of the actual
struct member, not one off.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-04-27 15:34:34 +02:00
Jesper Dangaard Brouer af740b2c8f netfilter: nf_conntrack: extend with extra stat counter
I suspect an unfortunatly series of events occuring under a DDoS
attack, in function __nf_conntrack_find() nf_contrack_core.c.

Adding a stats counter to see if the search is restarted too often.

Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-04-23 12:34:56 +02:00
Patrick McHardy 6291055465 Merge branch 'master' of /repos/git/net-next-2.6
Conflicts:
	Documentation/feature-removal-schedule.txt
	net/ipv6/netfilter/ip6t_REJECT.c
	net/netfilter/xt_limit.c

Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-04-20 16:02:01 +02:00
Patrick McHardy 22265a5c3c netfilter: xt_TEE: resolve oif using netdevice notifiers
Replace the runtime oif name resolving by netdevice notifier based
resolving. When an oif is given, a netdevice notifier is registered
to resolve the name on NETDEV_REGISTER or NETDEV_CHANGE and unresolve
it again on NETDEV_UNREGISTER or NETDEV_CHANGE to a different name.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-04-20 15:07:32 +02:00
Jan Engelhardt f3c5c1bfd4 netfilter: xtables: make ip_tables reentrant
Currently, the table traverser stores return addresses in the ruleset
itself (struct ip6t_entry->comefrom). This has a well-known drawback:
the jumpstack is overwritten on reentry, making it necessary for
targets to return absolute verdicts. Also, the ruleset (which might
be heavy memory-wise) needs to be replicated for each CPU that can
possibly invoke ip6t_do_table.

This patch decouples the jumpstack from struct ip6t_entry and instead
puts it into xt_table_info. Not being restricted by 'comefrom'
anymore, we can set up a stack as needed. By default, there is room
allocated for two entries into the traverser.

arp_tables is not touched though, because there is just one/two
modules and further patches seek to collapse the table traverser
anyhow.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-04-19 16:05:10 +02:00
Jan Engelhardt e281b19897 netfilter: xtables: inclusion of xt_TEE
xt_TEE can be used to clone and reroute a packet. This can for
example be used to copy traffic at a router for logging purposes
to another dedicated machine.

References: http://www.gossamer-threads.com/lists/iptables/devel/68781
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-04-19 14:17:47 +02:00
Alexey Dobriyan a79ff731a1 netfilter: xtables: make XT_ALIGN() usable in exported headers by exporting __ALIGN_KERNEL()
XT_ALIGN() was rewritten through ALIGN() by commit 42107f5009
"netfilter: xtables: symmetric COMPAT_XT_ALIGN definition".
ALIGN() is not exported in userspace headers, which created compile problem for tc(8)
and will create problem for iptables(8).

We can't export generic looking name ALIGN() but we can export less generic
__ALIGN_KERNEL() (suggested by Ben Hutchings).
Google knows nothing about __ALIGN_KERNEL().

COMPAT_XT_ALIGN() changed for symmetry.

Reported-by: Andreas Henriksson <andreas@fatal.se>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-04-13 11:21:46 +02:00
Jan Engelhardt 135367b8f6 netfilter: xtables: change xt_target.checkentry return type
Restore function signatures from bool to int so that we can report
memory allocation failures or similar using -ENOMEM rather than
always having to pass -EINVAL back.

// <smpl>
@@
type bool;
identifier check, par;
@@
-bool check
+int check
 (struct xt_tgchk_param *par) { ... }
// </smpl>

Minus the change it does to xt_ct_find_proto.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-25 16:04:33 +01:00
Jan Engelhardt b0f38452ff netfilter: xtables: change xt_match.checkentry return type
Restore function signatures from bool to int so that we can report
memory allocation failures or similar using -ENOMEM rather than
always having to pass -EINVAL back.

This semantic patch may not be too precise (checking for functions
that use xt_mtchk_param rather than functions referenced by
xt_match.checkentry), but reviewed, it produced the intended result.

// <smpl>
@@
type bool;
identifier check, par;
@@
-bool check
+int check
 (struct xt_mtchk_param *par) { ... }
// </smpl>

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-25 16:03:13 +01:00
Jan Engelhardt fd0ec0e621 netfilter: xtables: consolidate code into xt_request_find_match
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-25 15:02:19 +01:00
Pablo Neira Ayuso 37b7ef7203 netfilter: ctnetlink: fix reliable event delivery if message building fails
This patch fixes a bug that allows to lose events when reliable
event delivery mode is used, ie. if NETLINK_BROADCAST_SEND_ERROR
and NETLINK_RECV_NO_ENOBUFS socket options are set.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-20 14:29:03 -07:00
Jan Engelhardt f5c511c67a netfilter: xtables: reduce holes in struct xt_target
This will save one full padding chunk (8 bytes on x86_64) per target.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-18 14:20:07 +01:00
Jan Engelhardt 4f948db191 netfilter: xtables: remove almost-unused xt_match_param.data member
This member is taking up a "long" per match, yet is only used by one
module out of the roughly 90 modules, ip6t_hbh. ip6t_hbh can be
restructured a little to accomodate for the lack of the .data member.
This variant uses checking the par->match address, which should avoid
having to add two extra functions, including calls, i.e.

(hbh_mt6: call hbhdst_mt6(skb, par, NEXTHDR_OPT),
dst_mt6: call hbhdst_mt6(skb, par, NEXTHDR_DEST))

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-18 14:20:07 +01:00
Jan Engelhardt 16599786ae netfilter: update documentation fields of x_tables.h
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-18 14:20:07 +01:00
Patrick McHardy e8a96f6938 Merge branch 'master' of ../nf-2.6 2010-03-18 12:55:50 +01:00
Tim Gardner 606a9a0263 netfilter: xt_recent: check for unsupported user space flags
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-03-17 16:18:56 +01:00
Tim Gardner 0079c5aee3 netfilter: xt_recent: add an entry reaper
One of the problems with the way xt_recent is implemented is that
there is no efficient way to remove expired entries. Of course,
one can write a rule '-m recent --remove', but you have to know
beforehand which entry to delete. This commit adds reaper
logic which checks the head of the LRU list when a rule
is invoked that has a '--seconds' value and XT_RECENT_REAP set. If an
entry ceases to accumulate time stamps, then it will eventually bubble
to the top of the LRU list where it is then reaped.

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-03-17 15:53:12 +01:00
Jan Engelhardt b8f00ba27e netfilter: xtables: merge xt_CONNMARK into xt_connmark
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-17 15:48:36 +01:00
Jan Engelhardt 28b949885f netfilter: xtables: merge xt_MARK into xt_mark
Two arguments for combining the two:
- xt_mark is pretty useless without xt_MARK
- the actual code is so small anyway that the kmod metadata and the module
  in its loaded state totally outweighs the combined actual code size.

i586-before:
-rw-r--r-- 1 jengelh users 3821 Feb 10 01:01 xt_MARK.ko
-rw-r--r-- 1 jengelh users 2592 Feb 10 00:04 xt_MARK.o
-rw-r--r-- 1 jengelh users 3274 Feb 10 01:01 xt_mark.ko
-rw-r--r-- 1 jengelh users 2108 Feb 10 00:05 xt_mark.o
   text    data     bss     dec     hex filename
    354     264       0     618     26a xt_MARK.o
    223     176       0     399     18f xt_mark.o
And the runtime size is like 14 KB.

i586-after:
-rw-r--r-- 1 jengelh users 3264 Feb 18 17:28 xt_mark.o

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-03-17 15:48:36 +01:00
Andrea Gelmini 63e690caf2 netfilter: include/linux/netfilter/nf_conntrack_tuple_common.h: Checkpatch cleanup
include/linux/netfilter/nf_conntrack_tuple_common.h:5: ERROR: open brace '{' following enum go on the same line

Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-03-08 13:13:07 +01:00
Jan Engelhardt dcea992aca netfilter: xtables: replace XT_MATCH_ITERATE macro
The macro is replaced by a list.h-like foreach loop. This makes
the code more inspectable.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-24 18:34:48 +01:00
Jan Engelhardt 72b2b1dd77 netfilter: xtables: replace XT_ENTRY_ITERATE macro
The macro is replaced by a list.h-like foreach loop. This makes
the code much more inspectable.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-24 18:32:59 +01:00
Florian Westphal 3e5e524ffb netfilter: CONFIG_COMPAT: allow delta to exceed 32767
with 32 bit userland and 64 bit kernels, it is unlikely but possible
that insertion of new rules fails even tough there are only about 2000
iptables rules.

This happens because the compat delta is using a short int.
Easily reproducible via "iptables -m limit" ; after about 2050
rules inserting new ones fails with -ELOOP.

Note that compat_delta included 2 bytes of padding on x86_64, so
structure size remains the same.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-15 18:17:10 +01:00
Patrick McHardy ef00f89f1e netfilter: ctnetlink: add zone support
Parse and dump the conntrack zone in ctnetlink.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-15 18:14:57 +01:00
Patrick McHardy 5d0aa2ccd4 netfilter: nf_conntrack: add support for "conntrack zones"
Normally, each connection needs a unique identity. Conntrack zones allow
to specify a numerical zone using the CT target, connections in different
zones can use the same identity.

Example:

iptables -t raw -A PREROUTING -i veth0 -j CT --zone 1
iptables -t raw -A OUTPUT -o veth1 -j CT --zone 1

Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-15 18:13:33 +01:00
Jan Engelhardt 739674fb7f netfilter: xtables: constify args in compat copying functions
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-02-15 16:59:28 +01:00
Patrick McHardy 9d288dffe3 netfilter: nf_conntrack_sip: add T.38 FAX support
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-11 12:30:21 +01:00
Patrick McHardy 48f8ac2653 netfilter: nf_nat_sip: add TCP support
Add support for mangling TCP SIP packets.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-11 12:29:38 +01:00
Patrick McHardy f5b321bd37 netfilter: nf_conntrack_sip: add TCP support
Add TCP support, which is mandated by RFC3261 for all SIP elements.

SIP over TCP is similar to UDP, except that messages are delimited
by Content-Length: headers and multiple messages may appear in one
packet.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-11 12:26:19 +01:00
Patrick McHardy 3b6b9fab42 netfilter: nf_conntrack_sip: pass data offset to NAT functions
When using TCP multiple SIP messages might be present in a single packet.
A following patch will parse them by setting the dptr to the beginning of
each message. The NAT helper needs to reload the dptr value after mangling
the packet however, so it needs to know the offset of the message to the
beginning of the packet.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-11 12:23:53 +01:00
Patrick McHardy a8c28d0515 Merge branch 'master' of git://dev.medozas.de/linux 2010-02-10 17:56:46 +01:00
Jan Engelhardt 2b95efe7f6 netfilter: xtables: use xt_table for hook instantiation
The respective xt_table structures already have most of the metadata
needed for hook setup. Add a 'priority' field to struct xt_table so
that xt_hook_link() can be called with a reduced number of arguments.

So should we be having more tables in the future, it comes at no
static cost (only runtime, as before) - space saved:
6807373->6806555.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-02-10 17:13:33 +01:00
Alexey Dobriyan 42107f5009 netfilter: xtables: symmetric COMPAT_XT_ALIGN definition
Rewrite COMPAT_XT_ALIGN in terms of dummy structure hack.
Compat counters logically have nothing to do with it.
Use ALIGN() macro while I'm at it for same types.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-10 15:03:27 +01:00
Alexey Dobriyan 0a02604628 netfilter: xtables: consistent struct compat_xt_counters definition
There is compat_u64 type which deals with different u64 type alignment
on different compat-capable platforms, so use it and removed some
hardcoded assumptions.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-10 15:00:32 +01:00
Patrick McHardy 84f3bb9ae9 netfilter: xtables: add CT target
Add a new target for the raw table, which can be used to specify conntrack
parameters for specific connections, f.i. the conntrack helper.

The target attaches a "template" connection tracking entry to the skb, which
is used by the conntrack core when initializing a new conntrack.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-03 17:17:06 +01:00
Patrick McHardy b2a15a604d netfilter: nf_conntrack: support conntrack templates
Support initializing selected parameters of new conntrack entries from a
"conntrack template", which is a specially marked conntrack entry attached
to the skb.

Currently the helper and the event delivery masks can be initialized this
way.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-03 14:40:17 +01:00
Patrick McHardy 0cebe4b416 netfilter: ctnetlink: support selective event delivery
Add two masks for conntrack end expectation events to struct nf_conntrack_ecache
and use them to filter events. Their default value is "all events" when the
event sysctl is on and "no events" when it is off. A following patch will add
specific initializations. Expectation events depend on the ecache struct of
their master conntrack.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-03 13:51:51 +01:00
Patrick McHardy add6746124 netfilter: add struct net * to target parameters
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-03 13:45:12 +01:00
Alexey Dobriyan c30f540b63 netfilter: xtables: CONFIG_COMPAT redux
Ifdef out
	struct nf_sockopt_ops::compat_set
	struct nf_sockopt_ops::compat_get
	struct xt_match::compat_from_user
	struct xt_match::compat_to_user
	struct xt_match::compatsize
to make structures smaller on COMPAT=n kernels.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-02 15:03:24 +01:00
Alexey Dobriyan f54e9367f8 netfilter: xtables: add struct xt_mtdtor_param::net
Add ->net to match destructor list like ->net in constructor list.

Make sure it's set in ebtables/iptables/ip6tables, this requires to
propagate netns up to *_unregister_table().

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-01-18 08:25:47 +01:00
Alexey Dobriyan a83d8e8d09 netfilter: xtables: add struct xt_mtchk_param::net
Some complex match modules (like xt_hashlimit/xt_recent) want netns
information at constructor and destructor time. We propably can play
games at match destruction time, because netns can be passed in object,
but I think it's cleaner to explicitly pass netns.

Add ->net, make sure it's set from ebtables/iptables/ip6tables code.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-01-18 08:21:13 +01:00
Alexey Dobriyan cd8c20b650 netfilter: nfnetlink: netns support
Make nfnl socket per-petns.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-01-13 16:02:14 +01:00
David S. Miller 424eff9751 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6 2009-12-03 13:23:12 -08:00
Pablo Neira Ayuso c4832c7bbc netfilter: nf_ct_tcp: improve out-of-sync situation in TCP tracking
Without this patch, if we receive a SYN packet from the client while
the firewall is out-of-sync, we let it go through. Then, if we see
the SYN/ACK reply coming from the server, we destroy the conntrack
entry and drop the packet to trigger a new retransmission. Then,
the retransmision from the client is used to start a new clean
session.

This patch improves the current handling. Basically, if we see an
unexpected SYN packet, we annotate the TCP options. Then, if we
see the reply SYN/ACK, this means that the firewall was indeed
out-of-sync. Therefore, we set a clean new session from the existing
entry based on the annotated values.

This patch adds two new 8-bits fields that fit in a 16-bits gap of
the ip_ct_tcp structure.

This patch is particularly useful for conntrackd since the
asynchronous nature of the state-synchronization allows to have
backup nodes that are not perfect copies of the master. This helps
to improve the recovery under some worst-case scenarios.

I have tested this by creating lots of conntrack entries in wrong
state:

for ((i=1024;i<65535;i++)); do conntrack -I -p tcp -s 192.168.2.101 -d 192.168.2.2 --sport $i --dport 80 -t 800 --state ESTABLISHED -u ASSURED,SEEN_REPLY; done

Then, I make some TCP connections:

$ echo GET / | nc 192.168.2.2 80

The events show the result:

 [UPDATE] tcp      6 60 SYN_RECV src=192.168.2.101 dst=192.168.2.2 sport=33220 dport=80 src=192.168.2.2 dst=192.168.2.101 sport=80 dport=33220 [ASSURED]
 [UPDATE] tcp      6 432000 ESTABLISHED src=192.168.2.101 dst=192.168.2.2 sport=33220 dport=80 src=192.168.2.2 dst=192.168.2.101 sport=80 dport=33220 [ASSURED]
 [UPDATE] tcp      6 120 FIN_WAIT src=192.168.2.101 dst=192.168.2.2 sport=33220 dport=80 src=192.168.2.2 dst=192.168.2.101 sport=80 dport=33220 [ASSURED]
 [UPDATE] tcp      6 30 LAST_ACK src=192.168.2.101 dst=192.168.2.2 sport=33220 dport=80 src=192.168.2.2 dst=192.168.2.101 sport=80 dport=33220 [ASSURED]
 [UPDATE] tcp      6 120 TIME_WAIT src=192.168.2.101 dst=192.168.2.2 sport=33220 dport=80 src=192.168.2.2 dst=192.168.2.101 sport=80 dport=33220 [ASSURED]

and tcpdump shows no retransmissions:

20:47:57.271951 IP 192.168.2.101.33221 > 192.168.2.2.www: S 435402517:435402517(0) win 5840 <mss 1460,sackOK,timestamp 4294961827 0,nop,wscale 6>
20:47:57.273538 IP 192.168.2.2.www > 192.168.2.101.33221: S 3509927945:3509927945(0) ack 435402518 win 5792 <mss 1460,sackOK,timestamp 235681024 4294961827,nop,wscale 4>
20:47:57.273608 IP 192.168.2.101.33221 > 192.168.2.2.www: . ack 3509927946 win 92 <nop,nop,timestamp 4294961827 235681024>
20:47:57.273693 IP 192.168.2.101.33221 > 192.168.2.2.www: P 435402518:435402524(6) ack 3509927946 win 92 <nop,nop,timestamp 4294961827 235681024>
20:47:57.275492 IP 192.168.2.2.www > 192.168.2.101.33221: . ack 435402524 win 362 <nop,nop,timestamp 235681024 4294961827>
20:47:57.276492 IP 192.168.2.2.www > 192.168.2.101.33221: P 3509927946:3509928082(136) ack 435402524 win 362 <nop,nop,timestamp 235681025 4294961827>
20:47:57.276515 IP 192.168.2.101.33221 > 192.168.2.2.www: . ack 3509928082 win 108 <nop,nop,timestamp 4294961828 235681025>
20:47:57.276521 IP 192.168.2.2.www > 192.168.2.101.33221: F 3509928082:3509928082(0) ack 435402524 win 362 <nop,nop,timestamp 235681025 4294961827>
20:47:57.277369 IP 192.168.2.101.33221 > 192.168.2.2.www: F 435402524:435402524(0) ack 3509928083 win 108 <nop,nop,timestamp 4294961828 235681025>
20:47:57.279491 IP 192.168.2.2.www > 192.168.2.101.33221: . ack 435402525 win 362 <nop,nop,timestamp 235681025 4294961828>

I also added a rule to log invalid packets, with no occurrences  :-) .

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-11-23 10:37:34 +01:00
Eric Dumazet d94d9fee9f net: cleanup include/linux
This cleanup patch puts struct/union/enum opening braces,
in first line to ease grep games.

struct something
{

becomes :

struct something {

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-04 09:50:58 -08:00
Patrick McHardy 3993832464 netfilter: nfnetlink: constify message attributes and headers
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-08-25 16:07:58 +02:00
Jan Engelhardt 35aad0ffdf netfilter: xtables: mark initial tables constant
The inputted table is never modified, so should be considered const.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-08-24 14:56:30 +02:00
Jan Engelhardt 98d89b4198 netfilter: xtables: realign struct xt_target_param
This commit gets rid of a padding hole as reported by pahole(1).
Saves 8 bytes on x86_64.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2009-08-10 13:35:24 +02:00
Jan Engelhardt 4725c7287e netfilter: xtables: remove xt_mark v0
Superseded by xt_mark v1 (v2.6.24-2922-g17b0d7e).

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2009-08-10 13:09:45 +02:00
Jan Engelhardt 9e05ec4b18 netfilter: xtables: remove xt_conntrack v0
Superseded by xt_conntrack v1 (v2.6.24-2921-g64eb12f).

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2009-08-10 13:09:44 +02:00
Jan Engelhardt 84899a2b9a netfilter: xtables: remove xt_connmark v0
Superseded by xt_connmark v1 (v2.6.24-2919-g96e3227).

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2009-08-10 12:25:12 +02:00
Jan Engelhardt c8001f7fd5 netfilter: xtables: remove xt_MARK v0, v1
Superseded by xt_MARK v2 (v2.6.24-2918-ge0a812a).

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2009-08-10 12:25:12 +02:00
Jan Engelhardt e973a70ca0 netfilter: xtables: remove xt_CONNMARK v0
Superseded by xt_CONNMARK v1 (v2.6.24-2917-g0dc8c76).

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2009-08-10 12:25:11 +02:00
Jan Engelhardt d6d3f08b0f netfilter: xtables: conntrack match revision 2
As reported by Philip, the UNTRACKED state bit does not fit within
the 8-bit state_mask member. Enlarge state_mask and give status_mask
a few more bits too.

Reported-by: Philip Craig <philipc@snapgear.com>
References: http://markmail.org/thread/b7eg6aovfh4agyz7
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-06-29 14:31:46 +02:00
Jaswinder Singh Rajput 8a3af79361 netfilter: headers_check fix: linux/netfilter/xt_osf.h
fix the following 'make headers_check' warnings:

  usr/include/linux/netfilter/xt_osf.h:40: found __[us]{8,16,32,64} type without #include <linux/types.h>

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-06-29 14:28:27 +02:00
Patrick McHardy 36432dae73 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 2009-06-11 16:00:49 +02:00
Laszlo Attila Toth a31e1ffd22 netfilter: xt_socket: added new revision of the 'socket' match supporting flags
If the XT_SOCKET_TRANSPARENT flag is set, enabled 'transparent'
socket option is required for the socket to be matched.

Signed-off-by: Laszlo Attila Toth <panther@balabit.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-06-09 15:16:34 +02:00
Evgeniy Polyakov 11eeef41d5 netfilter: passive OS fingerprint xtables match
Passive OS fingerprinting netfilter module allows to passively detect
remote OS and perform various netfilter actions based on that knowledge.
This module compares some data (WS, MSS, options and it's order, ttl, df
and others) from packets with SYN bit set with dynamically loaded OS
fingerprints.

Fingerprint matching rules can be downloaded from OpenBSD source tree
or found in archive and loaded via netfilter netlink subsystem into
the kernel via special util found in archive.

Archive contains library file (also attached), which was shipped
with iptables extensions some time ago (at least when ipt_osf existed
in patch-o-matic).

Following changes were made in this release:
 * added NLM_F_CREATE/NLM_F_EXCL checks
 * dropped _rcu list traversing helpers in the protected add/remove calls
 * dropped unneded structures, debug prints, obscure comment and check

Fingerprints can be downloaded from
http://www.openbsd.org/cgi-bin/cvsweb/src/etc/pf.os
or can be found in archive

Example usage:
-d switch removes fingerprints

Please consider for inclusion.
Thank you.

Passive OS fingerprint homepage (archives, examples):
http://www.ioremap.net/projects/osf

Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-06-08 17:01:51 +02:00
Florian Westphal 10662aa308 netfilter: xt_NFQUEUE: queue balancing support
Adds support for specifying a range of queues instead of a single queue
id. Flows will be distributed across the given range.

This is useful for multicore systems: Instead of having a single
application read packets from a queue, start multiple
instances on queues x, x+1, .. x+n. Each instance can process
flows independently.

Packets for the same connection are put into the same queue.

Signed-off-by: Holger Eitzenberger <heitzenberger@astaro.com>
Signed-off-by: Florian Westphal <fwestphal@astaro.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-06-05 13:24:24 +02:00
Evgeniy Polyakov a5e7882096 netfilter: x_tables: added hook number into match extension parameter structure.
Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-06-04 16:54:42 +02:00
Pablo Neira Ayuso e34d5c1a4f netfilter: conntrack: replace notify chain by function pointer
This patch removes the notify chain infrastructure and replace it
by a simple function pointer. This issue has been mentioned in the
mailing list several times: the use of the notify chain adds
too much overhead for something that is only used by ctnetlink.

This patch also changes nfnetlink_send(). It seems that gfp_any()
returns GFP_KERNEL for user-context request, like those via
ctnetlink, inside the RCU read-side section which is not valid.
Using GFP_KERNEL is also evil since netlink may schedule(),
this leads to "scheduling while atomic" bug reports.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2009-06-03 10:32:06 +02:00
Pablo Neira Ayuso 6bfea1984a netfilter: conntrack: remove events flags from userspace exposed file
This patch moves the event flags from linux/netfilter/nf_conntrack_common.h
to net/netfilter/nf_conntrack_ecache.h. This flags are not of any use
from userspace.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2009-06-02 20:08:44 +02:00
Jozsef Kadlecsik 874ab9233e netfilter: nf_ct_tcp: TCP simultaneous open support
The patch below adds supporting TCP simultaneous open to conntrack. The
unused LISTEN state is replaced by a new state (SYN_SENT2) denoting the
second SYN sent from the reply direction in the new case. The state table
is updated and the function tcp_in_window is modified to handle
simultaneous open.

The functionality can fairly easily be tested by socat. A sample tcpdump
recording

23:21:34.244733 IP (tos 0x0, ttl 64, id 49224, offset 0, flags [DF], proto TCP (6), length 60) 192.168.0.254.2020 > 192.168.0.1.2020: S, cksum 0xe75f (correct), 3383710133:3383710133(0) win 5840 <mss 1460,sackOK,timestamp 173445629 0,nop,wscale 7>
23:21:34.244783 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 40) 192.168.0.1.2020 > 192.168.0.254.2020: R, cksum 0x0253 (correct), 0:0(0) ack 3383710134 win 0
23:21:36.038680 IP (tos 0x0, ttl 64, id 28092, offset 0, flags [DF], proto TCP (6), length 60) 192.168.0.1.2020 > 192.168.0.254.2020: S, cksum 0x704b (correct), 2634546729:2634546729(0) win 5840 <mss 1460,sackOK,timestamp 824213 0,nop,wscale 1>
23:21:36.038777 IP (tos 0x0, ttl 64, id 49225, offset 0, flags [DF], proto TCP (6), length 60) 192.168.0.254.2020 > 192.168.0.1.2020: S, cksum 0xb179 (correct), 3383710133:3383710133(0) ack 2634546730 win 5840 <mss 1460,sackOK,timestamp 173447423 824213,nop,wscale 7>
23:21:36.038847 IP (tos 0x0, ttl 64, id 28093, offset 0, flags [DF], proto TCP (6), length 52) 192.168.0.1.2020 > 192.168.0.254.2020: ., cksum 0xebad (correct), ack 3383710134 win 2920 <nop,nop,timestamp 824213 173447423>

and the corresponding netlink events:

    [NEW] tcp      6 120 SYN_SENT src=192.168.0.254 dst=192.168.0.1 sport=2020 dport=2020 [UNREPLIED] src=192.168.0.1 dst=192.168.0.254 sport=2020 dport=2020
 [UPDATE] tcp      6 120 LISTEN src=192.168.0.254 dst=192.168.0.1 sport=2020 dport=2020 src=192.168.0.1 dst=192.168.0.254 sport=2020 dport=2020
 [UPDATE] tcp      6 60 SYN_RECV src=192.168.0.254 dst=192.168.0.1 sport=2020 dport=2020 src=192.168.0.1 dst=192.168.0.254 sport=2020 dport=2020
 [UPDATE] tcp      6 432000 ESTABLISHED src=192.168.0.254 dst=192.168.0.1 sport=2020 dport=2020 src=192.168.0.1 dst=192.168.0.254 sport=2020 dport=2020 [ASSURED]

The RST packet was dropped in the raw table, thus it did not reach
conntrack.  nfnetlink_conntrack is unpatched so it shows the new SYN_SENT2
state as the old unused LISTEN.

With TCP simultaneous open support we satisfy REQ-2 in RFC 5382  ;-) .

Additional minor correction in this patch is that in order to catch
uninitialized reply directions, "td_maxwin == 0" is used instead of
"td_end == 0" because the former can't be true except in uninitialized
state while td_end may accidentally be equal to zero in the mid of a
connection.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-06-02 13:58:56 +02:00
David S. Miller 4d3383d0ad Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6 2009-05-27 15:51:25 -07:00
Pablo Neira Ayuso a17c859849 netfilter: conntrack: add support for DCCP handshake sequence to ctnetlink
This patch adds CTA_PROTOINFO_DCCP_HANDSHAKE_SEQ that exposes
the u64 handshake sequence number to user-space.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-05-27 17:50:35 +02:00
Jozsef Kadlecsik bfcaa50270 netfilter: nf_ct_tcp: fix accepting invalid RST segments
Robert L Mathews discovered that some clients send evil TCP RST segments,
which are accepted by netfilter conntrack but discarded by the
destination. Thus the conntrack entry is destroyed but the destination
retransmits data until timeout.

The same technique, i.e. sending properly crafted RST segments, can easily
be used to bypass connlimit/connbytes based restrictions (the sample
script written by Robert can be found in the netfilter mailing list
archives).

The patch below adds a new flag and new field to struct ip_ct_tcp_state so
that checking RST segments can be made more strict and thus TCP conntrack
can catch the invalid ones: the RST segment is accepted only if its
sequence number higher than or equal to the highest ack we seen from the
other direction. (The last_ack field cannot be reused because it is used
to catch resent packets.)

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-05-25 17:23:15 +02:00
David S. Miller 356d6c2d55 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6 2009-05-05 12:00:53 -07:00
Pablo Neira Ayuso 280f37afa2 netfilter: xt_cluster: fix use of cluster match with 32 nodes
This patch fixes a problem when you use 32 nodes in the cluster
match:

% iptables -I PREROUTING -t mangle -i eth0 -m cluster \
  --cluster-total-nodes  32  --cluster-local-node  32 \
  --cluster-hash-seed 0xdeadbeef -j MARK --set-mark 0xffff
iptables: Invalid argument. Run `dmesg' for more information.
% dmesg | tail -1
xt_cluster: this node mask cannot be higher than the total number of nodes

The problem is related to this checking:

if (info->node_mask >= (1 << info->total_nodes)) {
	printk(KERN_ERR "xt_cluster: this node mask cannot be "
			"higher than the total number of nodes\n");
	return false;
}

(1 << 32) is 1. Thus, the checking fails.

BTW, I said this before but I insist: I have only tested the cluster
match with 2 nodes getting ~45% extra performance in an active-active setup.
The maximum limit of 32 nodes is still completely arbitrary. I'd really
appreciate if people that have more nodes in their setups let me know.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-05-05 17:46:07 +02:00
Patrick McHardy a7ca7fccac netfilter: add missing linux/types.h include to xt_LED.h
Pointed out by Dave Miller:

  CHECK   include/linux/netfilter (57 files)
/home/davem/src/GIT/net-2.6/usr/include/linux/netfilter/xt_LED.h:6: found __[us]{8,16,32,64} type without #include <linux/types.h>

Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-05-05 14:31:12 +02:00
Eric Dumazet 0f3d042ed2 netfilter: use likely() in xt_info_rdlock_bh()
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-01 09:10:46 -07:00
Stephen Hemminger 942e4a2bd6 netfilter: revised locking for x_tables
The x_tables are organized with a table structure and a per-cpu copies
of the counters and rules. On older kernels there was a reader/writer 
lock per table which was a performance bottleneck. In 2.6.30-rc, this
was converted to use RCU and the counters/rules which solved the performance
problems for do_table but made replacing rules much slower because of
the necessary RCU grace period.

This version uses a per-cpu set of spinlocks and counters to allow to
table processing to proceed without the cache thrashing of a global
reader lock and keeps the same performance for table updates.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-28 22:36:33 -07:00
Pablo Neira Ayuso 71951b64a5 netfilter: nf_ct_dccp: add missing role attributes for DCCP
This patch adds missing role attribute to the DCCP type, otherwise
the creation of entries is not of any use.

The attribute added is CTA_PROTOINFO_DCCP_ROLE which contains the
role of the conntrack original tuple.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-04-24 16:58:41 +02:00
David S. Miller 01e6de64d9 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6 2009-03-26 22:45:23 -07:00
Linus Torvalds ba1eb95cf3 Merge branch 'header-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'header-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (50 commits)
  x86: headers cleanup - setup.h
  emu101k1.h: fix duplicate include of <linux/types.h>
  compiler-gcc4: conditionalize #error on __KERNEL__
  remove __KERNEL_STRICT_NAMES
  make netfilter use strict integer types
  make drm headers use strict integer types
  make MTD headers use strict integer types
  make most exported headers use strict integer types
  make exported headers use strict posix types
  unconditionally include asm/types.h from linux/types.h
  make linux/types.h as assembly safe
  Neither asm/types.h nor linux/types.h is required for arch/ia64/include/asm/fpu.h
  headers_check fix cleanup: linux/reiserfs_fs.h
  headers_check fix cleanup: linux/nubus.h
  headers_check fix cleanup: linux/coda_psdev.h
  headers_check fix: x86, setup.h
  headers_check fix: x86, prctl.h
  headers_check fix: linux/reinserfs_fs.h
  headers_check fix: linux/socket.h
  headers_check fix: linux/nubus.h
  ...

Manually fix trivial conflicts in:
	include/linux/netfilter/xt_limit.h
	include/linux/netfilter/xt_statistic.h
2009-03-26 16:11:41 -07:00
Ingo Molnar 5a54bd1307 Merge commit 'v2.6.29' into core/header-fixes 2009-03-26 18:29:40 +01:00
Arnd Bergmann 60c195c729 make netfilter use strict integer types
Netfilter traditionally uses BSD integer types in its
interface headers. This changes it to use the Linux
strict integer types, like everyone else.

Cc: netfilter-devel@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-26 18:14:20 +01:00
Eric Dumazet b8dfe49877 netfilter: factorize ifname_compare()
We use same not trivial helper function in four places. We can factorize it.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-03-25 17:31:52 +01:00
David S. Miller b5bb14386e Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6 2009-03-24 13:24:36 -07:00
Pablo Neira Ayuso dd5b6ce6fd nefilter: nfnetlink: add nfnetlink_set_err and use it in ctnetlink
This patch adds nfnetlink_set_err() to propagate the error to netlink
broadcast listener in case of memory allocation errors in the
message building.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-03-23 13:21:06 +01:00
Pablo Neira Ayuso 0269ea4937 netfilter: xtables: add cluster match
This patch adds the iptables cluster match. This match can be used
to deploy gateway and back-end load-sharing clusters. The cluster
can be composed of 32 nodes maximum (although I have only tested
this with two nodes, so I cannot tell what is the real scalability
limit of this solution in terms of cluster nodes).

Assuming that all the nodes see all packets (see below for an
example on how to do that if your switch does not allow this), the
cluster match decides if this node has to handle a packet given:

	(jhash(source IP) % total_nodes) & node_mask

For related connections, the master conntrack is used. The following
is an example of its use to deploy a gateway cluster composed of two
nodes (where this is the node 1):

iptables -I PREROUTING -t mangle -i eth1 -m cluster \
	--cluster-total-nodes 2 --cluster-local-node 1 \
	--cluster-proc-name eth1 -j MARK --set-mark 0xffff
iptables -A PREROUTING -t mangle -i eth1 \
	-m mark ! --mark 0xffff -j DROP
iptables -A PREROUTING -t mangle -i eth2 -m cluster \
	--cluster-total-nodes 2 --cluster-local-node 1 \
	--cluster-proc-name eth2 -j MARK --set-mark 0xffff
iptables -A PREROUTING -t mangle -i eth2 \
	-m mark ! --mark 0xffff -j DROP

And the following commands to make all nodes see the same packets:

ip maddr add 01:00:5e:00:01:01 dev eth1
ip maddr add 01:00:5e:00:01:02 dev eth2
arptables -I OUTPUT -o eth1 --h-length 6 \
	-j mangle --mangle-mac-s 01:00:5e:00:01:01
arptables -I INPUT -i eth1 --h-length 6 \
	--destination-mac 01:00:5e:00:01:01 \
	-j mangle --mangle-mac-d 00:zz:yy:xx:5a:27
arptables -I OUTPUT -o eth2 --h-length 6 \
	-j mangle --mangle-mac-s 01:00:5e:00:01:02
arptables -I INPUT -i eth2 --h-length 6 \
	--destination-mac 01:00:5e:00:01:02 \
	-j mangle --mangle-mac-d 00:zz:yy:xx:5a:27

In the case of TCP connections, pickup facility has to be disabled
to avoid marking TCP ACK packets coming in the reply direction as
valid.

echo 0 > /proc/sys/net/netfilter/nf_conntrack_tcp_loose

BTW, some final notes:

 * This match mangles the skbuff pkt_type in case that it detects
PACKET_MULTICAST for a non-multicast address. This may be done in
a PKTTYPE target for this sole purpose.
 * This match supersedes the CLUSTERIP target.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-03-16 17:10:36 +01:00
Jan Engelhardt acc738fec0 netfilter: xtables: avoid pointer to self
Commit 784544739a (netfilter: iptables:
lock free counters) broke a number of modules whose rule data referenced
itself. A reallocation would not reestablish the correct references, so
it is best to use a separate struct that does not fall under RCU.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-03-16 15:35:29 +01:00
Jan Engelhardt d060ffc184 netfilter: install missing headers
iptables imports headers from (the unifdefed headers of a)
kernel tree, but some headers happened to not be installed.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-02-24 15:23:58 +01:00
Adam Nielsen 268cb38e18 netfilter: x_tables: add LED trigger target
Kernel module providing implementation of LED netfilter target.  Each
instance of the target appears as a led-trigger device, which can be
associated with one or more LEDs in /sys/class/leds/

Signed-off-by: Adam Nielsen <a.nielsen@shikadi.net>
Acked-by: Richard Purdie <rpurdie@linux.intel.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-02-20 10:55:14 +01:00
Stephen Hemminger 784544739a netfilter: iptables: lock free counters
The reader/writer lock in ip_tables is acquired in the critical path of
processing packets and is one of the reasons just loading iptables can cause
a 20% performance loss. The rwlock serves two functions:

1) it prevents changes to table state (xt_replace) while table is in use.
   This is now handled by doing rcu on the xt_table. When table is
   replaced, the new table(s) are put in and the old one table(s) are freed
   after RCU period.

2) it provides synchronization when accesing the counter values.
   This is now handled by swapping in new table_info entries for each cpu
   then summing the old values, and putting the result back onto one
   cpu.  On a busy system it may cause sampling to occur at different
   times on each cpu, but no packet/byte counts are lost in the process.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

Sucessfully tested on my dual quad core machine too, but iptables only (no ipv6 here)
BTW, my new "tbench 8" result is 2450 MB/s, (it was 2150 MB/s not so long ago)

Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-02-20 10:35:32 +01:00
Stephen Hemminger 4a2f965ca5 netfilter: x_tables: change elements in x_tables
Change to proper type on private pointer rather than anonymous void.
Keep active elements on same cache line.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-02-18 16:29:44 +01:00
Eric Leblond 5ca431f9ae netfilter: nfnetlink_log: fix per-rule qthreshold override
In NFLOG the per-rule qthreshold should overrides per-instance only
it is set. With current code, the per-rule qthreshold is 1 if not set
and it overrides the per-instance qthreshold.

This patch modifies the default xt_NFLOG threshold from 1 to
0. Thus a value of 0 means there is no per-rule setting and the instance
parameter has to apply.

Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-02-18 15:29:23 +01:00
Jaswinder Singh Rajput 9df27bab62 headers_check fix: netfilter/xt_conntrack.h
fix the following 'make headers_check' warning:

  usr/include/linux/netfilter/xt_conntrack.h:40: found __[us]{8,16,32,64} type without #include <linux/types.h>

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
2009-01-30 20:00:47 +05:30
Richard Kennedy daaf83d2b9 netfilter 09/09: remove padding from struct xt_match on 64bit builds
reorder struct xt_match to remove 8 bytes of padding and make its size
128 bytes.

This saves a small amount of data space in each of the xt netfilter
modules and fits xt_match in one 128 byte cache line.

Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-12 21:18:37 -08:00
Pablo Neira Ayuso 092cab7e2c netfilter: ctnetlink: fix missing CTA_NAT_SEQ_UNSPEC
This patch fixes an inconsistency in nfnetlink_conntrack.h that
I introduced myself. The problem is that CTA_NAT_SEQ_UNSPEC is
missing from enum ctattr_natseq. This inconsistency may lead to
problems in the message parsing in userspace (if the message
contains the CTA_NAT_SEQ_* attributes, of course).

This patch breaks backward compatibility, however, the only known
client of this code is libnetfilter_conntrack which indeed crashes
because it assumes the existence of CTA_NAT_SEQ_UNSPEC to do
the parsing.

The CTA_NAT_SEQ_* attributes were introduced in 2.6.25.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-12-16 01:19:41 -08:00
Jan Engelhardt f79fca55f9 netfilter: xtables: add missing const qualifier to xt_tgchk_param
When entryinfo was a standalone parameter to functions, it used to be
"const void *". Put the const back in.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-24 16:06:17 -08:00
Pablo Neira Ayuso e6a7d3c04f netfilter: ctnetlink: remove bogus module dependency between ctnetlink and nf_nat
This patch removes the module dependency between ctnetlink and
nf_nat by means of an indirect call that is initialized when
nf_nat is loaded. Now, nf_conntrack_netlink only requires
nf_conntrack and nfnetlink.

This patch puts nfnetlink_parse_nat_setup_hook into the
nf_conntrack_core to avoid dependencies between ctnetlink,
nf_conntrack_ipv4 and nf_conntrack_ipv6.

This patch also introduces the function ctnetlink_change_nat
that is only invoked from the creation path. Actually, the
nat handling cannot be invoked from the update path since
this is not allowed. By introducing this function, we remove
the useless nat handling in the update path and we avoid
deadlock-prone code.

This patch also adds the required EAGAIN logic for nfnetlink.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-14 11:58:31 -07:00
Jan Engelhardt 916a917dfe netfilter: xtables: provide invoked family value to extensions
By passing in the family through which extensions were invoked, a bit
of data space can be reclaimed. The "family" member will be added to
the parameter structures and the check functions be adjusted.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08 11:35:20 +02:00
Jan Engelhardt a2df1648ba netfilter: xtables: move extension arguments into compound structure (6/6)
This patch does this for target extensions' destroy functions.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08 11:35:19 +02:00
Jan Engelhardt af5d6dc200 netfilter: xtables: move extension arguments into compound structure (5/6)
This patch does this for target extensions' checkentry functions.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08 11:35:19 +02:00
Jan Engelhardt 7eb3558655 netfilter: xtables: move extension arguments into compound structure (4/6)
This patch does this for target extensions' target functions.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08 11:35:19 +02:00
Jan Engelhardt 6be3d8598e netfilter: xtables: move extension arguments into compound structure (3/6)
This patch does this for match extensions' destroy functions.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08 11:35:19 +02:00
Jan Engelhardt 9b4fce7a35 netfilter: xtables: move extension arguments into compound structure (2/6)
This patch does this for match extensions' checkentry functions.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-10-08 11:35:18 +02:00