dect
/
linux-2.6
Archived
13
0
Fork 0
Commit Graph

694 Commits

Author SHA1 Message Date
Adrian Bunk 90d41122f7 [IPV6] ip6_fib.c: make code static
Make the following needlessly global code static:
- fib6_walker_lock
- struct fib6_walker_list
- fib6_walk_continue()
- fib6_walk()

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:54:38 -07:00
Patrick McHardy abcab26830 [DECNET]: Increase number of possible routing tables to 2^32
Increase the number of possible routing tables to 2^32 by replacing the
fixed sized array of pointers by a hash table and replacing iterations
over all possible table IDs by hash table walking.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:54:28 -07:00
Patrick McHardy 1b43af5480 [IPV6]: Increase number of possible routing tables to 2^32
Increase number of possible routing tables to 2^32 by replacing iterations
over all possible table IDs by hash table walking.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:54:27 -07:00
Patrick McHardy 1af5a8c4a1 [IPV4]: Increase number of possible routing tables to 2^32
Increase the number of possible routing tables to 2^32 by replacing the
fixed sized array of pointers by a hash table and replacing iterations
over all possible table IDs by hash table walking.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:54:26 -07:00
Patrick McHardy 9e762a4a89 [NET]: Introduce RTA_TABLE/FRA_TABLE attributes
Introduce RTA_TABLE route attribute and FRA_TABLE routing rule attribute
to hold 32 bit routing table IDs. Usespace compatibility is provided by
continuing to accept and send the rtm_table field, but because of its
limited size it can only carry the low 8 bits of the table ID. This
implies that if larger IDs are used, _all_ userspace programs using them
need to use RTA_TABLE.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:54:25 -07:00
Patrick McHardy 2dfe55b47e [NET]: Use u32 for routing table IDs
Use u32 for routing table IDs in net/ipv4 and net/decnet in preparation of
support for a larger number of routing tables. net/ipv6 already uses u32
everywhere and needs no further changes. No functional changes are made by
this patch.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:54:24 -07:00
Stephen Hemminger d924424aae [NEIGHBOUR]: Use ALIGN() macro.
Rather than opencoding the mask, it looks better to use ALIGN()
macro from kernel.h.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:54:23 -07:00
Steven Whitehouse a8731cbf61 [DECNET]: Covert rules to use generic code
This patch converts the DECnet rules code to use the generic
rules system created by Thomas Graf <tgraf@suug.ch>.

Signed-off-by: Steven Whitehouse <steve@chygwyn.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:54:15 -07:00
Herbert Xu 8f491069b4 [IPV4]: Use network-order dport for all visible inet_lookup_*
Right now most inet_lookup_* functions take a host-order hnum instead
of a network-order dport because that's how it is represented
internally.

This means that users of these functions have to be careful about
using the right byte-order.  To add more confusion, inet_lookup takes
a network-order dport unlike all other functions.

So this patch changes all visible inet_lookup functions to take a
dport and move all dport->hnum conversion inside them.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:54:14 -07:00
Herbert Xu 99a92ff504 [IPV4]: Uninline inet_lookup_listener
By modern standards this function is way too big to be inlined.  It's
even bigger than __inet_lookup_listener :)

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:54:11 -07:00
Louis Nyffenegger 1a01912ae0 [INET]: Remove is_setbyuser patch
The value is_setbyuser from struct ip_options is never used and set
only one time (http://linux-net.osdl.org/index.php/TODO#IPV4).
This little patch removes it from the kernel source.

Signed-off-by: Louis Nyffenegger <louis.nyffenegger@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:54:10 -07:00
David S. Miller 0298f36a57 [IPV4]: Kill fib4_rules_clean().
As noted by Adrian Bunk this function is totally unused.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:54:09 -07:00
Adrian Bunk 8ce11e6a9f [NET]: Make code static.
This patch makes needlessly global code static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:54:07 -07:00
Thomas Graf 9067c722cf [NEIGH]: Move netlink neighbour bits to linux/neighbour.h
Moves netlink neighbour bits to linux/neighbour.h. Also
moves bits to be exported to userspace from net/neighbour.h
to linux/neighbour.h and removes __KERNEL__ guards, userspace
is not supposed to be using it.

rtnetlink_rcv_msg() is not longer required to parse attributes
for the neighbour layer, remove dependency on obsolete and
buggy rta_buf.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:54:01 -07:00
Thomas Graf fe4944e59c [NETLINK]: Extend netlink messaging interface
Adds:
 nlmsg_get_pos()                 return current position in message
 nlmsg_trim()                    trim part of message
 nla_reserve_nohdr(skb, len)     reserve room for an attribute w/o hdr
 nla_put_nohdr(skb, len, data)   add attribute w/o hdr
 nla_find_nested()               find attribute in nested attributes

Fixes nlmsg_new() to take allocation flags and consider size.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:53:43 -07:00
Thomas Graf e1ef4bf23b [IPV4]: Use Protocol Independant Policy Routing Rules Framework
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:53:42 -07:00
Thomas Graf 101367c2f8 [IPV6]: Policy Routing Rules
Adds support for policy routing rules including a new
local table for routes with a local destination.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:53:41 -07:00
Thomas Graf 14c0b97ddf [NET]: Protocol Independant Policy Routing Rules Framework
Derived from net/ipv/fib_rules.c

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:53:40 -07:00
Thomas Graf c71099acce [IPV6]: Multiple Routing Tables
Adds the framework to support multiple IPv6 routing tables.
Currently all automatically generated routes are put into the
same table. This could be changed at a later point after
considering the produced locking overhead.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:53:39 -07:00
Paul Moore 11a03f78fb [NetLabel]: core network changes
Changes to the core network stack to support the NetLabel subsystem.  This
includes changes to the IPv4 option handling to support CIPSO labels.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:53:32 -07:00
Venkat Yekkirala 4237c75c0a [MLSXFRM]: Auto-labeling of child sockets
This automatically labels the TCP, Unix stream, and dccp child sockets
as well as openreqs to be at the same MLS level as the peer. This will
result in the selection of appropriately labeled IPSec Security
Associations.

This also uses the sock's sid (as opposed to the isec sid) in SELinux
enforcement of secmark in rcv_skb and postroute_last hooks.

Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:53:29 -07:00
Venkat Yekkirala cb969f072b [MLSXFRM]: Default labeling of socket specific IPSec policies
This defaults the label of socket-specific IPSec policies to be the
same as the socket they are set on.

Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:53:28 -07:00
Venkat Yekkirala beb8d13bed [MLSXFRM]: Add flow labeling
This labels the flows that could utilize IPSec xfrms at the points the
flows are defined so that IPSec policy and SAs at the right label can
be used.

The following protos are currently not handled, but they should
continue to be able to use single-labeled IPSec like they currently
do.

ipmr
ip_gre
ipip
igmp
sit
sctp
ip6_tunnel (IPv6 over IPv6 tunnel device)
decnet

Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:53:27 -07:00
Venkat Yekkirala e0d1caa7b0 [MLSXFRM]: Flow based matching of xfrm policy and state
This implements a seemless mechanism for xfrm policy selection and
state matching based on the flow sid. This also includes the necessary
SELinux enforcement pieces.

Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:53:24 -07:00
Venkat Yekkirala b6340fcd76 [MLSXFRM]: Add security sid to flowi
This adds security to flow key for labeling of flows as also to allow
for making flow cache lookups based on the security label seemless.

Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:53:23 -07:00
Venkat Yekkirala 892c141e62 [MLSXFRM]: Add security sid to sock
This adds security for IP sockets at the sock level. Security at the
sock level is needed to enforce the SELinux security policy for
security associations even when a sock is orphaned (such as in the TCP
LAST_ACK state).

This will also be used to enforce SELinux controls over data arriving
at or leaving a child socket while it's still waiting to be accepted.

Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-22 14:53:22 -07:00
Herbert Xu e4d5b79c66 [CRYPTO] users: Use crypto_comp and crypto_has_*
This patch converts all users to use the new crypto_comp type and the
crypto_has_* functions.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-09-21 11:46:22 +10:00
Herbert Xu 1b489e11d4 [SCTP]: Use HMAC template and hash interface
This patch converts SCTP to use the new HMAC template and hash interface.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-21 11:46:19 +10:00
Herbert Xu 07d4ee583e [IPSEC]: Use HMAC template and hash interface
This patch converts IPsec to use the new HMAC template.  The names of
existing simple digest algorithms may still be used to refer to their
HMAC composites.

The same structure can be used by other MACs such as AES-XCBC-MAC.

This patch also switches from the digest interface to hash.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-09-21 11:46:18 +10:00
Herbert Xu 6b7326c849 [IPSEC] ESP: Use block ciphers where applicable
This patch converts IPSec/ESP to use the new block cipher type where
applicable.  Similar to the HMAC conversion, existing algorithm names
have been kept for compatibility.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-09-21 11:46:14 +10:00
Herbert Xu 04ff126094 [IPSEC]: Add compatibility algorithm name support
This patch adds a compatibility name field for each IPsec algorithm.  This
is needed when parameterised algorithms are used.  For example, "md5" will
become "hmac(md5)", and "aes" will become "cbc(aes)".

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-09-21 11:46:14 +10:00
Herbert Xu 9409f38a0c [IPSEC]: Move linux/crypto.h inclusion out of net/xfrm.h
The header file linux/crypto.h is only needed by a few files so including
it in net/xfrm.h (which is included by half of the networking stack) is a
waste.  This patch moves it out of net/xfrm.h and into the specific header
files that actually need it.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-09-21 11:16:30 +10:00
Jeff Garzik a422142cfd Merge branch 'master' into upstream 2006-08-29 17:20:55 -04:00
Sridhar Samudrala c164a9ba0a Fix sctp privilege elevation (CVE-2006-3745)
sctp_make_abort_user() now takes the msg_len along with the msg
so that we don't have to recalculate the bytes in iovec.
It also uses memcpy_fromiovec() so that we don't go beyond the
length allocated.

It is good to have this fix even if verify_iovec() is fixed to
return error on overflow.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-08-22 12:52:23 -07:00
Jeff Garzik 242898be7a Merge branch 'master' into upstream 2006-08-07 06:38:41 -04:00
Ilpo Järvinen c4c0ce5c57 [PKT_SCHED] RED: Fix overflow in calculation of queue average
Overflow can occur very easily with 32 bits, e.g., with 1 second
us_idle is approx. 2^20, which leaves only 11-Wlog bits for queue
length. Since the EWMA exponent is typically around 9, queue
lengths larger than 2^2 cause overflow. Whether the affected
branch is taken when us_idle is as high as 1 second, depends on
Scell_log, but with rather reasonable configuration Scell_log is
large enough to cause p->Stab to have zero index, which always
results zero shift (typically also few other small indices result
in zero shift).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-04 22:59:51 -07:00
Jeff Garzik 23946a8a98 Merge branch 'upstream-fixes' into upstream 2006-08-03 17:20:37 -04:00
Alexey Dobriyan 29bbd72d6e [NET]: Fix more per-cpu typos
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-02 15:02:31 -07:00
Catherine Zhang dc49c1f94e [AF_UNIX]: Kernel memory leak fix for af_unix datagram getpeersec patch
From: Catherine Zhang <cxzhang@watson.ibm.com>

This patch implements a cleaner fix for the memory leak problem of the
original unix datagram getpeersec patch.  Instead of creating a
security context each time a unix datagram is sent, we only create the
security context when the receiver requests it.

This new design requires modification of the current
unix_getsecpeer_dgram LSM hook and addition of two new hooks, namely,
secid_to_secctx and release_secctx.  The former retrieves the security
context and the latter releases it.  A hook is required for releasing
the security context because it is up to the security module to decide
how that's done.  In the case of Selinux, it's a simple kfree
operation.

Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-02 14:12:06 -07:00
Tom Tucker 792d1932e3 [NET]: Network Event Notifier Mechanism.
This patch uses notifier blocks to implement a network event
notifier mechanism.

Clients register their callback function by calling
register_netevent_notifier() like this:

static struct notifier_block nb = {
        .notifier_call = my_callback_func
};

...

register_netevent_notifier(&nb);

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-02 13:38:20 -07:00
Wei Yongjun 3687b1dc6f [TCP]: SNMPv2 tcpAttemptFails counter error
Refer to RFC2012, tcpAttemptFails is defined as following:
  tcpAttemptFails OBJECT-TYPE
      SYNTAX      Counter32
      MAX-ACCESS  read-only
      STATUS      current
      DESCRIPTION
              "The number of times TCP connections have made a direct
              transition to the CLOSED state from either the SYN-SENT
              state or the SYN-RCVD state, plus the number of times TCP
              connections have made a direct transition to the LISTEN
              state from the SYN-RCVD state."
      ::= { tcp 7 }

When I lookup into RFC793, I found that the state change should occured
under following condition:
  1. SYN-SENT -> CLOSED
     a) Received ACK,RST segment when SYN-SENT state.

  2. SYN-RCVD -> CLOSED
     b) Received SYN segment when SYN-RCVD state(came from LISTEN).
     c) Received RST segment when SYN-RCVD state(came from SYN-SENT).
     d) Received SYN segment when SYN-RCVD state(came from SYN-SENT).

  3. SYN-RCVD -> LISTEN
     e) Received RST segment when SYN-RCVD state(came from LISTEN).

In my test, those direct state transition can not be counted to
tcpAttemptFails.

Signed-off-by: Wei Yongjun <yjwei@nanjing-fnst.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-02 13:38:19 -07:00
Herbert Xu 497c615aba [IPV6]: Audit all ip6_dst_lookup/ip6_dst_store calls
The current users of ip6_dst_lookup can be divided into two classes:

1) The caller holds no locks and is in user-context (UDP).
2) The caller does not want to lookup the dst cache at all.

The second class covers everyone except UDP because most people do
the cache lookup directly before calling ip6_dst_lookup.  This patch
adds ip6_sk_dst_lookup for the first class.

Similarly ip6_dst_store users can be divded into those that need to
take the socket dst lock and those that don't.  This patch adds
__ip6_dst_store for those (everyone except UDP/datagram) that don't
need an extra lock.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-08-02 13:38:14 -07:00
Daniel Drake f2060f039e [PATCH] ieee80211: Make ieee80211_rx_any usable
ieee80211_rx_any is new to 2.6.18-rc1, even though it appears this function
was never completed:

http://lists.sipsolutions.net/pipermail/softmac-dev/2006-February/000103.html

This patch changes ieee80211_rx_any to always claim the skb, which avoids
further driver complexity and the possibility of leaking management frames.
It also exports the function so that people can actually use it.

Signed-off-by: Daniel Drake <dsd@gentoo.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-07-27 16:17:28 -04:00
Daniel Drake d7712ac254 [PATCH] softmac: export highest_supported_rate function
zd1211 needs this functionality, no point duplicating it.

Signed-off-by: Daniel Drake <dsd@gentoo.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-07-27 16:17:28 -04:00
Daniel Drake 5acd0c4153 [PATCH] softmac: ERP handling and driver-level notifications
This patch implements ERP handling in softmac so that the drivers can support
protection and preambles properly.

I added a new struct, ieee80211softmac_bss_info, which is used for
BSS-dependent variables like these.

A new hook has been added (bssinfo_change), which allows the drivers to be
notified when anything in bssinfo changes.

I modified the txrates_change API to match the bssinfo_change API. The
existing one is a little messy and the usefulness of providing the old rates
is questionable (and can be implemented at driver level if really necessary).
No drivers are using this API (yet), so this should be safe.

Signed-off-by: Daniel Drake <dsd@gentoo.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-07-27 16:17:28 -04:00
Daniel Drake d8e2be90d3 [PATCH] ieee80211: small ERP handling additions
This adds a flag to the ieee80211_network structure which indicates whether
the stored erp_value is valid (a check against 0 is not enough, since an ERP
of 0 is valid and very meaningful).

I also added the ERP IE bit-definitions to ieee80211.h.

This is needed by some upcoming softmac patches.

Signed-off-by: Daniel Drake <dsd@gentoo.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-07-27 16:17:27 -04:00
Guillaume Chazarain 2266d8886f [PKT_SCHED]: Fix regression in PSCHED_TADD{,2}.
In PSCHED_TADD and PSCHED_TADD2, if delta is less than tv.tv_usec (so,
less than USEC_PER_SEC too) then tv_res will be smaller than tv. The
affectation "(tv_res).tv_usec = __delta;" is wrong.  The fix is to
revert to the original code before
4ee303dfea and change the 'if' in
'while'.

[Shuya MAEDA: "while (__delta >= USEC_PER_SEC){ ... }" instead of
"while (__delta > USEC_PER_SEC){ ... }"]

Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-24 12:44:23 -07:00
Adrian Bunk 64d2f0855e [I/OAT]: net/core/user_dma.c should #include <net/netdma.h>
Every file should #include the headers containing the prototypes for
its global functions.

Especially in cases like this one where gcc can tell us through a
compile error that the prototype was wrong...

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-21 14:49:49 -07:00
Sridhar Samudrala dc022a9874 [SCTP]: ADDIP: Don't use an address as source until it is ASCONF-ACKed
This implements Rules D1 and D4 of Sec 4.3 in the ADDIP draft.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-21 14:49:25 -07:00
Sridhar Samudrala ad8fec1720 [SCTP]: Verify all the paths to a peer via heartbeat before using them.
This patch implements Path Initialization procedure as described in
Sec 2.36 of RFC4460.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-21 14:48:50 -07:00
Balbir Singh fb0ba6bd02 [PATCH] per-task-delay-accounting: utilities for genetlink usage
Two utilities for simplifying usage of NETLINK_GENERIC interface.

Signed-off-by: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com>
Cc: Jes Sorensen <jes@sgi.com>
Cc: Peter Chubb <peterc@gelato.unsw.edu.au>
Cc: Erich Focht <efocht@ess.nec.de>
Cc: Levent Serinol <lserinol@gmail.com>
Cc: Jay Lan <jlan@engr.sgi.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-14 21:53:56 -07:00
Herbert Xu a430a43d08 [NET] gso: Fix up GSO packets with broken checksums
Certain subsystems in the stack (e.g., netfilter) can break the partial
checksum on GSO packets.  Until they're fixed, this patch allows this to
work by recomputing the partial checksums through the GSO mechanism.

Once they've all been converted to update the partial checksum instead of
clearing it, this workaround can be removed.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-08 13:34:56 -07:00
Joseph Jezak cb74c432e3 [PATCH] SoftMAC: Prevent multiple authentication attempts on the same network
This patch addresses the "No queue exists" messages commonly seen during
authentication and associating.  These appear due to scheduling multiple
authentication attempts on the same network.  To prevent this, I added a
flag to stop multiple authentication attempts by the association layer.
I also added a check to the wx handler to see if we're connecting to a
different network than the one already in progress.  This scenario was
causing multiple requests on the same network because the network BSSID
was not being updated despite the fact that the ESSID changed.

Signed-off-by: Joseph Jezak <josejx@gentoo.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-07-05 13:42:58 -04:00
Marcel Holtmann a91f2e396f [Bluetooth] Use real devices for host controllers
This patch converts the Bluetooth class devices into real devices. The
Bluetooth class is kept and the driver core provides the appropriate
symlinks for backward compatibility.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2006-07-03 19:54:02 -07:00
Marcel Holtmann 27d3528425 [Bluetooth] Add platform device for virtual and serial devices
This patch adds a generic Bluetooth platform device that can be used
as parent device by virtual and serial devices.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2006-07-03 19:54:00 -07:00
Marcel Holtmann 04837f6447 [Bluetooth] Add automatic sniff mode support
This patch introduces the automatic sniff mode feature. This allows
the host to switch idle connections into sniff mode to safe power.

Signed-off-by: Ulisses Furquim <ulissesf@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2006-07-03 19:53:58 -07:00
Marcel Holtmann da1f519851 [Bluetooth] Correct SCO buffer size on request
This patch introduces a quirk that allows the drivers to tell the host
to correct the SCO buffer size values.

Signed-off-by: Olivier Galibert <galibert@pobox.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2006-07-03 19:53:56 -07:00
Ralf Baechle DL5RB 006f68b84f [AX.25]: Reference counting for AX.25 routes.
In the past routes could be freed even though the were possibly in use ...

Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-07-03 19:30:18 -07:00
Ingo Molnar a5b5bb9a05 [PATCH] lockdep: annotate sk_locks
Teach sk_lock semantics to the lock validator.  In the softirq path the
slock has mutex_trylock()+mutex_unlock() semantics, in the process context
sock_lock() case it has mutex_lock()/mutex_unlock() semantics.

Thus we treat sock_owned_by_user() flagged areas as an exclusion area too,
not just those areas covered by a held sk_lock.slock.

Effect on non-lockdep kernels: minimal, sk_lock_sock_init() has been turned
into an inline function.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:10 -07:00
Ingo Molnar c636618485 [PATCH] lockdep: annotate bh_lock_sock()
Teach special (recursive) locking code to the lock validator.  Has no effect
on non-lockdep kernels.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:08 -07:00
Ingo Molnar a09785a241 [PATCH] lockdep: annotate af_unix locking
Teach special (recursive) locking code to the lock validator.  Also splits
af_unix's sk_receive_queue.lock class from the other networking skb-queue
locks.  Has no effect on non-lockdep kernels.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:07 -07:00
Ingo Molnar da21f24dd7 [PATCH] lockdep: annotate sock_lock_init()
Teach special (multi-initialized, per-address-family) locking code to the lock
validator.  Has no effect on non-lockdep kernels.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:07 -07:00
Thomas Gleixner 1fb9df5d30 [PATCH] irq-flags: drivers/net: Use the new IRQF_ constants
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-02 13:58:51 -07:00
Herbert Xu f83ef8c0b5 [IPV6]: Added GSO support for TCPv6
This patch adds GSO support for IPv6 and TCPv6.  This is based on a patch
by Ananda Raju <Ananda.Raju@neterion.com>.  His original description is:

	This patch enables TSO over IPv6. Currently Linux network stacks
	restricts TSO over IPv6 by clearing of the NETIF_F_TSO bit from
	"dev->features". This patch will remove this restriction.

	This patch will introduce a new flag NETIF_F_TSO6 which will be used
	to check whether device supports TSO over IPv6. If device support TSO
	over IPv6 then we don't clear of NETIF_F_TSO and which will make the
	TCP layer to create TSO packets. Any device supporting TSO over IPv6
	will set NETIF_F_TSO6 flag in "dev->features" along with NETIF_F_TSO.

	In case when user disables TSO using ethtool, NETIF_F_TSO will get
	cleared from "dev->features". So even if we have NETIF_F_TSO6 we don't
	get TSO packets created by TCP layer.

	SKB_GSO_TCPV4 renamed to SKB_GSO_TCP to make it generic GSO packet.
	SKB_GSO_UDPV4 renamed to SKB_GSO_UDP as UFO is not a IPv4 feature.
	UFO is supported over IPv6 also

	The following table shows there is significant improvement in
	throughput with normal frames and CPU usage for both normal and jumbo.

	--------------------------------------------------
	|          |     1500        |      9600         |
	|          ------------------|-------------------|
	|          | thru     CPU    |  thru     CPU     |
	--------------------------------------------------
	| TSO OFF  | 2.00   5.5% id  |  5.66   20.0% id  |
	--------------------------------------------------
	| TSO ON   | 2.63   78.0 id  |  5.67   39.0% id  |
	--------------------------------------------------

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-30 14:12:10 -07:00
Herbert Xu bcd7611117 [NET]: Generalise TSO-specific bits from skb_setup_caps
This patch generalises the TSO-specific bits from sk_setup_caps by adding
the sk_gso_type member to struct sock.  This makes sk_setup_caps generic
so that it can be used by TCPv6 or UFO.

The only catch is that whoever uses this must provide a GSO implementation
for their protocol which I think is a fair deal :) For now UFO continues to
live without a GSO implementation which is OK since it doesn't use the sock
caps field at the moment.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-30 14:12:08 -07:00
Herbert Xu adcfc7d0b4 [IPV6]: Added GSO support for TCPv6
This patch adds GSO support for IPv6 and TCPv6.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-30 14:12:06 -07:00
Michael Chan b0da853703 [NET]: Add ECN support for TSO
In the current TSO implementation, NETIF_F_TSO and ECN cannot be
turned on together in a TCP connection.  The problem is that most
hardware that supports TSO does not handle CWR correctly if it is set
in the TSO packet.  Correct handling requires CWR to be set in the
first packet only if it is set in the TSO header.

This patch adds the ability to turn on NETIF_F_TSO and ECN using
GSO if necessary to handle TSO packets with CWR set.  Hardware
that handles CWR correctly can turn on NETIF_F_TSO_ECN in the dev->
features flag.

All TSO packets with CWR set will have the SKB_GSO_TCPV4_ECN set.  If
the output device does not have the NETIF_F_TSO_ECN feature set, GSO
will split the packet up correctly with CWR only set in the first
segment.

With help from Herbert Xu <herbert@gondor.apana.org.au>.

Since ECN can always be enabled with TSO, the SOCK_NO_LARGESEND sock
flag is completely removed.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29 16:58:08 -07:00
Catherine Zhang 877ce7c1b3 [AF_UNIX]: Datagram getpeersec
This patch implements an API whereby an application can determine the
label of its peer's Unix datagram sockets via the auxiliary data mechanism of
recvmsg.

Patch purpose:

This patch enables a security-aware application to retrieve the
security context of the peer of a Unix datagram socket.  The application
can then use this security context to determine the security context for
processing on behalf of the peer who sent the packet.

Patch design and implementation:

The design and implementation is very similar to the UDP case for INET
sockets.  Basically we build upon the existing Unix domain socket API for
retrieving user credentials.  Linux offers the API for obtaining user
credentials via ancillary messages (i.e., out of band/control messages
that are bundled together with a normal message).  To retrieve the security
context, the application first indicates to the kernel such desire by
setting the SO_PASSSEC option via getsockopt.  Then the application
retrieves the security context using the auxiliary data mechanism.

An example server application for Unix datagram socket should look like this:

toggle = 1;
toggle_len = sizeof(toggle);

setsockopt(sockfd, SOL_SOCKET, SO_PASSSEC, &toggle, &toggle_len);
recvmsg(sockfd, &msg_hdr, 0);
if (msg_hdr.msg_controllen > sizeof(struct cmsghdr)) {
    cmsg_hdr = CMSG_FIRSTHDR(&msg_hdr);
    if (cmsg_hdr->cmsg_len <= CMSG_LEN(sizeof(scontext)) &&
        cmsg_hdr->cmsg_level == SOL_SOCKET &&
        cmsg_hdr->cmsg_type == SCM_SECURITY) {
        memcpy(&scontext, CMSG_DATA(cmsg_hdr), sizeof(scontext));
    }
}

sock_setsockopt is enhanced with a new socket option SOCK_PASSSEC to allow
a server socket to receive security context of the peer.

Testing:

We have tested the patch by setting up Unix datagram client and server
applications.  We verified that the server can retrieve the security context
using the auxiliary data mechanism of recvmsg.

Signed-off-by: Catherine Zhang <cxzhang@watson.ibm.com>
Acked-by: Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29 16:58:06 -07:00
Shuya MAEDA 4ee303dfea [PKT_SCHED]: PSCHED_TADD() and PSCHED_TADD2() can result,tv_usec >= 1000000
Signed-off-by: Shuya MAEDA <maeda-sxb@necst.nec.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29 16:58:01 -07:00
Herbert Xu 576a30eb64 [NET]: Added GSO header verification
When GSO packets come from an untrusted source (e.g., a Xen guest domain),
we need to verify the header integrity before passing it to the hardware.

Since the first step in GSO is to verify the header, we can reuse that
code by adding a new bit to gso_type: SKB_GSO_DODGY.  Packets with this
bit set can only be fed directly to devices with the corresponding bit
NETIF_F_GSO_ROBUST.  If the device doesn't have that bit, then the skb
is fed to the GSO engine which will allow the packet to be sent to the
hardware if it passes the header check.

This patch changes the sg flag to a full features flag.  The same method
can be used to implement TSO ECN support.  We simply have to mark packets
with CWR set with SKB_GSO_ECN so that only hardware with a corresponding
NETIF_F_TSO_ECN can accept them.  The GSO engine can either fully segment
the packet, or segment the first MTU and pass the rest to the hardware for
further segmentation.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29 16:57:53 -07:00
Allan Stephens 4938450789 [TIPC]: Corrected potential misuse of tipc_media_addr structure.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:38:29 -07:00
Randy Dunlap f4b8ea7849 [NET]: fix net-core kernel-doc
Warning(/var/linsrc/linux-2617-g4//include/linux/skbuff.h:304): No description found for parameter 'dma_cookie'
Warning(/var/linsrc/linux-2617-g4//include/net/sock.h:1274): No description found for parameter 'copied_early'
Warning(/var/linsrc/linux-2617-g4//net/core/dev.c:3309): No description found for parameter 'chan'
Warning(/var/linsrc/linux-2617-g4//net/core/dev.c:3309): No description found for parameter 'event'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 02:07:42 -07:00
Herbert Xu 37c3185a02 [NET]: Added GSO toggle
This patch adds a generic segmentation offload toggle that can be turned
on/off for each net device.  For now it only supports in TCPv4.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 02:07:36 -07:00
Herbert Xu f4c50d990d [NET]: Add software TSOv4
This patch adds the GSO implementation for IPv4 TCP.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 02:07:33 -07:00
Herbert Xu 7967168cef [NET]: Merge TSO/UFO fields in sk_buff
Having separate fields in sk_buff for TSO/UFO (tso_size/ufo_size) is not
going to scale if we add any more segmentation methods (e.g., DCCP).  So
let's merge them.

They were used to tell the protocol of a packet.  This function has been
subsumed by the new gso_type field.  This is essentially a set of netdev
feature bits (shifted by 16 bits) that are required to process a specific
skb.  As such it's easy to tell whether a given device can process a GSO
skb: you just have to and the gso_type field and the netdev's features
field.

I've made gso_type a conjunction.  The idea is that you have a base type
(e.g., SKB_GSO_TCPV4) that can be modified further to support new features.
For example, if we add a hardware TSO type that supports ECN, they would
declare NETIF_F_TSO | NETIF_F_TSO_ECN.  All TSO packets with CWR set would
have a gso_type of SKB_GSO_TCPV4 | SKB_GSO_TCPV4_ECN while all other TSO
packets would be SKB_GSO_TCPV4.  This means that only the CWR packets need
to be emulated in software.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 02:07:29 -07:00
Jeff Garzik dbe1ab9514 Merge branch 'master' into upstream 2006-06-22 22:51:46 -04:00
Linus Torvalds a4cfae13ce Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
  [ATM]: fix broken uses of NIPQUAD in net/atm
  [SCTP]: sctp_unpack_cookie() fix
  [SCTP]: Fix unintentional change to SCTP_ASSERT when !SCTP_DEBUG
  [NET]: Prevent multiple qdisc runs
  [CONNECTOR]: Initialize subsystem earlier.
  [NETFILTER]: xt_sctp: fix endless loop caused by 0 chunk length
2006-06-20 17:39:53 -07:00
Linus Torvalds cee4cca740 Merge git://git.infradead.org/hdrcleanup-2.6
* git://git.infradead.org/hdrcleanup-2.6: (63 commits)
  [S390] __FD_foo definitions.
  Switch to __s32 types in joystick.h instead of C99 types for consistency.
  Add <sys/types.h> to headers included for userspace in <linux/input.h>
  Move inclusion of <linux/compat.h> out of user scope in asm-x86_64/mtrr.h
  Remove struct fddi_statistics from user view in <linux/if_fddi.h>
  Move user-visible parts of drivers/s390/crypto/z90crypt.h to include/asm-s390
  Revert include/media changes: Mauro says those ioctls are only used in-kernel(!)
  Include <linux/types.h> and use __uXX types in <linux/cramfs_fs.h>
  Use __uXX types in <linux/i2o_dev.h>, include <linux/ioctl.h> too
  Remove private struct dx_hash_info from public view in <linux/ext3_fs.h>
  Include <linux/types.h> and use __uXX types in <linux/affs_hardblocks.h>
  Use __uXX types in <linux/divert.h> for struct divert_blk et al.
  Use __u32 for elf_addr_t in <asm-powerpc/elf.h>, not u32. It's user-visible.
  Remove PPP_FCS from user view in <linux/ppp_defs.h>, remove __P mess entirely
  Use __uXX types in user-visible structures in <linux/nbd.h>
  Don't use 'u32' in user-visible struct ip_conntrack_old_tuple.
  Use __uXX types for S390 DASD volume label definitions which are user-visible
  S390 BIODASDREADCMB ioctl should use __u64 not u64 type.
  Remove unneeded inclusion of <linux/time.h> from <linux/ufs_fs.h>
  Fix private integer types used in V4L2 ioctls.
  ...

Manually resolve conflict in include/linux/mtd/physmap.h
2006-06-20 15:10:08 -07:00
Jeff Garzik 4b2d9cf009 Merge branch 'upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into upstream 2006-06-20 04:46:02 -04:00
David S. Miller 65fd28f743 [SCTP]: Fix unintentional change to SCTP_ASSERT when !SCTP_DEBUG
A local debugging change slipped into a previous changeset.

When SCTP_DEBUG is off SCTP_ASSERT should do nothing.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-20 00:07:52 -07:00
Herbert Xu 48d83325b6 [NET]: Prevent multiple qdisc runs
Having two or more qdisc_run's contend against each other is bad because
it can induce packet reordering if the packets have to be requeued.  It
appears that this is an unintended consequence of relinquinshing the queue
lock while transmitting.  That in turn is needed for devices that spend a
lot of time in their transmit routine.

There are no advantages to be had as devices with queues are inherently
single-threaded (the loopback device is not but then it doesn't have a
queue).

Even if you were to add a queue to a parallel virtual device (e.g., bolt
a tbf filter in front of an ipip tunnel device), you would still want to
process the queue in sequence to ensure that the packets are ordered
correctly.

The solution here is to steal a bit from net_device to prevent this.

BTW, as qdisc_restart is no longer used by anyone as a module inside the
kernel (IIRC it used to with netif_wake_queue), I have not exported the
new __qdisc_run function.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-19 23:57:59 -07:00
Linus Torvalds d0b952a983 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (109 commits)
  [ETHTOOL]: Fix UFO typo
  [SCTP]: Fix persistent slowdown in sctp when a gap ack consumes rx buffer.
  [SCTP]: Send only 1 window update SACK per message.
  [SCTP]: Don't do CRC32C checksum over loopback.
  [SCTP] Reset rtt_in_progress for the chunk when processing its sack.
  [SCTP]: Reject sctp packets with broadcast addresses.
  [SCTP]: Limit association max_retrans setting in setsockopt.
  [PFKEYV2]: Fix inconsistent typing in struct sadb_x_kmprivate.
  [IPV6]: Sum real space for RTAs.
  [IRDA]: Use put_unaligned() in irlmp_do_discovery().
  [BRIDGE]: Add support for NETIF_F_HW_CSUM devices
  [NET]: Add NETIF_F_GEN_CSUM and NETIF_F_ALL_CSUM
  [TG3]: Convert to non-LLTX
  [TG3]: Remove unnecessary tx_lock
  [TCP]: Add tcp_slow_start_after_idle sysctl.
  [BNX2]: Update version and reldate
  [BNX2]: Use CPU native page size
  [BNX2]: Use compressed firmware
  [BNX2]: Add firmware decompression
  [BNX2]: Allow WoL settings on new 5708 chips
  ...

Manual fixup for conflict in drivers/net/tulip/winbond-840.c
2006-06-19 18:55:56 -07:00
Vlad Yasevich 4c9f5d5305 [SCTP] Reset rtt_in_progress for the chunk when processing its sack.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 22:56:08 -07:00
Vlad Yasevich 5636bef732 [SCTP]: Reject sctp packets with broadcast addresses.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 22:55:35 -07:00
David S. Miller 35089bb203 [TCP]: Add tcp_slow_start_after_idle sysctl.
A lot of people have asked for a way to disable tcp_cwnd_restart(),
and it seems reasonable to add a sysctl to do that.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:53 -07:00
Stephen Hemminger bdeb04c6d9 [NET]: net.ipv4.ip_autoconfig sysctl removal
The sysctl net.ipv4.ip_autoconfig is a legacy value that is not used.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:30 -07:00
Herbert Xu b38dfee3d6 [NET]: skb_trim audit
I found a few more spots where pskb_trim_rcsum could be used but were not.
This patch changes them to use it.

Also, sk_filter can get paged skb data.  Therefore we must use pskb_trim
instead of skb_trim.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:20 -07:00
James Morris 7c9728c393 [SECMARK]: Add secmark support to conntrack
Add a secmark field to IP and NF conntracks, so that security markings
on packets can be copied to their associated connections, and also
copied back to packets as required.  This is similar to the network
mark field currently used with conntrack, although it is intended for
enforcement of security policy rather than network policy.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:01 -07:00
Alexey Dobriyan 6d74165350 [IPV4]: Right prototype of __raw_v4_lookup()
All users pass 32-bit values as addresses and internally they're
compared with 32-bit entities. So, change "laddr" and "raddr" types to
__be32.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:39 -07:00
Stephen Hemminger 72dc5b9225 [TCP]: Minimum congestion window consolidation.
Many of the TCP congestion methods all just use ssthresh
as the minimum congestion window on decrease.  Rather than
duplicating the code, just have that be the default if that
handle in the ops structure is not set.

Minor behaviour change to TCP compound.  It probably wants
to use this (ssthresh) as lower bound, rather than ssthresh/2
because the latter causes undershoot on loss.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:29 -07:00
Patrick McHardy 39a27a35c5 [NETFILTER]: conntrack: add sysctl to disable checksumming
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:57 -07:00
Herbert Xu 73654d61e5 [IPSEC] xfrm: Use IPPROTO_MAX instead of 256
The size of the type_map array (256) comes from the number of IP protocols,
i.e., IPPROTO_MAX.  This patch is based on a suggestion from Ingo Oeser.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:43 -07:00
Herbert Xu b59f45d0b2 [IPSEC] xfrm: Abstract out encapsulation modes
This patch adds the structure xfrm_mode.  It is meant to represent
the operations carried out by transport/tunnel modes.

By doing this we allow additional encapsulation modes to be added
without clogging up the xfrm_input/xfrm_output paths.

Candidate modes include 4-to-6 tunnel mode, 6-to-4 tunnel mode, and
BEET modes.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:39 -07:00
Herbert Xu 546be2405b [IPSEC] xfrm: Undo afinfo lock proliferation
The number of locks used to manage afinfo structures can easily be reduced
down to one each for policy and state respectively.  This is based on the
observation that the write locks are only held by module insertion/removal
which are very rare events so there is no need to further differentiate
between the insertion of modules like ipv6 versus esp6.

The removal of the read locks in xfrm4_policy.c/xfrm6_policy.c might look
suspicious at first.  However, after you realise that nobody ever takes
the corresponding write lock you'll feel better :)

As far as I can gather it's an attempt to guard against the removal of
the corresponding modules.  Since neither module can be unloaded at all
we can leave it to whoever fixes up IPv6 unloading :)

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:37 -07:00
Stephen Hemminger bc0e646796 [LLC]: add multicast support for datagrams
Allow mulitcast reception of datagrams (similar to UDP).
All sockets bound to the same SAP receive a clone.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:26:08 -07:00
Stephen Hemminger aecbd4e45c [LLC]: use more efficient ether address routines
Use more cache efficient Ethernet address manipulation functions
in etherdevice.h.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
2006-06-17 21:26:00 -07:00
Chris Leech 9593782585 [I/OAT]: Add a sysctl for tuning the I/OAT offloaded I/O threshold
Any socket recv of less than this ammount will not be offloaded

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:25:54 -07:00
Chris Leech 624d116473 [I/OAT]: Make sk_eat_skb I/OAT aware.
Add an extra argument to sk_eat_skb, and make it move early copied
packets to the async_wait_queue instead of freeing them.

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:25:52 -07:00
Chris Leech 0e4b4992b8 [I/OAT]: Rename cleanup_rbuf to tcp_cleanup_rbuf and make non-static
Needed to be able to call tcp_cleanup_rbuf in tcp_input.c for I/OAT

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:25:50 -07:00
Chris Leech 97fc2f0848 [I/OAT]: Structure changes for TCP recv offload to I/OAT
Adds an async_wait_queue and some additional fields to tcp_sock, and a
dma_cookie_t to sk_buff.

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:25:48 -07:00
Chris Leech de5506e155 [I/OAT]: Utility functions for offloading sk_buff to iovec copies
Provides for pinning user space pages in memory, copying to iovecs,
and copying from sk_buffs including fragmented and chained sk_buffs.

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:25:46 -07:00
Chris Leech db21733488 [I/OAT]: Setup the networking subsystem as a DMA client
Attempts to allocate per-CPU DMA channels

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:24:58 -07:00
Larry Finger 74f4903363 [PATCH] wireless: Changes to ieee80211.h for user space regulatory daemon
Attached are two small patches for include/net/ieee80211.h to prepare
for later submission of code to implement a user-space daemon that
supplies 802.11 regulatory information.

The first change adds a bit indicating that 802.11h rules are to be
applied to a channel. As discussed earlier in this list, a single bit
is unlikely to be sufficient; however, at this time I have been unable
to find any regulations implementing differences between 802.11a and
802.11h other than DFS, radar detection and passive scanning. A single
bit is thus sufficient to convey to the driver that these rules should
be obeyed.

The second change adds comments to the freq and max_power fields of
struct ieee80211_channel to indicate the units that are used.

Signed-Off-By: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-06-15 15:48:13 -04:00
Daniel Drake 6ae15df16e [PATCH] softmac: Fix handling of authentication failure
My router blew up earlier, but exhibited some interesting behaviour during
its dying moments. It was broadcasting beacons but wouldn't respond to
any authentication requests.

I noticed that softmac wasn't playing nice with this, as I couldn't make it try
to connect to other networks after it had timed out authenticating to my ill
router.

To resolve this, I modified the softmac event/notify API to pass the event
code to the callback, so that callbacks being notified from
IEEE80211SOFTMAC_EVENT_ANY masks can make some judgement. In this case, the
ieee80211softmac_assoc callback needs to make a decision based upon whether
the association passed or failed.

Signed-off-by: Daniel Drake <dsd@gentoo.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-06-05 15:51:30 -04:00
Daniel Drake 76ea4c7f4c [PATCH] softmac: complete shared key authentication
This patch finishes of the partially-complete shared key authentication
implementation in softmac.

The complication here is that we need to encrypt a management frame during
the authentication process. I don't think there are any other scenarios where
this would have to happen.

To get around this without causing too many headaches, we decided to just use
software encryption for this frame. The softmac config option now selects
IEEE80211_CRYPT_WEP so that we can ensure this available. This also involved
a modification to some otherwise unused ieee80211 API.

Signed-off-by: Daniel Drake <dsd@gentoo.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-06-05 15:51:29 -04:00
John W. Linville dea58b80f2 Merge branch 'from-linus' into upstream 2006-06-05 14:42:27 -04:00
Andrew Morton 29f767a254 [PATCH] net/compat.h build fix
From: Andrew Morton <akpm@osdl.org>

Move the forward decl outside the ifdef, since we use it in both legs.

Should fix the spacr64 build error reported in
	http://bugzilla.kernel.org/show_bug.cgi?id=6625

Acked-by: "David S. Miller" <davem@davemloft.net>
Cc: Cedric Pellerin <cedric@bidouillesoft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-31 16:27:11 -07:00
John W. Linville f587fb74b2 Merge branch 'from-linus' into upstream 2006-05-26 16:06:58 -04:00
David Woodhouse 66643de455 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:

	include/asm-powerpc/unistd.h
	include/asm-sparc/unistd.h
	include/asm-sparc64/unistd.h

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-24 09:22:21 +01:00
Alexey Dobriyan f5565f4a90 [IRDA]: fixup type of ->lsap_state
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-22 16:54:30 -07:00
John W. Linville 3b38f317e5 Merge branch 'from-linus' into upstream 2006-05-22 14:26:25 -04:00
Vladislav Yasevich dd2d1c6f29 [SCTP]: Respect the real chunk length when walking parameters.
When performing bound checks during the parameter processing, we
want to use the real chunk and paramter lengths for bounds instead
of the rounded ones.  This prevents us from potentially walking of
the end if the chunk length was miscalculated.  We still use rounded
lengths when advancing the pointer. This was found during a
conformance test that changed the chunk length without modifying
parameters.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2006-05-19 11:52:20 -07:00
Sridhar Samudrala 8de8c87380 [SCTP]: Set sk_err so that poll wakes up after a non-blocking connect failure.
Also fix some other cases where sk_err is not set for 1-1 style sockets.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2006-05-19 10:58:12 -07:00
John W. Linville 5dd8816aeb Merge branch 'from-linus' into upstream 2006-05-17 14:51:24 -04:00
Simon Kelley bd89efc532 [NEIGH]: Fix IP-over-ATM and ARP interaction.
The classical IP over ATM code maintains its own IPv4 <-> <ATM stuff>
ARP table, using the standard neighbour-table code. The
neigh_table_init function adds this neighbour table to a linked list
of all neighbor tables which is used by the functions neigh_delete()
neigh_add() and neightbl_set(), all called by the netlink code.

Once the ATM neighbour table is added to the list, there are two
tables with family == AF_INET there, and ARP entries sent via netlink
go into the first table with matching family. This is indeterminate
and often wrong.

To see the bug, on a kernel with CLIP enabled, create a standard IPv4
ARP entry by pinging an unused address on a local subnet. Then attempt
to complete that entry by doing

ip neigh replace <ip address> lladdr <some mac address> nud reachable

Looking at the ARP tables by using 

ip neigh show

will reveal two ARP entries for the same address. One of these can be
found in /proc/net/arp, and the other in /proc/net/atm/arp.

This patch adds a new function, neigh_table_init_no_netlink() which
does everything the neigh_table_init() does, except add the table to
the netlink all-arp-tables chain. In addition neigh_table_init() has a
check that all tables on the chain have a distinct address family.
The init call in clip.c is changed to call
neigh_table_init_no_netlink().

Since ATM ARP tables are rather more complicated than can currently be
handled by the available rtattrs in the netlink protocol, no
functionality is lost by this patch, and non-ATM ARP manipulation via
netlink is rescued. A more complete solution would involve a rtattr
for ATM ARP entries and some way for the netlink code to give
neigh_add and friends more information than just address family with
which to find the correct ARP table.

[ I've changed the assertion checking in neigh_table_init() to not
  use BUG_ON() while holding neigh_tbl_lock.  Instead we remember that
  we found an existing tbl with the same family, and after dropping
  the lock we'll give a diagnostic kernel log message and a stack dump.
  -DaveM ]

Signed-off-by: Simon Kelley <simon@thekelleys.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-12 14:56:08 -07:00
Stephen Hemminger 23aee82e75 Merge branch 'upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2006-05-08 16:01:20 -07:00
David Woodhouse 5047f09b56 Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-06 19:59:18 +01:00
Neil Horman 7c3ceb4fb9 [SCTP]: Allow spillover of receive buffer to avoid deadlock.
This patch fixes a deadlock situation in the receive path by allowing
temporary spillover of the receive buffer.

- If the chunk we receive has a tsn that immediately follows the ctsn,
  accept it even if we run out of receive buffer space and renege data with
  higher TSNs.
- Once we accept one chunk in a packet, accept all the remaining chunks
  even if we run out of receive buffer space.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Mark Butler <butlerm@middle.net>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-05 17:02:09 -07:00
Daniel Drake 8462fe3cd9 [PATCH] softmac: suggest per-frame-type TX rate
This patch is the first step towards rate control inside softmac.

The txrates substructure has been extended to provide
different fields for different types of packets (management/data,
unicast/multicast). These fields are updated on association to values
compatible with the access point we are associating to.

Drivers can then use the new ieee80211softmac_suggest_txrate() function
call when deciding which rate to transmit each frame at. This is
immensely useful for ZD1211, and bcm can use it too.

The user can still specify a rate through iwconfig, which is matched
for all transmissions (assuming the rate they have specified is in
the rate set required by the AP).

At a later date, we can incorporate automatic rate management into
the ieee80211softmac_recalc_txrates() function.

This patch also removes the mcast_fallback field. Sam Leffler pointed
out that this field is meaningless, because no driver will ever be
retransmitting mcast frames (they are not acked).

Signed-off-by: Daniel Drake <dsd@gentoo.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-05-05 17:10:41 -04:00
John W. Linville fd5226a726 Merge branch 'upstream-fixes' into upstream 2006-05-05 16:56:24 -04:00
Jean Delvare f21709d70a [PATCH] ieee80211: Fix A band channel count (resent)
The channel count for 802.11a is still not right. We better
compute it from the min and max channel numbers, rather than
hardcoding it.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-05-05 16:55:23 -04:00
Daniel Drake d57336e3f2 [PATCH] softmac: make non-operational after being stopped
zd1211 with softmac and wpa_supplicant revealed an issue with softmac
and the use of workqueues. Some of the work functions actually
reschedule themselves, so this meant that there could still be
pending work after flush_scheduled_work() had been called during
ieee80211softmac_stop().

This patch introduces a "running" flag which is used to ensure that
rescheduling does not happen in this situation.

I also used this flag to ensure that softmac's hooks into ieee80211 are
non-operational once the stop operation has been started. This simply
makes softmac a little more robust, because I could crash it easily
by receiving frames in the short timeframe after shutting down softmac
and before turning off the ZD1211 radio. (ZD1211 is now fixed as well!)

Signed-off-by: Daniel Drake <dsd@gentoo.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-05-05 16:55:22 -04:00
John W. Linville aad61439e6 Merge branch 'from-linus' into upstream 2006-05-05 16:50:23 -04:00
Ralf Baechle 82e84249f0 [ROSE]: Eleminate HZ from ROSE kernel interfaces
Convert all ROSE sysctl time values from jiffies to ms as units.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:28:20 -07:00
Ralf Baechle 4d8937d0b1 [NETROM]: Eleminate HZ from NET/ROM kernel interfaces
Convert all NET/ROM sysctl time values from jiffies to ms as units.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:27:47 -07:00
Ralf Baechle e1fdb5b396 [AX.25]: Eleminate HZ from AX.25 kernel interfaces
Convert all AX.25 sysctl time values from jiffies to ms as units.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:27:16 -07:00
Akinobu Mita da753beaeb [NET]: use hlist_unhashed()
Use hlist_unhashed() rather than accessing inside data structure.

Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-29 18:33:15 -07:00
David Woodhouse d6754b401a Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2006-04-29 01:42:26 +01:00
David Woodhouse 62c4f0a2d5 Don't include linux/config.h from anywhere else in include/
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-04-26 12:56:16 +01:00
Johannes Berg 9a1771e867 [PATCH] softmac: add SIOCSIWMLME
This patch adds the SIOCSIWMLME wext to softmac, this functionality
appears to be used by wpa_supplicant and is softmac-specific.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Jouni Malinen <jkm@devicescape.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-24 16:15:58 -04:00
Zhu Yi 45a62ab3d6 [PATCH] ieee80211: update version stamp to 1.1.13
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-24 16:15:54 -04:00
Zhu Yi 73858062b6 [PATCH] ieee80211: Fix TX code doesn't enable QoS when using WPA + QoS
Fix ieee80211 TX code when using WPA+QOS. TKIP/CCMP will use
the TID field of qos_ctl in 802.11 frame header to do encryption. We
cannot ignore this field when doing host encryption and add the qos_ctl
field later.

Signed-off-by: Hong Liu <hong.liu@intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-24 16:15:53 -04:00
Zhu Yi ea2841521a [PATCH] ieee80211: Fix TKIP MIC calculation for QoS frames
Fix TKIP MIC verification failure when receiving QoS frames from AP.

Signed-off-by: Hong Liu <hong.liu@intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-24 16:15:53 -04:00
Johannes Berg 818667f7c4 [PATCH] softmac: fix SIOCSIWAP
There are some bugs in the current implementation of the SIOCSIWAP wext,
for example that when you do it twice and it fails, it may still try
another access point for some reason. This patch fixes this by introducing
a new flag that tells the association code that the bssid that is in use
was fixed by the user and shouldn't be deviated from.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-24 15:20:23 -04:00
Linus Torvalds f4ffaa452e Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: (21 commits)
  [PATCH] wext: Fix RtNetlink ENCODE security permissions
  [PATCH] bcm43xx: iw_priv_args names should be <16 characters
  [PATCH] bcm43xx: sysfs code cleanup
  [PATCH] bcm43xx: fix pctl slowclock limit calculation
  [PATCH] bcm43xx: fix dyn tssi2dbm memleak
  [PATCH] bcm43xx: fix config menu alignment
  [PATCH] bcm43xx wireless: fix printk format warnings
  [PATCH] softmac: report when scanning has finished
  [PATCH] softmac: fix event sending
  [PATCH] softmac: handle iw_mode properly
  [PATCH] softmac: dont send out packets while scanning
  [PATCH] softmac: return -EAGAIN from getscan while scanning
  [PATCH] bcm43xx: set trans_start on TX to prevent bogus timeouts
  [PATCH] orinoco: fix truncating commsquality RID with the latest Symbol firmware
  [PATCH] softmac: fix spinlock recursion on reassoc
  [PATCH] Revert NET_RADIO Kconfig title change
  [PATCH] wext: Fix IWENCODEEXT security permissions
  [PATCH] wireless/atmel: send WEXT scan completion events
  [PATCH] wireless/airo: clean up WEXT association and scan events
  [PATCH] softmac uses Wiress Ext.
  ...
2006-04-20 15:26:25 -07:00
David S. Miller dc6de33674 [NET]: Add skb->truesize assertion checking.
Add some sanity checking.  truesize should be at least sizeof(struct
sk_buff) plus the current packet length.  If not, then truesize is
seriously mangled and deserves a kernel log message.

Currently we'll do the check for release of stream socket buffers.

But we can add checks to more spots over time.

Incorporating ideas from Herbert Xu.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-20 00:10:50 -07:00
Johannes Berg feeeaa87e8 [PATCH] softmac: fix event sending
Softmac is sending custom events to userspace already, but it
should _really_ be sending the right WEXT events instead. This
patch fixes that.

Signed-off-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-19 17:25:39 -04:00
YOSHIFUJI Hideaki b809739a1b [IPV6]: Clean up hop-by-hop options handler.
- Removed unused argument (nhoff) for ipv6_parse_hopopts().
- Make ipv6_parse_hopopts() to align with other extension header
  handlers.
- Removed pointless assignment (hdr), which is not used afterwards.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-18 15:57:53 -07:00
Jamal Hadi Salim 2717096ab4 [XFRM]: Fix aevent timer.
Send aevent immediately if we have sent nothing since last timer and
this is the first packet.

Fixes a corner case when packet threshold is very high, the timer low
and a very low packet rate input which is bursty.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-14 15:03:05 -07:00
Adrian Bunk 6c97e72a16 [IPV4]: Possible cleanups.
This patch contains the following possible cleanups:
- make the following needlessly global function static:
  - arp.c: arp_rcv()
- remove the following unused EXPORT_SYMBOL's:
  - devinet.c: devinet_ioctl
  - fib_frontend.c: ip_rt_ioctl
  - inet_hashtables.c: inet_bind_bucket_create
  - inet_hashtables.c: inet_bind_hash
  - tcp_input.c: sysctl_tcp_abc
  - tcp_ipv4.c: sysctl_tcp_tw_reuse
  - tcp_output.c: sysctl_tcp_mtu_probing
  - tcp_output.c: sysctl_tcp_base_mss

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-14 15:00:20 -07:00
Denis Vlasenko b1a7ffcb7a [IPV6]: Deinline few large functions in inet6 code
Deinline a few functions which produce 200+ bytes of code.

Size  Uses Wasted Name and definition
===== ==== ====== ================================================
  429    3    818 __inet6_lookup        include/net/inet6_hashtables.h
  404    2    384 __inet6_lookup_established    include/net/inet6_hashtables.h
  206    3    372 __inet6_hash  include/net/inet6_hashtables.h

Signed-off-by: Denis Vlasenko <vda@ilport.com.ua>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:48:59 -07:00
David S. Miller 9b591cbd4e [X25]: Restore skb->dev setting in x25_type_trans().
Noticed by Pascal Schlafer.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:37:18 -07:00
Patrick McHardy 2e2f7aefa8 [NETFILTER]: Fix fragmentation issues with bridge netfilter
The conntrack code doesn't do re-fragmentation of defragmented packets
anymore but relies on fragmentation in the IP layer. Purely bridged
packets don't pass through the IP layer, so the bridge netfilter code
needs to take care of fragmentation itself.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:25:23 -07:00
Herbert Xu dbe5b4aaaf [IPSEC]: Kill unused decap state structure
This patch removes the *_decap_state structures which were previously
used to share state between input/post_input.  This is no longer
needed.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-01 00:54:16 -08:00
Herbert Xu e695633e21 [IPSEC]: Kill unused decap state argument
This patch removes the decap_state argument from the xfrm input hook.
Previously this function allowed the input hook to share state with
the post_input hook.  The latter has since been removed.

The only purpose for it now is to check the encap type.  However, it
is easier and better to move the encap type check to the generic
xfrm_rcv function.  This allows us to get rid of the decap state
argument altogether.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-01 00:52:46 -08:00
David S. Miller 0803dbed7a [TCP]: Kill unused extern decl for tcp_v4_hash_connecting()
Noticed by Alan Menegotto.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-31 02:25:46 -08:00
Herbert Xu d2acc3479c [INET]: Introduce tunnel4/tunnel6
Basically this patch moves the generic tunnel protocol stuff out of
xfrm4_tunnel/xfrm6_tunnel and moves it into the new files of tunnel4.c
and tunnel6 respectively.

The reason for this is that the problem that Hugo uncovered is only
the tip of the iceberg.  The real problem is that when we removed the
dependency of ipip on xfrm4_tunnel we didn't really consider the module
case at all.

For instance, as it is it's possible to build both ipip and xfrm4_tunnel
as modules and if the latter is loaded then ipip simply won't load.

After considering the alternatives I've decided that the best way out of
this is to restore the dependency of ipip on the non-xfrm-specific part
of xfrm4_tunnel.  This is acceptable IMHO because the intention of the
removal was really to be able to use ipip without the xfrm subsystem.
This is still preserved by this patch.

So now both ipip/xfrm4_tunnel depend on the new tunnel4.c which handles
the arbitration between the two.  The order of processing is determined
by a simple integer which ensures that ipip gets processed before
xfrm4_tunnel.

The situation for ICMP handling is a little bit more complicated since
we may not have enough information to determine who it's for.  It's not
a big deal at the moment since the xfrm ICMP handlers are basically
no-ops.  In future we can deal with this when we look at ICMP caching
in general.

The user-visible change to this is the removal of the TUNNEL Kconfig
prompts.  This makes sense because it can only be used through IPCOMP
as it stands.

The addition of the new modules shouldn't introduce any problems since
module dependency will cause them to be loaded.

Oh and I also turned some unnecessary pskb's in IPv6 related to this
patch to skb's.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-28 17:02:46 -08:00
Denis Vlasenko f0088a50e7 [NET]: deinline 200+ byte inlines in sock.h
Sizes in bytes (allyesconfig, i386) and files where those inlines
are used:

238 sock_queue_rcv_skb 2.6.16/net/x25/x25_in.o
238 sock_queue_rcv_skb 2.6.16/net/rose/rose_in.o
238 sock_queue_rcv_skb 2.6.16/net/packet/af_packet.o
238 sock_queue_rcv_skb 2.6.16/net/netrom/nr_in.o
238 sock_queue_rcv_skb 2.6.16/net/llc/llc_sap.o
238 sock_queue_rcv_skb 2.6.16/net/llc/llc_conn.o
238 sock_queue_rcv_skb 2.6.16/net/irda/af_irda.o
238 sock_queue_rcv_skb 2.6.16/net/ipx/af_ipx.o
238 sock_queue_rcv_skb 2.6.16/net/ipv6/udp.o
238 sock_queue_rcv_skb 2.6.16/net/ipv6/raw.o
238 sock_queue_rcv_skb 2.6.16/net/ipv4/udp.o
238 sock_queue_rcv_skb 2.6.16/net/ipv4/raw.o
238 sock_queue_rcv_skb 2.6.16/net/ipv4/ipmr.o
238 sock_queue_rcv_skb 2.6.16/net/econet/econet.o
238 sock_queue_rcv_skb 2.6.16/net/econet/af_econet.o
238 sock_queue_rcv_skb 2.6.16/net/bluetooth/sco.o
238 sock_queue_rcv_skb 2.6.16/net/bluetooth/l2cap.o
238 sock_queue_rcv_skb 2.6.16/net/bluetooth/hci_sock.o
238 sock_queue_rcv_skb 2.6.16/net/ax25/ax25_in.o
238 sock_queue_rcv_skb 2.6.16/net/ax25/af_ax25.o
238 sock_queue_rcv_skb 2.6.16/net/appletalk/ddp.o
238 sock_queue_rcv_skb 2.6.16/drivers/net/pppoe.o

276 sk_receive_skb 2.6.16/net/decnet/dn_nsp_in.o
276 sk_receive_skb 2.6.16/net/dccp/ipv6.o
276 sk_receive_skb 2.6.16/net/dccp/ipv4.o
276 sk_receive_skb 2.6.16/net/dccp/dccp_ipv6.o
276 sk_receive_skb 2.6.16/drivers/net/pppoe.o

209 sk_dst_check 2.6.16/net/ipv6/ip6_output.o
209 sk_dst_check 2.6.16/net/ipv4/udp.o
209 sk_dst_check 2.6.16/net/decnet/dn_nsp_out.o

Large inlines with multiple callers:
Size  Uses Wasted Name and definition
===== ==== ====== ================================================
  238   21   4360 sock_queue_rcv_skb    include/net/sock.h
  109   10    801 sock_recv_timestamp   include/net/sock.h
  276    4    768 sk_receive_skb        include/net/sock.h
   94    8    518 __sk_dst_check        include/net/sock.h
  209    3    378 sk_dst_check  include/net/sock.h
  131    4    333 sk_setup_caps include/net/sock.h
  152    2    132 sk_stream_alloc_pskb  include/net/sock.h
  125    2    105 sk_stream_writequeue_purge    include/net/sock.h

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-28 17:02:45 -08:00
Linus Torvalds fdccffc6b7 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
  [NET]: drop duplicate assignment in request_sock
  [IPSEC]: Fix tunnel error handling in ipcomp6
2006-03-27 08:47:29 -08:00
Alan Stern e041c68341 [PATCH] Notifier chain update: API changes
The kernel's implementation of notifier chains is unsafe.  There is no
protection against entries being added to or removed from a chain while the
chain is in use.  The issues were discussed in this thread:

    http://marc.theaimsgroup.com/?l=linux-kernel&m=113018709002036&w=2

We noticed that notifier chains in the kernel fall into two basic usage
classes:

	"Blocking" chains are always called from a process context
	and the callout routines are allowed to sleep;

	"Atomic" chains can be called from an atomic context and
	the callout routines are not allowed to sleep.

We decided to codify this distinction and make it part of the API.  Therefore
this set of patches introduces three new, parallel APIs: one for blocking
notifiers, one for atomic notifiers, and one for "raw" notifiers (which is
really just the old API under a new name).  New kinds of data structures are
used for the heads of the chains, and new routines are defined for
registration, unregistration, and calling a chain.  The three APIs are
explained in include/linux/notifier.h and their implementation is in
kernel/sys.c.

With atomic and blocking chains, the implementation guarantees that the chain
links will not be corrupted and that chain callers will not get messed up by
entries being added or removed.  For raw chains the implementation provides no
guarantees at all; users of this API must provide their own protections.  (The
idea was that situations may come up where the assumptions of the atomic and
blocking APIs are not appropriate, so it should be possible for users to
handle these things in their own way.)

There are some limitations, which should not be too hard to live with.  For
atomic/blocking chains, registration and unregistration must always be done in
a process context since the chain is protected by a mutex/rwsem.  Also, a
callout routine for a non-raw chain must not try to register or unregister
entries on its own chain.  (This did happen in a couple of places and the code
had to be changed to avoid it.)

Since atomic chains may be called from within an NMI handler, they cannot use
spinlocks for synchronization.  Instead we use RCU.  The overhead falls almost
entirely in the unregister routine, which is okay since unregistration is much
less frequent that calling a chain.

Here is the list of chains that we adjusted and their classifications.  None
of them use the raw API, so for the moment it is only a placeholder.

  ATOMIC CHAINS
  -------------
arch/i386/kernel/traps.c:		i386die_chain
arch/ia64/kernel/traps.c:		ia64die_chain
arch/powerpc/kernel/traps.c:		powerpc_die_chain
arch/sparc64/kernel/traps.c:		sparc64die_chain
arch/x86_64/kernel/traps.c:		die_chain
drivers/char/ipmi/ipmi_si_intf.c:	xaction_notifier_list
kernel/panic.c:				panic_notifier_list
kernel/profile.c:			task_free_notifier
net/bluetooth/hci_core.c:		hci_notifier
net/ipv4/netfilter/ip_conntrack_core.c:	ip_conntrack_chain
net/ipv4/netfilter/ip_conntrack_core.c:	ip_conntrack_expect_chain
net/ipv6/addrconf.c:			inet6addr_chain
net/netfilter/nf_conntrack_core.c:	nf_conntrack_chain
net/netfilter/nf_conntrack_core.c:	nf_conntrack_expect_chain
net/netlink/af_netlink.c:		netlink_chain

  BLOCKING CHAINS
  ---------------
arch/powerpc/platforms/pseries/reconfig.c:	pSeries_reconfig_chain
arch/s390/kernel/process.c:		idle_chain
arch/x86_64/kernel/process.c		idle_notifier
drivers/base/memory.c:			memory_chain
drivers/cpufreq/cpufreq.c		cpufreq_policy_notifier_list
drivers/cpufreq/cpufreq.c		cpufreq_transition_notifier_list
drivers/macintosh/adb.c:		adb_client_list
drivers/macintosh/via-pmu.c		sleep_notifier_list
drivers/macintosh/via-pmu68k.c		sleep_notifier_list
drivers/macintosh/windfarm_core.c	wf_client_list
drivers/usb/core/notify.c		usb_notifier_list
drivers/video/fbmem.c			fb_notifier_list
kernel/cpu.c				cpu_chain
kernel/module.c				module_notify_list
kernel/profile.c			munmap_notifier
kernel/profile.c			task_exit_notifier
kernel/sys.c				reboot_notifier_list
net/core/dev.c				netdev_chain
net/decnet/dn_dev.c:			dnaddr_chain
net/ipv4/devinet.c:			inetaddr_chain

It's possible that some of these classifications are wrong.  If they are,
please let us know or submit a patch to fix them.  Note that any chain that
gets called very frequently should be atomic, because the rwsem read-locking
used for blocking chains is very likely to incur cache misses on SMP systems.
(However, if the chain's callout routines may sleep then the chain cannot be
atomic.)

The patch set was written by Alan Stern and Chandra Seetharaman, incorporating
material written by Keith Owens and suggestions from Paul McKenney and Andrew
Morton.

[jes@sgi.com: restructure the notifier chain initialization macros]
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:50 -08:00
Norbert Kiesel 3eb4801d7b [NET]: drop duplicate assignment in request_sock
Just noticed that request_sock.[ch] contain a useless assignment of
rskq_accept_head to itself.  I assume this is a typo and the 2nd one
was supposed to be _tail.  However, setting _tail to NULL is not
needed, so the patch below just drops the 2nd assignment.

Signed-off-By: Norbert Kiesel <nkiesel@tbdnetworks.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-26 17:39:55 -08:00
Ilia Sotnikov cef2685e00 [IPV4]: Aggregate route entries with different TOS values
When we get an ICMP need-to-frag message, the original TOS value in the
ICMP payload cannot be used as a key to look up the routes to update.
This is because the TOS field may have been modified by routers on the
way.  Similarly, ip_rt_redirect should also ignore the TOS as the router
that gave us the message may have modified the TOS value.

The patch achieves this objective by aggregating entries with different
TOS values (but are otherwise identical) into the same bucket.  This
makes it easy to update them at the same time when an ICMP message is
received.

In future we should use a twin-hashing scheme where teh aggregation
occurs at the entry level.  That is, the TOS goes back into the hash
for normal lookups while ICMP lookups will end up with a node that
gives us a list that contains all other route entries that differ
only by TOS.

Signed-off-by: Ilia Sotnikov <hostcc@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-25 01:38:55 -08:00
David S. Miller 9932cf9546 [NET]: Fill in a 32-bit hole in struct sock on 64-bit platforms.
This makes struct sock 8 bytes smaller on 64-bit.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-24 15:45:00 -08:00
Jeff Garzik 9b7c84899e Merge branch 'upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2006-03-23 17:15:41 -05:00
Jean Tourrilhes 711e2c33ac [PATCH] WE-20 for kernel 2.6.16
This is version 20 of the Wireless Extensions. This is the
completion of the RtNetlink work I started early 2004, it enables the
full Wireless Extension API over RtNetlink.

	Few comments on the patch :
	o totally driver transparent, no change in drivers needed.
	o iwevent were already RtNetlink based since they were created
(around 2.5.7). This adds all the regular SET and GET requests over
RtNetlink, using the exact same mechanism and data format as iwevents.
	o This is a Kconfig option, as currently most people have no
need for it. Surprisingly, patch is actually small and well
encapsulated.
	o Tested on SMP, attention as been paid to make it 64 bits clean.
	o Code do probably too many checks and could be further
optimised, but better safe than sorry.
	o RtNetlink based version of the Wireless Tools available on
my web page for people inclined to try out this stuff.

	I would also like to thank Alexey Kuznetsov for his helpful
suggestions to make this patch better.

Signed-off-by: Jean Tourrilhes <jt@hpl.hp.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-23 07:12:57 -05:00
Jeff Garzik fa4fa40a99 Merge branch 'upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2006-03-22 22:55:57 -05:00
Johannes Berg 4855d25b1e [PATCH] softmac: add copyright and license headers
add copyright and license headers to all softmac files

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-22 22:16:56 -05:00
Johannes Berg 5c4df6da58 [PATCH] softmac: convert to use global workqueue
Convert softmac to use global workqueue instead of private one...

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-22 22:16:52 -05:00
Johannes Berg 370121e519 [PATCH] wireless: Add softmac layer to the kernel
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-22 22:16:50 -05:00
Dmitry Mishin 1e30a014e3 [NETFILTER]: futher {ip,ip6,arp}_tables unification
This patch moves {ip,ip6,arp}t_entry_{match,target} definitions to
x_tables.h. This move simplifies code and future compatibility fixes.

Signed-off-by: Dmitry Mishin <dim@openvz.org>
Acked-off-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-22 13:56:56 -08:00
Pablo Neira Ayuso b9f78f9fca [NETFILTER]: nf_conntrack: support for layer 3 protocol load on demand
x_tables matches and targets that require nf_conntrack_ipv[4|6] to work
don't have enough information to load on demand these modules. This
patch introduces the following changes to solve this issue:

o nf_ct_l3proto_try_module_get: try to load the layer 3 connection
tracker module and increases the refcount.
o nf_ct_l3proto_module put: drop the refcount of the module.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-22 13:56:08 -08:00
Shaun Pereira a64b7b936d [X25]: allow ITU-T DTE facilities for x25
Allows use of the optional user facility to insert ITU-T
(http://www.itu.int/ITU-T/) specified DTE facilities in call set-up x25
packets.  This feature is optional; no facilities will be added if the ioctl
is not used, and call setup packet remains the same as before.

If the ioctls provided by the patch are used, then a facility marker will be
added to the x25 packet header so that the called dte address extension
facility can be differentiated from other types of facilities (as described in
the ITU-T X.25 recommendation) that are also allowed in the x25 packet header.

Facility markers are made up of two octets, and may be present in the x25
packet headers of call-request, incoming call, call accepted, clear request,
and clear indication packets.  The first of the two octets represents the
facility code field and is set to zero by this patch.  The second octet of the
marker represents the facility parameter field and is set to 0x0F because the
marker will be inserted before ITU-T type DTE facilities.

Since according to ITU-T X.25 Recommendation X.25(10/96)- 7.1 "All networks
will support the facility markers with a facility parameter field set to all
ones or to 00001111", therefore this patch should work with all x.25 networks.

While there are many ITU-T DTE facilities, this patch implements only the
called and calling address extension, with placeholders in the
x25_dte_facilities structure for the rest of the facilities.

Testing:

This patch was tested using a cisco xot router connected on its serial ports
to an X.25 network, and on its lan ports to a host running an xotd daemon.

It is also possible to test this patch using an xotd daemon and an x25tap
patch, where the xotd daemons work back-to-back without actually using an x.25
network.  See www.fyonne.net for details on how to do this.

Signed-off-by: Shaun Pereira <spereira@tusc.com.au>
Acked-by: Andrew Hendry <ahendry@tusc.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-22 00:01:31 -08:00
Shaun Pereira f0ac261441 [NET]: socket timestamp 32 bit handler for 64 bit kernel
Get socket timestamp handler function that does not use the
ioctl32_hash_table.

Signed-off-by: Shaun Pereira <spereira@tusc.com.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-21 23:59:39 -08:00
Stephen Hemminger f4ad2b162d [LLC]: llc_mac_hdr_init const arguments
Cleanup of LLC.  llc_mac_hdr_init can take constant arguments,
and it is defined twice once in llc_output.h that is otherwise unused.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Acked-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:59:36 -08:00
Arnaldo Carvalho de Melo dec73ff029 [ICSK] compat: Introduce inet_csk_compat_[gs]etsockopt
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:46:16 -08:00
Dmitry Mishin 3fdadf7d27 [NET]: {get|set}sockopt compatibility layer
This patch extends {get|set}sockopt compatibility layer in order to
move protocol specific parts to their place and avoid huge universal
net/compat.c file in the future.

Signed-off-by: Dmitry Mishin <dim@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:45:21 -08:00
Steven Whitehouse c4ea94ab37 [DECnet]: Endian annotation and fixes for DECnet.
The typedef for dn_address has been removed in favour of using __le16
or __u16 directly as appropriate. All the DECnet header files are
updated accordingly.

The byte ordering of dn_eth2dn() and dn_dn2eth() are both changed
since just about all their callers wanted network order rather than
host order, so the conversion is now done in the functions themselves.

Several missed endianess conversions have been picked up during the
conversion process. The nh_gw field in struct dn_fib_info has been
changed from a 32 bit field to 16 bits as it ought to be.

One or two cases of using htons rather than dn_htons in the routing
code have been found and fixed.

There are still a few warnings to fix, but this patch deals with the
important cases.

Signed-off-by: Steven Whitehouse <steve@chygwyn.com>
Signed-off-by: Patrick Caulfield <patrick@tykepenguin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:42:39 -08:00
Patrick McHardy be33690d8f [XFRM]: Fix aevent related crash
When xfrm_user isn't loaded xfrm_nl is NULL, which makes IPsec crash because
xfrm_aevent_is_on passes the NULL pointer to netlink_has_listeners as socket.
A second problem is that the xfrm_nl pointer is not cleared when the socket
is releases at module unload time.

Protect references of xfrm_nl from outside of xfrm_user by RCU, check
that the socket is present in xfrm_aevent_is_on and set it to NULL
when unloading xfrm_user.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:40:54 -08:00
Rick Jones 15d99e02ba [TCP]: sysctl to allow TCP window > 32767 sans wscale
Back in the dark ages, we had to be conservative and only allow 15-bit
window fields if the window scale option was not negotiated.  Some
ancient stacks used a signed 16-bit quantity for the window field of
the TCP header and would get confused.

Those days are long gone, so we can use the full 16-bits by default
now.

There is a sysctl added so that we can still interact with such old
stacks

Signed-off-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:40:29 -08:00
Ingo Molnar 57b47a53ec [NET]: sem2mutex part 2
Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:35:41 -08:00
Arjan van de Ven 4a3e2f711a [NET] sem2mutex: net/
Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:33:17 -08:00
Michael S. Tsirkin c5ecd62c25 [NET]: Move destructor from neigh->ops to neigh_params
struct neigh_ops currently has a destructor field, which no in-kernel
drivers outside of infiniband use.  The infiniband/ulp/ipoib in-tree
driver stashes some info in the neighbour structure (the results of
the second-stage lookup from ARP results to real link-level path), and
it uses neigh->ops->destructor to get a callback so it can clean up
this extra info when a neighbour is freed.  We've run into problems
with this: since the destructor is in an ops field that is shared
between neighbours that may belong to different net devices, there's
no way to set/clear it safely.

The following patch moves this field to neigh_parms where it can be
safely set, together with its twin neigh_setup.  Two additional
patches in the patch series update ipoib to use this new interface.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:25:41 -08:00
Arnaldo Carvalho de Melo c4d9390941 [ICSK]: Introduce inet_csk_ctl_sock_create
Consolidating open coded sequences in tcp and dccp, v4 and v6.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:01:03 -08:00
John Heffner 0e7b13685f [TCP] mtu probing: move tcp-specific data out of inet_connection_sock
This moves some TCP-specific MTU probing state out of
inet_connection_sock back to tcp_sock.

Signed-off-by: John Heffner <jheffner@psc.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 21:32:58 -08:00
Benjamin LaHaise 1d541ddd74 [AF_UNIX]: scm: better initialization
Instead of doing a memset then initialization of the fields of the scm
structure, just initialize all the members explicitly.  Prevent reloading
of current on x86 and x86-64 by storing the value in a local variable for
subsequent dereferences.  This is worth a ~7KB/s increase in af_unix
bandwidth.  Note that we avoid the issues surrounding potentially
uninitialized members of the ucred structure by constructing a struct
ucred instead of assigning the members individually, which forces the
compiler to zero any padding.

[ I modified the patch not to use the aggregate assignment since
  gcc-3.4.x and earlier cannot optimize that properly at all even
  though gcc-4.0.x and later can -DaveM ]

Signed-off-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 21:31:51 -08:00
Jamal Hadi Salim 6c5c8ca7ff [IPSEC]: Sync series - policy expires
This is similar to the SA expire insertion patch - only it inserts
expires for SP.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:17:25 -08:00
Jamal Hadi Salim 53bc6b4d29 [IPSEC]: Sync series - SA expires
This patch allows a user to insert SA expires. This is useful to
do on an HA backup for the case of byte counts but may not be very
useful for the case of time based expiry.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:17:03 -08:00
Jamal Hadi Salim 980ebd2579 [IPSEC]: Sync series - acquire insert
This introduces a feature similar to the one described in RFC 2367:
"
   ... the application needing an SA sends a PF_KEY
   SADB_ACQUIRE message down to the Key Engine, which then either
   returns an error or sends a similar SADB_ACQUIRE message up to one or
   more key management applications capable of creating such SAs.
   ...
   ...
   The third is where an application-layer consumer of security
   associations (e.g.  an OSPFv2 or RIPv2 daemon) needs a security
   association.

        Send an SADB_ACQUIRE message from a user process to the kernel.

        <base, address(SD), (address(P),) (identity(SD),) (sensitivity,)
          proposal>

        The kernel returns an SADB_ACQUIRE message to registered
          sockets.

        <base, address(SD), (address(P),) (identity(SD),) (sensitivity,)
          proposal>

        The user-level consumer waits for an SADB_UPDATE or SADB_ADD
        message for its particular type, and then can use that
        association by using SADB_GET messages.

 "
An app such as OSPF could then use ipsec KM to get keys

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:16:40 -08:00
Jamal Hadi Salim f8cd54884e [IPSEC]: Sync series - core changes
This patch provides the core functionality needed for sync events
for ipsec. Derived work of Krisztian KOVACS <hidden@balabit.hu>

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:15:11 -08:00
Patrick McHardy f2ffd9eeda [NETFILTER]: Move ip6_masked_addrcmp to include/net/ipv6.h
Replace netfilter's ip6_masked_addrcmp by a more efficient version
in include/net/ipv6.h to make it usable without module dependencies.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 18:03:16 -08:00
Harald Welte dc808fe28d [NETFILTER] nf_conntrack: clean up to reduce size of 'struct nf_conn'
This patch moves all helper related data fields of 'struct nf_conn'
into a separate structure 'struct nf_conn_help'.  This new structure
is only present in conntrack entries for which we actually have a
helper loaded.

Also, this patch cleans up the nf_conntrack 'features' mechanism to
resemble what the original idea was: Just glue the feature-specific
data structures at the end of 'struct nf_conn', and explicitly
re-calculate the pointer to it when needed rather than keeping
pointers around.

Saves 20 bytes per conntrack on my x86_64 box. A non-helped conntrack
is 276 bytes. We still need to save another 20 bytes in order to fit
into to target of 256bytes.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:56:32 -08:00
John Heffner 5d424d5a67 [TCP]: MTU probing
Implementation of packetization layer path mtu discovery for TCP, based on
the internet-draft currently found at
<http://www.ietf.org/internet-drafts/draft-ietf-pmtud-method-05.txt>.

Signed-off-by: John Heffner <jheffner@psc.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:53:41 -08:00
YOSHIFUJI Hideaki 70ceb4f539 [IPV6]: ROUTE: Add experimental support for Route Information Option in RA (RFC4191).
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:06:24 -08:00
YOSHIFUJI Hideaki ebacaaa0fd [IPV6]: ROUTE: Add support for Router Preference (RFC4191).
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:04:53 -08:00
YOSHIFUJI Hideaki 554cfb7ee5 [IPV6]: ROUTE: Eliminate lock for default route pointer.
And prepare for more advanced router selection.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:00:26 -08:00
YOSHIFUJI Hideaki 955189efb4 [IPV6]: ADDRCONF: Use our standard algorithm for randomized ifid.
RFC 3041 describes an algorithm to generate random interface
identifier.  In RFC 3041bis, it is allowed to use different
algorithm than one described in RFC 3041.

So, let's use our standard pseudo random algorithm to simplify
our implementation.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 16:54:09 -08:00
Jeff Garzik d378aca6ec Merge branch 'master' 2006-03-20 04:38:03 -05:00
Ralf Baechle DL5RB c7c694d196 [AX.25]: Fix potencial memory hole.
If the AX.25 dialect chosen by the sysadmin is set to DAMA master / 3
(or DAMA slave / 2, if CONFIG_AX25_DAMA_SLAVE=n) ax25_kick() will fall
through the switch statement without calling ax25_send_iframe() or any
other function that would eventually free skbn thus leaking the packet.

Fix by restricting the sysctl inferface to allow only actually supported
AX.25 dialects.

The system administration mistake needed for this to happen is rather
unlikely, so this is an uncritical hole.

Coverity #651.

Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-19 13:20:06 -08:00
Alexey Kuznetsov 265a92856b [NET]: Fix race condition in sk_wait_event().
It is broken, the condition is checked out of socket lock. It is
wonderful the bug survived for so long time.

[ This fixes bugzilla #6233:
  race condition in tcp_sendmsg when connection became established ]

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-17 16:05:43 -08:00
Jeff Garzik 68727fed54 Merge branch 'upstream-fixes' 2006-03-01 01:58:38 -05:00
Herbert Xu 752c1f4c78 [IPSEC]: Kill post_input hook and do NAT-T in esp_input directly
The only reason post_input exists at all is that it gives us the
potential to adjust the checksums incrementally in future which
we ought to do.

However, after thinking about it for a bit we can adjust the
checksums without using this post_input stuff at all.  The crucial
point is that only the inner-most NAT-T SA needs to be considered
when adjusting checksums.  What's more, the checksum adjustment
comes down to a single u32 due to the linearity of IP checksums.

We just happen to have a spare u32 lying around in our skb structure :)
When ip_summed is set to CHECKSUM_NONE on input, the value of skb->csum
is currently unused.  All we have to do is to make that the checksum
adjustment and voila, there goes all the post_input and decap structures!

I've left in the decap data structures for now since it's intricately
woven into the sec_path stuff.  We can kill them later too.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-27 13:00:40 -08:00
Jeff Garzik dbfedbb981 Merge branch 'master' 2006-02-27 11:33:51 -05:00
Herbert Xu 21380b81ef [XFRM]: Eliminate refcounting confusion by creating __xfrm_state_put().
We often just do an atomic_dec(&x->refcnt) on an xfrm_state object
because we know there is more than 1 reference remaining and thus
we can elide the heavier xfrm_state_put() call.

Do this behind an inline function called __xfrm_state_put() so that is
more obvious and also to allow us to more cleanly add refcount
debugging later.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-23 16:10:53 -08:00
Jeff Garzik b04a92e160 Merge branch 'upstream-fixes' 2006-02-17 16:20:30 -05:00
Patrick McHardy 48d5cad87c [XFRM]: Fix SNAT-related crash in xfrm4_output_finish
When a packet matching an IPsec policy is SNATed so it doesn't match any
policy anymore it looses its xfrm bundle, which makes xfrm4_output_finish
crash because of a NULL pointer dereference.

This patch directs these packets to the original output path instead. Since
the packets have already passed the POST_ROUTING hook, but need to start at
the beginning of the original output path which includes another
POST_ROUTING invocation, a flag is added to the IPCB to indicate that the
packet was rerouted and doesn't need to pass the POST_ROUTING hook again.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-15 15:10:22 -08:00
David S. Miller 15c38c6ecd Merge master.kernel.org:/pub/scm/linux/kernel/git/holtmann/bluetooth-2.6 2006-02-13 15:40:55 -08:00
Joe Perches 7a11c4d063 [IRDA]: Ratelimit messages.
From: Joe Perches <joe@perches.com>

Based upon a patch by Dave Jones.

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-13 15:34:11 -08:00
Marcel Holtmann 56f3a40a5e [Bluetooth] Reduce L2CAP MTU for RFCOMM connections
This patch reduces the default L2CAP MTU for all RFCOMM connections
from 1024 to 1013 to improve the interoperability with some broken
RFCOMM implementations. To make this more flexible the L2CAP MTU
becomes also a module parameter and so it can changed at runtime.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2006-02-13 11:39:57 +01:00
Samuel Ortiz d93077fb0e [IRDA]: Set proper IrLAP device address length
This patch set IrDA's addr_len properly, i.e to 4 bytes, the size of the
IrLAP device address.

Signed-off-by: Samuel Ortiz <samuel.ortiz@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-09 16:58:46 -08:00
Jeff Garzik 3c9b3a8575 Merge branch 'master' 2006-02-07 01:47:12 -05:00