linux-2.6

dect

Archived

Author	SHA1	Message	Date
Johannes Berg	618f356b95	mac80211: rename WLAN_STA_SUSPEND to WLAN_STA_BLOCK_BA I want to use it during station destruction as well so rename it to WLAN_STA_BLOCK_BA which is also the only use of it now. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-07 14:38:04 -04:00
Johannes Berg	66b0470aee	mac80211: remove ieee80211_sta_stop_rx_ba_session All callers of ieee80211_sta_stop_rx_ba_session can just call __ieee80211_stop_rx_ba_session instead because they already have the station struct, so do that and remove ieee80211_sta_stop_rx_ba_session. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-07 14:38:03 -04:00
Johannes Berg	2b43ae6daf	mac80211: remove irq disabling for sta lock All other places except one in the TX path, which has BHs disabled, and it also cannot be locked from interrupts so disabling IRQs is not necessary. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-07 14:38:01 -04:00
Johannes Berg	e64b379574	mac80211: fix station destruction problem When a station w/o a key is destroyed, or when a driver submits work for a station and thereby references it again, it seems like potentially we could reference the station structure while it is being destroyed. Wait for an RCU grace period to elapse before finishing destroying the station after we have removed the station from the driver and from the hash table etc., even in the case where no key is associated with the station. Also, there's no point in deleting the plink timer here since it'll be properly deleted just a bit later. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-07 14:38:00 -04:00
Jouni Malinen	d5cdfacb35	cfg80211: Add local-state-change-only auth/deauth/disassoc cfg80211 is quite strict on allowing authentication and association commands only in certain states. In order to meet these requirements, user space applications may need to clear authentication or association state in some cases. Currently, this can be done with deauth/disassoc command, but that ends up sending out Deauthentication or Disassociation frame unnecessarily. Add a new nl80211 attribute to allow this sending of the frame be skipped, but with all other deauth/disassoc operations being completed. Similar state change is also needed for IEEE 802.11r FT protocol in the FT-over-DS case which does not use Authentication frame exchange in a transition to another BSS. For this to work with cfg80211, an authentication entry needs to be created for the target BSS without sending out an Authentication frame. The nl80211 authentication command can be used for this purpose, too, with the new attribute to indicate that the command is only for changing local state. This enables wpa_supplicant to complete FT-over-DS transition successfully. Signed-off-by: Jouni Malinen <j@w1.fi> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-07 14:37:56 -04:00
Timo Teräs	8e4795605d	flow: delayed deletion of flow cache entries Speed up lookups by freeing flow cache entries later. After virtualizing flow cache entry operations, the flow cache may now end up calling policy or bundle destructor which can be slowish. As gc_list is more effective with double linked list, the flow cache is converted to use common hlist and list macroes where appropriate. Signed-off-by: Timo Teras <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-07 03:43:20 -07:00
Timo Teräs	285ead175c	xfrm: remove policy garbage collection Policies are now properly reference counted and destroyed from all code paths. The delayed gc is just an overhead now and can be removed. Signed-off-by: Timo Teras <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-07 03:43:19 -07:00
Timo Teräs	80c802f307	xfrm: cache bundles instead of policies for outgoing flows __xfrm_lookup() is called for each packet transmitted out of system. The xfrm_find_bundle() does a linear search which can kill system performance depending on how many bundles are required per policy. This modifies __xfrm_lookup() to store bundles directly in the flow cache. If we did not get a hit, we just create a new bundle instead of doing slow search. This means that we can now get multiple xfrm_dst's for same flow (on per-cpu basis). Signed-off-by: Timo Teras <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-07 03:43:19 -07:00
Timo Teräs	fe1a5f031e	flow: virtualize flow cache entry methods This allows to validate the cached object before returning it. It also allows to destruct object properly, if the last reference was held in flow cache. This is also a prepartion for caching bundles in the flow cache. In return for virtualizing the methods, we save on: - not having to regenerate the whole flow cache on policy removal: each flow matching a killed policy gets refreshed as the getter function notices it smartly. - we do not have to call flow_cache_flush from policy gc, since the flow cache now properly deletes the object if it had any references Signed-off-by: Timo Teras <timo.teras@iki.fi> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-07 03:43:18 -07:00
David S. Miller	4a35ecf8bf	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/bonding/bond_main.c drivers/net/via-velocity.c drivers/net/wireless/iwlwifi/iwl-agn.c	2010-04-06 23:53:30 -07:00
Hagen Paul Pfeifer	842509b859	socket: remove duplicate declaration of struct timespec struct timespec ts was alreay defined. Reuse the previously defined one and reduce the memory footprint on the stack by 16 bytes. Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-06 19:50:20 -07:00
Jon Paul Maloy	c6537d6742	TIPC: Updated topology subscription protocol according to latest spec This patch makes it explicit in the API that all fields in subscriptions and events exchanged with the Topology Server must be in network byte order. It also ensures that all fields of a subscription are compared when cancelling a subscription, in order to avoid inadvertent cancelling of the wrong subscription. Finally, the tipc module version is updated to 2.0.0, to reflect the API change. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-06 19:50:19 -07:00
Jouni Malinen	d211e90e28	mac80211: Fix robust management frame handling (MFP) Commit e34e09401ee9888dd662b2fca5d607794a56daf2 incorrectly removed use of ieee80211_has_protected() from the management frame case and in practice, made this validation drop all Action frames when MFP is enabled. This should have only been done for frames with Protected field set to zero. Signed-off-by: Jouni Malinen <j@w1.fi> Cc: stable@kernel.org Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-06 16:49:33 -04:00
Johannes Berg	0379185b6c	mac80211: annotate station rcu dereferences The new RCU lockdep support warns about these in some contexts -- make it aware of the locks used to protect all this. Different locks are used in different contexts which unfortunately means we can't get perfect checking. Also remove rcu_dereference() from two places that don't actually dereference the pointers. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-06 15:53:30 -04:00
Javier Cardona	1cb561f837	mac80211: Handle mesh action frames in ieee80211_rx_h_action This fixes the problem introduced in commit `8404080568` which broke mesh peer link establishment. changes: v2 Added missing break (Johannes) v3 Broke original patch into two (Johannes) Signed-off-by: Javier Cardona <javier@cozybit.com> Cc: stable@kernel.org Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-04-06 15:53:28 -04:00
Linus Torvalds	cb4361c1dc	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (37 commits) smc91c92_cs: fix the problem of "Unable to find hardware address" r8169: clean up my printk uglyness net: Hook up cxgb4 to Kconfig and Makefile cxgb4: Add main driver file and driver Makefile cxgb4: Add remaining driver headers and L2T management cxgb4: Add packet queues and packet DMA code cxgb4: Add HW and FW support code cxgb4: Add register, message, and FW definitions netlabel: Fix several rcu_dereference() calls used without RCU read locks bonding: fix potential deadlock in bond_uninit() net: check the length of the socket address passed to connect(2) stmmac: add documentation for the driver. stmmac: fix kconfig for crc32 build error be2net: fix bug in vlan rx path for big endian architecture be2net: fix flashing on big endian architectures be2net: fix a bug in flashing the redboot section bonding: bond_xmit_roundrobin() fix drivers/net: Add missing unlock net: gianfar - align BD ring size console messages net: gianfar - initialize per-queue statistics ...	2010-04-06 08:34:06 -07:00
YOSHIFUJI Hideaki / 吉藤英明	2f787b0b76	mac80211: Ensure initializing private mc_list in prepare_multicast(). Fix kernel panic by NULL pointer dereference in the context of ieee80211_ops->prepare_multicast(). This bug was introduced by commit 22bedad3c.. ("net: convert multicast list to list_head"). Call __hw_addr_init() in ieee80211_alloc_hw() to initialize list_head of private device multicast list, like we do in bond_init(). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Reviewed-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-06 00:12:30 -07:00
Eric Dumazet	e4008276fd	net: Add a missing local_irq_enable() As noticed by Changli Gao, we must call local_irq_enable() after rps_unlock() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-05 15:42:39 -07:00
Tom Herbert	5a6d234e73	rps: fixed missed rps_unlock Fix spin_unlock_irq which needs to be rps_unlock. Signed-off-by: Tom Herbert <therbert@google.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-05 14:37:55 -07:00
Linus Torvalds	749d229761	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: 9p: saving negative to unsigned char 9p: return on mutex_lock_interruptible() 9p: Creating files with names too long should fail with ENAMETOOLONG. 9p: Make sure we are able to clunk the cached fid on umount 9p: drop nlink remove fs/9p: Clunk the fid resulting from partial walk of the name 9p: documentation update 9p: Fix setting of protocol flags in v9fs_session_info structure.	2010-04-05 13:42:54 -07:00
Dan Carpenter	3dc9fef67f	9p: saving negative to unsigned char Saving -EINVAL as unsigned char truncates the high bits and changes it into 234 instead of -22. This breaks the test for "if (ret == -EINVAL)" in parse_opts(). Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2010-04-05 14:37:28 -05:00
Tom Tucker	bade732a28	svcrdma: RDMA support not yet compatible with RPC6 RPC6 requires that it be possible to create endpoints that listen exclusively for IPv4 or IPv6 connection requests. This is not currently supported by the RDMA API. This fixes a server RDMA regression introduced by `37498292a` "NFSD: Create PF_INET6 listener in write_ports". Signed-off-by: Tom Tucker<tom@opengridcomputing.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-05 12:10:22 -04:00
Aneesh Kumar K.V	6d96d3ab7a	9p: Make sure we are able to clunk the cached fid on umount dcache prune happen on umount. So we cannot mark the client satus disconnect. That will prevent a 9p call to the server Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2010-04-05 10:37:36 -05:00
Eric Dumazet	7bddd0db62	l2tp: unmanaged L2TPv3 tunnels fixes Followup to commit `789a4a2c` (l2tp: Add support for static unmanaged L2TPv3 tunnels) One missing init in l2tp_tunnel_sock_create() could access random kernel memory, and a bit field should be unsigned. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-04 01:02:46 -07:00
Brian Haley	486f50ca79	SCTP: Change to use ipv6_addr_copy() Change SCTP IPv6 code to use ipv6_addr_copy() Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-03 15:10:21 -07:00
Eric Dumazet	1f8438a853	icmp: Account for ICMP out errors When ip_append() fails because of socket limit or memory shortage, increment ICMP_MIB_OUTERRORS counter, so that "netstat -s" can report these errors. LANG=C netstat -s \| grep "ICMP messages failed" 0 ICMP messages failed For IPV6, implement ICMP6_MIB_OUTERRORS counter as well. # grep Icmp6OutErrors /proc/net/dev_snmp6/* /proc/net/dev_snmp6/eth0:Icmp6OutErrors 0 /proc/net/dev_snmp6/lo:Icmp6OutErrors 0 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-03 15:09:04 -07:00
David S. Miller	f66ef2d064	l2tp: Fix L2TP_DEBUGFS ifdef tests. We have to check CONFIG_L2TP_DEBUGFS_MODULE as well as CONFIG_L2TP_DEBUGFS. Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-03 15:01:37 -07:00
David S. Miller	f481c0d862	l2tp: Add missing semicolon to MODULE_ALIAS() in l2tp_netlink.c Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-03 14:58:07 -07:00
James Chapman	789a4a2c61	l2tp: Add support for static unmanaged L2TPv3 tunnels This patch adds support for static (unmanaged) L2TPv3 tunnels, where the tunnel socket is created by the kernel rather than being created by userspace. This means L2TP tunnels and sessions can be created manually, without needing an L2TP control protocol implemented in userspace. This might be useful where the user wants a simple ethernet over IP tunnel. A patch to iproute2 adds a new command set under "ip l2tp" to make use of this feature. This will be submitted separately. Signed-off-by: James Chapman <jchapman@katalix.com> Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-03 14:56:08 -07:00
James Chapman	0ad6614048	l2tp: Add debugfs files for dumping l2tp debug info The existing pppol2tp driver exports debug info to /proc/net/pppol2tp. Rather than adding info to that file for the new functionality added in this patch series, we add new files in debugfs, leaving the old /proc file for backwards compatibility (L2TPv2 only). Currently only one file is provided: l2tp/tunnels, which lists internal debug info for all l2tp tunnels and sessions. More files may be added later. The info is for debug and problem analysis only - userspace apps should use netlink to obtain status about l2tp tunnels and sessions. Although debugfs does not support net namespaces, the tunnels and sessions dumped in l2tp/tunnels are only those in the net namespace of the process reading the file. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-03 14:56:07 -07:00
James Chapman	d9e31d17ce	l2tp: Add L2TP ethernet pseudowire support This driver presents a regular net_device for each L2TP ethernet pseudowire instance. These interfaces are named l2tpethN by default, though userspace can specify an alternative name when the L2TP session is created, if preferred. When the pseudowire is established, regular Linux networking utilities may be used to configure the interface, i.e. give it IP address info or add it to a bridge. Any data passed over the interface is carried over an L2TP tunnel. Signed-off-by: James Chapman <jchapman@katalix.com> Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-03 14:56:06 -07:00
James Chapman	e02d494d2c	l2tp: Convert rwlock to RCU Reader/write locks are discouraged because they are slower than spin locks. So this patch converts the rwlocks used in the per_net structs to rcu. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-03 14:56:06 -07:00
James Chapman	309795f4be	l2tp: Add netlink control API for L2TP In L2TPv3, we need to create/delete/modify/query L2TP tunnel and session contexts. The number of parameters is significant. So let's use netlink. Userspace uses this API to control L2TP tunnel/session contexts in the kernel. The previous pppol2tp driver was managed using [gs]etsockopt(). This API is retained for backwards compatibility. Unlike L2TPv2 which carries only PPP frames, L2TPv3 can carry raw ethernet frames or other frame types and these do not always have an associated socket family. Therefore, we need a way to use L2TP sessions that doesn't require a socket type for each supported frame type. Hence netlink is used. Signed-off-by: James Chapman <jchapman@katalix.com> Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-03 14:56:05 -07:00
James Chapman	f408e0ce40	netlink: Export genl_lock() API for use by modules This lets kernel modules which use genl netlink APIs serialize netlink processing. Signed-off-by: James Chapman <jchapman@katalix.com> Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-03 14:56:05 -07:00
James Chapman	0d76751fad	l2tp: Add L2TPv3 IP encapsulation (no UDP) support This patch adds a new L2TPIP socket family and modifies the core to handle the case where there is no UDP header in the L2TP packet. L2TP/IP uses IP protocol 115. Since L2TP/UDP and L2TP/IP packets differ in layout, the datapath packet handling code needs changes too. Userspace uses an L2TPIP socket instead of a UDP socket when IP encapsulation is required. We can't use raw sockets for this because the semantics of raw sockets don't lend themselves to the socket-per-tunnel model - we need to Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-03 14:56:04 -07:00
James Chapman	e0d4435f93	l2tp: Update PPP-over-L2TP driver to work over L2TPv3 This patch makes changes to the L2TP PPP code for L2TPv3. The existing code has some assumptions about the L2TP header which are broken by L2TPv3. Also the sockaddr_pppol2tp structure of the original code is too small to support the increased size of the L2TPv3 tunnel and session id, so a new sockaddr_pppol2tpv3 structure is needed. In the socket calls, the size of this structure is used to tell if the operation is for L2TPv2 or L2TPv3. Signed-off-by: James Chapman <jchapman@katalix.com> Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-03 14:56:04 -07:00
James Chapman	f7faffa3ff	l2tp: Add L2TPv3 protocol support The L2TPv3 protocol changes the layout of the L2TP packet header. Tunnel and session ids change from 16-bit to 32-bit values, data sequence numbers change from 16-bit to 24-bit values and PPP-specific fields are moved into protocol-specific subheaders. Although this patch introduces L2TPv3 protocol support, there are no userspace interfaces to create L2TPv3 sessions yet. Signed-off-by: James Chapman <jchapman@katalix.com> Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-03 14:56:03 -07:00
James Chapman	9345471bca	l2tp: Add ppp device name to L2TP ppp session data When dumping L2TP PPP sessions using /proc/net/pppol2tp, get the assigned PPP device name from PPP using ppp_dev_name(). Signed-off-by: James Chapman <jchapman@katalix.com> Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-03 14:56:03 -07:00
James Chapman	fd558d186d	l2tp: Split pppol2tp patch into separate l2tp and ppp parts This patch splits the pppol2tp driver into separate L2TP and PPP parts to prepare for L2TPv3 support. In L2TPv3, protocols other than PPP can be carried, so this split creates a common L2TP core that will handle the common L2TP bits which protocol support modules such as PPP will use. Note that the existing pppol2tp module is split into l2tp_core and l2tp_ppp by this change. There are no feature changes here. Internally, however, there are significant changes, mostly to handle the separation of PPP-specific data from the L2TP session and to provide hooks in the core for modules like PPP to access. Signed-off-by: James Chapman <jchapman@katalix.com> Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-03 14:56:02 -07:00
James Chapman	21b4aaa143	l2tp: Relocate pppol2tp driver to new net/l2tp directory This patch moves the existing pppol2tp driver from drivers/net into a new net/l2tp directory, which is where the upcoming L2TPv3 code will live. The existing CONFIG_PPPOL2TP config option is left in its current place to avoid "make oldconfig" issues when an existing pppol2tp user takes this change. (This is the same approach used for the pppoatm driver, which moved to net/atm.) There are no code changes. The existing drivers/net/pppol2tp.c is simply moved to net/l2tp. Signed-off-by: James Chapman <jchapman@katalix.com> Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-03 14:56:01 -07:00
Jiri Pirko	22bedad3ce	net: convert multicast list to list_head Converts the list and the core manipulating with it to be the same as uc_list. +uses two functions for adding/removing mc address (normal and "global" variant) instead of a function parameter. +removes dev_mcast.c completely. +exposes netdev_hw_addr_list_* macros along with __hw_addr_* functions for manipulation with lists on a sandbox (used in bonding and 80211 drivers) Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-03 14:22:15 -07:00
Jiri Pirko	a748ee2426	net: move address list functions to a separate file +little renaming of unicast functions to be smooth with multicast ones Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-03 14:22:11 -07:00
Eric Dumazet	9092c658ba	net: illegal_highdma() fix Followup to commit `5acbbd428d` (net: change illegal_highdma to use dma_mask) If dev->dev.parent is NULL, we should not try to dereference it. Dont force inline illegal_highdma() as its pretty big now. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-02 13:34:49 -07:00
FUJITA Tomonori	5acbbd428d	net: change illegal_highdma to use dma_mask Robert Hancock pointed out two problems about NETIF_F_HIGHDMA: -Many drivers only set the flag when they detect they can use 64-bit DMA, since otherwise they could receive DMA addresses that they can't handle (which on platforms without IOMMU/SWIOTLB support is fatal). This means that if 64-bit support isn't available, even buffers located below 4GB will get copied unnecessarily. -Some drivers set the flag even though they can't actually handle 64-bit DMA, which would mean that on platforms without IOMMU/SWIOTLB they would get a DMA mapping error if the memory they received happened to be located above 4GB. http://lkml.org/lkml/2010/3/3/530 We can use the dma_mask if we need bouncing or not here. Then we can safely fix drivers that misuse NETIF_F_HIGHDMA. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-01 19:53:12 -07:00
Timo Teräs	d7997fe1f4	flow: structurize flow cache Group all per-cpu data to one structure instead of having many globals. Also prepare the internals so that we can have multiple instances of the flow cache if needed. Only the kmem_cache is left as a global as all flow caches share the same element size, and benefit from using a common cache. Signed-off-by: Timo Teras <timo.teras@iki.fi> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-01 19:41:36 -07:00
Timo Teräs	ea2dea9dac	xfrm: remove policy lock when accessing policy->walk.dead All of the code considers ->dead as a hint that the cached policy needs to get refreshed. The read side can just drop the read lock without any side effects. The write side needs to make sure that it's written only exactly once. Only possible race is at xfrm_policy_kill(). This is fixed by checking result of __xfrm_policy_unlink() when needed. It will always succeed if the policy object is looked up from the hash list (so some checks are removed), but it needs to be checked if we are trying to unlink policy via a reference (appropriate checks added). Since policy->walk.dead is written exactly once, it no longer needs to be protected with a write lock. Signed-off-by: Timo Teras <timo.teras@iki.fi> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-01 19:41:35 -07:00
Timo Teräs	c8bf4d04f9	xfrm_user: verify policy direction at XFRM_MSG_POLEXPIRE handler Add missing check for policy direction verification. This is especially important since without this xfrm_user may end up deleting per-socket policy which is not allowed. Signed-off-by: Timo Teras <timo.teras@iki.fi> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-01 19:41:35 -07:00
Herbert Xu	34996cb91d	xfrm: Remove xfrm_state_genid The xfrm state genid only needs to be matched against the copy saved in xfrm_dst. So we don't need a global genid at all. In fact, we don't even need to initialise it. Based on observation by Timo Teräs. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-01 19:41:34 -07:00
Changli Gao	152102c7f2	rps: keep the old behavior on SMP without rps keep the old behavior on SMP without rps RPS introduces a lock operation to per cpu variable input_pkt_queue on SMP whenever rps is enabled or not. On SMP without RPS, this lock isn't needed at all. Signed-off-by: Changli Gao <xiaosuo@gmail.com> ---- net/core/dev.c \| 42 ++++++++++++++++++++++++++++-------------- 1 file changed, 28 insertions(+), 14 deletions(-) Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-01 18:41:40 -07:00
Eric Dumazet	5d944c640b	gen_estimator: deadlock fix One of my test machine got a deadlock during "tc" sessions, adding/deleting classes & filters, using traffic estimators. After some analysis, I believe we have a potential use after free case in est_timer() : spin_lock(e->stats_lock); << HERE >> read_lock(&est_lock); if (e->bstats == NULL) << TEST >> goto skip; Test is done a bit late, because after estimator is killed, and before rcu grace period elapsed, we might already have freed/reuse memory where e->stats_locks points to (some qdisc->q.lock) A possible fix is to respect a rcu grace period at Qdisc dismantle time. On 64bit, sizeof(struct Qdisc) is exactly 192 bytes. Adding 16 bytes to it (for struct rcu_head) is a problem because it might change performance, given QDISC_ALIGNTO is 32 bytes. This is why I also change QDISC_ALIGNTO to 64 bytes, to satisfy most current alignment requirements. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-01 18:38:48 -07:00
Hagen Paul Pfeifer	d4fc6dbb5a	ipv4: remove redundant verification code The check if error signaling is wanted (inet->recverr != 0) is done by the caller: raw.c:raw_err() and udp.c:__udp4_lib_err(), so there is no need to check this condition again. Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-01 18:38:47 -07:00
Paul Moore	b914f3a2a3	netlabel: Fix several rcu_dereference() calls used without RCU read locks The recent changes to add RCU lock verification to rcu_dereference() calls caught out a problem with netlbl_unlhsh_hash(), see below. =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- net/netlabel/netlabel_unlabeled.c:246 invoked rcu_dereference_check() without protection! This patch fixes this problem as well as others like it in the NetLabel code. Also included in this patch is the identification of future work to eliminate the RCU read lock in netlbl_domhsh_add(), but in the interest of getting this patch out quickly that work will happen in another patch to be finished later. Thanks to Eric Dumazet and Paul McKenney for their help in understanding the recent RCU changes. Signed-off-by: Paul Moore <paul.moore@hp.com> Reported-by: David Howells <dhowells@redhat.com> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-01 18:32:08 -07:00
Changli Gao	6503d96168	net: check the length of the socket address passed to connect(2) check the length of the socket address passed to connect(2). Check the length of the socket address passed to connect(2). If the length is invalid, -EINVAL will be returned. Signed-off-by: Changli Gao <xiaosuo@gmail.com> ---- net/bluetooth/l2cap.c \| 3 ++- net/bluetooth/rfcomm/sock.c \| 3 ++- net/bluetooth/sco.c \| 3 ++- net/can/bcm.c \| 3 +++ net/ieee802154/af_ieee802154.c \| 3 +++ net/ipv4/af_inet.c \| 5 +++++ net/netlink/af_netlink.c \| 3 +++ 7 files changed, 20 insertions(+), 3 deletions(-) Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-01 17:26:01 -07:00
Stephen Rothwell	6c57990696	net-caif: using kmalloc/kfree requires the include of slab.h Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-01 00:28:49 -07:00
David S. Miller	d5dc056cce	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6	2010-03-31 19:32:50 -07:00
Jouni Malinen	e3efca0a63	mac80211: Fix drop_unencrypted for MFP with hwaccel Commit `bef5d1c70d` split ieee80211_drop_unencrypted() into separate functions that are used for Data and Management frames. However, it did not handle the RX_FLAG_DECRYPTED correctly for Management frames: ieee80211_drop_unencrypted() can only return 0 for Management frames, so there is no point in calling it here. Instead, just check the status->flag directly. Signed-off-by: Jouni Malinen <j@w1.fi> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-03-31 14:52:15 -04:00
Marco Porsch	d5d9de024c	nl80211: reenable station del for mesh iw dev <devname> station del <MAC address> is quiet useful in mesh mode and should be possible. Signed-off-by: Marco Porsch <marco.porsch@siemens.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-03-31 14:49:12 -04:00
Jouni Malinen	fa83a21898	mac80211: Fix dropping of unprotected robust multicast frames When selecting the RX key for group-addressed robust management frames, we do not actually select any BIP key if the frame is unprotected (since we cannot find the key index from MMIE). This results in the drop_unencrypted check in failing to drop the frame. It is enough to verify that we have a STA entry for the transmitter and that MFP is enabled for that STA; we do not need to check rx->key here. This fixes BIP processing for unprotected, group-addressed, robust management frames. Signed-off-by: Jouni Malinen <j@w1.fi> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-03-31 14:46:42 -04:00
Jouni Malinen	ecbcd32436	mac80211: Fix BIP to be used only with group-addressed frames BIP (part of IEEE 802.11w) is only supposed to be used with group-addressed frames. We ended up picking it as a default mechanism for every management whenever we did not have a STA entry for the destination (e.g., for Probe Response to a STA that is not associated). While the extra MMIE in the end of management frames should not break frames completed in most cases, there is no point in doing this. Fix key selection to pick the default management key only if the frame is sent to multicast/broadcast address and the frame is a robust management frame. Signed-off-by: Jouni Malinen <j@w1.fi> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-03-31 14:46:42 -04:00
Jouni Malinen	e69e95dbec	mac80211: Send deauth/disassoc prior to dropping STA entry When management frame protection (IEEE 802.11w) is used, the deauthentication and disassociation frames must be protected whenever the encryption keys are configured. We were removing the STA entry and with it, the keys, just before actually sending out these frames which meant that the frames went out unprotected. The AP will drop them in such a case. Fix this by reordering the operations a bit so that sta_info_destroy_addr() gets called only after ieee80211_send_deauth_disassoc(). Signed-off-by: Jouni Malinen <j@w1.fi> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-03-31 14:46:42 -04:00
Jouni Malinen	17e4ec147f	mac80211: Track Beacon signal strength and implement cqm events Calculate a running average of the signal strength reported for Beacon frames and indicate cqm events if the average value moves below or above the configured threshold value (and filter out repetitive events with by using the configured hysteresis). Signed-off-by: Jouni Malinen <j@w1.fi> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-03-31 14:46:42 -04:00
Stanislaw Gruszka	0af26b278b	mac80211: enable QoS explicitly in AP mode Enable QoS explicitly, when user space AP program will setup a QoS queues. Currently this is not needed as iwlwifi not work in AP mode and no other driver implement enable/disable QoS. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-03-31 14:46:38 -04:00
Stanislaw Gruszka	e1b3ec1a2a	mac80211: explicitly disable/enable QoS Add interface to disable/enable QoS (aka WMM or WME). Currently drivers enable it explicitly when ->conf_tx method is called, and newer disable. Disabling is needed for some APs, which do not support QoS, such we should send QoS frames to them. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-03-31 14:43:59 -04:00
Zhu Yi	e3cf8b3f7b	mac80211: support paged rx SKBs Mac80211 drivers can now pass paged SKBs to mac80211 via ieee80211_rx{_irqsafe}. The implementation currently use skb_linearize() in a few places i.e. management frame handling, software decryption, defragmentation and A-MSDU process. We will optimize them one by one later. Signed-off-by: Zhu Yi <yi.zhu@intel.com> Cc: Kalle Valo <kalle.valo@iki.fi> Cc: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-03-31 14:39:34 -04:00
Frans Pop	55f98938b5	wireless: remove trailing space in messages Also correct indentation in net/wireless/reg.c. Signed-off-by: Frans Pop <elendil@planet.nl> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-03-31 14:38:51 -04:00
Hagen Paul Pfeifer	b68c92460d	sctp: eliminate useless code Remove duplicate declaration of symbol: struct hlist_node *node was already declared, the seconds declaration shadows the first one. CC: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-30 23:58:22 -07:00
Hagen Paul Pfeifer	8379d07031	tipc: define needless global scoped variable static struct _zone *tipc_zones has local scope level and should defined with the correct scoping. CC: Per Liden <per.liden@nospam.ericsson.com> Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-30 23:58:22 -07:00
laurent chavey	598ed9367a	fix net/core/dst.c coding style error and warnings Fix coding style errors and warnings output while running checkpatch.pl on the file net/core/dst.c. Signed-off-by: chavey <chavey@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-30 23:51:08 -07:00
stephen hemminger	b00fabb402	netdev: ethtool RXHASH flag This adds ethtool and device feature flag to allow control of receive hashing offload. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-30 23:51:08 -07:00
YOSHIFUJI Hideaki / 吉藤英明	02cdce53f3	ipv6 fib: Use "Sweezle" to optimize addr_bit_test(). addr_bit_test() is used in various places in IPv6 routing table subsystem. It checks if the given fn_bit is set, where fn_bit counts bits from MSB in words in network-order. fn_bit : 0 .... 31 32 .... 64 65 .... 95 96 ....127 fn_bit >> 5 gives offset of word, and (~fn_bit & 0x1f) gives count from LSB in the network-endian word in question. fn_bit >> 5 : 0 1 2 3 ~fn_bit & 0x1f: 31 .... 0 31 .... 0 31 .... 0 31 .... 0 Thus, the mask was generated as htonl(1 << (~fn_bit & 0x1f)). This can be optimized by "sweezle" (See include/asm-generic/bitops/le.h). In little-endian, htonl(1 << bit) = 1 << (bit ^ BITOP_BE32_SWIZZLE) where BITOP_BE32_SWIZZLE is (0x1f & ~7) So, htonl(1 << (~fn_bit & 0x1f)) = 1 << ((~fn_bit & 0x1f) ^ (0x1f & ~7)) = 1 << ((~fn_bit ^ ~7) & 0x1f) = 1 << ((~fn_bit ^ BITOP_BE32_SWIZZLE) & 0x1f) In big-endian, BITOP_BE32_SWIZZLE is equal to 0. 1 << ((~fn_bit ^ BITOP_BE32_SWIZZLE) & 0x1f) = 1 << ((~fn_bit) & 0x1f) = htonl(1 << (~fn_bit & 0x1f)) Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-30 23:28:47 -07:00
YOSHIFUJI Hideaki / 吉藤英明	de7737e056	sctp: Use ipv6_addr_diff() in sctp_v6_addr_match_len(). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-30 23:28:47 -07:00
Tom Goff	7e5ab15781	net_sched: minor netns related cleanup These changes were suggested by Alexey Dobriyan <adobriyan@gmail.com>: - psched_show() does not use any private data so just pass NULL to psched_open() - remove unnecessary return statement Signed-off-by: Tom Goff <thomas.goff@boeing.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-30 19:44:56 -07:00
Sjur Braendeland	3908c69023	net-caif: add CAIF Kconfig and Makefiles Kconfig and Makefiles with options for: CAIF: Including caif CAIF_DEBUG: CAIF Debug CAIF_NETDEV: CAIF Network Device for GPRS Contexts Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-30 19:08:49 -07:00
Sjur Braendeland	cc36a070b5	net-caif: add CAIF netdevice Adding GPRS Net Device for PDP Contexts. The device can be managed by RTNL as defined in if_caif.h. Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-30 19:08:48 -07:00
Sjur Braendeland	e6f95ec8db	net-caif: add CAIF socket implementation Implementation of CAIF sockets for protocol and address family PF_CAIF and AF_CAIF. CAIF socket is connection oriented implementing SOCK_SEQPACKET and SOCK_STREAM interface with supporting blocking and non-blocking mode. Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-30 19:08:48 -07:00
Sjur Braendeland	c72dfae2f7	net-caif: add CAIF device registration functionality Registration and deregistration of CAIF Link Layer. Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-30 19:08:47 -07:00
Sjur Braendeland	15c9ac0c80	net-caif: add CAIF generic caif support functions Support functions for the caif protocol stack: cfcnfg.c - CAIF Configuration Module used for adding and removing drivers and connection cfpkt_skbuff.c - CAIF Packet layer (SKB helper functions) Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-30 19:08:47 -07:00
Sjur Braendeland	b482cd2053	net-caif: add CAIF core protocol stack CAIF generic protocol implementation. This layer is somewhat generic in order to be able to use and test it outside the Linux Kernel. cfctrl.c - CAIF control protocol layer cfdbgl.c - CAIF debug protocol layer cfdgml.c - CAIF datagram protocol layer cffrml.c - CAIF framing protocol layer cfmuxl.c - CAIF mux protocol layer cfrfml.c - CAIF remote file manager protocol layer cfserl.c - CAIF serial (fragmentation) protocol layer cfsrvl.c - CAIF generic service layer functions cfutill.c - CAIF utility protocol layer cfveil.c - CAIF AT protocol layer cfvidl.c - CAIF video protocol layer Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-30 19:08:46 -07:00
Steven J. Magnani	baff42ab14	net: Fix oops from tcp_collapse() when using splice() tcp_read_sock() can have a eat skbs without immediately advancing copied_seq. This can cause a panic in tcp_collapse() if it is called as a result of the recv_actor dropping the socket lock. A userspace program that splices data from a socket to either another socket or to a file can trigger this bug. Signed-off-by: Steven J. Magnani <steve@digidescorp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-30 13:56:01 -07:00
Johannes Berg	7236fe29fd	mac80211: move netdev queue enabling to correct spot "mac80211: fix skb buffering issue" still left a race between enabling the hardware queues and the virtual interface queues. In hindsight it's totally obvious that enabling the netdev queues for a hardware queue when the hardware queue is enabled is wrong, because it could well possible that we can fill the hw queue with packets we already have pending. Thus, we must only enable the netdev queues once all the pending packets have been processed and sent off to the device. In testing, I haven't been able to trigger this race condition, but it's clearly there, possibly only when aggregation is being enabled. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Cc: stable@kernel.org Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-03-30 15:37:28 -04:00
Porsch, Marco	533866b12c	mac80211: fix PREQ processing and one small bug 1st) a PREQ should only be processed, if it has the same SN and better metric (instead of better or equal). 2nd) next_hop[ETH_ALEN] now actually used to buffer mpath->next_hop->sta.addr for use out of lock. Signed-off-by: Marco Porsch <marco.porsch@siemens.com> Acked-by: Javier Cardona <javier@cozybit.com> Cc: stable@kernel.org Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-03-30 15:37:23 -04:00
John W. Linville	c7a00dc73b	mac80211: correct typos in "unavailable upon resume" warning Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-03-30 15:37:22 -04:00
John W. Linville	368d06f5b0	wireless: convert reg_regdb_search_lock to mutex Stanse discovered that kmalloc is being called with GFP_KERNEL while holding this spinlock. The spinlock can be a mutex instead, which also enables the removal of the unlock/lock around the lock/unlock of cfg80211_mutex and the call to set_regdom. Reported-by: Jiri Slaby <jirislaby@gmail.com> Cc: stable@kernel.org Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-03-30 15:37:20 -04:00
Tejun Heo	5a0e3ad6af	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>	2010-03-30 22:02:32 +09:00
Linus Torvalds	6631424fd2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (33 commits) r8169: offical fix for CVE-2009-4537 (overlength frame DMAs) ipv6: Don't drop cache route entry unless timer actually expired. tulip: Add missing parens. r8169: fix broken register writes pcnet_cs: add new id bonding: fix broken multicast with round-robin mode drivers/net: Fix continuation lines e1000: do not modify tx_queue_len on link speed change net: ipmr/ip6mr: prevent out-of-bounds vif_table access ixgbe: Do not run all Diagnostic offline tests when VFs are active igb: use correct bits to identify if managability is enabled benet: Fix compile warnnings in drivers/net/benet/be_ethtool.c net: Add MSG_WAITFORONE flag to recvmmsg e1000e: do not modify tx_queue_len on link speed change igbvf: do not modify tx_queue_len on link speed change ipv4: Restart rt_intern_hash after emergency rebuild (v2) ipv4: Cleanup struct net dereference in rt_intern_hash net: fix netlink address dumping in IPv4/IPv6 tulip: Fix null dereference in uli526x_rx_packet() gianfar: fix undo of reserve() ...	2010-03-29 14:41:18 -07:00
David S. Miller	7905e357eb	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6	2010-03-29 13:50:10 -07:00
Stephen Rothwell	30bde1f507	rps: fix net-sysfs build for !CONFIG_RPS Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-29 01:00:44 -07:00
Eric Dumazet	10f744d205	net: __netif_receive_skb should be static Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-28 23:07:20 -07:00
YOSHIFUJI Hideaki / 吉藤英明	54c1a859ef	ipv6: Don't drop cache route entry unless timer actually expired. This is ipv6 variant of the commit 5e016cbf6.. ("ipv4: Don't drop redirected route cache entry unless PTMU actually expired") by Guenter Roeck <guenter.roeck@ericsson.com>. Remove cache route entry in ipv6_negative_advice() only if the timer is expired. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-28 19:34:26 -07:00
Jan Engelhardt	adcfe1964e	net: increase preallocated size of nlmsg to accomodate for IFLA_STATS64 When more data is stuffed into an nlmsg than initially projected, an extra allocation needs to be done. Reserve enough for IFLA_STATS64 so that this does not to needlessy happen. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-27 17:15:29 -07:00
Jan Engelhardt	14a4b42bd6	net: fix unaligned access in IFLA_STATS64 Tony Luck observes that the original IFLA_STATS64 submission causes unaligned accesses. This is because nla_data() returns a pointer to a memory region that is only aligned to 32 bits. Do some memcpying to workaround this. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-27 16:35:50 -07:00
Nicolas Dichtel	7438189baa	net: ipmr/ip6mr: prevent out-of-bounds vif_table access When cache is unresolved, c->mf[6]c_parent is set to 65535 and minvif, maxvif are not initialized, hence we must avoid to parse IIF and OIF. A second problem can happen when the user dumps a cache entry where a VIF, that was referenced at creation time, has been removed. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-27 08:33:21 -07:00
Brandon L Black	71c5c1595c	net: Add MSG_WAITFORONE flag to recvmmsg Add new flag MSG_WAITFORONE for the recvmmsg() syscall. When this flag is specified for a blocking socket, recvmmsg() will only block until at least 1 packet is available. The default behavior is to block until all vlen packets are available. This flag has no effect on non-blocking sockets or when used in combination with MSG_DONTWAIT. Signed-off-by: Brandon L Black <blblack@gmail.com> Acked-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-27 08:29:01 -07:00
Pavel Emelyanov	6a2bad70d5	ipv4: Restart rt_intern_hash after emergency rebuild (v2) The the rebuild changes the genid which in turn is used at the hash calculation. Thus if we don't restart and go on with inserting the rt will happen in wrong chain. (Fixed Neil's comment about the index passed into the rt_intern_hash) Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Reviewed-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-26 20:57:35 -07:00
Pavel Emelyanov	b35ecb5d40	ipv4: Cleanup struct net dereference in rt_intern_hash There's no need in getting it 3 times and gcc isn't smart enough to understand this himself. This is just a cleanup before the fix (next patch). Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-26 20:57:35 -07:00
Patrick McHardy	4b97efdf39	net: fix netlink address dumping in IPv4/IPv6 When a dump is interrupted at the last device in a hash chain and then continued, "idx" won't get incremented past s_idx, so s_ip_idx is not reset when moving on to the next device. This means of all following devices only the last n - s_ip_idx addresses are dumped. Tested-by: Pawel Staszewski <pstaszewski@itcare.pl> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-03-26 20:27:49 -07:00
Tom Goff	66aa4a55fe	netlink: use the appropriate namespace pid This was included in OpenVZ kernels but wasn't integrated upstream. >From git://git.openvz.org/pub/linux-2.6.24-openvz: commit 5c69402f18adf7276352e051ece2cf31feefab02 Author: Alexey Dobriyan <adobriyan@openvz.org> Date: Mon Dec 24 14:37:45 2007 +0300 netlink: fixup ->tgid to work in multiple PID namespaces Signed-off-by: Tom Goff <thomas.goff@boeing.com> Acked-by: Alexey Dobriyan <adobriyan@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-26 20:13:58 -07:00
David S. Miller	b79d1d54cf	ipv6: Fix result generation in ipv6_get_ifaddr(). Finishing naturally from hlist_for_each_entry(x, ...) does not result in 'x' being NULL. Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-25 21:39:21 -07:00
David S. Miller	b54c9b98bb	ipv6: Preserve pervious behavior in ipv6_link_dev_addr(). Use list_add_tail() to get the behavior we had before the list_head conversion for ipv6 address lists. Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-25 21:25:30 -07:00
Linus Torvalds	126a031e43	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (25 commits) TIPC: Removed inactive maintainer isdn: Cleanup Sections in PCMCIA driver elsa isdn: Cleanup Sections in PCMCIA driver avma1 isdn: Cleanup Sections in PCMCIA driver teles isdn: Cleanup Sections in PCMCIA driver sedlbauer via-velocity: Fix FLOW_CNTL_TX_RX handling in set_mii_flow_control() netfilter: xt_hashlimit: IPV6 bugfix netfilter: ip6table_raw: fix table priority netfilter: xt_hashlimit: dl_seq_stop() fix af_key: return error if pfkey_xfrm_policy2msg_prep() fails skbuff: remove unused dma_head & dma_maps fields vlan: updates vlan real_num_tx_queues vlan: adds vlan_dev_select_queue igb: only use vlan_gro_receive if vlans are registered igb: do not modify tx_queue_len on link speed change igb: count Rx FIFO errors correctly bnx2: Use proper handler during netpoll. bnx2: Fix netpoll crash. ksz884x: fix return value of netdev_set_eeprom cgroups: net_cls as module ...	2010-03-25 14:06:29 -07:00
Eric Dumazet	df3345457a	rps: add CONFIG_RPS RPS currently depends on SMP and SYSFS Adding a CONFIG_RPS makes sense in case this requirement changes in the future. This patch saves about 1500 bytes of kernel text in case SMP is on but SYSFS is off. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-25 12:07:00 -07:00
David S. Miller	80bb3a00fa	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6	2010-03-25 11:48:58 -07:00
Eric Dumazet	8f59922914	netfilter: xt_hashlimit: IPV6 bugfix A missing break statement in hashlimit_ipv6_mask(), and masks between /64 and /95 are not working at all... Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-03-25 17:25:11 +01:00
Jozsef Kadlecsik	9c13886665	netfilter: ip6table_raw: fix table priority The order of the IPv6 raw table is currently reversed, that makes impossible to use the NOTRACK target in IPv6: for example if someone enters ip6tables -t raw -A PREROUTING -p tcp --dport 80 -j NOTRACK and if we receive fragmented packets then the first fragment will be untracked and thus skip nf_ct_frag6_gather (and conntrack), while all subsequent fragments enter nf_ct_frag6_gather and reassembly will never successfully be finished. Singed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-03-25 11:17:26 +01:00
Eric Dumazet	55e0d7cf27	netfilter: xt_hashlimit: dl_seq_stop() fix If dl_seq_start() memory allocation fails, we crash later in dl_seq_stop(), trying to kfree(ERR_PTR(-ENOMEM)) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-03-25 11:00:22 +01:00
Linus Torvalds	6c75969e22	Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: NFS: don't try to decode GETATTR if DELEGRETURN returned error sunrpc: handle allocation errors from __rpc_lookup_create() SUNRPC: Fix the return value of rpc_run_bc_task() SUNRPC: Fix a use after free bug with the NFSv4.1 backchannel SUNRPC: Fix a potential memory leak in auth_gss NFS: Prevent another deadlock in nfs_release_page()	2010-03-24 16:50:46 -07:00
Frans Pop	a570f095ea	tipc: remove trailing space in messages Signed-off-by: Frans Pop <elendil@planet.nl> Cc: Per Liden <per.liden@ericsson.com> Cc: Jon Maloy <jon.maloy@ericsson.com> Cc: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-24 14:01:54 -07:00
Frans Pop	b138338056	net: remove trailing space in messages Signed-off-by: Frans Pop <elendil@planet.nl> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-24 14:01:54 -07:00
Dan Carpenter	18062ca947	rds: cleanup: remove unneeded variable We never use "sk" so this patch removes it. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-24 13:34:09 -07:00
Dan Carpenter	a424077a0a	wimax: remove unneeded variable We never actually use "dev" so I removed it. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-24 13:34:09 -07:00
Dan Carpenter	a3dcce97b2	llc: cleanup: remove dead code from llc_init() We don't need "dev" any more after: `a5a04819c5` [LLC]: station source mac address Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-24 13:34:08 -07:00
Dan Carpenter	9a127aad4d	af_key: return error if pfkey_xfrm_policy2msg_prep() fails The original code saved the error value but just returned 0 in the end. Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Jamal Hadi Salim <hadi@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-24 13:28:27 -07:00
Dan Carpenter	14b44974d5	mac80211: remove unneed variable from ieee80211_tx_pending() We don't need "sdata" any more after: `d84f323477` mac80211: remove dev_hold/put calls Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-03-24 16:04:33 -04:00
Juuso Oikarinen	a97c13c345	mac80211: Add support for connection quality monitoring Add support for the set_cqm_config op. This op function configures the requested connection quality monitor rssi threshold and rssi hysteresis values to the hardware if the hardware supports IEEE80211_HW_SUPPORTS_CQM. For unsupported hardware, currently -EOPNOTSUPP is returned, so the mac80211 is currently not doing connection quality monitoring on the host. This could be added later, if needed. Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-03-24 16:04:33 -04:00
Juuso Oikarinen	d6dc1a3863	cfg80211: Add connection quality monitoring support to nl80211 Add support for basic configuration of a connection quality monitoring to the nl80211 interface, and basic support for notifying about triggered monitoring events. Via this interface a user-space connection manager may configure and receive pre-warning events of deteriorating WLAN connection quality, and start preparing for roaming in advance, before the connection is already lost. An example usage of such a trigger is starting scanning for nearby AP's in an attempt to find one with better connection quality, and associate to it before the connection characteristics of the existing connection become too bad or the association is even lost, leading in a prolonged delay in connectivity. The interface currently supports only RSSI, but it could be later extended to include other parameters, such as signal-to-noise ratio, if need for that arises. Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-03-24 16:02:37 -04:00
Vasu Dev	f6b9f4b263	vlan: updates vlan real_num_tx_queues Updates real_num_tx_queues in case underlying real device has changed real_num_tx_queues. -v2 As per Eric Dumazet<eric.dumazet@gmail.com> comment:- -- adds BUG_ON to catch case of real_num_tx_queues exceeding num_tx_queues. -- created this self contained patch to just update real_num_tx_queues. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-24 11:11:38 -07:00
Vasu Dev	669d3e0bab	vlan: adds vlan_dev_select_queue This is required to correctly select vlan tx queue for a driver supporting multi tx queue with ndo_select_queue implemented since currently selected vlan tx queue is unaligned to selected queue by real net_devce ndo_select_queue. Unaligned vlan tx queue selection causes thrash with higher vlan tx lock contention for least fcoe traffic and wrong socket tx queue_mapping for ixgbe having ndo_select_queue implemented. -v2 As per Eric Dumazet<eric.dumazet@gmail.com> comments, mirrored vlan net_device_ops to have them with and without vlan_dev_select_queue and then select according to real dev ndo_select_queue present or not for a vlan net_device. This is to completely skip vlan_dev_select_queue calling for real net_device not supporting ndo_select_queue. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-24 11:11:38 -07:00
Tom Herbert	e51d739ab7	net: Fix locking in flush_backlog Need to take spinlocks when dequeuing from input_pkt_queue in flush_backlog. Also, flush_backlog can now be called directly from netdev_run_todo. Signed-off-by: Tom Herbert <therbert@google.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-23 23:17:18 -07:00
Juuso Oikarinen	1e4dcd0124	mac80211: Add support for connection monitor in hardware This patch is based on a RFC patch by Kalle Valo. The wl1271 has a feature which handles the connection monitor logic in hardware, basically sending periodically nullfunc frames and reporting to the host if AP is lost, after attempting to recover by sending probe-requests to the AP. Add support to mac80211 by adding a new flag IEEE80211_HW_CONNECTION_MONITOR which prevents conn_mon_timer from triggering during idle periods, and prevents sending probe-requests to the AP if beacon-loss is indicated by the hardware. Cc: Kalle Valo <kalle.valo@nokia.com> Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-03-23 16:51:42 -04:00
Joe Perches	76326f1d4c	net/wireless/wext-core.c: Use IW_EVENT_IDX macro There's a wireless.h macro for this, might as well use it. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-03-23 16:50:27 -04:00
Joe Perches	44608f8012	net/wireless/wext_core.c: Use IW_IOCTL_IDX macro There's a wireless.h macro for this, might as well use it. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-03-23 16:50:27 -04:00
Ben Blum	8e039d84b3	cgroups: net_cls as module Allows the net_cls cgroup subsystem to be compiled as a module This patch modifies net/sched/cls_cgroup.c to allow the net_cls subsystem to be optionally compiled as a module instead of builtin. The cgroup_subsys struct is moved around a bit to allow the subsys_id to be either declared as a compile-time constant by the cgroup_subsys.h include in cgroup.h, or, if it's a module, initialized within the struct by cgroup_load_subsys. Signed-off-by: Ben Blum <bblum@andrew.cmu.edu> Acked-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-23 13:06:14 -07:00
Tom Goff	7316ae88c4	net_sched: make traffic control network namespace aware Mostly minor changes to add a net argument to various functions and remove initial network namespace checks. Make /proc/net/psched per network namespace. Signed-off-by: Tom Goff <thomas.goff@boeing.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-22 20:26:25 -07:00
Amerigo Wang	5fc05f8764	netpoll: warn when there are spaces in parameters v2: update according to Frans' comments. Currently, if we leave spaces before dst port, netconsole will silently accept it as 0. Warn about this. Also, when spaces appear in other places, make them visible in error messages. Signed-off-by: WANG Cong <amwang@redhat.com> Cc: David Miller <davem@davemloft.net> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-22 20:05:45 -07:00
David S. Miller	33e2bf6aa1	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 Conflicts: Documentation/feature-removal-schedule.txt drivers/net/wireless/ath/ath5k/phy.c	2010-03-22 18:15:15 -07:00
Tom Herbert	e880eb6c5c	rps: Fix build with CONFIG_SYSFS enabled Fix build with CONFIG_SYSFS not enabled. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-22 18:06:47 -07:00
Patrick McHardy	ef1691504c	netfilter: xt_recent: fix regression in rules using a zero hit_count Commit `8ccb92ad` (netfilter: xt_recent: fix false match) fixed supposedly false matches in rules using a zero hit_count. As it turns out there is nothing false about these matches and people are actually using entries with a hit_count of zero to make rules dependant on addresses inserted manually through /proc. Since this slipped past the eyes of three reviewers, instead of reverting the commit in question, this patch explicitly checks for a hit_count of zero to make the intentions more clear. Reported-by: Thomas Jarosch <thomas.jarosch@intra2net.com> Tested-by: Thomas Jarosch <thomas.jarosch@intra2net.com> Cc: stable@kernel.org Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-03-22 18:25:20 +01:00
Linus Torvalds	258152acc0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (38 commits) ip_gre: include route header_len in max_headroom calculation if_tunnel.h: add missing ams/byteorder.h include ipv4: Don't drop redirected route cache entry unless PTMU actually expired net: suppress lockdep-RCU false positive in FIB trie. Bluetooth: Fix kernel crash on L2CAP stress tests Bluetooth: Convert debug files to actually use debugfs instead of sysfs Bluetooth: Fix potential bad memory access with sysfs files netfilter: ctnetlink: fix reliable event delivery if message building fails netlink: fix NETLINK_RECV_NO_ENOBUFS in netlink_set_err() NET_DMA: free skbs periodically netlink: fix unaligned access in nla_get_be64() tcp: Fix tcp_mark_head_lost() with packets == 0 net: ipmr/ip6mr: fix potential out-of-bounds vif_table access KS8695: update ksp->next_rx_desc_read at the end of rx loop igb: Add support for 82576 ET2 Quad Port Server Adapter ixgbevf: Message formatting cleanups ixgbevf: Shorten up delay timer for watchdog task ixgbevf: Fix VF Stats accounting after reset ixgbe: Set IXGBE_RSC_CB(skb)->DMA field to zero after unmapping the address ixgbe: fix for real_num_tx_queues update issue ...	2010-03-22 10:01:58 -07:00
Tetsuo Handa	c3824d21eb	rxrpc: Check allocation failure. alloc_skb() can return NULL. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-03-22 09:57:19 -07:00
Dan Carpenter	f1f0abe192	sunrpc: handle allocation errors from __rpc_lookup_create() __rpc_lookup_create() can return ERR_PTR(-ENOMEM). Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org	2010-03-22 05:34:13 -04:00
Trond Myklebust	ff0901f803	SUNRPC: Fix the return value of rpc_run_bc_task() Currently rpc_run_bc_task() will return NULL if the task allocation failed. However the only caller is bc_send, which assumes that the return value will be an ERR_PTR. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-03-22 05:34:12 -04:00
Trond Myklebust	c9acb42ef1	SUNRPC: Fix a use after free bug with the NFSv4.1 backchannel The ->release_request() callback was designed to allow the transport layer to do housekeeping after the RPC call is done. It cannot be used to free the request itself, and doing so leads to a use-after-free bug in xprt_release(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-03-22 05:32:44 -04:00
Timo Teräs	243aad830e	ip_gre: include route header_len in max_headroom calculation Taking route's header_len into account, and updating gre device needed_headroom will give better hints on upper bound of required headroom. This is useful if the gre traffic is xfrm'ed. Signed-off-by: Timo Teras <timo.teras@iki.fi> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-21 21:23:28 -07:00
Dan Carpenter	7668448ea9	bridge: cleanup: remove unused assignment We never actually use iph again so this assignment can be removed. Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-21 21:21:58 -07:00
Guenter Roeck	5e016cbf6c	ipv4: Don't drop redirected route cache entry unless PTMU actually expired TCP sessions over IPv4 can get stuck if routers between endpoints do not fragment packets but implement PMTU instead, and we are using those routers because of an ICMP redirect. Setup is as follows MTU1 MTU2 MTU1 A--------B------C------D with MTU1 > MTU2. A and D are endpoints, B and C are routers. B and C implement PMTU and drop packets larger than MTU2 (for example because DF is set on all packets). TCP sessions are initiated between A and D. There is packet loss between A and D, causing frequent TCP retransmits. After the number of retransmits on a TCP session reaches tcp_retries1, tcp calls dst_negative_advice() prior to each retransmit. This results in route cache entries for the peer to be deleted in ipv4_negative_advice() if the Path MTU is set. If the outstanding data on an affected TCP session is larger than MTU2, packets sent from the endpoints will be dropped by B or C, and ICMP NEEDFRAG will be returned. A and D receive NEEDFRAG messages and update PMTU. Before the next retransmit, tcp will again call dst_negative_advice(), causing the route cache entry (with correct PMTU) to be deleted. The retransmitted packet will be larger than MTU2, causing it to be dropped again. This sequence repeats until the TCP session aborts or is terminated. Problem is fixed by removing redirected route cache entries in ipv4_negative_advice() only if the PMTU is expired. Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-21 20:55:13 -07:00
Robert Olsson	e99b99b471	pktgen node allocation Here is patch to manipulate packet node allocation and implicitly how packets are DMA'd etc. The flag NODE_ALLOC enables the function and numa_node_id(); when enabled it can also be explicitly controlled via a new node parameter Tested this with 10 Intel 82599 ports w. TYAN S7025 E5520 CPU's. Was able to TX/DMA ~80 Gbit/s to Ethernet wires. Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-21 20:33:36 -07:00
Eric Dumazet	99fe3c391d	net: dev_getfirstbyhwtype() optimization Use RCU to avoid RTNL use in dev_getfirstbyhwtype() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-21 20:33:36 -07:00
Eric Dumazet	ec733b15a3	net: snmp mib cleanup There is no point to align or pad mibs to cache lines, they are per cpu allocated with a 8 bytes alignment anyway. This wastes space for no gain. This patch removes __SNMP_MIB_ALIGN__ Since SNMP mibs contain "unsigned long" fields only, we can relax the allocation alignment from "unsigned long long" to "unsigned long" Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-21 18:34:16 -07:00
Eric Dumazet	62c97ac04a	atm: Use kasprintf Use kasprintf in atm_proc_dev_register() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-21 18:34:15 -07:00
Eric Dumazet	283f2fe87e	net: speedup netdev_set_master() We currently force a synchronize_net() in netdev_set_master() This seems necessary only when a slave had a master and we dismantle it. In the other case ("ifenslave bond0 ethO"), we dont need this long delay. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-21 18:34:15 -07:00
Eric Dumazet	907cdda520	tcp: Add SNMP counter for DEFER_ACCEPT Its currently hard to diagnose when ACK frames are dropped because an application set TCP_DEFER_ACCEPT on its listening socket. See http://bugzilla.kernel.org/show_bug.cgi?id=15507 This patch adds a SNMP value, named TCPDeferAcceptDrop netstat -s \| grep TCPDeferAcceptDrop TCPDeferAcceptDrop: 0 This counter is incremented every time we drop a pure ACK frame received by a socket in SYN_RECV state because its SYNACK retrans count is lower than defer_accept value. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-21 18:31:35 -07:00
Jiri Pirko	32a806c194	bonding: flush unicast and multicast lists when changing type After the type change, addresses in unicast and multicast lists wouldn't make sense, not to mention possible different lenghts. So flush both lists here. Note "dev_addr_discard" will be very soon replaced by "dev_mc_flush" (once mc_list conversion will be done). Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-21 18:31:34 -07:00
Patrick McHardy	755d0e77ac	net: rtnetlink: ignore NETDEV_PRE_TYPE_CHANGE in rtnetlink_event() Ignore the new NETDEV_PRE_TYPE_CHANGE event in rtnetlink_event() since there have been no changes userspace needs to be notified of. Also add a comment to the netdev notifier event definitions to remind people to update the exclusion list when adding new event types. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-21 18:31:34 -07:00
David S. Miller	e3a61d47cc	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/holtmann/bluetooth-2.6	2010-03-21 18:03:11 -07:00
Paul E. McKenney	634a4b2038	net: suppress lockdep-RCU false positive in FIB trie. Allow fib_find_node() to be called either under rcu_read_lock() protection or with RTNL held. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-21 18:01:05 -07:00
Trond Myklebust	cdead7cf12	SUNRPC: Fix a potential memory leak in auth_gss The function alloc_enc_pages() currently fails to release the pointer rqstp->rq_enc_pages in the error path. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: stable@kernel.org	2010-03-21 12:22:49 -04:00
Andrei Emeltchenko	c2c77ec83b	Bluetooth: Fix kernel crash on L2CAP stress tests Added very simple check that req buffer has enough space to fit configuration parameters. Shall be enough to reject packets with configuration size more than req buffer. Crash trace below [ 6069.659393] Unable to handle kernel paging request at virtual address 02000205 [ 6069.673034] Internal error: Oops: 805 [#1] PREEMPT ... [ 6069.727172] PC is at l2cap_add_conf_opt+0x70/0xf0 [l2cap] [ 6069.732604] LR is at l2cap_recv_frame+0x1350/0x2e78 [l2cap] ... [ 6070.030303] Backtrace: [ 6070.032806] [<bf1c2880>] (l2cap_add_conf_opt+0x0/0xf0 [l2cap]) from [<bf1c6624>] (l2cap_recv_frame+0x1350/0x2e78 [l2cap]) [ 6070.043823] r8:dc5d3100 r7:df2a91d6 r6:00000001 r5:df2a8000 r4:00000200 [ 6070.050659] [<bf1c52d4>] (l2cap_recv_frame+0x0/0x2e78 [l2cap]) from [<bf1c8408>] (l2cap_recv_acldata+0x2bc/0x350 [l2cap]) [ 6070.061798] [<bf1c814c>] (l2cap_recv_acldata+0x0/0x350 [l2cap]) from [<bf0037a4>] (hci_rx_task+0x244/0x478 [bluetooth]) [ 6070.072631] r6:dc647700 r5:00000001 r4:df2ab740 [ 6070.077362] [<bf003560>] (hci_rx_task+0x0/0x478 [bluetooth]) from [<c006b9fc>] (tasklet_action+0x78/0xd8) [ 6070.087005] [<c006b984>] (tasklet_action+0x0/0xd8) from [<c006c160>] Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@nokia.com> Acked-by: Gustavo F. Padovan <gustavo@padovan.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2010-03-21 05:49:36 +01:00
Marcel Holtmann	aef7d97cc6	Bluetooth: Convert debug files to actually use debugfs instead of sysfs Some of the debug files ended up wrongly in sysfs, because at that point of time, debugfs didn't exist. Convert these files to use debugfs and also seq_file. This patch converts all of these files at once and then removes the exported symbol for the Bluetooth sysfs class. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2010-03-21 05:49:35 +01:00
Marcel Holtmann	101545f6fe	Bluetooth: Fix potential bad memory access with sysfs files When creating a high number of Bluetooth sockets (L2CAP, SCO and RFCOMM) it is possible to scribble repeatedly on arbitrary pages of memory. Ensure that the content of these sysfs files is always less than one page. Even if this means truncating. The files in question are scheduled to be moved over to debugfs in the future anyway. Based on initial patches from Neil Brown and Linus Torvalds Reported-by: Neil Brown <neilb@suse.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2010-03-21 05:49:32 +01:00
David S. Miller	3e81c6da39	ipv6: Fix bug in ipv6_chk_same_addr(). hlist_for_each_entry(p...) will not necessarily initialize 'p' to anything if the hlist is empty. GCC notices this and emits a warning. Just return true explicitly when we hit a match, and return false is we fall out of the loop without one. Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-20 16:18:00 -07:00
YOSHIFUJI Hideaki	b2db756449	ipv6: Reduce timer events for addrconf_verify(). This patch reduces timer events while keeping accuracy by rounding our timer and/or batching several address validations in addrconf_verify(). addrconf_verify() is called at earliest timeout among interface addresses' timeouts, but at maximum ADDR_CHECK_FREQUENCY (120 secs). In most cases, all of timeouts of interface addresses are long enough (e.g. several hours or days vs 2 minutes), this timer is usually called every ADDR_CHECK_FREQUENCY, and it is okay to be lazy. (Note this timer could be eliminated if all code paths which modifies variables related to timeouts call us manually, but it is another story.) However, in other least but important cases, we try keeping accuracy. When the real interface address timeout is coming, and the timeout is just before the rounded timeout, we accept some error. When a timeout has been reached, we also try batching other several events in very near future. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-20 16:11:12 -07:00
stephen hemminger	88949cf484	IPv6: addrconf cleanup addrconf_verify The variable regen_advance is only used in the privacy case. Move it to simplify code and eliminate ifdef's Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-20 16:09:11 -07:00
Stephen Hemminger	e21e8467d3	addrconf: checkpatch fixes Fix some of the checkpatch complaints. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-20 16:09:01 -07:00
Stephen Hemminger	bcdd553fd3	IPv6: addrconf cleanups Some minor stuff, reformat comments and add whitespace for clarity Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-20 16:08:18 -07:00
stephen hemminger	502a2ffd73	ipv6: convert idev_list to list macros Convert to list macro's for the list of addresses per interface in IPv6. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-20 15:45:09 -07:00
stephen hemminger	3a88a81d89	ipv6: user better hash for addrconf The existing hash function has a couple of issues: * it is hardwired to 16 for IN6_ADDR_HSIZE * limited to 256 and callers using int * use jhash2 rather than some old BSD algorithm No need for random seed since this is local only (based on assigned addresses) table. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-20 15:44:35 -07:00
stephen hemminger	5c578aedcb	IPv6: convert addrconf hash list to RCU Convert from reader/writer lock to RCU and spinlock for addrconf hash list. Adds an additional helper macro for hlist_for_each_entry_continue_rcu to handle the continue case. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-20 15:44:35 -07:00
stephen hemminger	c2e21293c0	ipv6: convert addrconf list to hlist Using hash list macros, simplifies code and helps later RCU. This patch includes some initialization that is not strictly necessary, since an empty hlist node/list is all zero; and list is in BSS and node is allocated with kzalloc. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-20 15:44:34 -07:00
stephen hemminger	372e6c8f1f	ipv6: convert temporary address list to list macros Use list macros instead of open coded linked list. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-20 15:44:34 -07:00
David S. Miller	e77c8e83dd	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6	2010-03-20 15:24:29 -07:00
Pablo Neira Ayuso	37b7ef7203	netfilter: ctnetlink: fix reliable event delivery if message building fails This patch fixes a bug that allows to lose events when reliable event delivery mode is used, ie. if NETLINK_BROADCAST_SEND_ERROR and NETLINK_RECV_NO_ENOBUFS socket options are set. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-20 14:29:03 -07:00
Pablo Neira Ayuso	1a50307ba1	netlink: fix NETLINK_RECV_NO_ENOBUFS in netlink_set_err() Currently, ENOBUFS errors are reported to the socket via netlink_set_err() even if NETLINK_RECV_NO_ENOBUFS is set. However, that should not happen. This fixes this problem and it changes the prototype of netlink_set_err() to return the number of sockets that have set the NETLINK_RECV_NO_ENOBUFS socket option. This return value is used in the next patch in these bugfix series. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-20 14:29:03 -07:00
Steven J. Magnani	73852e8151	NET_DMA: free skbs periodically Under NET_DMA, data transfer can grind to a halt when userland issues a large read on a socket with a high RCVLOWAT (i.e., 512 KB for both). This appears to be because the NET_DMA design queues up lots of memcpy operations, but doesn't issue or wait for them (and thus free the associated skbs) until it is time for tcp_recvmesg() to return. The socket hangs when its TCP window goes to zero before enough data is available to satisfy the read. Periodically issue asynchronous memcpy operations, and free skbs for ones that have completed, to prevent sockets from going into zero-window mode. Signed-off-by: Steven J. Magnani <steve@digidescorp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-20 14:29:02 -07:00
Lennart Schulte	6830c25b7d	tcp: Fix tcp_mark_head_lost() with packets == 0 A packet is marked as lost in case packets == 0, although nothing should be done. This results in a too early retransmitted packet during recovery in some cases. This small patch fixes this issue by returning immediately. Signed-off-by: Lennart Schulte <lennart.schulte@nets.rwth-aachen.de> Signed-off-by: Arnd Hannemann <hannemann@nets.rwth-aachen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-19 22:47:22 -07:00
Patrick McHardy	a50436f2cd	net: ipmr/ip6mr: fix potential out-of-bounds vif_table access mfc_parent of cache entries is used to index into the vif_table and is initialised from mfcctl->mfcc_parent. This can take values of to 2^16-1, while the vif_table has only MAXVIFS (32) entries. The same problem affects ip6mr. Refuse invalid values to fix a potential out-of-bounds access. Unlike the other validity checks, this is checked in ipmr_mfc_add() instead of the setsockopt handler since its unused in the delete path and might be uninitialized. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-19 22:47:22 -07:00
stephen hemminger	97e3ecd112	TCP: check min TTL on received ICMP packets This adds RFC5082 checks for TTL on received ICMP packets. It adds some security against spoofed ICMP packets disrupting GTSM protected sessions. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-19 21:00:42 -07:00
Herbert Xu	10414444cb	ipv6: Remove redundant dst NULL check in ip6_dst_check As the only path leading to ip6_dst_check makes an indirect call through dst->ops, dst cannot be NULL in ip6_dst_check. This patch removes this check in case it misleads people who come across this code. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-19 21:00:42 -07:00
Timo Teräs	d11a4dc18b	ipv4: check rt_genid in dst_check Xfrm_dst keeps a reference to ipv4 rtable entries on each cached bundle. The only way to renew xfrm_dst when the underlying route has changed, is to implement dst_check for this. This is what ipv6 side does too. The problems started after `87c1e12b5e` ("ipsec: Fix bogus bundle flowi") which fixed a bug causing xfrm_dst to not get reused, until that all lookups always generated new xfrm_dst with new route reference and path mtu worked. But after the fix, the old routes started to get reused even after they were expired causing pmtu to break (well it would occationally work if the rtable gc had run recently and marked the route obsolete causing dst_check to get called). Signed-off-by: Timo Teras <timo.teras@iki.fi> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-19 21:00:41 -07:00
florian@mickler.org	819bfecc4f	rename new rfkill sysfs knobs This patch renames the (never officially released) sysfs-knobs "blocked_hw" and "blocked_sw" to "hard" and "soft", as the hardware vs software conotation is misleading. It also gets rid of not needed locks around u32-read-access. Signed-off-by: Florian Mickler <florian@mickler.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-03-19 15:48:25 -04:00
Eric Dumazet	0641e4fbf2	net: Potential null skb->dev dereference When doing "ifenslave -d bond0 eth0", there is chance to get NULL dereference in netif_receive_skb(), because dev->master suddenly becomes NULL after we tested it. We should use ACCESS_ONCE() to avoid this (or rcu_dereference()) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-18 21:16:45 -07:00
Alexandra Kossovsky	b634f87522	tcp: Fix OOB POLLIN avoidance. From: Alexandra.Kossovsky@oktetlabs.ru Fixes kernel bugzilla #15541 Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-18 20:29:24 -07:00
Jiri Pirko	1c01fe14a8	net: forbid underlaying devices to change its type It's not desired for underlaying devices to change type. At the time, there is for example possible to have bond with changed type from Ethernet to Infiniband as a port of a bridge. This patch fixes this. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-18 20:00:02 -07:00
Jiri Pirko	3ca5b4042e	bonding: check return value of nofitier when changing type This patch adds the possibility to refuse the bonding type change for other subsystems (such as for example bridge, vlan, etc.) Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-18 20:00:02 -07:00
Jiri Pirko	93d9b7d7a8	net: rename notifier defines for netdev type change Since generally there could be more netdevices changing type other than bonding, making this event type name "bonding-unrelated" Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-18 20:00:01 -07:00
Tom Herbert	1e94d72fea	rps: Fixed build with CONFIG_SMP not enabled. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-18 17:45:44 -07:00
Linus Torvalds	7c34691abe	Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: NFS: ensure bdi_unregister is called on mount failure. NFS: Avoid a deadlock in nfs_release_page NFSv4: Don't ignore the NFS_INO_REVAL_FORCED flag in nfs_revalidate_inode() nfs4: Make the v4 callback service hidden nfs: fix unlikely memory leak rpc client can not deal with ENOSOCK, so translate it into ENOCONN	2010-03-18 16:50:09 -07:00
Jiri Pirko	ff6e2163f2	net: convert multiple drivers to use netdev_for_each_mc_addr, part7 In mlx4, using char * to store mc address in private structure instead. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-16 21:23:25 -07:00
Neil Horman	ca50910185	tipc: Allow retransmission of cloned buffers Forward port commit fc477e160af086f6e30c3d4fdf5f5c000d29beb5 from git://tipc.cslab.ericsson.net/pub/git/people/allan/tipc.git Origional commit message: Allow retransmission of cloned buffers This patch fixes an issue with TIPC's message retransmission logic that prevented retransmission of clone sk_buffs. Originally intended as a means of avoiding wasted work in retransmitting messages that were still on the driver's outbound queue, it also prevented TIPC from retransmitting messages through other means -- such as the secondary bearer of the broadcast link, or another interface in a set of bonded interfaces. This fix removes existing checks for cloned sk_buffs that prevented such retransmission. Origionally-Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-16 21:23:24 -07:00
Neil Horman	1a624832a0	tipc: Increase frequency of load distribution over broadcast link Forward port commit 29eb572941501c40ac6e62dbc5043bf9ee76ee56 from git://tipc.cslab.ericsson.net/pub/git/people/allan/tipc.git Origional commit message: Increase frequency of load distribution over broadcast link This patch enhances the behavior of TIPC's broadcast link so that it alternates between redundant bearers (if available) after every message sent, rather than after every 10 messages. This change helps to speed up delivery of retransmitted messages by ensuring that they are not sent repeatedly over a bearer that is no longer working, but not yet recognized as failed. Tested by myself in the latest net-2.6 tree using the tipc sanity test suite Origionally-signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Neil Horman <nhorman@tuxdriver.com> bcast.c \| 35 ++++++++++++++--------------------- 1 file changed, 14 insertions(+), 21 deletions(-) Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-16 21:23:23 -07:00
Jan Engelhardt	10708f37ae	net: core: add IFLA_STATS64 support `ip -s link` shows interface counters truncated to 32 bit. This is because interface statistics are transported only in 32-bit quantity to userspace. This commit adds a new IFLA_STATS64 attribute that exports them in full 64 bit. References: http://lkml.indiana.edu/hypermail/linux/kernel/0307.3/0215.html Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-16 21:23:22 -07:00
Jan Engelhardt	6ce1a6df6e	net: tcp: make veno selectable as default congestion module Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-16 21:23:21 -07:00
Jan Engelhardt	dd2acaa7bc	net: tcp: make hybla selectable as default congestion module Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-16 21:23:20 -07:00
Eric Dumazet	2fb3573dfb	net: remove rcu locking from fib_rules_event() We hold RTNL at this point and dont use RCU variants of list traversals, we dont need rcu_read_lock()/rcu_read_unlock() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-16 21:23:19 -07:00
stephen hemminger	14bb478983	bridge: per-cpu packet statistics (v3) The shared packet statistics are a potential source of slow down on bridged traffic. Convert to per-cpu array, but only keep those statistics which change per-packet. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-16 21:23:19 -07:00
Tom Herbert	0a9627f264	rps: Receive Packet Steering This patch implements software receive side packet steering (RPS). RPS distributes the load of received packet processing across multiple CPUs. Problem statement: Protocol processing done in the NAPI context for received packets is serialized per device queue and becomes a bottleneck under high packet load. This substantially limits pps that can be achieved on a single queue NIC and provides no scaling with multiple cores. This solution queues packets early on in the receive path on the backlog queues of other CPUs. This allows protocol processing (e.g. IP and TCP) to be performed on packets in parallel. For each device (or each receive queue in a multi-queue device) a mask of CPUs is set to indicate the CPUs that can process packets. A CPU is selected on a per packet basis by hashing contents of the packet header (e.g. the TCP or UDP 4-tuple) and using the result to index into the CPU mask. The IPI mechanism is used to raise networking receive softirqs between CPUs. This effectively emulates in software what a multi-queue NIC can provide, but is generic requiring no device support. Many devices now provide a hash over the 4-tuple on a per packet basis (e.g. the Toeplitz hash). This patch allow drivers to set the HW reported hash in an skb field, and that value in turn is used to index into the RPS maps. Using the HW generated hash can avoid cache misses on the packet when steering it to a remote CPU. The CPU mask is set on a per device and per queue basis in the sysfs variable /sys/class/net/<device>/queues/rx-<n>/rps_cpus. This is a set of canonical bit maps for receive queues in the device (numbered by <n>). If a device does not support multi-queue, a single variable is used for the device (rx-0). Generally, we have found this technique increases pps capabilities of a single queue device with good CPU utilization. Optimal settings for the CPU mask seem to depend on architectures and cache hierarcy. Below are some results running 500 instances of netperf TCP_RR test with 1 byte req. and resp. Results show cumulative transaction rate and system CPU utilization. e1000e on 8 core Intel Without RPS: 108K tps at 33% CPU With RPS: 311K tps at 64% CPU forcedeth on 16 core AMD Without RPS: 156K tps at 15% CPU With RPS: 404K tps at 49% CPU bnx2x on 16 core AMD Without RPS 567K tps at 61% CPU (4 HW RX queues) Without RPS 738K tps at 96% CPU (8 HW RX queues) With RPS: 854K tps at 76% CPU (4 HW RX queues) Caveats: - The benefits of this patch are dependent on architecture and cache hierarchy. Tuning the masks to get best performance is probably necessary. - This patch adds overhead in the path for processing a single packet. In a lightly loaded server this overhead may eliminate the advantages of increased parallelism, and possibly cause some relative performance degradation. We have found that masks that are cache aware (share same caches with the interrupting CPU) mitigate much of this. - The RPS masks can be changed dynamically, however whenever the mask is changed this introduces the possibility of generating out of order packets. It's probably best not change the masks too frequently. Signed-off-by: Tom Herbert <therbert@google.com> include/linux/netdevice.h \| 32 ++++- include/linux/skbuff.h \| 3 + net/core/dev.c \| 335 +++++++++++++++++++++++++++++++++++++-------- net/core/net-sysfs.c \| 225 ++++++++++++++++++++++++++++++- net/core/skbuff.c \| 2 + 5 files changed, 538 insertions(+), 59 deletions(-) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-16 21:23:18 -07:00
Tina Yang	768bbedf9c	RDS: Enable per-cpu workqueue threads Create per-cpu workqueue threads instead of a single krdsd thread. This is a step towards better scalability. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-16 21:17:02 -07:00
Andy Grover	561c7df63e	RDS: Do not call set_page_dirty() with irqs off set_page_dirty() unconditionally re-enables interrupts, so if we call it with irqs off, they will be on after the call, and that's bad. This patch moves the call after we've re-enabled interrupts in send_drop_to(), so it's safe. Also, add BUG_ONs to let us know if we ever do call set_page_dirty with interrupts off. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-16 21:17:01 -07:00
Sherman Pun	450d06c020	RDS: Properly unmap when getting a remote access error If the RDMA op has aborted with a remote access error, in addition to what we already do (tell userspace it has completed with an error) also unmap it and put() the rm. Otherwise, hangs may occur on arches that track maps and will not exit without proper cleanup. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-16 21:17:00 -07:00
Andy Grover	b98ba52f96	RDS: only put sockets that have seen congestion on the poll_waitq rds_poll_waitq's listeners will be awoken if we receive a congestion notification. Bad performance may result because all polled sockets contend for this single lock. However, it should not be necessary to wake pollers when a congestion update arrives if they have never experienced congestion, and not putting these on the waitq will hopefully greatly reduce contention. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-16 21:16:59 -07:00
Tina Yang	550a8002e4	RDS: Fix locking in rds_send_drop_to() It seems rds_send_drop_to() called __rds_rdma_send_complete(rs, rm, RDS_RDMA_CANCELED) with only rds_sock lock, but not rds_message lock. It raced with other threads that is attempting to modify the rds_message as well, such as from within rds_rdma_send_complete(). Signed-off-by: Tina Yang <tina.yang@oracle.com> Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-16 21:16:58 -07:00
Andy Grover	97069788d6	RDS: Turn down alarming reconnect messages RDS's error messages when a connection goes down are a little extreme. A connection may go down, and it will be re-established, and everything is fine. This patch links these messages through rdsdebug(), instead of to printk directly. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-16 21:16:57 -07:00
Andy Grover	571c02fa81	RDS: Workaround for in-use MRs on close causing crash if a machine is shut down without closing sockets properly, and freeing all MRs, then a BUG_ON will bring it down. This patch changes these to WARN_ONs -- leaking MRs is not fatal (although not ideal, and there is more work to do here for a proper fix.) Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-16 21:16:56 -07:00
Tina Yang	048c15e641	RDS: Fix send locking issue Fix a deadlock between rds_rdma_send_complete() and rds_send_remove_from_sock() when rds socket lock and rds message lock are acquired out-of-order. Signed-off-by: Tina Yang <Tina.Yang@oracle.com> Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-16 21:16:55 -07:00
Andy Grover	2e7b3b9945	RDS: Fix congestion issues for loopback We have two kinds of loopback: software (via loop transport) and hardware (via IB). sw is used for 127.0.0.1, and doesn't support rdma ops. hw is used for sends to local device IPs, and supports rdma. Both are used in different cases. For both of these, when there is a congestion map update, we want to call rds_cong_map_updated() but not actually send anything -- since loopback local and foreign congestion maps point to the same spot, they're already in sync. The old code never called sw loop's xmit_cong_map(),so rds_cong_map_updated() wasn't being called for it. sw loop ports would not work right with the congestion monitor. Fixing that meant that hw loopback now would send congestion maps to itself. This is also undesirable (racy), so we check for this case in the ib-specific xmit code. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-16 21:16:55 -07:00
Andy Grover	8e82376e5f	RDS/TCP: Wait to wake thread when write space available Instead of waking the send thread whenever any send space is available, wait until it is at least half empty. This is modeled on how sock_def_write_space() does it, and may help to minimize context switches. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-16 21:16:55 -07:00
Andy Grover	b075cfdb66	RDS: update copy_to_user state in tcp transport Other transports use rds_page_copy_user, which updates our s_copy_to_user counter. TCP doesn't, so it needs to explicity call rds_stats_add(). Reported-by: Richard Frank <richard.frank@oracle.com> Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-16 21:16:54 -07:00
Andy Grover	1123fd734d	RDS: sendmsg() should check sndtimeo, not rcvtimeo Most likely cut n paste error - sendmsg() was checking sock_rcvtimeo. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-16 21:16:54 -07:00
Andy Grover	735f61e626	RDS: Do not BUG() on error returned from ib_post_send BUGging on a runtime error code should be avoided. This patch also eliminates all other BUG()s that have no real reason to exist. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-16 21:16:53 -07:00
David S. Miller	87faf3ccf1	bridge: Make first arg to deliver_clone const. Otherwise we get a warning from the call in br_forward(). Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-16 14:37:47 -07:00
YOSHIFUJI Hideaki / 吉藤英明	32dec5dd02	bridge br_multicast: Don't refer to BR_INPUT_SKB_CB(skb)->mrouters_only without IGMP snooping. Without CONFIG_BRIDGE_IGMP_SNOOPING, BR_INPUT_SKB_CB(skb)->mrouters_only is not appropriately initialized, so we can see garbage. A clear option to fix this is to set it even without that config, but we cannot optimize out the branch. Let's introduce a macro that returns value of mrouters_only and let it return 0 without CONFIG_BRIDGE_IGMP_SNOOPING. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-16 14:34:23 -07:00

... 2 3 4 5 6 ...

15162 Commits