laforge-slides/2005/netfilter_status-netconf2005/netfilter_status-netconf200...

241 lines
8.5 KiB
C++

--author Harald Welte <laforge@netfilter.org>
--title What's been happening in the netfilter world
--date 16 Jul 2005
This is an overview about what has been going on in the netfilter world recently. The main purpose is to keep the rest of the linux kenrel networking crowd informed.
--footer This presentation is made with tpp http://synflood.at/tpp.html
--newpage
--footer netconf'05 - netfilter update
--header Overview
rustynat
nfnetlink
ctnetlink
flow-based accounting
conntrack tool
helpers (pptp, h.323, sip)
pkttables
ipset
ct_sync
transparent proxies
misc
--newpage
--footer netconf'05 - netfilter update
--header rustynat
Three years ago, the "newnat" design was adopted as architecture and API for conntrack/nat helpers. This is what most people are using, and what's in kernel 2.4.x and 2.6.x (for x < 11).
In 2.6.11, a new scheme (which I call "rustynat") was integrated.
Fundamental changes:
struct ip_conntrack no longer has sibling_list
struct ip_conntrack_expect is killed when expected conntrack comes in
NAT helpers are now called by callback functions from conntrack helpers
cleanup of NAT manip data structures to reduce size of ip_conntrack
Problems:
All existing helpers need to be ported (non-trivial port)
Some fallout related to sequence number updates in NAT helper case
--newpage
--footer netconf'05 - netfilter update
--header nfnetlink
Fundamental idea is to have a generic layer for all netfilter related netlink messages. It basically adds another layer of abstraction/multiplexing on top of netlink. Is it really needed?
Looking at the real users, they are extremely different:
ctnetlink
dump/read/flush/update connection tracking table
dump/read/flush/update connection tracking expectation table
ulog-ng
log arbitrary (even non-ip) packets to userspace
nf_queue
queue arbitrary (even non-ip) packets to userspace
pkttnetlink
ruleset management
--newpage
--footer netconf'05 - netfilter update
--header ctnetlink
Purpose of ctnetlink is to have a userspace interface to the conntrack table
message types
IPCTNL_MSG_CT_NEW - create a new conntrack
IPCTNL_MSG_CT_DELETE - delete a conntrack, flush table
IPCTNL_MSG_CT_GET - read one or more conntracks
IPCTNL_MSG_CT_GET_CTRZERO - read conntrack and zero counters
IPCTNL_MSG_EXP_NEW - create a new expect
IPCTNL_MSG_EXP_DELETE - delete an expect
IPCTNL_MSG_EXP_GET - read one or more expects
IPCTNL_MSG_CONFIG - configuration of masks (see later)
--newpage
--footer netconf'05 - netfilter update
--header conntrack event cache
ctnetlink also wants to have events, i.e. inform userspace about updates
ip_conntrack was extended to build an 'event cache', i.e. a list of events that have happened while one specific packet passes throught the stack:
IPCT_DESTROY
IPCT_NEW
IPCT_RELATED
IPCT_STATUS
IPCT_PROTOINFO
IPCT_HELPER
IPCT_HELPINFO
IPCT_NATINFO
When packet traversal finishes, a notifier is called with the bitmask of accumulated events for this packet (skb->nfcache)
Event API is used by ct_sync and ctnetlink
--newpage
--footer netconf'05 - netfilter update
--header ctnetlink
ctnetlink registers with the event API and sends ctnetlink multicast msgs
ctnetlink event messages are either NEW, NEW with F_UPDATE or DELETE
Problem:
There can be lots of events.
We can easily see 200,000 NEW conntracks per second
Interim Solution:
Have userspace app specify the bitmask of interesting events via
IPCTNL_MSG_CONFIG. This defeats use by multiple incooperative apps.
--newpage
--footer netconf'05 - netfilter update
--header ctnetlink
Proposed Real Solution:
Have generic netlink event message filters.
- Every socket can set it's local bitmask of events using setsockopt()
- netlink core maintains ORed event mask that is used by ctnetlink
- Whenever a socket disappears (or changes its mask), we recalculate
the global mask
This scheme should really be generic, since other subsystems with potentially many messages can profit from it.
--newpage
--footer netconf'05 - netfilter update
--header conntrack tool
To test and use ctnetlink, Pablo Neira wrote the "conntrack" tool
Basically "iproute2" for conntrack:
-L [table] [-z] List conntrack or expect table
-G [table] params Show conntrac or expect
-D [table] params Delete conntrack or expect
-I [table] params Create conntrack or expect
-E [table] [options] Show events (equals "ip route monitor")
--newpage
--footer netconf'05 - netfilter update
--header flow-based accounting
Linux misses good accounting solution.
Lots of people use inefficient net-acct/nacctd, ip-acct, ulog-acct, ...
Specialized solutions exist (ipt_ACCOUNT, ...) but are limited in scope
Most people want to have flow-based instead of packet-based logs
NETFLOW (or now IPFIX) format can be used by standard tools for analysis
Idea: We already have a flow cache in the kernel
Problem: It's read-only per packet
But: ip_conntrack already has per-packet write acccess
So: We can put counters in same already-written-to ip_conntrack cache line
Userspace interface is ctnetlink (either polling or event-based)
Simplistic implementation can use "conntrack" tool and pipe to perl script
Fully-featured logging daemon (ulogd2) is in the final implementation stage
See my OLS 2005 paper for more details
--newpage
--footer netconf'05 - netfilter update
--header helpers
PPTP
helper is now finally ported to rustynat
will be merged soon since I'm tired of syncing it with core changes
H.323
now has a simplified ASN.1 parser instead of brute-force replace
needs more testing but could probably be merged soon, too
SIP
first development version showed up
extremely complex protocol, helper can only cover common cases
some features (like host names in SDP) cannot be solved in-kernel
--newpage
--footer netconf'05 - netfilter update
--header pkttables
Sorry, no real progress since last year. Too much other work :(
We'll have to wait a bit longer until we see the next linux packet filter..
--newpage
--footer netconf'05 - netfilter update
--header nf_conntrack
nf_conntrack is the layer3-independent connection tracking code (ipv4+ipv6)
- Code is still kept in-sync with ip_conntrack changes
- We still don't have IPv4-NAT on top of it
- Should already have been submitted a long time ago
- Problem: you can only have ip_conntrack or nf_conntrack loaded at once
- All the existing users ('state' and 'conntrack' iptables match, ..)
can't deal with it transparently.
- Should get fixed up, but like many ipv6 issues it has low prio :(
--newpage
--footer netconf'05 - netfilter update
--header ipset
http://ipset.netfilter.org/
- Supersedes old ippool code
- Idea is to have certain groups of addresses (called "sets")
- Instead of having 100 iptables rules to match on 100 addresses, you have
1 iptables rule and an ipset with 100 addresses
- It's more optimal since it has efficient data types (such as a 256bit
long bitmask for any N addresses out of a /24)
- Should IMHO get merged soon, too.
--newpage
--footer netconf'05 - netfilter update
--header ct_sync
- Development of 2.6.x port seems to have stabilized now
- We're not seeing any oopses for quite some time
- Still doesn't support working failover for 'helped' connections
- 2.6.x branch allows one node to participate in multiple virtual clusters
- Currently working on real active-active failover
- Current code based on 2.6.10, so no "rustynat" port yet
--newpage
--footer netconf'05 - netfilter update
--header transparent proxying
In 2.2.x we had the kludy bind-to-foreign-address code
In 2.4.x it was removed because netfilter had to clean up core networking code
Now we have huge bloaty TPROXY patches out-of-tree instead:
- they do DNAT of incoming connection
- SNAT on outgoing connection
- use SO_GETORIGDST on incoming connection to retrieve un-nat'ed addr
While the code is working fine, I think it's just not worth the effort:
- NATing _twice_ just to route packets to local sockets, plus
- kludgy socket options and other nasty stuff....
Al we need is
- route certain packets to local sockets (based on destip/destport)
- bind local processes to foreign addresses (already works)
- send packets from sockets bound to foreign addreses
Transparent proxies with ctnetlink-issued expectations is what you want to enable conntrack helpers in userspace!
--newpage
--footer netconf'05 - netfilter update
--header misc
- new sourcecode directory structure: /net/netfilter/* for core stuff
- ipsec interaction -> Patrick
- conntrack reference issue (rmmod ip_conntrack vs. nf_reset() vs.
local nat vs. GETORIGDST)
not netfilter-related
- would somebody mind 'alias' devices that had their own mac address?