import of old now defunct presentation slides svn repo

This commit is contained in:
Harald Welte 2015-10-25 21:00:20 +01:00
commit fca59bea77
2136 changed files with 370466 additions and 0 deletions

View File

@ -0,0 +1,336 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<HTML>
<HEAD>
<META NAME="GENERATOR" CONTENT="SGML-Tools 1.0.9">
<TITLE>The netfilter framework in Linux 2.4</TITLE>
</HEAD>
<BODY>
<H1>The netfilter framework in Linux 2.4</H1>
<H2>Harald Welte <CODE>laforge@gnumonks.org</CODE></H2>$Date: 2004-10-10 15:04:54 +0200 (Sun, 10 Oct 2004) $
<P><HR>
<EM>This is the paper on which my talk about netfilter at Linux-Kongress 2000, CCC Congress 2000 (and probably some more occassions where I give this talk) is based. It describes the netfilter infrastructure, as well as the systems for packet filtering, NAT and packet mangling on top of it</EM>
<HR>
<H2><A NAME="s1">1. PART I - Netfilter basics / concepts</A></H2>
<H2>1.1 What is netfilter?</H2>
<P>Netfilter is definitely more than any of the firewall subsystems in the past linux kernels. Netfilter provides a abstract, generalized framework of which one particular incarnation is the packet filtering subsystem. So don't expect a talk about "how to set up a firewall or a masquerading gateway in 2.4". This would only cover a part of netfilter.
<P>The netfilter framework consists out of three parts:
<P>
<P>
<OL>
<LI>Each protocol defines a set of 'hooks' (IPv4 defines 5), which are well-defined points in a packet's traversal of that protocol stack. At each of these points, the protocol stack will call the netfilter framework with the packet and the hook number.
</LI>
<LI>Parts of the kernel can register to listen to the different hooks for each protocol. So when a packet is passed to the netfilter framework, it checks to see if anyone has registered for that protocol and hook; if so, they get a chance to examine (and possibly alter) the packet, discard it, allow it to pass or ask netfilter to queue the packet for userspace.
</LI>
<LI>Packets that have been queued are collected for sending to userspace; these packets are handled asynchronously. A userspace process can examine the packet, can alter it, and reinject it at the same hook it left the kernel.</LI>
</OL>
<P>
<P>All the packet filtering / NAT / ... stuff is based on this framework. There is no more dirty packet altering code spread all over the network stack.
<P>
<P>The netfilter framework currently has been implemented for IPv4, IPv6 and DECnet.
<P>
<H2>1.2 Why did we need netfilter?</H2>
<P>This chapter could be called 'What is wrong with ipchains?', too. So why did we need this change? (I only give a few examples here)
<P>
<UL>
<LI>No infrastructure for passing packets to userspace, so all code which does some packet fiddling must be done as kernel code. Kernel programming is hard, must be done in C, and is dangerous.
</LI>
<LI>Transparent proxying is extremely difficult
We have to look up _every_ packet to see if there's a socket bound to that adderess. No clean interface, 34 #ifdef' in 11 different files of the network stack
</LI>
<LI>Creating of packet filter rules independent of interface address is impossible.
We must know local interface address to distinguish locally-generated or locally-terminated packets from through packets. The forward chain has only information on outgoing interface. So we must try to figure out where the packet came from.
</LI>
<LI>Masquerading and packet filtering are implemented as one part
This makes the firewalling code way too complex.
</LI>
<LI>Ipchains code is neither modular nor extensible (eg. for MAC adress filtering)</LI>
</UL>
<P>
<H2>1.3 The authors of netfilter</H2>
<P>The concept of the netfilter framework and most of its implementation were done by Rusty Russell. He is co-author if ipchains and is the current Linux Kernel IP firewall maintainer. Rusty got paid one Year by Watchguard (a firewall company) to do nothing, so he had enough time to do it :)
<P>
<P>The official netfilter core team consists out of Rusty Russell, Marc Boucher, James Morris and Harald Welte. Of course there are various other hackers who have contributed some stuff (for more information see
<A HREF="http://netfilter.samba.org/scoreboard.html">http://netfilter.samba.org/scoreboard.html</A>).
<P>
<H2>1.4 Netfilter architecture in IPv4</H2>
<P>A Packet Traversing the Netfilter System:
<BLOCKQUOTE><CODE>
<PRE>
--->[1]--->[ROUTE]--->[3]--->[4]--->
| ^
| |
| [ROUTE]
v |
[2] [5]
| ^
| |
v |
</PRE>
</CODE></BLOCKQUOTE>
<P>
<P>
<P>Packets come in from the left. After verification of the IP checksum, the packets hit the NF_IP_PRE_ROUTING [1] hook.
<P>Next they enter the routing code, which decides if the packets are local or have to be passed to another interface.
<P>If the packets are considered to be local, they traverse th NF_IP_LOCAL_IN [2] hook and get passed to the process (if any) afterwards.
<P>If the packets are routed to another interface, they pass the NF_IP_FORWARD [3] hook.
<P>The packet passes a final netfilter hook, NF_IP_POST_ROUTING [4], before they get transmitted on the target interface.
<P>The NF_IP_LOCAL_OUT [5] hook is called for locally generated packets. Here You can see that routing occurs after this hook is called: in fact, the routing code is called first (to figure out the source IP address and some IP options), and called again if the packet is altered.
<P>Locally generated packets hit NF_IP_POST_ROUTING [4], too.
<P>
<H2>1.5 Netfilter base</H2>
<P>Kernel modules can register a callback function for each one of these hooks. This callback function is called for each packet traversing the hook. The module is free to alter the packet. It has to return netfilter one of these constants:
<P>
<UL>
<LI>NF_ACCEPT continue traversal as normal</LI>
<LI>NF_DROP drop the packet; do not continue traversal</LI>
<LI>NF_STOLEN I've taken over the packet; do not continue traversal</LI>
<LI>NF_QUEUE queue the packet (usually for userspace handling)</LI>
<LI>NF_REPEAT call this hook again</LI>
</UL>
<P>
<P>
<H2>1.6 Packet selection: IP tables</H2>
<P>A packet selection system called IP tables has been built. It is a direct descendant of ipchains, with extensibility.
<P>Kernel modules can create a new table utilizing the IP tables core, and ask for a packet to traverse a given table.
<P>IP tables are used for packet filtering (the 'filter' table), Network Address Translation (the 'nat' table) and general packet mangling (the 'mangle' table).
<P>The three big parts of Linux 2.4 packet handling are built using netfilter hooks and IP tables. They are seperate modules and are independent from each other. They all plug in nicely into the infrastructure provided by netfilter.
<P>
<OL>
<LI>Packet filtering
<P>This table 'filter' should never alter packets, only filter them.
One of the advantages of iptables over ipchains is that it is small and fast, and it hooks into netfilter at the NF_IP_LOCAL_IN, NF_IP_FORWARD and NF_IP_LOCAL_OUT hooks.
<P>Therefore, for each packet there is one, and only one, place to filter it. This is one big change compared to ipchains, where a forwarded packet used to traverse three chains.
<P>
</LI>
<LI> NAT
<P>The nat table listens at three netfilter hooks: NF_IP_PRE_ROUTING and NF_IP_POST_ROUTING to do source and destination NAT for routed packets. For destination altering of local packets, the NF_IP_LOCAL_OUT hook is used.
<P>This table is different from the 'filter' table, in that only the first packet of a new connection will traverse the table. The result of this traversal is then applied to all future packets of the same connection.
<P>The NAT table is used for source NAT, destination NAT, masquerading (which is a special case of source nat) and transparent proxying (which is a special case of destination nat).
<P>
</LI>
<LI> Packet mangling
<P>The 'mangle' table registers at the NF_IP_PRE_ROUTING and NF_IP_LOCAL_OUT hooks.
<P>Using the mangle table You can modify the packet itself or some of the out-of-band data attached to the packet. Currently the alteration of the TOS bits as well as setting the nfmark field inside the skb is implemented on top of the mangle table.
</LI>
</OL>
<P>
<H2>1.7 Connection tracking</H2>
<P>Connection tracking is fundamental to NAT, but has been implemented as a seperate module. This allows an extension to the packet filtering code to simply use connection tracking for "stateful firewalling". (the 'state' match)
<P>
<P>
<H2><A NAME="s2">2. PART II - packet filtering using iptables and netfilter</A></H2>
<H2>2.1 Overview</H2>
<P>I expect You are familiar with TCP/IP, routing, firewall concepts and packet filtering in general.
<P>As already explained in Part I, the filter table listens on three hooks, thus providing us three chains for packet filtering.
<P>All packets coming from the network and destined for the local box traverse the INPUT chain.
<P>All packets which are forwarded (routed) by us traverse the FORWARD chain (and only the FORWARD chain). Please again note this difference to the previous linux firewall implementations!
<P>Finally, the packets originating from the local box traverse the OUTPUT chain.
<P>
<H2>2.2 Inserting rules into chains</H2>
<P>To insert/delete/modify any rules in linux 2.4 IP tables we have a neat and powerful commandline tool, called 'iptables'. I don't want to get too deep into all its features and extensibility. Here are some of its major features:
<UL>
<LI>It handles all different kinds of IP tables. Currently the filter, nat and mangle tables, but also all future table modules
</LI>
<LI>It supports plugins for new matches and new targets. Thus, nobody ever needs to patch anything to provide a netfilter extension. You have a kernel module doing the real work and a iptables plugin (dynamic library) to add the neccessary configuration options.
</LI>
<LI>It comes in two incarnations: iptalbes (IPv4) and ip6tables (IPv6). Both of them are based on the same library and mostly the same code.</LI>
</UL>
<P>
<H3>Basic iptables commands</H3>
<P>An iptables command usually consists out of 5 parts:
<OL>
<LI>which table we want to work with</LI>
<LI>which chain in this table we want it to use</LI>
<LI>an operation (insert, add, delete, modify)</LI>
<LI>a target for this particular rule</LI>
<LI>a description of which packets we want to match this rule</LI>
</OL>
<P>The basic syntax is
<PRE>
iptables -t table -Operation chain -j target match(es)
</PRE>
<P>To add a rule allowing all traffic from anywhere to our local smtp port:
<PRE>
iptables -t filter -A INPUT -j ACCEPT -p tcp --dport smtp
</PRE>
<P>Of course there are various other commands like flush chain, set the default policy of a chain, add a user-defined chain, ...
<P>Basic Operations:
<PRE>
-A append rule
-I insert rule
-D delete rule
-R replace rule
-L list rules
</PRE>
<P>Basic Targets, common to all chains:
<PRE>
ACCEPT accept the packet
DROP drop the packet
QUEUE queue packet to userspace
RETURN return to the previous (calling) chain
foobar user defined chain
</PRE>
<P>
<P>Basic matches, common to all chains:
<PRE>
-p protocol (tcp/icmp/udp/...)
-s source address (ip address/masklen)
-d destination address (ip address/masklen)
-i incoming interface
-o outgoing interface
</PRE>
<P>Apart from these basic operations, matches and targets there are various extensions, which I'll describe in the apropriate chapters.
<P>
<H2>2.3 iptables match extensions for filtering</H2>
<P>There are various extensions which are useful for packet filtering. Describing them all in detail would take way too much time. Just to give You an impression about the power :)
<P>At first there are some match extensions, which give us more power to describe which packets to match:
<UL>
<LI>TCP match extensions to match source port, destination port, arbitrary combinations of TCP flags, tcp options.</LI>
<LI>UPD match extensions to match source port and destination port</LI>
<LI>ICMP match extension to match icmp type</LI>
<LI>MAC match extension to match incoming mac (ethernet) address</LI>
<LI>MARK match extension to match the nfmark </LI>
<LI>OWNER match extension (for locally generated packets only) to match user id, group id, process id, session id</LI>
<LI>LIMIT match extension to match only a certain limit of packets per time frame. Very useful to prevent forwarding of any kind of flooding.</LI>
<LI>STATE match extension to match packets of a certain state (decided by the connection tracking subsystem). Possible states are
<UL>
<LI>INVALID (not matched against a connection), </LI>
<LI>ESTABLISHED (packet belongs to an already established connection), </LI>
<LI>NEW (packet would establish a new connection) and </LI>
<LI>RELATED (packet is in some way related to an already established connection. For example an ICMP error message or a ftp data connection)</LI>
</UL>
</LI>
<LI>TOS match extension to match the value of the TOS IP header field</LI>
<LI>TTL match extension to match the value of the TTL IP header field</LI>
</UL>
<P>
<P>
<H2>2.4 iptables target extensions for filtering</H2>
<P>
<UL>
<LI>LOG log matched packets via syslog()</LI>
<LI>ULOG log matched packets via userspace logging daemon
(supports interpreter and output plugins for flexible logging)</LI>
<LI>REJECT not only drop the packet, but also send some kind of error
message to the sender (which message is configurable)</LI>
<LI>MIRROR retransmit the packet after exchanging source and destination
IP address </LI>
</UL>
<P>
<H2><A NAME="s3">3. PART III - NAT using iptables and netfilter</A></H2>
<P>Regarding to NAT (Network Address Translation) the previous Linux Kernels only supported one spacial case called "Masquerading"
<P>Netfilter now enables Linux to do any kind of NAT.
<P>Nat is divided into `source NAT' and `destination NAT'.
<P>Source NAT alters the source address of a packet while passing the NF_IP_POST_ROUTING hook. Masquerading is a special application of SNAT
<P>Destination NAT alters the destination address of a packet while passing the NF_IP_LOCAL_OUT respectively NF_IP_PRE_ROUTING hook. Port forwarding and transparent proxying are forms of DNAT.
<P>
<H2>3.1 iptables target extensions for NAT</H2>
<P>
<P>
<DL>
<P>
<DT><B>SNAT</B><DD><P>Change the source address to something different
<P>Example:
<PRE>
iptables -t nat -A POSTROUTING -j SNAT --to-source 1.2.3.4
</PRE>
<P>
<DT><B>MASQUERADE</B><DD><P>SNAT for dialup connections with dynamic ip address
<P>Does almost the same as SNAT, but if the link goes down, all connection tracking information is dropped. The connections are lost anyway, because we get a different IP address at reconnect.
<P>Example:
<PRE>
iptables -t nat -A POSTROUTING -j MASQUERADE -o ppp0
</PRE>
<P>
<DT><B>DNAT</B><DD><P>Change the destination address to something different
<P>This is done at the PREROUTING chain, just as the packet comes in. Therefore, anything else on the Linux box itself (routing, packet filtering) will se the packet to its real (new) destination.
<P>Example:
<PRE>
iptables -t nat -A PREROUTING -j DNAT --to-destination 1.2.3.4:8080 -p tcp --dport 80 -i eth1
</PRE>
<P>
<DT><B>REDIRECT</B><DD><P>Redirect packets to local destination
<P>Exactly the same as doing DNAT to the address of the incoming interface
<P>Example:
<PRE>
iptables -t nat -A PREROUTING -j REDIRECT --to-port 3128 -i eth1 -p tcp --dport 80
</PRE>
<P>
</DL>
<P>
<H2><A NAME="s4">4. PART IV - Packet mangling using iptables and netfilter</A></H2>
<P>The `mangle' table enables us to alter the packet itself or some data accompaning the packet.
<P>
<H2>4.1 iptables target extensions for packet mangling</H2>
<P>
<DL>
<P>
<DT><B>MARK</B><DD><P>set the value of the nfmark field
<P>We can change the value of the nfmark field. The nfmark is just a user defined mark (anything within the range of an unsigned long) of the packet. The mark value is used to do policy routing, tell ipqmpd (the userspace queue multiplex daemon) which process to queue the packet to, etc.
<P>Example:
<BLOCKQUOTE><CODE>
<PRE>
iptables -t mangle -A PREROUTING -j MARK --set-mark 0x0a -p tcp
</PRE>
</CODE></BLOCKQUOTE>
<P>
<DT><B>TOS</B><DD><P>set the value of the TOS bits inside the IP header
<P>We can change the value of the type of service bits inside the IP haeder. This is useful if You are using TOS based packet scheduling / routing.
<P>Example:
<BLOCKQUOTE><CODE>
<PRE>
iptables -t mangle -A PREROUTING -j TOS --set-tos 0x10 -p tcp --dport ssh
</PRE>
</CODE></BLOCKQUOTE>
<P>
<DT><B>TTL</B><DD><P>alther the value of the TTL field inside the IP header
<P>Enables the user to set, increase or decrease the TTL field.
<P>Example:
<BLOCKQUOTE><CODE>
<PRE>
iptables -t mangle -A PREROUTING -j TTL --ttl-dec 2 -i eth0
</PRE>
</CODE></BLOCKQUOTE>
</DL>
<P>
<H2><A NAME="s5">5. Queueing packets to userspace</A></H2>
<P>As I already mentioned, at any time in any netfilter chain, the packet can be queued to userspace. The actual queuing is done by a kernel module (ip_queue.o).
<P>The packets (including metadata like nfmark and mac address) are sent to an userspace process using netlink sockets. This process can do whatever it wants to do with the packet.
<P>After the userspace process is done with its work on the packet, it can either reinject the packet into the kernel, or set a verdict (DROP, ...) what to do with the packet.
<P>This is one key technology of netfilter, enabling to do complicated packet handling by userspace processes. Thus, preventing more complexity in the kernel space.
<P>
<P>Userspace packet handling processes can be easily developed using a netfilter-provided library called 'libipq'.
<P>
<P>Currently only one userspace process is supported, but the first beta release of an userspace ip queueing multiplex daemon (ipqmpd) is available. ipqmpd provides a compatibility library (libipqmpd) which makes upgrading from raw ipqueue interface to the new ipqpmd as easy as relinking to another library.
<P>
<H2><A NAME="s6">6. PART V Credits</A></H2>
<P>Credits to all the netfilter hackers, especially the core team.
<P>Namely: <B>Paul 'Rusty' Russel</B>, <B>Marc Boucher</B> and <B>James Morris</B>.
<P>Additional special thanks to Rusty for his `netfilter-hacking-HOWTO', `packet-filtering-HOWTO' and `NAT-HOWTO' which I heavily used as a basis for this presentation.
<P>
</BODY>
</HTML>

View File

@ -0,0 +1,18 @@
Tutorial: Firewalling using netfilter/iptables in Linux 2.4
One of the major advantages of the new Linux 2.4.x kernel series is the
new packet filtering / NAT / packet mangling sybsystem, called iptables.
Iptables is the successor of ipchains and ipfwadm in 2.2 and 2.0 kernels.
Major new features are stateful firewalling, extensibility and better NAT
(Network Address Translation) support.
Topics:
- concepts behind new netfilter/iptables infrastructure
- usage of iptables
- case example of a real-world firewall
- current (experimental) netfilter work - or "what is patch-o-matic"
- writing netfilter/iptables extension modules
The tutorial will be presented by two of the netfilter core team members,
Rusty Russel <rusty@rustcorp.com.au> and Harald Welte <laforge@gnumonks.org>

View File

@ -0,0 +1,9 @@
Technical Presentation: A tour through the Linux 2.4 network stack
Linux based systems are known for performance and realiability in the area of
networking. This presentation will give a tour through the Linux 2.4 kernel
network stack, it's structure and implementation. Some of the topics covered
are: Network hardware drivers, core network functions, IPv4 protocol stack,
sockets implementation, zero-copy TCP.
The Author of this Presentation is Harald Welte <laforge@gnumonks.org>

View File

@ -0,0 +1,116 @@
<!doctype linuxdoc system>
<article>
<title>The journey of a packet through the linux 2.4 network stack</title>
<author>Harald Welte <tt>laforge@gnumonks.org</tt>
<date>$Revision: 537 $, $Date: 2004-10-10 15:04:54 +0200 (Sun, 10 Oct 2004) $</date>
<!-- $Id: packet-journey-2.4.sgml 537 2004-10-10 13:04:54Z laforge $ -->
<abstract>
This document describes the journey of a network packet inside the linux kernel 2.4.x. This has changed drastically since 2.2 because the globally serialized bottom half was abandoned in favor of the new softirq system.
<toc>
<sect>Preface
<p>
I have to excuse for my ignorance, but this document has a strong focus on the "default case": x86 architecture and ip packets which get forwarded.
<p>
I am definitely no kernel guru and the information provided by this document may be wrong. So don't expect too much, I'll always appreciate Your comments and bugfixes.
<sect>Receiving the packet
<sect1>The receive interrupt
<p>
If the network card receives an ethernet frame which matches the local MAC address or is a linklayer broadcast, it issues an interrupt.
The network driver for this particular card handles the interrupt, fetches the packet data via DMA / PIO / whatever into RAM. It then allocates a skb and calls a function of the protocol independent device support routines: <tt>net/core/dev.c:netif_rx(skb)</tt>.
<p>
If the driver didn't already timestamp the skb, it is timestamped now. Afterwards the skb gets enqueued in the apropriate queue for the processor handling this packet. If the queue backlog is full the packet is dropped at this place. After enqueuing the skb the receive softinterrupt is marked for execution via <tt>include/linux/interrupt.h:__cpu_raise_softirq()</tt>.
<p>
The interrupt handler exits and all interrupts are reenabled.
<sect1>The network RX softirq
<p>
Now we encounter one of the big changes between 2.2 and 2.4: The whole network stack is no longer a bottom half, but a softirq. Softirqs have the major advantage, that they may run on more than one CPU simultaneously. bh's were guaranteed to run only on one CPU at a time.
<p>
Our network receive softirq is registered in <tt>net/core/dev.c:net_init()</tt> using the function <tt>kernel/softirq.c:open_softirq()</tt> provided by the softirq subsystem.
<p>
Further handling of our packet is done in the network receive softirq (NET_RX_SOFTIRQ) which is called from <tt>kernel/softirq.c:do_softirq()</tt>. do_softirq() itself is called from three places within the kernel:
<enum>
<item>from <tt>arch/i386/kernel/irq.c:do_IRQ()</tt>, which is the generic IRQ handler
<item>from <tt>arch/i386/kernel/entry.S</tt> in case the kernel just returned from a syscall
<item>inside the main process scheduler in <tt>kernel/sched.c:schedule()</tt>
</enum>
<p>
So if execution passes one of these points, do_softirq() is called, it detects the NET_RX_SOFTIRQ marked an calls <tt>net/core/dev.c:net_rx_action()</tt>. Here the sbk is dequeued from this cpu's receive queue and afterwards handled to the apropriate packet handler. In case of IPv4 this is the IPv4 packet handler.
<sect1>The IPv4 packet handler
<p>
The IP packet handler is registered via <tt>net/core/dev.c:dev_add_pack()</tt> called from <tt>net/ipv4/ip_output.c:ip_init()</tt>.
<p>
The IPv4 packet handling function is <tt>net/ipv4/ip_input.c:ip_rcv()</tt>. After some initial checks (if the packet is for this host, ...) the ip checksum is calculated. Additional checks are done on the length and IP protocol version 4.
<p>
Every packet failing one of the sanity checks is dropped at this point.
<p>
If the packet passes the tests, we determine the size of the ip packet and trim the skb in case the transport medium has appended some padding.
<p>
Now it is the first time one of the netfilter hooks is called.
<p>
Netfilter provides an generict and abstract interface to the standard routing code. This is currently used for packet filtering, mangling, NAT and queuing packets to userspace. For further reference see my conference paper 'The netfilter subsystem in Linux 2.4' or one of Rustys unreliable guides, i.e the netfilter-hacking-guide.
<p>
After successful traversal the netfilter hook, <tt>net/ipv4/ipv_input.c:ip_rcv_finish()</tt> is called.
<p>
Inside ip_rcv_finish(), the packet's destination is determined by calling the routing function <tt>net/ipv4/route.c:ip_route_input()</tt>. Furthermore, if our IP packet has IP options, they are processed now. Depending on the routing decision made by <tt>net/ipv4/route.c:ip_route_input_slow()</tt>, the journey of our packet continues in one of the following functions:
<descrip>
<tag>net/ipv4/ip_input.c:ip_local_deliver()</tag>
The packet's destination is local, we have to process the layer 4 protocol and pass it to an userspace process.
<tag>net/ipv4/ip_forward.c:ip_forward()</tag>
The packet's destination is not local, we have to forward it to another network
<tag>net/ipv4/route.c:ip_error()</tag>
An error occurred, we are unable to find an apropriate routing table entry for this packet.
<tag>net/ipv4/ipmr.c:ip_mr_input()</tag>
It is a Multicast packet and we have to do some multicast routing.
</descrip>
<sect>Packet forwarding to another device
<p>
If the routing decided that this packet has to be forwarded to another device, the function <tt>net/ipv4/ip_forward.c:ip_forward()</tt> is called.
<p>
The first task of this function is to check the ip header's TTL. If it is &lt;= 1 we drop the packet and return an ICMP time exceeded message to the sender.
<p>
We check the header's tailroom if we have enough tailroom for the destination device's link layer header and expand the skb if neccessary.
<p>
Next the TTL is decremented by one.
<p>
If our new packet is bigger than the MTU of the destination device and the don't fragment bit in the IP header is set, we drop the packet and send a ICMP frag needed message to the sender.
<p>
Finally it is time to call another one of the netfilter hooks - this time it is the NF_IP_FORWARD hook.
<p>
Assuming that the netfilter hooks is returning a NF_ACCEPT verdict, the function <tt>net/ipv4/ip_forward.c:ip_forward_finish()</tt> is the next step in our packet's journey.
<p>
ip_forward_finish() itself checks if we need to set any additional options in the IP header, and has and has <tt>net/ipv4/ip_options.c:ip_forward_options()</tt> doing this. Afterwards it calls <tt>include/net/ip.h:ip_send()</tt>.
<p>
If we need some fragmentation, <tt>net/ipv4/output.c:ip_fragment()</tt> gets called, otherwise we continue in <tt>net/ipv4/ip_forward:ip_finish_output()</tt>.
<p>
ip_finish_output() again does nothing else than calling the netfilter postrouting hook NF_IP_POST_ROUTING and calling ip_finish_output2() on successful traversal of this hook.
<p>
ip_finish_output2() calls prepends the hardware (link layer) header to our skb and calls <tt>dst->hh->hh_output()</tt> which seems to usually be <tt>net/core/dev.c:dev_queue_transmit()</tt>.
<p>
dev_queue_xmit() enqueues the packet for transmission by the network device.
</article>

View File

@ -0,0 +1,397 @@
%include "cnc-style.mgp"
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
%nodefault
%pcache 1 1 0 1
%size 7, font "standard", fore "white", vgap 20, back "black"
%bimage "fundo-cnc.png" 1024x768
%center
%size 7
Quality of Service in IP Networks
%center
%size 4
by
Harald Welte <laforge@gnumonks.org>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
QoS in IP Networks
Contents
Definition of QoS
Why QoS
IP Networks are not designed for QoS
How to do the impossible
What can Linux based systems help
Advanced Concepts (DiffServ, IntServ, RSVP, ...)
References / Further Reading
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
QoS in IP Networks
Definiton of QoS
Provide Service Differentiation
Performance Assurance by
Bandwitdh guarantees
for streaming multimedia traffic
priorizing certain important applications
Latency guarantees
for voice over IP
for interactive character-oriented applications (ssh,telnet)
Packet-loss guarantees
for unreliable layer-4 protocols
to avoid retransmits
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
QoS in IP Networks
Why QoS
Decide how and who available bandwidth is devided
Limit available bandwidth for certain users / applications
Guarantee bandwidth for certain users / applications
Divide bandwidth more equally between users / applications
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
QoS in IP Networks
IP networks not designed for QoS
Properties of IP-based networks:
offer a "best-effort" service
make NO guarantees about
bandwidth
latency
packet loss
provide a non-reliable packet transport
Conclusion: IP networks are not suitable for QoS
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
QoS in IP Networks
How to do the Impossible
%size 4
As IP Networks including Hardware (Routers, ...) are widely deployed, all QoS efforts have to layer on top of the existing technology.
There's no real solution to control latency
latency widely dependent on routing, which may be dynamic
There's no real solution to control packet loss
packet loss may occurr on any intermediate router
But we can control bandwidth usage!
The sender can limit bandwidth for outgoing streams
Intermediate routers BEFORE a bottleneck can control bandwidth usage
%size 5
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
QoS in IP Networks
What can Linux systems do?
Bandwidth limiting at the sender application
not many applications support it
server often out of control (on Internet, ...)
server doesn't know what's between him and the client
Bandwidth control on intermediate router before bottleneck
Ideal case because this is where packet loss would occurr
Sophisticated queue scheduling on the outgoing queue
Variety of different queue scheduling algorithms
Flow throttling at the Receiver
Worst case, because influence is limited
Theoretically possible for TCP, no implementation yet.
Ingress qdisc might help
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
QoS in IP Networks
Bandwidth limiting at server
Some Internet Servers support bandwidth limiting
ProFTPd (builtin support)
Apache (using contributed mod_bandwidth)
Using those features it is easy to limit
maximum bandwidth used per connection
maximum bandwidth used per client (IP/network)
maximum bandwidth used by one virtual host (webserver/ftpserver)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
QoS in IP Networks
Router before bottleneck
%size 4
The router receives more packets on his incoming interface(s) than it can send out on the outgoing interface. It has to build a queue of packets (usually a FIFO one) and starts dropping packets as soon as the queue is full
%image "qos-1.png" 0 100 30
The idea is to change this queue, thus decide
which packets get enqueued in which order
how many packets get queued
which packets get dropped in case of a filling queue
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
QoS in IP Networks
The Linux 2.2 / 2.4 Solution
Packet Scheduling algorithms in the Kernel
CBQ - Class Based Queue
RED - Random Early Drop
SFQ - Stochastic Fairness Queueing
TEQL - True Link Equalizer
TBF - Token Bucket Filter
tc command of iproute2 package for configuration
almost no documentation
very few examples on the internet
Packet Classification
tc builtin classes (route, u23, ...)
all iptables/netfilter matches by using fwmark
Conclusion: Linux is the best suited general-purpose operating system for QoS, but almost nobody is using it because lack of knowledge.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
QoS in IP Networks
Available queuing algorithms
CBQ - Class Based Queue
hierarchical bandwidth classes
used as basis in almost all cases
TBF - Token Bucket Filter
really accurate algorithm
uses a lot of CPU
not possible for high bandwidth links (>1MBit)
SFQ - Stochastic Fairness Queueing
less accurate algorithm
tries to distinguish between individual streams
does round robin between those streams
TEQL - True Link Equalizer
allows to 'bundle' interfaces
RED - Random Early Detect / Drop
simulates congested link by statistic packet dropping
uses almost no CPU
recommended for high-bandwidth backbones
others (WRR, TCINDEX, DSMARK, ..)
WRR not officially included in kernel, similar to CBQ
others mostly used for DiffServ
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
QoS in IP Networks
The big picture
Overview of the a packet's journey
%size 3
%font "typewriter"
Incoming Packets
%size 3
%font "typewriter"
|
%size 3
%font "typewriter"
V
%size 3
%font "typewriter"
Packet Classification classify
%size 3
%font "typewriter"
(ipchains/iptables) set nfmark
%size 3
%font "typewriter"
|
%size 3
%font "typewriter"
V
%size 3
%font "typewriter"
Routing decision
%size 3
%font "typewriter"
|
%size 3
%font "typewriter"
V
%size 3
%font "typewriter"
TC filter select classes based on nfmark
%size 3
%font "typewriter"
/ | \
%size 3
%font "typewriter"
/ | \
%size 3
%font "typewriter"
/ | \
%size 3
%font "typewriter"
Different Bandwidth classes bandwidth classes (CBQ)
%size 3
%font "typewriter"
\ | /
%size 3
%font "typewriter"
\ | /
%size 3
%font "typewriter"
\ | /
%size 3
%font "typewriter"
Enqueuing output queue discipline
%size 3
%font "typewriter"
|
%size 3
%font "typewriter"
V
%size 3
%font "typewriter"
Outgoing packets
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
QoS in IP Networks
Example scenario usin CBQ
%size 4
Let's assume we have a link with 10 MBit maximum available bandwidth.
We offer two major services to the outside world: Anonymous FTP and a Webserver offering important Information.
FTP Bulk data transfers are using up almost all available bandwidth, thus slowing down accesses to our website :(
We want to have FTP transfers use up to 8MBit and reserve 2MBit for WWW.
Implementation uses CBQ for bandwidth divisions.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
QoS in IP Networks
Example scenario
%size 3
attach a CBQ to the device
%size 3
%font "typewriter"
tc qdisc add dev eth0 root handle 10: cbq
bandwidth 10Mbit avpkt 1000
%size 3
%font "standard"
create CBQ classes
%size 3
%font "typewriter"
tc class add dev eth0 parent 10:0 classid 10:1 cbq
bandwidth 10MBit rate 10MBit allot 1514
weight 1Mbit prio 8 maxburst 20 avpkt 1000
tc class add dev eth0 parent 10:1 classid 10:100 cbq
bandwidth 10MBit rate 8MBit allot 1514
weight 800kbit prio 5 maxburst 20 avpkt 1000 bounded
tc class add dev eth0 parent 10:1 classid 10:200 cbq
bandwidth 10MBit rate 2MBit allot 1514
weight 200kbit prio 5 maxburst 20 avpkt 1000 bounded
%size 3
%font "standard"
add filter rules
%size 3
%font "typewriter"
tc filter add dev eth0 parent 10:1 protocol ip handle 6 fw classid 10:100
tc filter add dev eth0 parent 10:1 protocol ip handle 7 fw classid 10:200
iptables -t mangle -A PREROUTING -j MARK -p tcp --sport 20 --set-mark 6
iptables -t mangle -A PREROUTING -j MARK -p tcp ! --sport 20 --set-mark 7
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
QoS in IP Networks
Further optimization
%size 4
Now we have achieved bandwidth division between two services.
Within one service, however, one individual user with a high bandwith link can still use up most of our bandwidth, slowing down other user.
We can improve this behaviour of changing the scheduling algorithm from it's default (fifo)
%size 3
%font "typewriter"
tc qdisc add dev eth0 parent 10:100 sfq quantum 1514b perturb 15
tc qdisc add dev eth0 parent 10:200 sfq quantum 1514b perturb 15
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
QoS in IP Networks
Further reading / Links
Bandwidth limiting on Servers
ProFTPd
http://www.proftpd.net/
Apache mod_bandwidth / mod_bwshare
ftp://ftp.cohprog.com/pub/apache/module/mod_bandwidth.c
http://www.topology.org/src/bwshare/
Queue scheduling
Advanced Routing HOWTO
http://www.ds9a.nl/2.4Routing/
Linux QoS HOWTO
http://www.ittc.ukans.edu/~rsarav/howto/
iproute2+tc
This presentation
Authors Homepage
http://www.gnumonks.org/

File diff suppressed because it is too large Load Diff

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@ -0,0 +1,23 @@
Quality of Service in IP Networks
IP networks were designed some 25 years ago. Networks based on TCP/IP are
widely deployed, as organization-local Intranets as well as in the Internet
itself. The usage patterns of those networks change. Especially new
technologies like voice-over-IP as well as streaming multimedia applications
have different requirements on the underlying network infrastructure than
bulk data transfers like ftp/www or interactive traffic like telnet/ssh.
Organizations usually run a mixture of different services on their Internet
uplinks or on their organization-internal wide area networks. Bandwidth is
usually a limited ressource, so everybody wants to divide bandwidth between
different services according to his specific needs.
Linux always had a very strong focus on network functionality and has
sophisticated means for bandwidth control / QoS since Kernel 2.2.
The presentation is organized in the following parts:
Basics of QoS in IP networks
How can Linux help with QoS
Sample scenarios of Linux-based QoS solutions
Overview about advanced conecpts (DiffServ, IntServ, RSVP, ...)

View File

@ -0,0 +1,23 @@
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%
%deffont "standard" tfont "VERDANA.TTF"
%deffont "standard-i" tfont "VERDANAI.TTF"
%deffont "thick" tfont "ARIBLK.TTF"
%deffont "typewriter" xfont "courier-medium-r", tfont "courbd.ttf", tmfont "wadalab-gothic.ttf"
%%
%% Default settings per each line numbers.
%%
%default 1 leftfill, size 2, fore "white", back "black", font "thick"
%default 1 bimage "fundo-cnc.png" 1024x768
%default 1 pcache 1 1 0 0
%default 2 size 7, vgap 10, prefix " "
%default 3 size 2, bar "midnightblue", vgap 30
%default 4 size 5, fore "lemon chiffon", vgap 30, prefix " ", font "standard"
%%
%% Default settings that are applied to TAB-indented lines.
%%
%tab 1 size 4, vgap 40, prefix " ", icon arc "tomato" 40
%tab 2 size 4, vgap 20, prefix " ", icon box "spring green" 40
%tab 3 size 3, vgap 20, prefix " ", icon delta3 "white" 40
%tab 4 size 3, vgap 20, prefix " ", icon delta3 "white" 40
%%

Binary file not shown.

After

Width:  |  Height:  |  Size: 127 KiB

View File

@ -0,0 +1,397 @@
%include "cnc-style.mgp"
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
%nodefault
%pcache 1 1 0 1
%size 7, font "standard", fore "white", vgap 20, back "black"
%bimage "fundo-cnc.png" 1024x768
%center
%size 7
Quality of Service in IP Networks
%center
%size 4
by
Harald Welte <laforge@conectiva.com>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
QoS in IP Networks
Contents
Definition of QoS
Why QoS
IP Networks are not designed for QoS
How to do the impossible
What can Linux based systems help
Advanced Concepts (DiffServ, IntServ, RSVP, ...)
References / Further Reading
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
QoS in IP Networks
Definiton of QoS
Provide Service Differentiation
Performance Assurance by
Bandwitdh guarantees
for streaming multimedia traffic
priorizing certain important applications
Latency guarantees
for voice over IP
for interactive character-oriented applications (ssh,telnet)
Packet-loss guarantees
for unreliable layer-4 protocols
to avoid retransmits
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
QoS in IP Networks
Why QoS
Decide how and who available bandwidth is devided
Limit available bandwidth for certain users / applications
Guarantee bandwidth for certain users / applications
Divide bandwidth more equally between users / applications
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
QoS in IP Networks
IP networks not designed for QoS
Properties of IP-based networks:
offer a "best-effort" service
make NO guarantees about
bandwidth
latency
packet loss
provide a non-reliable packet transport
Conclusion: IP networks are not suitable for QoS
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
QoS in IP Networks
How to do the Impossible
%size 4
As IP Networks including Hardware (Routers, ...) are widely deployed, all QoS efforts have to layer on top of the existing technology.
There's no real solution to control latency
latency widely dependent on routing, which may be dynamic
There's no real solution to control packet loss
packet loss may occurr on any intermediate router
But we can control bandwidth usage!
The sender can limit bandwidth for outgoing streams
Intermediate routers BEFORE a bottleneck can control bandwidth usage
%size 5
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
QoS in IP Networks
What can Linux systems do?
Bandwidth limiting at the sender application
not many applications support it
server often out of control (on Internet, ...)
server doesn't know what's between him and the client
Bandwidth control on intermediate router before bottleneck
Ideal case because this is where packet loss would occurr
Sophisticated queue scheduling on the outgoing queue
Variety of different queue scheduling algorithms
Flow throttling at the Receiver
Worst case, because influence is limited
Theoretically possible for TCP, no implementation yet.
Ingress qdisc might help
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
QoS in IP Networks
Bandwidth limiting at server
Some Internet Servers support bandwidth limiting
ProFTPd (builtin support)
Apache (using contributed mod_bandwidth)
Using those features it is easy to limit
maximum bandwidth used per connection
maximum bandwidth used per client (IP/network)
maximum bandwidth used by one virtual host (webserver/ftpserver)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
QoS in IP Networks
Router before bottleneck
%size 4
The router receives more packets on his incoming interface(s) than it can send out on the outgoing interface. It has to build a queue of packets (usually a FIFO one) and starts dropping packets as soon as the queue is full
%image "qos-1.png" 0 100 30
The idea is to change this queue, thus decide
which packets get enqueued in which order
how many packets get queued
which packets get dropped in case of a filling queue
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
QoS in IP Networks
The Linux 2.2 / 2.4 Solution
Packet Scheduling algorithms in the Kernel
CBQ - Class Based Queue
RED - Random Early Drop
SFQ - Stochastic Fairness Queueing
TEQL - True Link Equalizer
TBF - Token Bucket Filter
tc command of iproute2 package for configuration
almost no documentation
very few examples on the internet
Packet Classification
tc builtin classes (route, u23, ...)
all iptables/netfilter matches by using fwmark
Conclusion: Linux is the best suited general-purpose operating system for QoS, but almost nobody is using it because lack of knowledge.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
QoS in IP Networks
Available queuing algorithms
CBQ - Class Based Queue
hierarchical bandwidth classes
used as basis in almost all cases
TBF - Token Bucket Filter
really accurate algorithm
uses a lot of CPU
not possible for high bandwidth links (>1MBit)
SFQ - Stochastic Fairness Queueing
less accurate algorithm
tries to distinguish between individual streams
does round robin between those streams
TEQL - True Link Equalizer
allows to 'bundle' interfaces
RED - Random Early Detect / Drop
simulates congested link by statistic packet dropping
uses almost no CPU
recommended for high-bandwidth backbones
others (WRR, TCINDEX, DSMARK, ..)
WRR not officially included in kernel, similar to CBQ
others mostly used for DiffServ
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
QoS in IP Networks
The big picture
Overview of the a packet's journey
%size 3
%font "typewriter"
Incoming Packets
%size 3
%font "typewriter"
|
%size 3
%font "typewriter"
V
%size 3
%font "typewriter"
Packet Classification classify
%size 3
%font "typewriter"
(ipchains/iptables) set nfmark
%size 3
%font "typewriter"
|
%size 3
%font "typewriter"
V
%size 3
%font "typewriter"
Routing decision
%size 3
%font "typewriter"
|
%size 3
%font "typewriter"
V
%size 3
%font "typewriter"
TC filter select classes based on nfmark
%size 3
%font "typewriter"
/ | \
%size 3
%font "typewriter"
/ | \
%size 3
%font "typewriter"
/ | \
%size 3
%font "typewriter"
Different Bandwidth classes bandwidth classes (CBQ)
%size 3
%font "typewriter"
\ | /
%size 3
%font "typewriter"
\ | /
%size 3
%font "typewriter"
\ | /
%size 3
%font "typewriter"
Enqueuing output queue discipline
%size 3
%font "typewriter"
|
%size 3
%font "typewriter"
V
%size 3
%font "typewriter"
Outgoing packets
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
QoS in IP Networks
Example scenario usin CBQ
%size 4
Let's assume we have a link with 10 MBit maximum available bandwidth.
We offer two major services to the outside world: Anonymous FTP and a Webserver offering important Information.
FTP Bulk data transfers are using up almost all available bandwidth, thus slowing down accesses to our website :(
We want to have FTP transfers use up to 8MBit and reserve 2MBit for WWW.
Implementation uses CBQ for bandwidth divisions.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
QoS in IP Networks
Example scenario
%size 3
attach a CBQ to the device
%size 3
%font "typewriter"
tc qdisc add dev eth0 root handle 10: cbq
bandwidth 10Mbit avpkt 1000
%size 3
%font "standard"
create CBQ classes
%size 3
%font "typewriter"
tc class add dev eth0 parent 10:0 classid 10:1 cbq
bandwidth 10MBit rate 10MBit allot 1514
weight 1Mbit prio 8 maxburst 20 avpkt 1000
tc class add dev eth0 parent 10:1 classid 10:100 cbq
bandwidth 10MBit rate 8MBit allot 1514
weight 800kbit prio 5 maxburst 20 avpkt 1000 bounded
tc class add dev eth0 parent 10:1 classid 10:200 cbq
bandwidth 10MBit rate 2MBit allot 1514
weight 200kbit prio 5 maxburst 20 avpkt 1000 bounded
%size 3
%font "standard"
add filter rules
%size 3
%font "typewriter"
tc filter add dev eth0 parent 10:1 protocol ip handle 6 fw classid 10:100
tc filter add dev eth0 parent 10:1 protocol ip handle 7 fw classid 10:200
iptables -t mangle -A PREROUTING -j MARK -p tcp --sport 20 --set-mark 6
iptables -t mangle -A PREROUTING -j MARK -p tcp ! --sport 20 --set-mark 7
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
QoS in IP Networks
Further optimization
%size 4
Now we have achieved bandwidth division between two services.
Within one service, however, one individual user with a high bandwith link can still use up most of our bandwidth, slowing down other user.
We can improve this behaviour of changing the scheduling algorithm from it's default (fifo)
%size 3
%font "typewriter"
tc qdisc add dev eth0 parent 10:100 sfq quantum 1514b perturb 15
tc qdisc add dev eth0 parent 10:200 sfq quantum 1514b perturb 15
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
QoS in IP Networks
Further reading / Links
Bandwidth limiting on Servers
ProFTPd
http://www.proftpd.net/
Apache mod_bandwidth / mod_bwshare
ftp://ftp.cohprog.com/pub/apache/module/mod_bandwidth.c
http://www.topology.org/src/bwshare/
Queue scheduling
Advanced Routing HOWTO
http://www.ds9a.nl/2.4Routing/
Linux QoS HOWTO
http://www.ittc.ukans.edu/~rsarav/howto/
iproute2+tc
This presentation
Authors Homepage
http://www.gnumonks.org/

View File

@ -0,0 +1,611 @@
%!PS-Adobe-2.0 EPSF-2.0
%%Title: /laforge/home/laforge/incoming/qos-1
%%Creator: Dia v0.86
%%CreationDate: Mon Apr 2 16:14:45 2001
%%For: a user
%%Magnification: 1.0000
%%Orientation: Portrait
%%BoundingBox: 0 0 1356 288
%%Pages: 1
%%BeginSetup
%%EndSetup
%%EndComments
[ /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef
/.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef
/.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef
/.notdef /.notdef /space /exclam /quotedbl /numbersign /dollar /percent /ampersand /quoteright
/parenleft /parenright /asterisk /plus /comma /hyphen /period /slash /zero /one
/two /three /four /five /six /seven /eight /nine /colon /semicolon
/less /equal /greater /question /at /A /B /C /D /E
/F /G /H /I /J /K /L /M /N /O
/P /Q /R /S /T /U /V /W /X /Y
/Z /bracketleft /backslash /bracketright /asciicircum /underscore /quoteleft /a /b /c
/d /e /f /g /h /i /j /k /l /m
/n /o /p /q /r /s /t /u /v /w
/x /y /z /braceleft /bar /braceright /asciitilde /.notdef /.notdef /.notdef
/.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef
/.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef
/.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef
/space /exclamdown /cent /sterling /currency /yen /brokenbar /section /dieresis /copyright
/ordfeminine /guillemotleft /logicalnot /hyphen /registered /macron /degree /plusminus /twosuperior /threesuperior
/acute /mu /paragraph /periodcentered /cedilla /onesuperior /ordmasculine /guillemotright /onequarter /onehalf
/threequarters /questiondown /Agrave /Aacute /Acircumflex /Atilde /Adieresis /Aring /AE /Ccedilla
/Egrave /Eacute /Ecircumflex /Edieresis /Igrave /Iacute /Icircumflex /Idieresis /Eth /Ntilde
/Ograve /Oacute /Ocircumflex /Otilde /Odieresis /multiply /Oslash /Ugrave /Uacute /Ucircumflex
/Udieresis /Yacute /Thorn /germandbls /agrave /aacute /acircumflex /atilde /adieresis /aring
/ae /ccedilla /egrave /eacute /ecircumflex /edieresis /igrave /iacute /icircumflex /idieresis
/eth /ntilde /ograve /oacute /ocircumflex /otilde /odieresis /divide /oslash /ugrave
/uacute /ucircumflex /udieresis /yacute /thorn /ydieresis] /isolatin1encoding exch def
/Times-Roman-latin1
/Times-Roman findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/Times-Italic-latin1
/Times-Italic findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/Times-Bold-latin1
/Times-Bold findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/Times-BoldItalic-latin1
/Times-BoldItalic findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/AvantGarde-Book-latin1
/AvantGarde-Book findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/AvantGarde-BookOblique-latin1
/AvantGarde-BookOblique findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/AvantGarde-Demi-latin1
/AvantGarde-Demi findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/AvantGarde-DemiOblique-latin1
/AvantGarde-DemiOblique findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/Bookman-Light-latin1
/Bookman-Light findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/Bookman-LightItalic-latin1
/Bookman-LightItalic findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/Bookman-Demi-latin1
/Bookman-Demi findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/Bookman-DemiItalic-latin1
/Bookman-DemiItalic findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/Courier-latin1
/Courier findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/Courier-Oblique-latin1
/Courier-Oblique findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/Courier-Bold-latin1
/Courier-Bold findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/Courier-BoldOblique-latin1
/Courier-BoldOblique findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/Helvetica-latin1
/Helvetica findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/Helvetica-Oblique-latin1
/Helvetica-Oblique findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/Helvetica-Bold-latin1
/Helvetica-Bold findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/Helvetica-BoldOblique-latin1
/Helvetica-BoldOblique findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/Helvetica-Narrow-latin1
/Helvetica-Narrow findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/Helvetica-Narrow-Oblique-latin1
/Helvetica-Narrow-Oblique findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/Helvetica-Narrow-Bold-latin1
/Helvetica-Narrow-Bold findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/Helvetica-Narrow-BoldOblique-latin1
/Helvetica-Narrow-BoldOblique findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/NewCenturySchoolbook-Roman-latin1
/NewCenturySchoolbook-Roman findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/NewCenturySchoolbook-Italic-latin1
/NewCenturySchoolbook-Italic findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/NewCenturySchoolbook-Bold-latin1
/NewCenturySchoolbook-Bold findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/NewCenturySchoolbook-BoldItalic-latin1
/NewCenturySchoolbook-BoldItalic findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/Palatino-Roman-latin1
/Palatino-Roman findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/Palatino-Italic-latin1
/Palatino-Italic findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/Palatino-Bold-latin1
/Palatino-Bold findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/Palatino-BoldItalic-latin1
/Palatino-BoldItalic findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/Symbol-latin1
/Symbol findfont
definefont pop
/ZapfChancery-MediumItalic-latin1
/ZapfChancery-MediumItalic findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/ZapfDingbats-latin1
/ZapfDingbats findfont
dup length dict begin
{1 index /FID ne {def} {pop pop} ifelse} forall
/Encoding isolatin1encoding def
currentdict end
definefont pop
/cp {closepath} bind def
/c {curveto} bind def
/f {fill} bind def
/a {arc} bind def
/ef {eofill} bind def
/ex {exch} bind def
/gr {grestore} bind def
/gs {gsave} bind def
/sa {save} bind def
/rs {restore} bind def
/l {lineto} bind def
/m {moveto} bind def
/rm {rmoveto} bind def
/n {newpath} bind def
/s {stroke} bind def
/sh {show} bind def
/slc {setlinecap} bind def
/slj {setlinejoin} bind def
/slw {setlinewidth} bind def
/srgb {setrgbcolor} bind def
/rot {rotate} bind def
/sc {scale} bind def
/sd {setdash} bind def
/ff {findfont} bind def
/sf {setfont} bind def
/scf {scalefont} bind def
/sw {stringwidth pop} bind def
/tr {translate} bind def
/ellipsedict 8 dict def
ellipsedict /mtrx matrix put
/ellipse
{ ellipsedict begin
/endangle exch def
/startangle exch def
/yrad exch def
/xrad exch def
/y exch def
/x exch def /savematrix mtrx currentmatrix def
x y tr xrad yrad sc
0 0 1 startangle endangle arc
savematrix setmatrix
end
} def
/mergeprocs {
dup length
3 -1 roll
dup
length
dup
5 1 roll
3 -1 roll
add
array cvx
dup
3 -1 roll
0 exch
putinterval
dup
4 2 roll
putinterval
} bind def
28.346000 -28.346000 scale
17.845207 437.856740 translate
%%EndProlog
1.000000 1.000000 1.000000 srgb
n -13.077546 -447.950612 m -13.077546 -439.950612 l -7.985470 -439.950612 l -7.985470 -447.950612 l f
0.100000 slw
[] 0 sd
[] 0 sd
0 slj
0.000000 0.000000 0.000000 srgb
n -13.077546 -447.950612 m -13.077546 -439.950612 l -7.985470 -439.950612 l -7.985470 -447.950612 l cp s
1.000000 1.000000 1.000000 srgb
n -2.033689 -446.022040 m -2.033689 -441.959540 l 10.048900 -441.959540 l 10.048900 -446.022040 l f
0.100000 slw
[] 0 sd
[] 0 sd
0 slj
0.000000 0.000000 0.000000 srgb
n -2.033689 -446.022040 m -2.033689 -441.959540 l 10.048900 -441.959540 l 10.048900 -446.022040 l cp s
1.000000 1.000000 1.000000 srgb
n 16.050852 -447.942862 m 16.050852 -439.949112 l 25.116894 -439.949112 l 25.116894 -447.942862 l f
0.100000 slw
[] 0 sd
[] 0 sd
0 slj
0.000000 0.000000 0.000000 srgb
n 16.050852 -447.942862 m 16.050852 -439.949112 l 25.116894 -439.949112 l 25.116894 -447.942862 l cp s
0.100000 slw
[] 0 sd
[] 0 sd
0 slc
0.000000 0.000000 0.000000 srgb
n -16.038172 -440.048589 m -13.070281 -440.011807 l s
0.100000 slw
[] 0 sd
[] 0 sd
0 slc
0.000000 0.000000 0.000000 srgb
n -15.997245 -447.906733 m -13.077546 -447.950612 l s
0.100000 slw
0 slc
[] 0 sd
1.000000 0.000000 0.000000 srgb
n -13.560988 -443.913577 m -16.406523 -443.936733 l s
0 slj
1.000000 0.000000 0.000000 srgb
n -14.362995 -443.670095 m -13.560988 -443.913577 l -14.358927 -444.170079 l f
/Helvetica-Oblique-latin1 ff 0.600000 scf sf
/Helvetica-Oblique-latin1 ff 0.600000 scf sf
1.000000 0.000000 0.000000 srgb
() dup sw 2 div -14.855953 ex sub -444.948149 m gs 1 -1 sc sh gr
1.000000 1.000000 1.000000 srgb
n -4.993883 -443.984335 1.990755 1.984000 0 360 ellipse f
0.100000 slw
[] 0 sd
[] 0 sd
0.000000 0.000000 0.000000 srgb
n -4.993883 -443.984335 1.990755 1.984000 0 360 ellipse cp s
1.000000 1.000000 1.000000 srgb
n 13.037714 -443.959407 2.005400 2.000000 0 360 ellipse f
0.100000 slw
[] 0 sd
[] 0 sd
0.000000 0.000000 0.000000 srgb
n 13.037714 -443.959407 2.005400 2.000000 0 360 ellipse cp s
0.100000 slw
[] 0 sd
[] 0 sd
0 slc
0.000000 0.000000 0.000000 srgb
n -7.985470 -443.950612 m -6.984638 -443.984335 l s
0.100000 slw
[] 0 sd
[] 0 sd
0 slc
0.000000 0.000000 0.000000 srgb
n -3.018922 -443.977661 m -2.033689 -443.990790 l s
0.100000 slw
[] 0 sd
[] 0 sd
0 slc
0.000000 0.000000 0.000000 srgb
n 9.967044 -444.004735 m 11.032314 -443.959407 l s
0.100000 slw
[] 0 sd
[] 0 sd
0 slc
0.000000 0.000000 0.000000 srgb
n 15.043114 -443.959407 m 16.016589 -443.977661 l s
/Courier-latin1 ff 0.800000 scf sf
0.000000 0.000000 0.000000 srgb
(Router) dup sw 2 div -4.832496 ex sub -443.824335 m gs 1 -1 sc sh gr
/Courier-latin1 ff 0.800000 scf sf
0.000000 0.000000 0.000000 srgb
(Router) dup sw 2 div 13.087201 ex sub -443.751407 m gs 1 -1 sc sh gr
1.000000 1.000000 1.000000 srgb
n -8.995481 -447.879407 m -8.995481 -440.039407 l -8.067481 -440.039407 l -8.067481 -447.879407 l f
0.100000 slw
[] 0 sd
[] 0 sd
0 slj
0.000000 0.000000 0.000000 srgb
n -8.995481 -447.879407 m -8.995481 -440.039407 l -8.067481 -440.039407 l -8.067481 -447.879407 l cp s
1.000000 1.000000 1.000000 srgb
n -9.998281 -447.883807 m -9.998281 -440.043807 l -9.070281 -440.043807 l -9.070281 -447.883807 l f
0.100000 slw
[] 0 sd
[] 0 sd
0 slj
0.000000 0.000000 0.000000 srgb
n -9.998281 -447.883807 m -9.998281 -440.043807 l -9.070281 -440.043807 l -9.070281 -447.883807 l cp s
1.000000 1.000000 1.000000 srgb
n -11.022281 -447.851807 m -11.022281 -440.011807 l -10.094281 -440.011807 l -10.094281 -447.851807 l f
0.100000 slw
[] 0 sd
[] 0 sd
0 slj
0.000000 0.000000 0.000000 srgb
n -11.022281 -447.851807 m -11.022281 -440.011807 l -10.094281 -440.011807 l -10.094281 -447.851807 l cp s
1.000000 1.000000 1.000000 srgb
n -12.046281 -447.851807 m -12.046281 -440.011807 l -11.118281 -440.011807 l -11.118281 -447.851807 l f
0.100000 slw
[] 0 sd
[] 0 sd
0 slj
0.000000 0.000000 0.000000 srgb
n -12.046281 -447.851807 m -12.046281 -440.011807 l -11.118281 -440.011807 l -11.118281 -447.851807 l cp s
1.000000 1.000000 1.000000 srgb
n -13.070281 -447.851807 m -13.070281 -440.011807 l -12.142281 -440.011807 l -12.142281 -447.851807 l f
0.100000 slw
[] 0 sd
[] 0 sd
0 slj
0.000000 0.000000 0.000000 srgb
n -13.070281 -447.851807 m -13.070281 -440.011807 l -12.142281 -440.011807 l -12.142281 -447.851807 l cp s
1.000000 1.000000 1.000000 srgb
n 18.115732 -447.883807 m 18.115732 -440.043807 l 19.043732 -440.043807 l 19.043732 -447.883807 l f
0.100000 slw
[] 0 sd
[] 0 sd
0 slj
0.000000 0.000000 0.000000 srgb
n 18.115732 -447.883807 m 18.115732 -440.043807 l 19.043732 -440.043807 l 19.043732 -447.883807 l cp s
1.000000 1.000000 1.000000 srgb
n 21.123732 -447.851807 m 21.123732 -440.011807 l 22.051732 -440.011807 l 22.051732 -447.851807 l f
0.100000 slw
[] 0 sd
[] 0 sd
0 slj
0.000000 0.000000 0.000000 srgb
n 21.123732 -447.851807 m 21.123732 -440.011807 l 22.051732 -440.011807 l 22.051732 -447.851807 l cp s
1.000000 1.000000 1.000000 srgb
n 24.195732 -447.883807 m 24.195732 -440.043807 l 25.123732 -440.043807 l 25.123732 -447.883807 l f
0.100000 slw
[] 0 sd
[] 0 sd
0 slj
0.000000 0.000000 0.000000 srgb
n 24.195732 -447.883807 m 24.195732 -440.043807 l 25.123732 -440.043807 l 25.123732 -447.883807 l cp s
0.100000 slw
[] 0 sd
[] 0 sd
0 slc
0.000000 0.000000 0.000000 srgb
n 25.116894 -439.949112 m 28.034661 -439.966733 l s
0.100000 slw
[] 0 sd
[] 0 sd
0 slc
0.000000 0.000000 0.000000 srgb
n 25.116894 -447.942862 m 27.993733 -447.947661 l s
0.100000 slw
0 slc
[] 0 sd
1.000000 0.000000 0.000000 srgb
n 29.115480 -443.994786 m 25.988269 -444.018589 l s
0 slj
1.000000 0.000000 0.000000 srgb
n 28.313600 -443.750882 m 29.115480 -443.994786 l 28.317406 -444.250868 l f
/Helvetica-Oblique-latin1 ff 0.600000 scf sf
/Helvetica-Oblique-latin1 ff 0.600000 scf sf
1.000000 0.000000 0.000000 srgb
() dup sw 2 div 27.554158 ex sub -444.306679 m gs 1 -1 sc sh gr
/Courier-latin1 ff 0.800000 scf sf
0.000000 0.000000 0.000000 srgb
(receiver) dup sw 2 div 27.851695 ex sub -444.836756 m gs 1 -1 sc sh gr
/Courier-latin1 ff 0.800000 scf sf
0.000000 0.000000 0.000000 srgb
(sender) dup sw 2 div -16.290807 ex sub -444.767845 m gs 1 -1 sc sh gr
1.000000 1.000000 1.000000 srgb
n -1.950016 -445.936335 m -1.950016 -442.064335 l 1.044776 -442.064335 l 1.044776 -445.936335 l f
0.100000 slw
[] 0 sd
[] 0 sd
0 slj
0.000000 0.000000 0.000000 srgb
n -1.950016 -445.936335 m -1.950016 -442.064335 l 1.044776 -442.064335 l 1.044776 -445.936335 l cp s
1.000000 1.000000 1.000000 srgb
n 1.098648 -445.940735 m 1.098648 -442.068735 l 4.073436 -442.068735 l 4.073436 -445.940735 l f
0.100000 slw
[] 0 sd
[] 0 sd
0 slj
0.000000 0.000000 0.000000 srgb
n 1.098648 -445.940735 m 1.098648 -442.068735 l 4.073436 -442.068735 l 4.073436 -445.940735 l cp s
1.000000 1.000000 1.000000 srgb
n 4.117184 -445.940735 m 4.117184 -442.068735 l 7.061168 -442.068735 l 7.061168 -445.940735 l f
0.100000 slw
[] 0 sd
[] 0 sd
0 slj
0.000000 0.000000 0.000000 srgb
n 4.117184 -445.940735 m 4.117184 -442.068735 l 7.061168 -442.068735 l 7.061168 -445.940735 l cp s
1.000000 1.000000 1.000000 srgb
n 7.143023 -445.940735 m 7.143023 -442.068735 l 9.967044 -442.068735 l 9.967044 -445.940735 l f
0.100000 slw
[] 0 sd
[] 0 sd
0 slj
0.000000 0.000000 0.000000 srgb
n 7.143023 -445.940735 m 7.143023 -442.068735 l 9.967044 -442.068735 l 9.967044 -445.940735 l cp s
/Courier-latin1 ff 0.800000 scf sf
0.000000 0.000000 0.000000 srgb
(3) dup sw 2 div 24.644229 ex sub -446.919407 m gs 1 -1 sc sh gr
/Courier-latin1 ff 0.800000 scf sf
0.000000 0.000000 0.000000 srgb
(4) dup sw 2 div 21.892229 ex sub -446.983407 m gs 1 -1 sc sh gr
/Courier-latin1 ff 0.800000 scf sf
0.000000 0.000000 0.000000 srgb
(5) dup sw 2 div 18.564229 ex sub -447.079407 m gs 1 -1 sc sh gr
/Courier-latin1 ff 0.800000 scf sf
0.000000 0.000000 0.000000 srgb
(6) dup sw 2 div 8.485208 ex sub -443.856335 m gs 1 -1 sc sh gr
/Courier-latin1 ff 0.800000 scf sf
0.000000 0.000000 0.000000 srgb
(7) dup sw 2 div 5.522128 ex sub -443.856335 m gs 1 -1 sc sh gr
/Courier-latin1 ff 0.800000 scf sf
0.000000 0.000000 0.000000 srgb
(8) dup sw 2 div 2.544520 ex sub -443.888335 m gs 1 -1 sc sh gr
/Courier-latin1 ff 0.800000 scf sf
0.000000 0.000000 0.000000 srgb
(9) dup sw 2 div -0.305843 ex sub -443.895805 m gs 1 -1 sc sh gr
/Courier-latin1 ff 0.800000 scf sf
0.000000 0.000000 0.000000 srgb
(10) dup sw 2 div -8.541268 ex sub -447.111407 m gs 1 -1 sc sh gr
/Courier-latin1 ff 0.800000 scf sf
0.000000 0.000000 0.000000 srgb
(11) dup sw 2 div -9.501268 ex sub -447.111407 m gs 1 -1 sc sh gr
/Courier-latin1 ff 0.800000 scf sf
0.000000 0.000000 0.000000 srgb
(12) dup sw 2 div -10.557268 ex sub -447.111407 m gs 1 -1 sc sh gr
/Courier-latin1 ff 0.800000 scf sf
0.000000 0.000000 0.000000 srgb
(13) dup sw 2 div -11.581268 ex sub -447.111407 m gs 1 -1 sc sh gr
/Courier-latin1 ff 0.800000 scf sf
0.000000 0.000000 0.000000 srgb
(14) dup sw 2 div -12.605268 ex sub -447.111407 m gs 1 -1 sc sh gr
/Courier-latin1 ff 0.800000 scf sf
0.000000 0.000000 0.000000 srgb
(low bandwidth link) dup sw 2 div 7.664387 ex sub -438.064335 m gs 1 -1 sc sh gr
/Courier-latin1 ff 0.800000 scf sf
0.000000 0.000000 0.000000 srgb
(high bandwidth link) dup sw 2 div -10.882479 ex sub -438.023407 m gs 1 -1 sc sh gr
/Courier-latin1 ff 0.800000 scf sf
0.000000 0.000000 0.000000 srgb
(high bandwidth link) dup sw 2 div 21.653603 ex sub -438.052474 m gs 1 -1 sc sh gr
showpage

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.1 KiB

View File

@ -0,0 +1,48 @@
Grundlagen des Firewallings - Sicherheit in IP-Netzwerken
=========================================================
Was ist eine Firewall?
Was macht eine Firewall genau?
Was fuer Unterschiede gibt es zwischen Firewalls?
Wo ist der Unterschied zwischen Paketfiltern und Proxies?
Mit diesen (und anderen) Fragen beschaeftigt sich der KNF-Vortrag ueber die
Grundlagen des Firewallings.
Ausgehend von einem grundlegenden Wissen ueber TCP/IP Netzwerke und Router
beschreibt dieser Vortrag die dem Firewalling zugrunde liegenden Konzepte
und Strategien, sowie deren Moeglichkeiten.
In dem Vortrag wird bewusst nicht auf bestimmte Firewall-Produkte oder
Implementationen eingegangen, es sind daher auch keinerlei Vorkenntnisse
in der Anwendung / Administration einer Firewall noetig.
Gliederung:
- Kurzer Ueberblick ueber IP-Routing, TCP, UDP, ICMP
- Paketfilter
- Funktionsweise
- traditionelle Paketfilter (ohne state)
- stateful firewalling (connection tracking)
- Proxies
- Funktionsweise
- 'normale' Proxies
- transparente Proxies
- Vergleich Paketfilter/Proxy
- daraus abgeleitet
- Moeglichkeiten
- Einsatzbereiche
- Network Address Translation (NAT)
- static NAT
- static NAPT
- symmetric NAPT
- masquerading
Ueber den Vortragenden:
Harald Welte ist seit 1995 aktives KNF-Mitglied und der derzeitige
stellvertretende Technische Kontakt des KNF. Er ist der Maintainer des
netfilter/iptables Firewalling-Subsystems im Linux 2.4.x und
2.5.x Kernel und war massgeblich an dessen Entwicklung beteiligt.

View File

@ -0,0 +1,312 @@
%include "default.mgp"
%default 1 bgrad
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
%nodefault
%back "blue"
%center
%size 7
TCP/IP Firewalling Basics
%center
%size 4
by
Harald Welte <laforge@sunbeam.franken.de>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalling Basics
Contents
Introduction
Networking Basics
Potential Security Problems
Solution 1: Packet Filters
Solution 2: Proxies
Comparison
Summary
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalling Basics
Introduction
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalling Basics
Networking Basics
7 layer OSI model used to abstract networking protocols
layer 7: application layer: e.g. telnet/ftp
layer 6: presentation layer:
layer 5: session layer:
layer 4: transport layer: e.g. TCP/UDP
layer 3: network layer: e.g. IP
layer 2: data link layer: e.g. Ethernet
layer 1: physical layer: e.g. Wire
Layer 1 + 2 embedded in hardware
Layer 3 + 4 implemented in operating system
Layer 5+ embedded in application program
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalling Basics
Networking Basics
Layer 2: Ethernet
enables two hosts within same pysical net to exchange packets
unreliable
adressing granularity: host
fixed hardware adresses (MAC adress, 48bit)
Layer 3: Internet Protocol (IP)
enables two hosts in diferent physical networks to exchange packets
unreliable, best effort
packet reordering
packet loss
adressing granularity: host
logical adresses (IP Adress, 32bit)
checksum protects only IP header
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalling Basics
Networking Basics
Layer 4: User Datagram Protocol (UDP)
unreliable, best effort
adressing granularity: ports (16bit = 65535)
optional payload checksum
Layer 4: Transmission Control Protocol (TCP)
provides connection abstraction
reliable
ordering guarantee
retransmissions correct packet loss
flow control
payload checksum protects payload from data corruption
Layer 4: Internet Control Message Protocol (ICMP)
used internally by TCP/IP protocol suite
error messages (e.g. host unreachable)
diagnostics (e.g. ping/pong)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalling Basics
Potential Security Problems
Security issues arise at interconnection of two networks
Traditional Case: IP Router connecting an organization internal network to the Internet
What Security Problem?
organization-internal services exposed to outside network
spoofed (forged) packets to circumvent 'security by address'
even if all internal services secured by authentication, difficult to guarantee security on all internal hosts
Why Firewalling?
to restrict which internal services are exposed to the outside
to restrict which outside services are used by internal users
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalling Basics
Solution 1: Packet Filters
Filter individual packets at network interconnection (Router)
Filter criteria traditionally include
IP source + destination address
TCP/UDP source + destination port
TCP header flags
Filtering rules determine if
packet is allowed to transit interconnection
packet is silently dropped
packet is dropped and error message returned to sender
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalling Basics
Solution 1: Packet Filters
Capabilities
disallow communication between certain IP adresses
disallow communication between certain port numbers
disallow malicious packets, like packets
using source routing IP option
impossible combination of features, like tcp xmas scan
generate log of malicious and/or filtered packets
Limitations
scope limited to individual packets
no ability to look inside packet payload (HTTP 1.1 virtual hosts)
no abstraction of connection, filtering rules needed for both directions
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalling Basics
Solution 1: Packet Filters
Extensions
stateful packet filters (connection tracking)
filtering only needed for connection-initiating packets
all other packets within connection are accepted as part of an already established connection
TCP window tracking
allow filtering not only on source/dest port but also on TCP sequence number
NAT (Network Address Translation)
manipulation of source / destination address
redirect packets to other hosts
'share' one ip address at dialup accounts (masquerading)
connect two networks with overlapping addresss ranges
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalling Basics
Solution 2: Proxies
A proxy operates at layer 5 and above
Mode of operation
client connects to proxy instead of server
proxy initiates a second, seperate connection to server
Proxies are just normal programs implementing a server and a client for a particular application protocol (e.g. HTTP) using operating system mechanisms (like sockets API, winsock, ...)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalling Basics
Solution 2: Proxies
Capabilities
disallow communication between certain IP adresses
disallow communication between certain ports
disallow communication based on packet payload
e.g. pathnames / filenames within HTTP and FTP
e.g. email-adresses within SMTP
e.g. hostnames within DNS (www.netzzensur.de)
e.g. badwords ('sex' and 'teen' within same file)
manipulation of packet payload
everything possible...
Limitations
somebody needs to tell client app to connect to proxy instead of server
seperate proxies for all used protocols needed
not possible to filter on packet options, etc.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalling Basics
Solution 2: Proxies
Extensions
Transparent Proxies
accept connections from client independent of dest IP
make reply packets to the client look like as sent by server
possibly to implement same transparancy towards server
no need to tell clients about proxies anymore!
SOCKS
application protocol indepentent proxy
one proxy for all application protocols
uses seperate protocol between client and proxy
needs explicit support from client application
integrated username/password authentication
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalling Basics
Comparison
Packet Filter
pro
total control on lowest per-packet level
very high performance
possible to implement failover / load balancing
NAT as extension solves adress space problem
contra
configuration requires sophisticated knowledge
problems when no state / window tracking used
support for complex protocols (H.323, SIP) difficult to implement
Proxy
pro
no knowledge about layer3/4 protocol needed
configuration very easy
address space automatically seperated
integrates easily with other applications like IDS
easy implementation, just normal application programs
contra
seperate proxies needed for almost every protocol
bad performance
uses lots of ressources (e.g. sockets) on gatway
horribly breaks end-to-end
needs explicit configuration of client apps if not transparent proxy
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalling Basics
Comparison
Transparent Proxy
uses ideas/methods of packet filtering (NAT) to achieve protocol transparence
horrible violation of layering
Stateful Packet Filter
uese ideas of proxies (tracking of higher layer state) to achieve better security and easieer configuration
horrible violation of layering
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalling Basics
Conclusion
Conclusion
proxies work for small installations where number of used protocols is small and administrative staff not very experienced
packet filters without state tracking are difficult to configure correctly
packet filters with state tracking are good solution for most usage scenarios: powerful but yet easy to configure correctly
for highest security, best of both worlds can be combined
imagine a stateful bridging packet filter in front of a proxy :)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalling Basics
Thanks
Thanks to
the BBS people, Z-Netz, FIDO, ...
for heavily increasing my computer usage in 1992
KNF
for bringing me in touch with the internet as early as 1995
for providing a playground for technical people
for telling me about the existance of Linux!
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
for implementing (one of?) the world's best TCP/IP stacks
Paul 'Rusty' Russell
for starting the netfilter/iptables project
for trusting me to maintain it today
Linux User Group Nuernberg (ALIGN, LUG-N)
for helping me with my initial Linux problems

View File

@ -0,0 +1,100 @@
- Introduction
- since 1995 member of KNF, now 2nd TC, newsmaster + other stuff
- learned lots of stuff while playing with KNF and own networks
- done weird stuff like UUCP-over-SSL HOWTO :)
- now maintainer of linux firewalling code
- Basics
- Internet as packet switched network
- 7-layer-OSI
- Internetworking using IP
- unreliable, best effort, no ordering guarantees
- Routing within IP
- UDP as stateless protocol
- same characteristics as IP, but
- added ports to multiplex between apps
- optional payload checksum
- TCP as session layer
- providing abstraction of connection
- reliable (payload checksum, retransmissions)
- ordering guarantees
- flow control
- ICMP as helper
- error messages / diagnostics
- absolutely neccessary !!
- potential security problems
- spoofed packets
- connections to internal, private services
- difficult to guarantee security on all internal hosts
- restrictions the other way around (for outbound connections)
-
- solution 1: packet filters
- operates on layer 3
- filter packets based on packet header/content
- alternatively also generate ICMP errors, RST packets
- extensions / derivates
- stateful firewalling
- transparent firewalls (firewalling bridges)
- solution 2: proxies
- layer 5+
- description
- explicit configuration of all clients
- extensions / derivates
- transparent proxies
- SOCKS
- needs explicit application support
- solves authentication problem
- not used widely
- should be offered in addition to proxies
to give users a chance of running 'weird' prtoocols
without httptunnel.
- comparison
- proxy
+ no knowledge about protocol headers needed
+ configuration extremely easy
+ address space separated (no need for NAT)
+ integrates easily with other applications like IDS
+ easy implementation, just normal programs
- seperate proxies needed for almost every protocol
- bad performance
- uses lots of ressources (i.e. sockets) on gateway
- horribly breaks end-to-end
- needs configuration of enduser applications, if not
used as transparent proxies
- packet filter
+ total control on lowest per-packet level
+ very high performance
+ possible to implement failover / load balancing
+ NAT as extension solves address space problem
- configuration requires high knowledge on TCP/IP protocols
- problems when no state/window tracking is done
- support for complex protocols (h.323, SIP, ...) difficult
to implement
- transparent proxies
- use some ideas of packet filtering / NAT to achieve
transparency
- stateful packet filters
- use some ideas of proxies (tracking of higher layer state)
to achieve better security and easier configuration
- summary:
- proxies work for small installations where number of
to-be-supported protocols small and administrative stuff
not very experienced
- packet filters without state tracking difficult to be
configured correctly
- packet filters with state tracking are a good solution for
most usage scenarios: powerful, but yet easy to configure
right :)
- for highest security, they can be combined: imagine a
stateful bridging packet filter in front of proxies :)

View File

@ -0,0 +1,243 @@
%include "default.mgp"
%default 1 bgrad
%deffont "typewriter" tfont "MONOTYPE.TTF"
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
%nodefault
%back "blue"
%center
%size 7
IPv6 Introduction
%center
%size 4
by
Harald Welte <laforge@rfc2460.org>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
IPv6 Introduction
What? Why?
What is IPv6?
Successor of currently used IP Version 4
Specified 1995 in RFC 2460
Why?
Address space in IPv4 too small
Routing tables too large
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
IPv6 Introduction
Advantages
Advantages
stateless autoconfiguration
multicast obligatory
IPsec obligatory
Mobile IP
Address renumbering
Multihoming
Multiple address scopes
smaller routing tables through aggregatable allocation
simplified l3 header
64bit aligned
no checksum (l4 or l2)
no fragmentation at router
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
IPv6 Introduction
Disadvantages
Disadvantages
Not widely deployed yet
In most cases access only possible using manual tunnel
OS support not ideal in most cases
W2k: IPv6 available from MSi
Windows XP: IPv6 included
Linux has support, but some flaws (no IPsec, ndisc not fully implemented, ...)
*BSD: full support (KAME)
Solaris: full support
Application support not ideal in most cases
not supported: postfix, current squid, inn, proftpd,
supported: bind8/9, apache, openssh, xinetd, rsync, squid-2.5(CVS), exim, zmailer, sendmail, qmail, inn-2.4(CVS), zebra
Conclusion: Circular dependencies
no application support without OS support
no good OS support without applications
no wide deployment without applications
no applications without deployment
no deployment without applications
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
IPv6 Introduction
Deployment
Experimental (6bone)
Experimental 6bone (3ffe::) has been active since 1995.
Uses slightly different Addressing Architecture (RFC2471)
Production (2001::)
Initial TLA's and sub-TLA's assigned in Sept 2000
Mostly used in education+research
Some commercial ISP's in .de are offering production prefixes
Why isn't IPv6 widely used yet?
No immediate need in Europe / North America
Big deployment cost at ISP's (Training, Routers, ..)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
IPv6 Introduction
Technical: Address Space
IP Version 6 Addressing Architecture (RFC2373)
Format prefix, variable length
001: RFC2374 addresses, 1/8 of address space
0000 001: Reserved for NSAP (1/128)
0000 010: Reserved for IPX (1/128)
1111 1110 10: link-local unicast addresses (1/1024)
1111 1110 11: site-local unicast addresses (1/1024)
1111 1111 flgs scop: multicast addresses
flgs (0: well-known, 1:transient)
scop (0: reserved, 1: node-local, 2: link-local, 5: site-local, 8: organization-local, e: global scope, f: reserved)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
IPv6 Introduction
Technical: Address Space
Aggregatable Global Unicast Address Format (RFC2374)
3bit FP (format prefix = 001)
13bit TLA ID - Top-Level Aggregation ID
13bit Sub-TLA - Sub-TLA Aggergation ID
19bit NLA - Next-Level Aggregation ID
16bit SLA - Site-Level Aggregation ID
64bit Interface ID - derived from 48bit ethernet MAC
Initial subTLA-Assignments
2001:0000::/29 - 2001:01f8::/29 IANA
2001:0200::/29 - 2001:03f8::/29 APNIC
2001:0400::/29 - 2001:05f8::/29 ARIN
2001:0600::/29 - 2001:07f8::/29 RIPE
loopback ::1
unspecified: ::0
embedded ipv4
IPv4-compatible address: 0::xxxx:xxxx
IPv4-mapped IPv4 (IPv4 only node): 0::ffff:xxxx:xxxx
anycast
allocated from unicast addresses
only subnet-router anycast address predefined (prefix::0000)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
IPv6 Introduction
Technical: Header
%font "typewriter"
%size 3
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| Traffic Class | Flow Label |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Payload Length | Next Header | Hop Limit |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ Source Address +
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ Destination Address +
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
%font "standard"
4bit Version: 6
8bit Traffic Class
20bit Flow Label
16bit Payload Length (incl. extension hdrs)
8bit next header (same values like IPv4, RFC1700 et seq.)
8bit hop limit (TTL)
128bit source address
128bit dest address
extension headers:
hop-by-hop options
routing
fragment
destination options
IPsec (AH/ESP)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
IPv6 Introduction
Technical: Layer 2 <-> Address mapping
Ethernet: No more ARP, everything within ICMPv6
No Broadcast, everything built using multicast.
all-nodes multicast address ff02::1
all-routers multicast address ff02::2
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
IPv6 Introduction
Technical: Address Configuration
router discovery
routers periodically send router advertisements
hosts can send router solicitation to explicitly request RADV
prefix discovery
router includes prefix(es) in ICMPv6 router advertisements
other nodes receive prefix advertisements and derive their final address from prefix + EUI64 of MAC address
neighbour discovery
machines can discover it's neighbours without advertising router
%page
IPv6 Introduction
How to get connected
In case of static IPv4 address
SIT (ipv6-in-ipv4) tunnel possible
http://www.join.uni-muenster.de/
In case of dynamic IPv4 address
ppp (ipv6 over ppp) tunnel (pptp, l2tp) possible
sitctrl (linux <-> linux)
atncp (*NIX), http://www.dhis.org/atncp/
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
IPv6 Introduction
Further Reading
http://www.ipv6-net.org/ (deutsches IPv6 forum)
http://www.6bone.net/ (ipv6 testing backbone)
http://www.freenet6.net/ (free tunnel broker)
http://hs247.com/ (list of tunnel brokers)
http://www.bieringer.de/ (ipv6 for linux)
http://www.linux-ipv6.org/ (improved ipv6 for linux)
http://www.kame.net/ (ipv6 for *BDS)
http://www.join.uni-muenster.de/ (ipv6 at DFN/WiN)
http://www.gnumonks.org/ (slides of this presentation)
And of course, all relevant RFC's

114
2002/ipv6-ccc2002/topics Normal file
View File

@ -0,0 +1,114 @@
What is IPv6?
Successor of currently used IP Version 4
Specified 1995 in RFC? 2460
Why?
Address space in IPv4 too small
Advantages?
stateless autoconfiguration
multicast obligatorisch
IPsec obligatorisch
Mobile IP
QoS ?
Address Renumbering?
Multihoming?
AddressScopes?
smaller routing tables through G
simplified l3 header
64bit aligned
no checksum (l4 or l2)
no fragmentation at router
Disadvantages
Not widely deployed yet
In most cases access only possible using manual tunnel
OS support not ideal in most cases
W2k?
Linux has support, but no IPsec in official tree -> USAGI
*BSD: full support (KAME
Application support not ideal in most cases
not supported:
supported: bind8/9, apache
Deployment
Experimental 6bone (3ffe::) has been active since 199x.
Uses slightly different Addressing Architecture (RFC2471)
Why isn't it widely used yet?
No immediate need in Europe / North America
Big deployment cost at ISP's (Training, Routers, ..)
Technical: Address Space
IP Version 6 Addressing Architecture (RFC2373)
Format prefix, variable length
001: RFC2374 addresses, 1/8 of address space
0000 001: Reserved for NSAP (1/128)
0000 010: Reserved for IPX (1/128)
1111 1110 10: link-local unicast addresses (1/1024)
1111 1110 11: site-local unicast addresses (1/1024)
1111 1111: multicast addresses
1111 1111 flgs scop
flgs (0: well-known, 1:transient)
scop (0: reserved, 1: node-local, 2: link-local, 5: site-local, 8: organization-local, e: global scope, f: reserved)
Aggregatable Global Unicast Address Format (RFC2374)
3bit FP (format prefix = 001)
13bit TLA ID - Top-Level Aggregation ID
13bit Sub-TLA - Sub-TLA Aggergation ID
19bit NLA - Next-Level Aggregation ID
16bit SLA - Site-Level Aggregation ID
64bit Interface ID - derived from 48bit ethernet MAC
2001:0000::/29 - 2001:01f8::/29 IANA
2001:0200::/29 - 2001:03f8::/29 APNIC
2001:0400::/29 - 2001:05f8::/29 ARIN
2001:0600::/29 - 2001:07f8::/29 RIPE
loopback
::1
unspecified:
::0
embedded ipv4
IPv4-compatible address: 0::xxxx:xxxx
IPv4-mapped IPv4 (IPv4 only node): 0::ffff:xxxx:xxxx
anycast
allocated from unicast addresses
only subnet-router anycast address predefined (prefix::0000)
Technical: Header
4bit Version: 6
8bit Traffic Class
20bit Flow Label
16bit Payload Length (incl. extension hdrs)
8bit next header (same values like IPv4, RF1700 et seq.)
8bit hop limit (TTL)
128bit source address
128bit dest address
extension headers:
hop-by-hop options
routing
fragment
destination options
authentication
encapsulating security payload
Technical: Layer 2 <-> Address mapping
Ethernet: No more ARP, everything within ICMPv6
No Broadcast, everything built using multicast.
all-nodes multicast address ff02::1
all-routers multicast address ff02::2
Technical: Address Configuration
router discovery
routers periodically send router advertisements
hosts can send router solicitation to explicitly request RADV
prefix discovery
router includes prefix(es) in ICMPv6 router advertisements
other nodes receive prefix advertisements and derive their final address from prefix + EUI64 of MAC address

View File

@ -0,0 +1,25 @@
Future directions of linux firewalling
Harald Welte, netfilter core team & Astaro AG
The Linux 2.4.x series provided a fundamental redesign of the packet filtering
and NAT framework, called netfilter/iptables. This flexible and modular
framwork still had it's limitations. This BOF will discuss the recent and
upcoming changes during the 2.4.x kernel series, as well as planned and
partially implemented changes/extensions for the 2.5.x kernel series.
Topics covered:
2.4.x stuff:
- The newnat API; supporting connection tracking and NAT for complex protocols
like H.323
- Accessing connection tracking table entries from userspace: ctnetlink
- Packet filtering and even NAT on a bridge
2.5.x stuff:
- libiptables: Providing a flexible and extensible API towards all iptables
features
- pkttables: Creating a layer-3-protocol independent layer for rule tables;
unifying iptables, ip6tables and arptables.
- nfnetlink: Move all netfilter/iptables related kernel/userspace communication
towards netlink

View File

@ -0,0 +1,374 @@
%include "default.mgp"
%default 1 bgrad
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
%nodefault
%back "blue"
%center
%size 7
The future of Linux packet filtering
targeted for kernel 2.6 and beyond
%center
%size 4
by
Harald Welte <laforge@gnumonks.org>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Contents
Problems with current 2.4.x netfilter/iptables
Solution to code replication
Solution for dynamic rulesets
Solution for API to GUI's and other management programs
HA for stateful firewalling
What's special about firewalling HA
Poor man's failover
Real state replication
Other current work
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Problems with 2.4.x netfilter/iptables
code replication between iptables/ip6tables/arptables
iptables was never meant for other protocols, but people did copy+paste 'ports'
replication of
core kernel code
layer 3 independent matches (mac, interface, ...)
userspace library (libiptc)
userspace tool (iptables)
userspace plugins (libipt_xxx.so)
doesn't suit the needs for dynamically changing rulesets
dynamic rulesets becomming more common due (service selection, IDS)
a whole table is created in userspace and sent as blob to kernel
for every ruleset the table needs to be copied to userspace and back
inside kernel consistency checks on whole table, loop detection
too extensible for writing any forward-compatible GUI
new extensions showing up all the time
a frontend would need to know about the options and use of a new extension
thus frontends are always incomplete and out-of-date
no high-level API other than piping to iptables-restore
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Reducing code replication
code replication is a real problem: unclean, bugfixes missed
we need layer 3 independent layer for
submitting rules to the kernel
traversing packet-rulesets supporting match/target modules
registering matches/targets
layer 3 specific (like matching ipv4 address)
layer 3 independent (like matching MAC address)
solution
pkt_tables inside kernel
pkt_tables_ipv4 registers layer 3 handler with pkt_tables
pkt_tables_ipv6 registers layer 3 handler with pkt_tables
everybody registering a pkt_table (like iptable_filter) needs to specify the l3 protocol
libraries in userspace (see later)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Supporting dynamic rulesets
atomic table-replacement turned out to be bad idea
need new interface for sending individual rules to kernel
policy routing has the same problem and good solution: rtnetlink
solution: nfnetlink
multicast-netlink based packet-orinented socket between kernel and userspace
has extra benefit that other userspace processes get notified of rule changes [just like routing daemons]
nfnetlink is a low-layer below all kernel/userspace communication
pkttnetlink [aka iptnetlink]
ctnetlink
ulog
ip_queue
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Communication with other programs
whole set of libraries
libnfnetlink for low-layer communication
libpkttnetlink for rule modifications
will handle all plugins [which are currently part of iptables]
query functions about avaliable matches/targets
query functions about parameters
query functions for help messages about specific match/parameter of a match
generic structure from which rules can be built
conversion functions to parse generic structure into in-kernel structure
conversion functiosn to perse kernel structure into generic structure
functions to convert generic structure in plain text
libipq will stay API-compatible to current version
libipulog will stay API-compatible to current version
libiptc will go away [compatibility layer extremely difficult]
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Introduction
What is special about firewall failover?
Nothing, in case of the stateless packet filter
Common IP takeover solutions can be used
VRRP
Hartbeat
Distribution of packet filtering ruleset no problem
can be done manually
or implemented with simple userspace process
Problems arise with stateful packet filters
Connection state only on active node
NAT mappings only on active node
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Connection tracking...
implemented seperately from NAT
enables stateful filtering
implementation
hooks into NF_IP_PRE_ROUTING to track packets
hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to see if packet passed filtering rules
protocol modules (currently TCP/UDP/ICMP)
application helpers currently (FTP,IRC,H.323,talk,SNMP)
divides packets in the following four categories
NEW - would establish new connection
ESTABLISHED - part of already established connection
RELATED - is related to established connection
INVALID - (multicast, errors...)
does _NOT_ filter packets itself
can be utilized by iptables using the 'state' match
is used by NAT Subsystem
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Common structures
struct ip_conntrack_tuple, representing unidirectional flow
layer 3 src + dst
layer 4 protocol
layer 4 src + dst
connetions represented as struct ip_conntrack
original tuple
reply tuple
timeout
l4 state private data
app helper
app helper private data
expected connections
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Flow of events for new packet
packet enters NF_IP_PRE_ROUTING
tuple is derived from packet
lookup conntrack hash table with hash(tuple) -> fails
new ip_conntrack is allocated
fill in original and reply == inverted(original) tuple
initialize timer
assign app helper if applicable
see if we've been expected -> fails
call layer 4 helper 'new' function
...
packet enters NF_IP_POST_ROUTING
do hashtable lookup for packet -> fails
place struct ip_conntrack in hashtable
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Flow of events for packet part of existing connection
packet enters NF_IP_PRE_ROUTING
tuple is derived from packet
lookup conntrack hash table with hash(tuple)
assosiate conntrack entry with skb->nfct
call l4 protocol helper 'packet' function
do l4 state tracking
update timeouts as needed [i.e. TCP TIME_WAIT,...]
...
packet enters NF_IP_POST_ROUTING
do hashtable lookup for packet -> succeds
do nothing else
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Poor man's failover
Poor man's failover
principle
let every node do it's own tracking rather than replicating state
two possible implementations
connect every node to shared media (i.e. real ethernet)
forwarding only turned on on active node
slave nodes use promiscuous mode to sniff packets
copy all traffic to slave nodes
active master needs to copy all traffic to other nodes
disadvantage: high load, sync traffic == payload traffic
IMHO stupid way of solving the problem
advantages
very easy implementation
only addition of sniffing mode to conntrack needed
existing means of address takeover can be used
same load on active master and slave nodes
no additional load on active master
disadvantages
can only be used with real shared media (no switches, ...)
can not be used with NAT
remaining problem
no initial state sync after reboot of slave node!
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Real state replication
Parts needed
state replication protocol
multicast based
sequence numbers for detection of packet loss
NACK-based retransmission
no security, since private ethernet segment to be used
event interface on active node
calling out to callback function at all state changes
exported interface to manipulate conntrack hash table
kernel thread for sending conntrack state protocol messages
registers with event interface
creates and accumulates state replication packets
sends them via in-kernel sockets api
kernel thread for receiving conntrack state replication messages
receives state replication packets via in-kernel sockets
uses conntrack hashtable manipulation interface
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Real state replication
Flow of events in chronological order:
on active node, inside the network RX softirq
connection tracking code is analyzing a forwarded packet
connection tracking gathers some new state information
connection tracking updates local connection tracking database
connection tracking sends event message to event API
on active node, inside the conntrack-sync kernel thread
conntrack sync daemon receives event through event API
conntrack sync daemon aggregates multiple event messages into a state replication protocol message, removing possible redundancy
conntrack sync daemon generates state replication protocol message
conntrack sync daemon sends state replication protocol message
on slave node(s), inside network RX softirq
connection tracking code ignores packets coming from the interface attached to the private conntrac sync network
state replication protocol messages is appended to socket receive queue of conntrack-sync kernel thread
on slave node(s), inside conntrack-sync kernel thread
conntrack sync daemon receives state replication message
conntrack sync daemon creates/updates conntrack entry
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Neccessary changes to kernel
Neccessary changes to current conntrack core
event generation (callback functions) for all state changes
conntrack hashtable manipulation API
is needed (and already implemented) for 'ctnetlink' API
conntrack exemptions
needed to _not_ track conntrack state replication packets
is needed for other cases as well
currently being developed by Jozsef Kadlecsik
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Other current work
optimizing the conntrack code
hash function optimization
current hash function not good for even hash bucket count
other hash functions in development
hash function evaluation tool [cttest] avaliable
introduce per-system randomness to prevent hash attack
code optimization (locking/timers/...)
getting our work submitted into the mainstream kernel
turns out to be more difficult
e.g. newnat api now waiting for three months
discussions about multiple targets/actions per rule
technical implementation easy
however, not everybody convinced that it fits into the concept
using tc for firewalling
Jamal Hadi Selim uses iptables targets from within TC
leads to discussion of generic classification engine API in kernel
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Thanks
The slides and the an according paper of this presentation are available at http://www.gnumonks.org/
The netfilter homepage http://www.netfilter.org/
Thanks to
the BBS people, Z-Netz, FIDO, ...
for heavily increasing my computer usage in 1992
KNF
for bringing me in touch with the internet as early as 1994
for providing a playground for technical people
for telling me about the existance of Linux!
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
for implementing (one of?) the world's best TCP/IP stacks
Paul 'Rusty' Russell
for starting the netfilter/iptables project
for trusting me to maintain it today
Astaro AG
for sponsoring parts of my netfilter work

View File

@ -0,0 +1,374 @@
%include "default.mgp"
%default 1 bgrad
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
%nodefault
%back "blue"
%center
%size 7
The future of Linux packet filtering
targeted for kernel 2.6
%center
%size 4
by
Harald Welte <laforge@gnumonks.org>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Contents
Problems with current 2.4.x netfilter/iptables
Solution to code replication
Solution for dynamic rulesets
Solution for API to GUI's and other management programs
HA for stateful firewalling
What's special about firewalling HA
Poor man's failover
Real state replication
Other current work
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Problems with 2.4.x netfilter/iptables
code replication between iptables/ip6tables/arptables
iptables was never meant for other protocols, but people did copy+paste 'ports'
replication of
core kernel code
layer 3 independent matches (mac, interface, ...)
userspace library (libiptc)
userspace tool (iptables)
userspace plugins (libipt_xxx.so)
doesn't suit the needs for dynamically changing rulesets
dynamic rulesets becomming more common due (service selection, IDS)
a whole table is created in userspace and sent as blob to kernel
for every ruleset the table needs to be copied to userspace and back
inside kernel consistency checks on whole table, loop detection
too extensible for writing any forward-compatible GUI
new extensions showing up all the time
a frontend would need to know about the options and use of a new extension
thus frontends are always incomplete and out-of-date
no high-level API other than piping to iptables-restore
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Reducing code replication
code replication is a real problem: unclean, bugfixes missed
we need layer 3 independent layer for
submitting rules to the kernel
traversing packet-rulesets supporting match/target modules
registering matches/targets
layer 3 specific (like matching ipv4 address)
layer 3 independent (like matching MAC address)
solution
pkt_tables inside kernel
pkt_tables_ipv4 registers layer 3 handler with pkt_tables
pkt_tables_ipv6 registers layer 3 handler with pkt_tables
everybody registering a pkt_table (like iptable_filter) needs to specify the l3 protocol
libraries in userspace (see later)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Supporting dynamic rulesets
atomic table-replacement turned out to be bad idea
need new interface for sending individual rules to kernel
policy routing has the same problem and good solution: rtnetlink
solution: nfnetlink
multicast-netlink based packet-orinented socket between kernel and userspace
has extra benefit that other userspace processes get notified of rule changes [just like routing daemons]
nfnetlink will be low-layer below all kernel/userspace communication
pkttnetlink [aka iptnetlink]
ctnetlink
ulog
ip_queue
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Communication with other programs
whole set of libraries
libnfnetlink for low-layer communication
libpkttnetlink for rule modifications
will handle all plugins [which are currently part of iptables]
query functions about avaliable matches/targets
query functions about parameters
query functions for help messages about specific match/parameter of a match
generic structure from which rules can be built
conversion functions to parse generic structure into in-kernel structure
conversion functiosn to perse kernel structure into generic structure
functions to convert generic structure in plain text
libipq will stay API-compatible to current version
libipulog will stay API-compatible to current version
libiptc will go away [compatibility layer extremely difficult]
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Introduction
What is special about firewall failover?
Nothing, in case of the stateless packet filter
Common IP takeover solutions can be used
VRRP
Hartbeat
Distribution of packet filtering ruleset no problem
can be done manually
or implemented with simple userspace process
Problems arise with stateful packet filters
Connection state only on active node
NAT mappings only on active node
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Connection tracking...
implemented seperately from NAT
enables stateful filtering
implementation
hooks into NF_IP_PRE_ROUTING to track packets
hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to see if packet passed filtering rules
protocol modules (currently TCP/UDP/ICMP)
application helpers currently (FTP,IRC,H.323,talk,SNMP)
divides packets in the following four categories
NEW - would establish new connection
ESTABLISHED - part of already established connection
RELATED - is related to established connection
INVALID - (multicast, errors...)
does _NOT_ filter packets itself
can be utilized by iptables using the 'state' match
is used by NAT Subsystem
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Common structures
struct ip_conntrack_tuple, representing unidirectional flow
layer 3 src + dst
layer 4 protocol
layer 4 src + dst
connetions represented as struct ip_conntrack
original tuple
reply tuple
timeout
l4 state private data
app helper
app helper private data
expected connections
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Flow of events for new packet
packet enters NF_IP_PRE_ROUTING
tuple is derived from packet
lookup conntrack hash table with hash(tuple) -> fails
new ip_conntrack is allocated
fill in original and reply == inverted(original) tuple
initialize timer
assign app helper if applicable
see if we've been expected -> fails
call layer 4 helper 'new' function
...
packet enters NF_IP_POST_ROUTING
do hashtable lookup for packet -> fails
place struct ip_conntrack in hashtable
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Flow of events for packet part of existing connection
packet enters NF_IP_PRE_ROUTING
tuple is derived from packet
lookup conntrack hash table with hash(tuple)
assosiate conntrack entry with skb->nfct
call l4 protocol helper 'packet' function
do l4 state tracking
update timeouts as needed [i.e. TCP TIME_WAIT,...]
...
packet enters NF_IP_POST_ROUTING
do hashtable lookup for packet -> succeds
do nothing else
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Poor man's failover
Poor man's failover
principle
let every node do it's own tracking rather than replicating state
two possible implementations
connect every node to shared media (i.e. real ethernet)
forwarding only turned on on active node
slave nodes use promiscuous mode to sniff packets
copy all traffic to slave nodes
active master needs to copy all traffic to other nodes
disadvantage: high load, sync traffic == payload traffic
IMHO stupid way of solving the problem
advantages
very easy implementation
only addition of sniffing mode to conntrack needed
existing means of address takeover can be used
same load on active master and slave nodes
no additional load on active master
disadvantages
can only be used with real shared media (no switches, ...)
can not be used with NAT
remaining problem
no initial state sync after reboot of slave node!
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Real state replication
Parts needed
state replication protocol
multicast based
sequence numbers for detection of packet loss
NACK-based retransmission
no security, since private ethernet segment to be used
event interface on active node
calling out to callback function at all state changes
exported interface to manipulate conntrack hash table
kernel thread for sending conntrack state protocol messages
registers with event interface
creates and accumulates state replication packets
sends them via in-kernel sockets api
kernel thread for receiving conntrack state replication messages
receives state replication packets via in-kernel sockets
uses conntrack hashtable manipulation interface
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Real state replication
Flow of events in chronological order:
on active node, inside the network RX softirq
connection tracking code is analyzing a forwarded packet
connection tracking gathers some new state information
connection tracking updates local connection tracking database
connection tracking sends event message to event API
on active node, inside the conntrack-sync kernel thread
conntrack sync daemon receives event through event API
conntrack sync daemon aggregates multiple event messages into a state replication protocol message, removing possible redundancy
conntrack sync daemon generates state replication protocol message
conntrack sync daemon sends state replication protocol message
on slave node(s), inside network RX softirq
connection tracking code ignores packets coming from the interface attached to the private conntrac sync network
state replication protocol messages is appended to socket receive queue of conntrack-sync kernel thread
on slave node(s), inside conntrack-sync kernel thread
conntrack sync daemon receives state replication message
conntrack sync daemon creates/updates conntrack entry
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Neccessary changes to kernel
Neccessary changes to current conntrack core
event generation (callback functions) for all state changes
conntrack hashtable manipulation API
is needed (and already implemented) for 'ctnetlink' API
conntrack exemptions
needed to _not_ track conntrack state replication packets
is needed for other cases as well
currently being developed by Jozsef Kadlecsik
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Other current work
optimizing the conntrack code
hash function optimization
current hash function not good for even hash bucket count
other hash functions in development
hash function evaluation tool [cttest] avaliable
introduce per-system randomness to prevent hash attack
code optimization (locking/timers/...)
getting our work submitted into the mainstream kernel
turns out to be more difficult
e.g. newnat api now waiting for three months
discussions about multiple targets/actions per rule
technical implementation easy
however, not everybody convinced that it fits into the concept
using tc for firewalling
Jamal Hadi Selim uses iptables targets from within TC
leads to discussion of generic classification engine API in kernel
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Thanks
The slides and the an according paper of this presentation are available at http://www.gnumonks.org/
The netfilter homepage http://www.netfilter.org/
Thanks to
the BBS people, Z-Netz, FIDO, ...
for heavily increasing my computer usage in 1992
KNF
for bringing me in touch with the internet as early as 1994
for providing a playground for technical people
for telling me about the existance of Linux!
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
for implementing (one of?) the world's best TCP/IP stacks
Paul 'Rusty' Russell
for starting the netfilter/iptables project
for trusting me to maintain it today
Astaro AG
for sponsoring parts of my netfilter work

View File

@ -0,0 +1,31 @@
How to replicate the fire - HA for netfilter based firewalls.
With traditional, stateless firewalling (such as ipfwadm, ipchains) there is
no need for special HA support in the firewalling subsystem. As long as all
packet filtering rules and routing table entries are configured in exactly the
same way, one can use any available tool for IP-Address takeover to accomplish
the goal of failing over from one node to the other.
With Linux 2.4.x netfilter/iptables, the Linux firewalling code moves beyond
traditional packet filtering. Netfilter provides a modular connection tracking
susbsystem which can be employed for stateful firewalling. The connection
tracking subsystem gathers information about the state of all current network
flows (connections). Packet filtering decisions and NAT information is
associated with this state information.
In a high availability scenario, this connection tracking state needs to be
replicated from the currently active firewall node to all standby slave
firewall nodes. Only when all connection tracking state is replicated, the
slave node will have all necessarry state information at the time a failover
event occurs.
The netfilter/iptables does currently not have any functionality for
replicating connection tracking state accross multiple nodes. However,
the author of this presentation, Harald Welte, has started a project for
connection tracking state replication with netfilter/iptables.
The presentation will cover the architectural design and implementation
of the connection tracking failover sytem. With respect to the date of
the conference, it is to be expected that the project is still a
work-in-progress at that time.

View File

@ -0,0 +1,22 @@
<a href="http://www.gnumonks.org/users/laforge/">Harald Welte</a> is one
of the five <a href="http://www.netfilter.org/">netfilter/iptables</a> core
team members, and the current Linux 2.4.x firewalling maintainer.
His main interest in computing has always been networking. In the few time
left besides netfilter/iptables related work, he's writing obscure documents
like the <a href="http://www.gnumonks.org/ftp/pub/doc/uucp-over-ssl.html"> UUCP
over SSL HOWTO</a>. Other kernel-related projects he has been contributing are
user mode linux and the international (crypto) kernel patch.
In the past he has been working as an independent IT Consultant working on
closed-source projecst for various companies ranging from banks to
manufacturers of networking gear. During the year 2001 he was living in
Curitiba (Brazil), where he got sponsored for his Linux related work by
<a href="http://www.conectiva.com/">Conectiva Inc.</a>.
Starting with February 2002, Harald has been contracted part-time by
<a href="http://www.astaro.com/">Astaro AG</a>, who are sponsoring him for his
current netfilter/iptables work.
Harald is living in Erlangen, Germany.

View File

@ -0,0 +1,294 @@
%include "default.mgp"
%default 1 bgrad
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
%nodefault
%back "blue"
%center
%size 7
How to replicate the fire
HA for netfilter-based firewalls
%center
%size 4
by
Harald Welte <laforge@gnumonks.org>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Contents
Introduction
Connection Tracking Subsystem
Packet selection based on IP Tables
The Connection Tracking Subsystem
The NAT Subsystem
Poor man's failover
Real state replication
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Introduction
What is special about firewall failover?
Nothing, in case of the stateless packet filter
Common IP takeover solutions can be used
VRRP
Hartbeat
Distribution of packet filtering ruleset no problem
can be done manually
or implemented with simple userspace process
Problems arise with stateful packet filters
Connection state only on active node
NAT mappings only on active node
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Connection tracking...
implemented seperately from NAT
enables stateful filtering
implementation
hooks into NF_IP_PRE_ROUTING to track packets
hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to see if packet passed filtering rules
protocol modules (currently TCP/UDP/ICMP)
application helpers currently (FTP,IRC,H.323,talk,SNMP)
divides packets in the following four categories
NEW - would establish new connection
ESTABLISHED - part of already established connection
RELATED - is related to established connection
INVALID - (multicast, errors...)
does _NOT_ filter packets itself
can be utilized by iptables using the 'state' match
is used by NAT Subsystem
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Common structures
struct ip_conntrack_tuple, representing unidirectional flow
layer 3 src + dst
layer 4 protocol
layer 4 src + dst
connetions represented as struct ip_conntrack
original tuple
reply tuple
timeout
l4 state private data
app helper
app helper private data
expected connections
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Flow of events for new packet
packet enters NF_IP_PRE_ROUTING
tuple is derived from packet
lookup conntrack hash table with hash(tuple) -> fails
new ip_conntrack is allocated
fill in original and reply == inverted(original) tuple
initialize timer
assign app helper if applicable
see if we've been expected -> fails
call layer 4 helper 'new' function
...
packet enters NF_IP_POST_ROUTING
do hashtable lookup for packet -> fails
place struct ip_conntrack in hashtable
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Flow of events for packet part of existing connection
packet enters NF_IP_PRE_ROUTING
tuple is derived from packet
lookup conntrack hash table with hash(tuple)
assosiate conntrack entry with skb->nfct
call l4 protocol helper 'packet' function
do l4 state tracking
update timeouts as needed [i.e. TCP TIME_WAIT,...]
...
packet enters NF_IP_POST_ROUTING
do hashtable lookup for packet -> succeds
do nothing else
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Network Address Translation
Overview
Previous Linux Kernels only implemented one special case of NAT: Masquerading
Linux 2.4.x can do any kind of NAT.
NAT subsystem implemented on top of netfilter, iptables and conntrack
NAT subsystem registers with all five netfilter hooks
'nat' Table registers chains PREROUTING, POSTROUTING and OUTPUT
Following targets available within 'nat' Table
SNAT changes the packet's source whille passing NF_IP_POST_ROUTING
DNAT changes the packet's destination while passing NF_IP_PRE_ROUTING
MASQUERADE is a special case of SNAT
REDIRECT is a special case of DNAT
NAT bindings determined only for NEW packet and saved in ip_conntrack
Further packets within connection NATed according NAT bindings
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Poor man's failover
Poor man's failover
principle
let every node do it's own tracking rather than replicating state
two possible implementations
connect every node to shared media (i.e. real ethernet)
forwarding only turned on on active node
slave nodes use promiscuous mode to sniff packets
copy all traffic to slave nodes
active master needs to copy all traffic to other nodes
disadvantage: high load, sync traffic == payload traffic
IMHO stupid way of solving the problem
advantages
very easy implementation
only addition of sniffing mode to conntrack needed
existing means of address takeover can be used
same load on active master and slave nodes
no additional load on active master
disadvantages
can only be used with real shared media (no switches, ...)
can not be used with NAT
remaining problem
no initial state sync after reboot of slave node!
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Real state replication
Parts needed
state replication protocol
multicast based
sequence numbers for detection of packet loss
NACK-based retransmission
no security, since private ethernet segment to be used
event interface on active node
calling out to callback function at all state changes
exported interface to manipulate conntrack hash table
kernel thread for sending conntrack state protocol messages
registers with event interface
creates and accumulates state replication packets
sends them via in-kernel sockets api
kernel thread for receiving conntrack state replication messages
receives state replication packets via in-kernel sockets
uses conntrack hashtable manipulation interface
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Real state replication
Flow of events in chronological order:
on active node, inside the network RX softirq
connection tracking code is analyzing a forwarded packet
connection tracking gathers some new state information
connection tracking updates local connection tracking database
connection tracking sends event message to event API
on active node, inside the conntrack-sync kernel thread
conntrack sync daemon receives event through event API
conntrack sync daemon aggregates multiple event messages into a state replication protocol message, removing possible redundancy
conntrack sync daemon generates state replication protocol message
conntrack sync daemon sends state replication protocol message
on slave node(s), inside network RX softirq
connection tracking code ignores packets coming from the interface attached to the private conntrac sync network
state replication protocol messages is appended to socket receive queue of conntrack-sync kernel thread
on slave node(s), inside conntrack-sync kernel thread
conntrack sync daemon receives state replication message
conntrack sync daemon creates/updates conntrack entry
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Neccessary changes to kernel
Neccessary changes to current conntrack core
event generation (callback functions) for all state changes
conntrack hashtable manipulation API
is needed (and already implemented) for 'ctnetlink' API
conntrack exemptions
needed to _not_ track conntrack state replication packets
is needed for other cases as well
currently being developed by Jozsef Kadlecsik
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Thanks
Thanks to
the BBS people, Z-Netz, FIDO, ...
for heavily increasing my computer usage in 1992
KNF
for bringing me in touch with the internet as early as 1994
for providing a playground for technical people
for telling me about the existance of Linux!
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
for implementing (one of?) the world's best TCP/IP stacks
Paul 'Rusty' Russell
for starting the netfilter/iptables project
for trusting me to maintain it today
Astaro AG
for sponsoring parts of my netfilter work
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Availability of slides / Links
The slides and the an according paper of this presentation are available at
http://www.gnumonks.org/
The netfilter homepage
http://www.netfilter.org/

View File

@ -0,0 +1,504 @@
\documentclass[twocolumn]{article}
\usepackage{ols}
\begin{document}
\date{}
\title{\Large \bf How to replicate the fire - HA for netfilter based firewalls}
\author{
Harald Welte\\
{\em Netfilter Core Team + Astaro AG}\\
{\normalsize laforge@gnumonks.org/laforge@astaro.com, http://www.gnumonks.org/}
}
\maketitle
\thispagestyle{empty}
\subsection*{Abstract}
With traditional, stateless firewalling (such as ipfwadm, ipchains) there is
no need for special HA support in the firewalling subsystem. As long as all
packet filtering rules and routing table entries are configured in exactly the
same way, one can use any available tool for IP-Address takeover to accomplish
the goal of failing over from one node to the other.
With Linux 2.4.x netfilter/iptables, the Linux firewalling code moves beyond
traditional packet filtering. Netfilter provides a modular connection tracking
susbsystem which can be employed for stateful firewalling. The connection
tracking subsystem gathers information about the state of all current network
flows (connections). Packet filtering decisions and NAT information is
associated with this state information.
In a high availability scenario, this connection tracking state needs to be
replicated from the currently active firewall node to all standby slave
firewall nodes. Only when all connection tracking state is replicated, the
slave node will have all necessarry state information at the time a failover
event occurs.
The netfilter/iptables does currently not have any functionality for
replicating connection tracking state accross multiple nodes. However,
the author of this presentation, Harald Welte, has started a project for
connection tracking state replication with netfilter/iptables.
The presentation will cover the architectural design and implementation
of the connection tracking failover sytem. With respect to the date of
the conference, it is to be expected that the project is still a
work-in-progress at that time.
\section{Failover of stateless firewalls}
There are no special precautions when installing a highly available
stateless packet filter. Since there is no state kept, all information
needed for filtering is the ruleset and the individual, seperate packets.
Building a set of highly available stateless packet filters can thus be
achieved by using any traditional means of IP-address takeover, such
as Hartbeat or VRRPd.
The only remaining issue is to make sure the firewalling ruleset is
exactly the same on both machines. This should be ensured by the firewall
administrator every time he updates the ruleset.
If this is not applicable, because a very dynamic ruleset is employed, one
can build a very easy solution using iptables-supplied tools iptables-save
and iptables-restore. The output of iptables-save can be piped over ssh
to iptables-restore on a different host.
Limitations
\begin{itemize}
\item
no state tracking
\item
not possible in combination with NAT
\item
no counter consistency of per-rule packet/byte counters
\end{itemize}
\section{Failover of stateful firewalls}
Modern firewalls implement state tracking (aka connection tracking) in order
to keep some state about the currently active sessions. The amount of
per-connection state kept at the firewall depends on the particular
implementation.
As soon as {\bf any} state is kept at the packet filter, this state information
needs to be replicated to the slave/backup nodes within the failover setup.
In Linux 2.4.x, all relevant state is kept within the {\it connection tracking
subsystem}. In order to understand how this state could possibly be
replicated, we need to understand the architecture of this conntrack subsystem.
\subsection{Architecture of the Linux Connection Tracking Subsystem}
Connection tracking within Linux is implemented as a netfilter module, called
ip\_conntrack.o.
Before describing the connection tracking subsystem, we need to describe a
couple of definitions and primitives used throughout the conntrack code.
A connection is represented within the conntrack subsystem using {\it struct
ip\_conntrack}, also called {\it connection tracking entry}.
Connection tracking is utilizing {\it conntrack tuples}, which are tuples
consisting out of (srcip, srcport, dstip, dstport, l4prot). A connection is
uniquely identified by two tuples: The tuple in the original direction
(IP\_CT\_DIR\_ORIGINAL) and the tuple for the reply direction
(IP\_CT\_DIR\_REPLY).
Connection tracking itself does not drop packets\footnote{well, in some rare
cases in combination with NAT it needs to drop. But don't tell anyone, this is
secret.} or impose any policy. It just associates every packet with a
connection tracking entry, which in turn has a particular state. All other
kernel code can use this state information\footnote{state information is
internally represented via the {\it struct sk\_buff.nfct} structure member of a
packet.}.
\subsubsection{Integration of conntrack with netfilter}
If the ip\_conntrack.o module is registered with netfilter, it attaches to the
NF\_IP\_PRE\_ROUTING, NF\_IP\_POST\_ROUTING, NF\_IP\_LOCAL\_IN and
NF\_IP\_LOCAL\_OUT hooks.
Because forwarded packets are the most common case on firewalls, I will only
describe how connection tracking works for forwarded packets. The two relevant
hooks for forwarded packets are NF\_IP\_PRE\_ROUTING and NF\_IP\_POST\_ROUTING.
Every time a packet arrives at the NF\_IP\_PRE\_ROUTING hook, connection
tracking creates a conntrack tuple from the packet. It then compares this
tuple to the original and reply tuples of all already-seen connections
\footnote{Of course this is not implemented as a linear search over all existing connections.} to find out if this just-arrived packet belongs to any existing
connection. If there is no match, a new conntrack table entry (struct
ip\_conntrack) is created.
Let's assume the case where we have already existing connections but are
starting from scratch.
The first packet comes in, we derive the tuple from the packet headers, look up
the conntrack hash table, don't find any matching entry. As a result, we
create a new struct ip\_conntrack. This struct ip\_conntrack is filled with
all necessarry data, like the original and reply tuple of the connection.
How do we know the reply tuple? By inverting the source and destination
parts of the original tuple.\footnote{So why do we need two tuples, if they can
be derived from each other? Wait until we discuss NAT.}
Please note that this new struct ip\_conntrack is {\bf not} yet placed
into the conntrack hash table.
The packet is now passed on to other callback functions which have registered
with a lower priority at NF\_IP\_PRE\_ROUTING. It then continues traversal of
the network stack as usual, including all respective netfilter hooks.
If the packet survives (i.e. is not dropped by the routing code, network stack,
firewall ruleset, ...), it re-appears at NF\_IP\_POST\_ROUTING. In this case,
we can now safely assume that this packet will be sent off on the outgoing
interface, and thus put the connection tracking entry which we created at
NF\_IP\_PRE\_ROUTING into the conntrack hash table. This process is called
{\it confirming the conntrack}.
The connection tracking code itself is not monolithic, but consists out of a
couple of seperate modules\footnote{They don't actually have to be seperate
kernel modules; e.g. TCP, UDP and ICMP tracking modules are all part of
the linux kernel module ip\_conntrack.o}. Besides the conntrack core, there
are two important kind of modules: Protocol helpers and application helpers.
Protocol helpers implement the layer-4-protocol specific parts. They currently
exist for TCP, UDP and ICMP (an experimental helper for GRE exists).
\subsubsection{TCP connection tracking}
As TCP is a connection oriented protocol, it is not very difficult to imagine
how conntection tracking for this protocol could work. There are well-defined
state transitions possible, and conntrack can decide which state transitions
are valid within the TCP specification. In reality it's not all that easy,
since we cannot assume that all packets that pass the packet filter actually
arrive at the receiving end, ...
It is noteworthy that the standard connection tracking code does {\bf not}
do TCP sequence number and window tracking. A well-maintained patch to add
this feature exists almost as long as connection tracking itself. It will
be integrated with the 2.5.x kernel. The problem with window tracking is
it's bad interaction with connection pickup. The TCP conntrack code is able to
pick up already existing connections, e.g. in case your firewall was rebooted.
However, connection pickup is conflicting with TCP window tracking: The TCP
window scaling option is only transferred at connection setup time, and we
don't know about it in case of pickup...
\subsubsection{ICMP tracking}
ICMP is not really a connection oriented protocol. So how is it possible to
do connection tracking for ICMP?
The ICMP protocol can be split in two groups of messages
\begin{itemize}
\item
ICMP error messages, which sort-of belong to a different connection
ICMP error messages are associated {\it RELATED} to a different connection.
(ICMP\_DEST\_UNREACH, ICMP\_SOURCE\_QUENCH, ICMP\_TIME\_EXCEEDED,
ICMP\_PARAMETERPROB, ICMP\_REDIRECT).
\item
ICMP queries, which have a request->reply character. So what the conntrack
code does, is let the request have a state of {\it NEW}, and the reply
{\it ESTABLISHED}. The reply closes the connection immediately.
(ICMP\_ECHO, ICMP\_TIMESTAMP, ICMP\_INFO\_REQUEST, ICMP\_ADDRESS)
\end{itemize}
\subsubsection{UDP connection tracking}
UDP is designed as a connectionless datagram protocol. But most common
protocols using UDP as layer 4 protocol have bi-directional UDP communication.
Imagine a DNS query, where the client sends an UDP frame to port 53 of the
nameserver, and the nameserver sends back a DNS reply packet from it's UDP
port 53 to the client.
Netfilter trats this as a connection. The first packet (the DNS request) is
assigned a state of {\it NEW}, because the packet is expected to create a new
'connection'. The dns servers' reply packet is marked as {\it ESTABLISHED}.
\subsubsection{conntrack application helpers}
More complex application protocols involving multiple connections need special
support by a so-called ``conntrack application helper module''. Modules in
the stock kernel come for FTP and IRC(DCC). Netfilter CVS currently contains
patches for PPTP, H.323, Eggdrop botnet, tftp ald talk. We're still lacking
a lot of protocols (e.g. SIP, SMB/CIFS) - but they are unlikely to appear
until somebody really needs them and either develops them on his own or
funds development.
\subsubsection{Integration of connection tracking with iptables}
As stated earlier, conntrack doesn't impose any policy on packets. It just
determines the relation of a packet to already existing connections. To base
packet filtering decision on this sate information, the iptables {\it state}
match can be used. Every packet is within one of the following categories:
\begin{itemize}
\item
{\bf NEW}: packet would create a new connection, if it survives
\item
{\bf ESTABLISHED}: packet is part of an already established connection
(either direction)
\item
{\bf RELATED}: packet is in some way related to an already established connection, e.g. ICMP errors or FTP data sessions
\item
{\bf INVALID}: conntrack is unable to derive conntrack information from this packet. Please note that all multicast or broadcast packets fall in this category.
\end{itemize}
\subsection{Poor man's conntrack failover}
When thinking about failover of stateful firewalls, one usually thinks about
replication of state. This presumes that the state is gathered at one
firewalling node (the currently active node), and replicated to several other
passive standby nodes. There is, howeve, a very different approach to
replication: concurrent state tracking on all firewalling nodes.
The basic assumption of this approach is: In a setup where all firewalling
nodes receive exactly the same traffic, all nodes will deduct the same state
information.
The implementability of this approach is totally dependent on fulfillment of
this assumption.
\begin{itemize}
\item
{\it All packets need to be seen by all nodes}. This is not always true, but
can be achieved by using shared media like traditional ethernet (no switches!!)
and promiscuous mode on all ethernet interfaces.
\item
{\it All nodes need to be able to process all packets}. This cannot be
universally guaranteed. Even if the hardware (CPU, RAM, Chipset, NIC's) and
software (Linux kernel) are exactly the same, they might behave different,
especially under high load. To avoid those effects, the hardware should be
able to deal with way more traffic than seen during operation. Also, there
should be no userspace processes (like proxes, etc.) running on the firewalling
nodes at all. WARNING: Nobody guarantees this behaviour. However, the poor
man is usually not interested in scientific proof but in usability in his
particular practical setup.
\end{itemize}
However, even if those conditions are fulfilled, ther are remaining issues:
\begin{itemize}
\item
{\it No resynchronization after reboot}. If a node is rebooted (because of
a hardware fault, software bug, software update, ..) it will loose all state
information until the event of the reboot. This means, the state information
of this node after reboot will not contain any old state, gathered before the
reboot. The effect depend on the traffic. Generally, it is only assured that
state information about all connections initiated after the reboot will be
present. If there are short-lived connections (like http), the state
information on the just rebooted node will approximate the state information of
an older node. Only after all sessions active at the time of reboot have
terminated, state information is guaranteed to be resynchronized.
\item
{\it Only possible with shared medium}. The practical implication is that no
switched ethernet (and thus no full duplex) can be used.
\end{itemize}
The major advantage of the poor man's approach is implementation simplicity.
No state transfer mechanism needs to be developed. Only very little changes
to the existing conntrack code would be needed in order to be able to
do tracking based on packets received from promiscuous interfaces. The active
node would have packet forwarding turned on, the passive nodes off.
I'm not proposing this as a real solution to the failover problem. It's
hackish, buggy and likely to break very easily. But considering it can be
implemented in very little programming time, it could be an option for very
small installations with low reliability criteria.
\subsection{Conntrack state replication}
The preferred solution to the failover problem is, without any doubt,
replication of the connection tracking state.
The proposed conntrack state replication soltution consists out of several
parts:
\begin{itemize}
\item
A connection tracking state replication protocol
\item
An event interface generating event messages as soon as state information
changes on the active node
\item
An interface for explicit generation of connection tracking table entries on
the standby slaves
\item
Some code (preferrably a kernel thread) running on the active node, receiving
state updates by the event interface and generating conntrack state replication
protocol messages
\item
Some code (preferrably a kernel thread) running on the slave node(s), receiving
conntrack state replication protocol messages and updating the local conntrack
table accordingly
\end{itemize}
Flow of events in chronological order:
\begin{itemize}
\item
{\it on active node, inside the network RX softirq}
\begin{itemize}
\item
connection tracking code is analyzing a forwarded packet
\item
connection tracking gathers some new state information
\item
connection tracking updates local connection tracking database
\item
connection tracking sends event message to event API
\end{itemize}
\item
{\it on active node, inside the conntrack-sync kernel thread}
\begin{itemize}
\item
conntrack sync daemon receives event through event API
\item
conntrack sync daemon aggregates multiple event messages into a state replication protocol message, removing possible redundancy
\item
conntrack sync daemon generates state replication protocol message
\item
conntrack sync daemon sends state replication protocol message
(private network between firewall nodes)
\end{itemize}
\item
{\it on slave node(s), inside network RX softirq}
\begin{itemize}
\item
connection tracking code ignores packets coming from the interface attached to the private conntrac sync network
\item
state replication protocol messages is appended to socket receive queue of conntrack-sync kernel thread
\end{itemize}
\item
{\it on slave node(s), inside conntrack-sync kernel thread}
\begin{itemize}
\item
conntrack sync daemon receives state replication message
\item
conntrack sync daemon creates/updates conntrack entry
\end{itemize}
\end{itemize}
\subsubsection{Connection tracking state replication protocol}
In order to be able to replicate the state between two or more firewalls, a
state replication protocol is needed. This protocol is used over a private
network segment shared by all nodes for state replication. It is designed to
work over IP unicast and IP multicast transport. IP unicast will be used for
direct point-to-point communication between one active firewall and one
standby firewall. IP multicast will be used when the state needs to be
replicated to more than one standby firewall.
The principle design criteria of this protocol are:
\begin{itemize}
\item
{\bf reliable against data loss}, as the underlying UDP layer does only
provide checksumming against data corruption, but doesn't employ any
means against data loss
\item
{\bf lightweight}, since generating the state update messages is
already a very expensive process for the sender, eating additional CPU,
memory and IO bandwith.
\item
{\bf easy to parse}, to minimize overhead at the receiver(s)
\end{itemize}
The protocol does not employ any security mechanism like encryption,
authentication or reliability against spoofing attacks. It is
assumed that the private conntrack sync network is a secure communications
channel, not accessible to any malicious 3rd party.
To achieve the reliability against data loss, an easy sequence numbering
scheme is used. All protocol messages are prefixed by a seuqence number,
determined by the sender. If the slave detects packet loss by discontinuous
sequence numbers, it can request the retransmission of the missing packets
by stating the missing sequence number(s). Since there is no acknowledgement
for sucessfully received packets, the sender has to keep a reasonably-sized
backlog of recently-sent packets in order to be able to fulfill retransmission
requests.
The different state replication protocol messages types are:
\begin{itemize}
\item
{\bf NF\_CTSRP\_NEW}: New conntrack entry has been created (and
confirmed\footnote{See the above description of the conntrack code for what is
meant by {\it confirming} a conntrack entry})
\item
{\bf NF\_CTSRP\_UPDATE}: State information of existing conntrack entry has
changed
\item
{\bf NF\_CTSRP\_EXPIRE}: Existing conntrack entry has been expired
\end{itemize}
To uniquely identify (and later reference) a conntrack entry, a
{\it conntrack\_id} is assigned to every conntrack entry transferred
using a NF\_CTSRP\_NEW message. This conntrack\_id must be saved at the
receiver(s) together with the conntrack entry, since it is used by the sender
for subsequent NF\_CTSRP\_UPDATE and NF\_CTSRP\_EXPIRE messages.
The protocol itself does not care about the source of this conntrack\_id,
but since the current netfilter connection tracking implementation does never
change the addres of a conntrack entry, the memory address of the entry can be
used, since it comes for free.
\subsubsection{Connection tracking state syncronization sender}
Maximum care needs to be taken for the implementation of the ctsyncd sender.
The normal workload of the active firewall node is likely to be already very
high, so generating and sending the conntrack state replication messages needs
to be highly efficient.
\begin{itemize}
\item
{\bf NF\_CTSRP\_NEW} will be generated at the NF\_IP\_POST\_ROUTING
hook, at the time ip\_conntrack\_confirm() is called. Delaying
this message until conntrack confirmation happens saves us from
replicating otherwise unneeded state information.
\item
{\bf NF\_CTSRP\_UPDATE} need to be created automagically by the
conntrack core. It is not possible to have any failover-specific
code within conntrack protocol and/or application helpers.
The easiest way involving the least changes to the conntrack core
code is to copy parts of the conntrack entry before calling any
helper functions, and then use memcmp() to find out if the helper
has changed any information.
\item
{\bf NF\_CTSRP\_EXPIRE} can be added very easily to the existing
conntrack destroy function.
\end{itemize}
\subsubsection{Connection tracking state syncronization receiver}
Impmentation of the receiver is very straightforward.
Apart from dealing with lost CTSRP packets, it just needs to call the
respective conntrack add/modify/delete functions offered by the core.
\subsubsection{Necessary changes within netfilter conntrack core}
To be able to implement the described conntrack state replication mechanism,
the following changes to the conntrack core are needed:
\begin{itemize}
\item
Ability to exclude certain packets from being tracked. This is a
long-wanted feature on the TODO list of the netfilter project and will
be implemented by having a ``prestate'' table in combination with a
``NOTRACK'' target.
\item
Ability to register callback functions to be called every time a new
conntrack entry is created or an existing entry modified.
\item
Export an API to add externally add, modify and remove conntrack
entries. Since the needed ip\_conntrack\_lock is exported,
implementation could even reside outside the conntrack core code.
\end{itemize}
Since the number of changes is very low, it is very likely that the
modifications will go into the mainstream kernel without any big hazzle.
\end{document}

View File

@ -0,0 +1,56 @@
% TEMPLATE for Usenix papers, specifically to meet requirements of
% TCL97 committee.
% originally a template for producing IEEE-format articles using LaTeX.
% written by Matthew Ward, CS Department, Worcester Polytechnic Institute.
% adapted by David Beazley for his excellent SWIG paper in Proceedings,
% Tcl 96
% turned into a smartass generic template by De Clarke, with thanks to
% both the above pioneers
% use at your own risk. Complaints to /dev/null.
% make it two column with no page numbering, default is 10 point
% adapted for Ottawa Linux Symposium
% include following in document.
%\documentclass[twocolumn]{article}
%\usepackage{usits,epsfig}
\pagestyle{empty}
%set dimensions of columns, gap between columns, and space between paragraphs
%\setlength{\textheight}{8.75in}
\setlength{\textheight}{9.0in}
\setlength{\columnsep}{0.25in}
\setlength{\textwidth}{6.45in}
\setlength{\footskip}{0.0in}
\setlength{\topmargin}{0.0in}
\setlength{\headheight}{0.0in}
\setlength{\headsep}{0.0in}
\setlength{\oddsidemargin}{0in}
%\setlength{\oddsidemargin}{-.065in}
%\setlength{\oddsidemargin}{-.17in}
\setlength{\parindent}{0pc}
\setlength{\parskip}{\baselineskip}
% started out with art10.sty and modified params to conform to IEEE format
% further mods to conform to Usenix standard
\makeatletter
%as Latex considers descenders in its calculation of interline spacing,
%to get 12 point spacing for normalsize text, must set it to 10 points
\def\@normalsize{\@setsize\normalsize{12pt}\xpt\@xpt
\abovedisplayskip 10pt plus2pt minus5pt\belowdisplayskip \abovedisplayskip
\abovedisplayshortskip \z@ plus3pt\belowdisplayshortskip 6pt plus3pt
minus3pt\let\@listi\@listI}
%need a 12 pt font size for subsection and abstract headings
\def\subsize{\@setsize\subsize{12pt}\xipt\@xipt}
%make section titles bold and 12 point, 2 blank lines before, 1 after
\def\section{\@startsection {section}{1}{\z@}{24pt plus 2pt minus 2pt}
{12pt plus 2pt minus 2pt}{\large\bf}}
%make subsection titles bold and 11 point, 1 blank line before, 1 after
\def\subsection{\@startsection {subsection}{2}{\z@}{12pt plus 2pt minus 2pt}
{12pt plus 2pt minus 2pt}{\subsize\bf}}
\makeatother

View File

@ -0,0 +1,33 @@
Linux packet filtering in the 2.6.x kernel series
The Linux 2.4.x provided a complete rewrite of the firewalling subsystem,
called netfilter/iptables. It was a major improvement about the previous
ipchains subsystem. The major advantages are it's modularity and flexibility.
However, as wity any project, as soon as you are sort-of finished, you become
aware of potential improvements and extensions.
The firewalling subsystem within the Linux kernel will undergo some fundamental design changes during the 2.5.x development kernel series.
Some of the changes from 2.4.x are:
- Have an independent pkt_tables subsystem, as a layer3 independent replacement
for iptables, ip6tables and arptables. This will allow adding support for
other layer 3 protocols very easily
- Move all kernel/userspace communication to netlink sockets. There will be
a generic nfnetlink layer, with pkttnetlink (for managing pkt_tables) and
ctnetlink (for manipulating the connection tracking database from userspace).
- Change the internal data structure of an ip_table to a linked list of chains,
which in turn are a linked lists out of rules, which are linked lists out of
matches + targets. This way it is _way_ more performant in the case of
dynamic firewalling rulesets.
- Provide a generic high-level API to userspace applications for manipulation
of packet filtering rules. This will enable generic GUI's, which need no
changes in case new matches or targets are added.
Optionally, the netfilter core team is planning to have support for connection
tracking state replication - something necessarry for failover of stateful
firewalls.
The talk assumes prior knowledge about the netfilter/iptables architecture.

View File

@ -0,0 +1,374 @@
%include "default.mgp"
%default 1 bgrad
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
%nodefault
%back "blue"
%center
%size 7
The future of Linux packet filtering
targeted for kernel 2.6
%center
%size 4
by
Harald Welte <laforge@gnumonks.org>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Contents
Problems with current 2.4.x netfilter/iptables
Solution to code replication
Solution for dynamic rulesets
Solution for API to GUI's and other management programs
HA for stateful firewalling
What's special about firewalling HA
Poor man's failover
Real state replication
Other current work
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Problems with 2.4.x netfilter/iptables
code replication between iptables/ip6tables/arptables
iptables was never meant for other protocols, but people did copy+paste 'ports'
replication of
core kernel code
layer 3 independent matches (mac, interface, ...)
userspace library (libiptc)
userspace tool (iptables)
userspace plugins (libipt_xxx.so)
doesn't suit the needs for dynamically changing rulesets
dynamic rulesets becomming more common due (service selection, IDS)
a whole table is created in userspace and sent as blob to kernel
for every ruleset the table needs to be copied to userspace and back
inside kernel consistency checks on whole table, loop detection
too extensible for writing any forward-compatible GUI
new extensions showing up all the time
a frontend would need to know about the options and use of a new extension
thus frontends are always incomplete and out-of-date
no high-level API other than piping to iptables-restore
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Reducing code replication
code replication is a real problem: unclean, bugfixes missed
we need layer 3 independent layer for
submitting rules to the kernel
traversing packet-rulesets supporting match/target modules
registering matches/targets
layer 3 specific (like matching ipv4 address)
layer 3 independent (like matching MAC address)
solution
pkt_tables inside kernel
pkt_tables_ipv4 registers layer 3 handler with pkt_tables
pkt_tables_ipv6 registers layer 3 handler with pkt_tables
everybody registering a pkt_table (like iptable_filter) needs to specify the l3 protocol
libraries in userspace (see later)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Supporting dynamic rulesets
atomic table-replacement turned out to be bad idea
need new interface for sending individual rules to kernel
policy routing has the same problem and good solution: rtnetlink
solution: nfnetlink
multicast-netlink based packet-orinented socket between kernel and userspace
has extra benefit that other userspace processes get notified of rule changes [just like routing daemons]
nfnetlink will be low-layer below all kernel/userspace communication
pkttnetlink [aka iptnetlink]
ctnetlink
ulog
ip_queue
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Communication with other programs
whole set of libraries
libnfnetlink for low-layer communication
libpkttnetlink for rule modifications
will handle all plugins [which are currently part of iptables]
query functions about avaliable matches/targets
query functions about parameters
query functions for help messages about specific match/parameter of a match
generic structure from which rules can be built
conversion functions to parse generic structure into in-kernel structure
conversion functiosn to perse kernel structure into generic structure
functions to convert generic structure in plain text
libipq will stay API-compatible to current version
libipulog will stay API-compatible to current version
libiptc will go away [compatibility layer extremely difficult]
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Introduction
What is special about firewall failover?
Nothing, in case of the stateless packet filter
Common IP takeover solutions can be used
VRRP
Hartbeat
Distribution of packet filtering ruleset no problem
can be done manually
or implemented with simple userspace process
Problems arise with stateful packet filters
Connection state only on active node
NAT mappings only on active node
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Connection tracking...
implemented seperately from NAT
enables stateful filtering
implementation
hooks into NF_IP_PRE_ROUTING to track packets
hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to see if packet passed filtering rules
protocol modules (currently TCP/UDP/ICMP)
application helpers currently (FTP,IRC,H.323,talk,SNMP)
divides packets in the following four categories
NEW - would establish new connection
ESTABLISHED - part of already established connection
RELATED - is related to established connection
INVALID - (multicast, errors...)
does _NOT_ filter packets itself
can be utilized by iptables using the 'state' match
is used by NAT Subsystem
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Common structures
struct ip_conntrack_tuple, representing unidirectional flow
layer 3 src + dst
layer 4 protocol
layer 4 src + dst
connetions represented as struct ip_conntrack
original tuple
reply tuple
timeout
l4 state private data
app helper
app helper private data
expected connections
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Flow of events for new packet
packet enters NF_IP_PRE_ROUTING
tuple is derived from packet
lookup conntrack hash table with hash(tuple) -> fails
new ip_conntrack is allocated
fill in original and reply == inverted(original) tuple
initialize timer
assign app helper if applicable
see if we've been expected -> fails
call layer 4 helper 'new' function
...
packet enters NF_IP_POST_ROUTING
do hashtable lookup for packet -> fails
place struct ip_conntrack in hashtable
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Flow of events for packet part of existing connection
packet enters NF_IP_PRE_ROUTING
tuple is derived from packet
lookup conntrack hash table with hash(tuple)
assosiate conntrack entry with skb->nfct
call l4 protocol helper 'packet' function
do l4 state tracking
update timeouts as needed [i.e. TCP TIME_WAIT,...]
...
packet enters NF_IP_POST_ROUTING
do hashtable lookup for packet -> succeds
do nothing else
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Poor man's failover
Poor man's failover
principle
let every node do it's own tracking rather than replicating state
two possible implementations
connect every node to shared media (i.e. real ethernet)
forwarding only turned on on active node
slave nodes use promiscuous mode to sniff packets
copy all traffic to slave nodes
active master needs to copy all traffic to other nodes
disadvantage: high load, sync traffic == payload traffic
IMHO stupid way of solving the problem
advantages
very easy implementation
only addition of sniffing mode to conntrack needed
existing means of address takeover can be used
same load on active master and slave nodes
no additional load on active master
disadvantages
can only be used with real shared media (no switches, ...)
can not be used with NAT
remaining problem
no initial state sync after reboot of slave node!
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Real state replication
Parts needed
state replication protocol
multicast based
sequence numbers for detection of packet loss
NACK-based retransmission
no security, since private ethernet segment to be used
event interface on active node
calling out to callback function at all state changes
exported interface to manipulate conntrack hash table
kernel thread for sending conntrack state protocol messages
registers with event interface
creates and accumulates state replication packets
sends them via in-kernel sockets api
kernel thread for receiving conntrack state replication messages
receives state replication packets via in-kernel sockets
uses conntrack hashtable manipulation interface
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Real state replication
Flow of events in chronological order:
on active node, inside the network RX softirq
connection tracking code is analyzing a forwarded packet
connection tracking gathers some new state information
connection tracking updates local connection tracking database
connection tracking sends event message to event API
on active node, inside the conntrack-sync kernel thread
conntrack sync daemon receives event through event API
conntrack sync daemon aggregates multiple event messages into a state replication protocol message, removing possible redundancy
conntrack sync daemon generates state replication protocol message
conntrack sync daemon sends state replication protocol message
on slave node(s), inside network RX softirq
connection tracking code ignores packets coming from the interface attached to the private conntrac sync network
state replication protocol messages is appended to socket receive queue of conntrack-sync kernel thread
on slave node(s), inside conntrack-sync kernel thread
conntrack sync daemon receives state replication message
conntrack sync daemon creates/updates conntrack entry
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Neccessary changes to kernel
Neccessary changes to current conntrack core
event generation (callback functions) for all state changes
conntrack hashtable manipulation API
is needed (and already implemented) for 'ctnetlink' API
conntrack exemptions
needed to _not_ track conntrack state replication packets
is needed for other cases as well
currently being developed by Jozsef Kadlecsik
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Other current work
optimizing the conntrack code
hash function optimization
current hash function not good for even hash bucket count
other hash functions in development
hash function evaluation tool [cttest] avaliable
introduce per-system randomness to prevent hash attack
code optimization (locking/timers/...)
getting our work submitted into the mainstream kernel
turns out to be more difficult
e.g. newnat api now waiting for three months
discussions about multiple targets/actions per rule
technical implementation easy
however, not everybody convinced that it fits into the concept
using tc for firewalling
Jamal Hadi Selim uses iptables targets from within TC
leads to discussion of generic classification engine API in kernel
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Thanks
The slides and the an according paper of this presentation are available at http://www.gnumonks.org/
The netfilter homepage http://www.netfilter.org/
Thanks to
the BBS people, Z-Netz, FIDO, ...
for heavily increasing my computer usage in 1992
KNF
for bringing me in touch with the internet as early as 1994
for providing a playground for technical people
for telling me about the existance of Linux!
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
for implementing (one of?) the world's best TCP/IP stacks
Paul 'Rusty' Russell
for starting the netfilter/iptables project
for trusting me to maintain it today
Astaro AG
for sponsoring parts of my netfilter work

View File

@ -0,0 +1,49 @@
Linux 2.4.x netfilter/iptables firewalling internals (lt-690870524)
The Linux 2.4.x kernel series has introduced a totally new kernel firewalling subsystem. It is much more than a plain successor of ipfwadm or ipchains.
The netfilter/iptables project has a very modular design and it's
sub-projects can be split in several parts: netfilter, iptables, connection
tracking, NAT and packet mangling.
While most users will already have learned how to use the basic functions
of netfilter/iptables in order to convert their old ipchains firewalls to
iptables, there's more advanced but less used functionality in
netfilter/iptables.
The presentation covers the design principles behind the netfilter/iptables
implementation. This knowledge enables us to understand how the individual
parts of netfilter/iptables fit together, and for which potential applications
this is useful.
Topics covered:
- overview about the internal netfilter/iptables architecture
- the netfilter hooks inside the network protocol stacks
- packet selection with IP tables
- how is connection tracking and NAT integrated into the framework
- the connection tracking system
- how good does it track the TCP state?
- how does it track ICMP and UDP state at all?
- layer 4 protocol helpers (GRE, ...)
- application helpers (ftp, irc, h323, ...)
- restrictions/limitations
- the NAT system
- how does it interact with connection tracking?
- layer 4 protocol helpers
- application helpers (ftp, irc, ...)
- misc
- how far is IPv6 firewalling with ip6tables?
- advances in failover/HA of stateful firewalls
- ivisible firewalls with iptables on a bridge
- userspace packet queueing with QUEUE
- userspace packet logging with ULOG
Requirements:
- knowledge about the TCP/IP protocol family
- knowledge about general firewalling and packet filtering concepts
- prior experience with linux packet filters
Audience:
- firewall administrators
- network developers

View File

@ -0,0 +1,520 @@
%include "default.mgp"
%default 1 bgrad
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
%nodefault
%back "blue"
%center
%size 7
Linux 2.4.x netfilter/iptables
firewalling internals
%center
%size 4
by
Harald Welte <laforge@gnumonks.org>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Contents
Introduction
Netfilter hooks in protocol stacks
Packet selection based on IP Tables
The Connection Tracking Subsystem
The NAT Subsystem based on netfilter + iptables
Packet filtering using the 'filter' table
Packet mangling using the 'mangle' table
Advanced netfilter concepts
Current development and Future
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Introduction
Why did we need netfilter/iptables?
Because ipchains...
has no infrastructure for passing packets to userspace
makes transparent proxying extremely difficult
has interface address dependent Packet filter rules
has Masquerading implemented as part of packet filtering
code is too complex and intermixed with core ipv4 stack
is neither modular nor extensible
only barely supports one special case of NAT (masquerading)
has only stateless packet filtering
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Introduction
Who's behind netfilter/iptables
Paul 'Rusty' Russel
co-author of iptables in Linux 2.2
was paid by Watchguard for about one Year of development
James Morris
userspace queuing (kernel, library and tools)
REJECT target
Marc Boucher
NAT and packet filtering controlled by one command
Mangle table
Harald Welte
Conntrack+NAT helper infrastructure (newnat)
Userspace packet logging (ULOG)
PPTP and IRC conntrack/NAT helpers
Jozsef Kadlecsik
TCP window tracking
H.323 conntrack + NAT helper
Continued newnat development
Non-core team contributors
http://www.netfilter.org/scoreboard/
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Netfilter Hooks
What is netfilter?
System of callback functions within network stack
Callback function to be called for every packet traversing certain point (hook) within network stack
Protocol independent framework
Hooks in layer 3 stacks (IPv4, IPv6, DECnet, ARP)
Multiple kernel modules can register with each of the hooks
Asynchronous packet handling in userspace (ip_queue)
Traditional packet filtering, NAT, ... is implemented on top of this framework
Can be used for other stuff interfacing with the core network stack, like DECnet routing daemon.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Netfilter Hooks
Netfilter architecture in IPv4
%font "courier"
--->[1]--->[ROUTE]--->[3]--->[4]--->
| ^
| |
| [ROUTE]
v |
[2] [5]
| ^
| |
v |
%font "standard"
1=NF_IP_PRE_ROUTING
2=NF_IP_LOCAL_IN
3=NF_IP_FORWARD
4=NF_IP_POST_ROUTING
5=NF_IP_LOCAL_OUT
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Netfilter Hooks
Netfilter Hooks
Any kernel module may register a callback function at any of the hooks
The module has to return one of the following constants
NF_ACCEPT continue traversal as normal
NF_DROP drop the packet, do not continue
NF_STOLEN I've taken over the packet do not continue
NF_QUEUE enqueue packet to userspace
NF_REPEAT call this hook again
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
IP tables
Packet selection using IP tables
The kernel provides generic IP tables support
Each kernel module may create it's own IP table
The three major parts of 2.4 firewalling subsystem are implemented using IP tables
Packet filtering table 'filter'
NAT table 'nat'
Packet mangling table 'mangle'
Can potentially be used for other stuff, i.e. IPsec SPDB
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
IP Tables
Managing chains and tables
An IP table consists out of multiple chains
A chain consists out of a list of rules
Every single rule in a chain consists out of
match[es] (rule executed if all matches true)
target (what to do if the rule is matched)
%size 4
matches and targets can either be builtin or implemented as kernel modules
%size 6
The userspace tool iptables is used to control IP tables
handles all different kinds of IP tables
supports a plugin/shlib interface for target/match specific options
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
IP Tables
Basic iptables commands
To build a complete iptables command, we must specify
which table to work with
which chain in this table to use
an operation (insert, add, delete, modify)
one or more matches (optional)
a target
The syntax is
%font "typewriter"
%size 3
iptables -t table -Operation chain -j target match(es)
%font "standard"
%size 5
Example:
%font "typewriter"
%size 3
iptables -t filter -A INPUT -j ACCEPT -p tcp --dport smtp
%font "standard"
%size 5
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
IP Tables
Matches
Basic matches
-p protocol (tcp/udp/icmp/...)
-s source address (ip/mask)
-d destination address (ip/mask)
-i incoming interface
-o outgoing interface
Match extensions (examples)
tcp/udp TCP/udp source/destination port
icmp ICMP code/type
ah/esp AH/ESP SPID match
mac source MAC address
mark nfmark
length match on length of packet
limit rate limiting (n packets per timeframe)
owner owner uid of the socket sending the packet
tos TOS field of IP header
ttl TTL field of IP header
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
IP Tables
Targets
very dependent on the particular table.
Table specific targets will be discussed later
Generic Targets, always available
ACCEPT accept packet within chain
DROP silently drop packet
QUEUE enqueue packet to userspace
LOG log packet via syslog
ULOG log packet via ulogd
RETURN return to previous (calling) chain
foobar jump to user defined chain
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Packet Filtering
Overview
Implemented as 'filter' table
Registers with three netfilter hooks
NF_IP_LOCAL_IN (packets destined for the local host)
NF_IP_FORWARD (packets forwarded by local host)
NF_IP_LOCAL_OUT (packets from the local host)
Each of the three hooks has attached one chain (INPUT, FORWARD, OUTPUT)
Every packet passes exactly one of the three chains. Note that this is very different compared to the old 2.2.x ipchains behaviour.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Packet Filtering
Targets available within 'filter' table
Builtin Targets to be used in filter table
ACCEPT accept the packet
DROP silently drop the packet
QUEUE enqueue packet to userspace
RETURN return to previous (calling) chain
foobar user defined chain
Targets implemented as loadable modules
REJECT drop the packet but inform sender
MIRROR change source/destination IP and resend
LOG log via syslog
ULOG log via userspace
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Connection Tracking Subsystem
Connection tracking...
implemented seperately from NAT
enables stateful filtering
implementation
hooks into NF_IP_PRE_ROUTING to track packets
hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to see if packet passed filtering rules
protocol modules (currently TCP/UDP/ICMP)
application helpers currently (FTP,IRC,H.323,talk,SNMP)
divides packets in the following four categories
NEW - would establish new connection
ESTABLISHED - part of already established connection
RELATED - is related to established connection
INVALID - (multicast, errors...)
does _NOT_ filter packets itself
can be utilized by iptables using the 'state' match
is used by NAT Subsystem
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Common structures
struct ip_conntrack_tuple, representing unidirectional flow
layer 3 src + dst
layer 4 protocol
layer 4 src + dst
connetions represented as struct ip_conntrack
original tuple
reply tuple
timeout
l4 state private data
app helper
app helper private data
expected connections
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Flow of events for new packet
packet enters NF_IP_PRE_ROUTING
tuple is derived from packet
lookup conntrack hash table with hash(tuple) -> fails
new ip_conntrack is allocated
fill in original and reply == inverted(original) tuple
initialize timer
assign app helper if applicable
see if we've been expected -> fails
call layer 4 helper 'new' function
...
packet enters NF_IP_POST_ROUTING
do hashtable lookup for packet -> fails
place struct ip_conntrack in hashtable
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Flow of events for packet part of existing connection
packet enters NF_IP_PRE_ROUTING
tuple is derived from packet
lookup conntrack hash table with hash(tuple)
assosiate conntrack entry with skb->nfct
call l4 protocol helper 'packet' function
do l4 state tracking
update timeouts as needed [i.e. TCP TIME_WAIT,...]
...
packet enters NF_IP_POST_ROUTING
do hashtable lookup for packet -> succeds
do nothing else
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Network Address Translation
Overview
Previous Linux Kernels only implemented one special case of NAT: Masquerading
Linux 2.4.x can do any kind of NAT.
NAT subsystem implemented on top of netfilter, iptables and conntrack
NAT subsystem registers with all five netfilter hooks
'nat' Table registers chains PREROUTING, POSTROUTING and OUTPUT
Following targets available within 'nat' Table
SNAT changes the packet's source whille passing NF_IP_POST_ROUTING
DNAT changes the packet's destination while passing NF_IP_PRE_ROUTING
MASQUERADE is a special case of SNAT
REDIRECT is a special case of DNAT
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Network Address Translation
Source NAT
SNAT Example:
%font "typewriter"
%size 3
iptables -t nat -A POSTROUTING -j SNAT --to-source 1.2.3.4 -s 10.0.0.0/8
%font "standard"
%size 4
MASQUERADE Example:
%font "typewriter"
%size 3
iptables -t nat -A POSTROUTING -j MASQUERADE -o ppp0
%font "standard"
%size 5
Destination NAT
DNAT example
%font "typewriter"
%size 3
iptables -t nat -A PREROUTING -j DNAT --to-destination 1.2.3.4:8080 -p tcp --dport 80 -i eth1
%font "standard"
%size 4
REDIRECT example
%font "typewriter"
%size 3
iptables -t nat -A PREROUTING -j REDIRECT --to-port 3128 -i eth1 -p tcp --dport 80
%font "standard"
%size 5
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Packet Mangling
Purpose of mangle table
packet manipulation except address manipulation
Integration with netfilter
'mangle' table hooks in all five netfilter hooks
priority: after conntrack
Targets specific to the 'mangle' table:
DSCP - manipulate DSCP field
IPV4OPTSSTRIP - strip IPv4 options
MARK - change the nfmark field of the skb
TCPMSS - set TCP MSS option
TOS - manipulate the TOS bits
TTL - set / increase / decrease TTL field
Simple example:
%font "typewriter"
%size 3
iptables -t mangle -A PREROUTING -j MARK --set-mark 10 -p tcp --dport 80
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Advanced Netfilter concepts
%size 4
Userspace logging
flexible replacement for old syslog-based logging
packets to userspace via multicast netlink sockets
easy-to-use library (libipulog)
plugin-extensible userspace logging daemon (ulogd)
Can even be used to directly log into MySQL
Queuing
reliable asynchronous packet handling
packets to userspace via unicast netlink socket
easy-to-use library (libipq)
provides Perl bindings
experimental queue multiplex daemon (ipqmpd)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Current Development and Future
Netfilter (although it proved very stable) is still work in progress.
Areas of current development
infrastructure for conntrack manipulation from userspace
failover of stateful firewalls
making iptables layer3 independent (pkttables)
new userspace library (libiptables) to hide plugins from apps
more matches and targets for advanced functions (pool, hashslot)
more conntrack and NAT modules (RPC, SNMP, SMB, ...)
better IPv6 support (conntrack, more matches / targets)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Thanks
The slides and the an according paper of this presentation are available at http://www.gnumonks.org/
The netfilter homepage http://www.netfilter.org/
Thanks to
the BBS people, Z-Netz, FIDO, ...
for heavily increasing my computer usage in 1992
KNF
for bringing me in touch with the internet as early as 1994
for providing a playground for technical people
for telling me about the existance of Linux!
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
for implementing (one of?) the world's best TCP/IP stacks
Paul 'Rusty' Russell
for starting the netfilter/iptables project
for trusting me to maintain it today
Astaro AG
for sponsoring parts of my netfilter work

View File

@ -0,0 +1,537 @@
\documentclass{article}
\usepackage{german}
\usepackage{fancyheadings}
\usepackage{a4}
\setlength{\oddsidemargin}{0in}
\setlength{\evensidemargin}{0in}
\setlength{\topmargin}{0.0in}
\setlength{\headheight}{0in}
\setlength{\headsep}{0in}
\setlength{\textwidth}{6.5in}
\setlength{\textheight}{9.5in}
\setlength{\parindent}{0in}
\setlength{\parskip}{0.05in}
\begin{document}
\title{Linux 2.4.x netfilter/iptables firewalling internals}
\author{Harald Welte\\
laforge@gnumonks.org\\
\copyright{}2002 H. Welte}
\date{25. April 2002}
\maketitle
\setcounter{section}{0}
\setcounter{subsection}{0}
\setcounter{subsubsection}{0}
\section{Introduction}
The Linux 2.4.x kernel series has introduced a totally new kernel firewalling
subsystem. It is much more than a plain successor of ipfwadm or ipchains.
The netfilter/iptables project has a very modular design and it's
sub-projects can be split in several parts: netfilter, iptables, connection
tracking, NAT and packet mangling.
While most users will already have learned how to use the basic functions
of netfilter/iptables in order to convert their old ipchains firewalls to
iptables, there's more advanced but less used functionality in
netfilter/iptables.
The presentation covers the design principles behind the netfilter/iptables
implementation. This knowledge enables us to understand how the individual
parts of netfilter/iptables fit together, and for which potential applications
this is useful.
\section{Internal netfilter/iptables architecture}
\subsection{Netfilter hooks in protocol stacks}
One of the major motivations behind the redesign of the linux packet
filtering and NAT system during the 2.3.x kernel series was the widespread
firewall specific code parts within the core IPv4 stack. Ideally the core
IPv4 stack (as used by regular hosts and routers) shouldn't contain any
firewalling specific code, resulting in no unwanted interaction and less
code complexity. This desire lead to the invention of {\it netfilter}.
\subsubsection{Architecture of netfilter}
Netfilter is basically a system of callback functions within the network
stack. It provides a non-portable API towards in-kernel networking
extensions.
What we call {\it netfilter hook} is a well-defined call-out point within a
layer three protocol stack, such as IPv4, IPv6 or DECnet. Any layer three
network stack can define an arbitrary number of hooks, usually placed at
strategic points within the packet flow.
Any other kernel code can now subsequently register callback functions for
any of these hooks. As in most sytems will be more than one callback
function registered for a particular hook, a {\it priority} is specified upon
registration of the callback function. This priority defines the order in
which the individual callback functions at a particular hook are called.
The return value of any registered callback functions can be:
\begin{itemize}
\item
{\bf NF\_ACCEPT}: continue traversal as usual
\item
{\bf NF\_DROP}: drop the packet; do not continue traversal
\item
{\bf NF\_STOLEN}: callback function has taken over the packet; do not continue
\item
{\bf NF\_QUEUE}: enqueue the packet to userspace
\item
{\bf NF\_REPEAT}: call this hook again
\end{itemize}
\subsubsection{Netfilter hooks within IPv4}
The IPv4 stack provides five netfilter hooks, which are placed at the
following peculiar places within the code:
\begin{verbatim}
--->[1]--->[ROUTE]--->[3]--->[4]--->
| ^
| |
| [ROUTE]
v |
[2] [5]
| ^
| |
v |
local processes
\end{verbatim}
Packets received on any network interface arrive at the left side of the
diagram. After the verification of the IP header checksum, the
NF\_IP\_PRE\_ROUTING [1] hook is traversed.
If they ``survive'' (i.e. NF\_ACCEPT is returned), the packet enters the
routing code. Where we continue from here depends on the destintion of the
packet.
Packets with a local destination (i.e. packets where the destination address is
one of the own IP addresses of the host) traverse the NF\_IP\_LOCAL\_IN [2]
hook. If all callback function return NF\_ACCEPT, the packet is finally passed
to the socket code, which eventually passes the packet to a local process.
Packets with a remote destination (i.e. packets which are forwarded by the
local machine) traverse the NF\_IP\_FORWARD [3] hook. If they ``survive'',
they finally pass the NF\_IP\_POST\_ROUTING [4] hook and are sent off the
outgoing network interface.
Locally generated packets first traverse the NF\_IP\_LOCAL\_OUT [5] hook, then
enter the routing code, and finally go through the NF\_IP\_POST\_ROUTING [4]
hook before being sent off the outgoing network interface.
\subsubsection{Netfilter hooks within IPv6}
As the IPv4 and IPv6 protocols are very similar, the netfilter hooks within the
IPv6 stack are placed at exactly the same locations as in the IPv4 stack. The
only change are the hook names: NF\_IP6\_PRE\_ROUTING, NF\_IP6\_LOCAL\_IN,
NF\_IP6\_FORWARD, NF\_IP6\_POST\_ROUTING, NF\_IP6\_LOCAL\_OUT.
\subsubsection{Netfilter hooks within DECnet}
There are seven decnet hooks. The first five hooks (NF\_DN\_PRE\_ROUTING,
NF\_DN\_LOCAL\_IN, NF\_DN\_FORWARD, NF\_DN\_LOCAL\_OUT, NF\_DN\_POST\_ROUTING)
are prretty much the same as in IPv4. The last two hooks (NF\_DN\_HELLO,
NF\_DN\_ROUTE) are used in conjunction with DECnet Hello and Routing packets.
\subsubsection{Netfilter hooks within ARP}
Recent kernels\footnote{IIRC, starting with 2.4.19-pre3} have added support for netfilter hooks within the ARP code.
There are two hooks: NF\_ARP\_IN and NF\_ARP\_OUT, for incoming and outgoing
ARP packets respectively.
\subsubsection{Netfilter hooks within IPX}
There have been experimental patches to add netfilter hooks to the IPX code,
but they never got integrated into the kernel source.
\subsection{Packet selection using IP Tables}
The IP tables core (ip\_tables.o) provides a generic layer for evaluation
of rulesets.
An IP table consists out of an arbitrary number of {\it chains}, which in turn
consist out of a linear list of {\it rules}, which again consist out of any
number of {\it matches} and one {\it target}.
{\it Chains} can further be devided into two classes: Either {\it builtin
chains} or {\it user-defined chains}. Builtin chains are always present, they
are created upon table registration. They are also the entry points for table
iteration. User defined chains are created at runtime upon user interaction.
{\it Matches} specify the matching criteria, there can be zero or more matches
{\it Targets} specify the action which is to be executed in case {\bf all}
matches match. There can only be a single target per rule.
Matches and targets can either be {\it builtin} or {\it linux kernel modules}.
There are two special targets:
\begin{itemize}
\item
By using a chain name as target, it is possible to jump to the respective chain
in case the matches match.
\item
By using the RETURN target, it is possible to return to the previous (calling)
chain
\end{itemize}
The IP tables core handles the following functions
\begin{itemize}
\item
Registering and unregistering tables
\item
Registering and unregistering matches and targets (can be implemented as linux kernel modules)
\item
Kernel / userspace interface for manipulation of IP tables
\item
Traversal of IP tables
\end{itemize}
\subsubsection{Packet filtering unsing the ``filter'' table}
Traditional packet filtering (i.e. the successor to ipfwadm/ipchains) takes
place in the ``filter'' table. Packet filtering works like a sieve: A packet
is (in the end) either dropped or accepted - but never modified.
The ``filter'' table is implemented in the {\it iptable\_filter.o} module
and contains three builtin chains:
\begin{itemize}
\item
{\bf INPUT} attaches to NF\_IP\_LOCAL\_IN
\item
{\bf FORWARD} attaches to NF\_IP\_FORWARD
\item
{\bf OUTPUT} attaches to NF\_IP\_LOCAL\_OUT
\end{itemize}
The placement of the chains / hooks is done in such way, that evey concievable
packet always traverses only one of the built-in chains. Packets destined for
the local host traverse only INPUT, packets forwarded only FORWARD and
locally-originated packets only OUTPUT.
\subsubsection{Packet mangling using the ``mangle'' table}
As stated above, operations which would modify a packet do not belong in the
``filter'' table. The ``mangle'' table is available for all kinds of packet
manipulation - but not manipulation of addresses (which is NAT).
The mangle table attaches to all five netfilter hooks and provides the
respectiva builtin chains (PREROUTING, INPUT, FORWARD, OUTPUT, POSTROUTING)
\footnote{This has changed through recent 2.4.x kernel series, old kernels may
only support three (PREROUTING, POSTROUTING, OUTPUT) chains.}.
\subsection{Connection Tracking Subsystem}
Traditional packet filters can only match on matching criteria within the
currently processed packet, like source/destination IP address, port numbers,
TCP flags, etc. As most applications have a notion of connections or at least
a request/response style protocol, there is a lot of information which can not
be derived from looking at a single packet.
Thus, modern (stateful) packet filters attempt to track connections (flows)
and their respective protocol states for all traffic through the packet
filter.
Connection tracking within linux is implemented as a netfilter module, called
ip\_conntrack.o.
Before describing the connection tracking subsystem, we need to describe a couple of definitions and primitives used throughout the conntrack code.
A connection is represented within the conntrack subsystem using {\it struct
ip\_conntrack}, also called {\it connection tracking entry}.
Connection tracking is utilizing {\it conntrack tuples}, which are tuples
consisting out of (srcip, srcport, dstip, dstport, l4prot). A connection is
uniquely identified by two tuples: The tuple in the original direction
(IP\_CT\_DIR\_ORIGINAL) and the tuple for the reply direction
(IP\_CT\_DIR\_REPLY).
Connection tracking itself does not drop packets\footnote{well, in some rare
cases in combination with NAT it needs to drop. But don't tell anyone, this is
secret.} or impose any policy. It just associates every packet with a
connection tracking entry, which in turn has a particular state. All other
kernel code can use this state information\footnote{state information is
internally represented via the {\it struct sk\_buff.nfct} structure member of a
packet.}.
\subsubsection{Integration of conntrack with netfilter}
If the ip\_conntrack.o module is registered with netfilter, it attaches to the
NF\_IP\_PRE\_ROUTING, NF\_IP\_POST\_ROUTING, NF\_IP\_LOCAL\_IN and
NF\_IP\_LOCAL\_OUT hooks.
Because forwarded packets are the most common case on firewalls, I will only
describe how connection tracking works for forwarded packets. The two relevant
hooks for forwarded packets are NF\_IP\_PRE\_ROUTING and NF\_IP\_POST\_ROUTING.
Every time a packet arrives at the NF\_IP\_PRE\_ROUTING hook, connection
tracking creates a conntrack tuple from the packet. It then compares this
tuple to the original and reply tuples of all already-seen connections
\footnote{Of course this is not implemented as a linear search over all existing connections.} to find out if this just-arrived packet belongs to any existing
connection. If there is no match, a new conntrack table entry (struct
ip\_conntrack) is created.
Let's assume the case where we have already existing connections but are
starting from scratch.
The first packet comes in, we derive the tuple from the packet headers, look up
the conntrack hash table, don't find any matching entry. As a result, we
create a new struct ip\_conntrack. This struct ip\_conntrack is filled with
all necessarry data, like the original and reply tuple of the connection.
How do we know the reply tuple? By inverting the source and destination
parts of the original tuple.\footnote{So why do we need two tuples, if they can
be derived from each other? Wait until we discuss NAT.}
Please note that this new struct ip\_conntrack is {\bf not} yet placed
into the conntrack hash table.
The packet is now passed on to other callback functions which have registered
with a lower priority at NF\_IP\_PRE\_ROUTING. It then continues traversal of
the network stack as usual, including all respective netfilter hooks.
If the packet survives (i.e. is not dropped by the routing code, network stack,
firewall ruleset, ...), it re-appears at NF\_IP\_POST\_ROUTING. In this case,
we can now safely assume that this packet will be sent off on the outgoing
interface, and thus put the connection tracking entry which we created at
NF\_IP\_PRE\_ROUTING into the conntrack hash table. This process is called
{\it confirming the conntrack}.
The connection tracking code itself is not monolithic, but consists out of a
couple of seperate modules\footnote{They don't actually have to be seperate
kernel modules; e.g. TCP, UDP and ICMP tracking modules are all part of
the linux kernel module ip\_conntrack.o}. Besides the conntrack core, there
are two important kind of modules: Protocol helpers and application helpers.
Protocol helpers implement the layer-4-protocol specific parts. They currently
exist for TCP, UDP and ICMP (an experimental helper for GRE exists).
\subsubsection{TCP connection tracking}
As TCP is a connection oriented protocol, it is not very difficult to imagine
how conntection tracking for this protocol could work. There are well-defined
state transitions possible, and conntrack can decide which state transitions
are valid within the TCP specification. In reality it's not all that easy,
since we cannot assume that all packets that pass the packet filter actually
arrive at the receiving end, ...
It is noteworthy that the standard connection tracking code does {\bf not}
do TCP sequence number and window tracking. A well-maintained patch to add
this feature exists almost as long as connection tracking itself. It will
be integrated with the 2.5.x kernel. The problem with window tracking is
it's bad interaction with connection pickup. The TCP conntrack code is able to
pick up already existing connections, e.g. in case your firewall was rebooted.
However, connection pickup is conflicting with TCP window tracking: The TCP
window scaling option is only transferred at connection setup time, and we
don't know about it in case of pickup...
\subsubsection{ICMP tracking}
ICMP is not really a connection oriented protocol. So how is it possible to
do connection tracking for ICMP?
The ICMP protocol can be split in two groups of messages
\begin{itemize}
\item
ICMP error messages, which sort-of belong to a different connection
ICMP error messages are associated {\it RELATED} to a different connection.
(ICMP\_DEST\_UNREACH, ICMP\_SOURCE\_QUENCH, ICMP\_TIME\_EXCEEDED,
ICMP\_PARAMETERPROB, ICMP\_REDIRECT).
\item
ICMP queries, which have a request->reply character. So what the conntrack
code does, is let the request have a state of {\it NEW}, and the reply
{\it ESTABLISHED}. The reply closes the connection immediately.
(ICMP\_ECHO, ICMP\_TIMESTAMP, ICMP\_INFO\_REQUEST, ICMP\_ADDRESS)
\end{itemize}
\subsubsection{UDP connection tracking}
UDP is designed as a connectionless datagram protocol. But most common
protocols using UDP as layer 4 protocol have bi-directional UDP communication.
Imagine a DNS query, where the client sends an UDP frame to port 53 of the
nameserver, and the nameserver sends back a DNS reply packet from it's UDP
port 53 to the client.
Netfilter trats this as a connection. The first packet (the DNS request) is
assigned a state of {\it NEW}, because the packet is expected to create a new
'connection'. The dns servers' reply packet is marked as {\it ESTABLISHED}.
\subsubsection{conntrack application helpers}
More complex application protocols involving multiple connections need special
support by a so-called ``conntrack application helper module''. Modules in
the stock kernel come for FTP and IRC(DCC). Netfilter CVS currently contains
patches for PPTP, H.323, Eggdrop botnet, tftp ald talk. We're still lacking
a lot of protocols (e.g. SIP, SMB/CIFS) - but they are unlikely to appear
until somebody really needs them and either develops them on his own or
funds development.
\subsubsection{Integration of connection tracking with iptables}
As stated earlier, conntrack doesn't impose any policy on packets. It just
determines the relation of a packet to already existing connections. To base
packet filtering decision on this sate information, the iptables {\it state}
match can be used. Every packet is within one of the following categories:
\begin{itemize}
\item
{\bf NEW}: packet would create a new connection, if it survives
\item
{\bf ESTABLISHED}: packet is part of an already established connection
(either direction)
\item
{\bf RELATED}: packet is in some way related to an already established connection, e.g. ICMP errors or FTP data sessions
\item
{\bf INVALID}: conntrack is unable to derive conntrack information from this packet. Please note that all multicast or broadcast packets fall in this category.
\end{itemize}
\subsection{NAT Subsystem}
The NAT (Network Address Translation) subsystem is probably the worst
documented subsystem within the whole framework. This has two reasons: NAT is
nasty and complicated. The Linux 2.4.x NAT implementation is easy to use, so
nobody needs to know the nasty details.
Nonetheless, as I was traditionally concentrating most on the conntrack and NAT
systems, I will give a short overview.
NAT uses almost all of the previously described subsystems:
\begin{itemize}
\item
IP tables to specify which packets to NAT in which particular way. NAT
registers a ``nat'' table with PREROUTING, POSTROUTING and OUTPUT chains.
\item
Connection tracking to associate NAT state with the connection.
\item
Netfilter to do the actuall packet manipulation transparent to the rest of the
kernel. NAT registers with NF\_IP\_PRE\_ROUTING, NF\_IP\_POST\_ROUTING,
NF\_IP\_LOCAL\_IN and NF\_IP\_LOCAL\_OUT.
\end{itemize}
The NAT implementation supports all kinds of different nat; Source NAT,
Destination NAT, NAT to address/port ranges, 1:1 NAT, ...
This fundamental design principle is still frequently misunderstood:\\
The information about which NAT mappings apply to a certain connection
is only gathered once - with the first packet of every connection.
So let's start to look at the life of a poor to-be-nat'ed packet.
For ease of understanding, I have chosen to describe the most frequently
used NAT scenario: Source NAT of a forwarded packet. Let's assume the
packet has an original source address of 1.1.1.1, an original destination
address of 2.2.2.2, and is going to be SNAT'ed to 9.9.9.9. Let's further
ignore the fact that there are port numbers.
Once upon a time, our poor packet arrives at NF\_IP\_PRE\_ROUTING, where
conntrack has registered with highest priority. This means that a conntrack
entry with the following two tuples is created:
\begin{verbatim}
IP_CT_DIR_ORIGINAL: 1.1.1.1 -> 2.2.2.2
IP_CT_DIR_REPLY: 2.2.2.2 -> 1.1.1.1
\end{verbatim}
After conntrack, the packet traverses the PREROUTING chain of the ``nat''
IP table. Since only destination NAT happens at PREROUTING, no action
occurs. After it's lengthy way through the rest of the network stack,
the packet arrives at the NF\_IP\_POST\_ROUTING hook, where it traverses
the POSTROUTING chain of the ``nat'' table. Here it hits a SNAT rule,
causing the following actions:
\begin{itemize}
\item
Fill in a {\it struct ip\_nat\_manip}, indicating the new source address
and the type of NAT (source NAT at POSTROUTING). This struct is part of the
conntrack entry.
\item
Automatically derive the inverse NAT transormation for the reply packets:
Destination NAT at PREROUTING. Fill in another {\it struct ip\_nat\_manip}.
\item
Alter the REPLY tuple of the conntrack entry to
\begin{verbatim}
IP_CT_DIR_REPLY: 2.2.2.2 -> 9.9.9.9
\end{verbatim}
\item
Apply the SNAT transformation to the packet
\end{itemize}
Every other packt within this connection, independent of its direction,
will only execute the last step. Since all NAT information is connected
with the conntrack entry, there is no need to do anything but to apply
the same transormations to all packets witin the same connection.
\subsection{IPv6 Firewalling with ip6tables}
Yes, Linux 2.4.x comes with a usable, though incomplete system to secure
your IPv6 network.
The parts ported to IPv6 are
\begin{itemize}
\item
IP tables (called IP6 tables)
\item
The ``filter'' table
\item
The ``mangle'' table
\item
The userspace library (libip6tc)
\item
The command line tool (ip6tables)
\end{itemize}
Due to the lack of conntrack and NAT\footnote{for god's sake we don't have NAT
with IPv6}, only traditional, stateless packet filtering is possible. Apart
from the obvious matches/targets, ip6tables can match on
\begin{itemize}
\item
{\it EUI64 checker}; verifies if the MAC address of the sender is the same as in the EUI64 64 least significant bits of the source IPv6 address
\item
{\it frag6 match}, matches on IPv6 fragmentation header
\item
{\it route6 match}, matches on IPv6 routing header
\item
{\it ahesp6 match}, matches on SPIDs within AH or ESP over IPv6 packets
\end{itemize}
However, the ip6tables code doesn't seem to be used very widely (yet?).
So please expect some potential remaining issues, since it is not tested
as heavily as iptables.
\subsection{Recent Development}
Please refer to the spoken word at the presentation. Development at the
time this paper was written can be quite different from development at the
time the presentation is held.
\section{Thanks}
I'd like to thank
\begin{itemize}
\item
{\it Linus Torvalds} for starting this interesting UNIX-like kernel
\item
{\it Alan Cox, David Miller, Alexey Kuznetesov, Andi Kleen} for building
(one of?) the world's best TCP/IP stacks.
\item
{\it Paul ``Rusty'' Russell} for starting the netfilter/iptables project
\item
{\it The Netfilter Core Team} for continuing the netfilter/iptables effort
\item
{\it Astaro AG} for partially funding my current netfilter/iptables work
\item
{\it Conectiva Inc.} for partially funding parts of my past netfilter/iptables
work and for inviting me to live in Brazil
\item
{\it samba.org and Kommunikationsnetz Franken e.V.} for hosting the netfilter
homepage, CVS, mailing lists, ...
\end{itemize}
\end{document}

View File

@ -0,0 +1,49 @@
Linux 2.4.x netfilter/iptables firewalling internals (lt-690870524)
The Linux 2.4.x kernel series has introduced a totally new kernel firewalling subsystem. It is much more than a plain successor of ipfwadm or ipchains.
The netfilter/iptables project has a very modular design and it's
sub-projects can be split in several parts: netfilter, iptables, connection
tracking, NAT and packet mangling.
While most users will already have learned how to use the basic functions
of netfilter/iptables in order to convert their old ipchains firewalls to
iptables, there's more advanced but less used functionality in
netfilter/iptables.
The presentation covers the design principles behind the netfilter/iptables
implementation. This knowledge enables us to understand how the individual
parts of netfilter/iptables fit together, and for which potential applications
this is useful.
Topics covered:
- overview about the internal netfilter/iptables architecture
- the netfilter hooks inside the network protocol stacks
- packet selection with IP tables
- how is connection tracking and NAT integrated into the framework
- the connection tracking system
- how good does it track the TCP state?
- how does it track ICMP and UDP state at all?
- layer 4 protocol helpers (GRE, ...)
- application helpers (ftp, irc, h323, ...)
- restrictions/limitations
- the NAT system
- how does it interact with connection tracking?
- layer 4 protocol helpers
- application helpers (ftp, irc, ...)
- misc
- how far is IPv6 firewalling with ip6tables?
- advances in failover/HA of stateful firewalls
- ivisible firewalls with iptables on a bridge
- userspace packet queueing with QUEUE
- userspace packet logging with ULOG
Requirements:
- knowledge about the TCP/IP protocol family
- knowledge about general firewalling and packet filtering concepts
- prior experience with linux packet filters
Audience:
- firewall administrators
- network developers

View File

@ -0,0 +1,22 @@
<a href="http://www.gnumonks.org/users/laforge/">Harald Welte</a> is one
of the five <a href="http://www.netfilter.org/">netfilter/iptables</a> core
team members, and the current Linux 2.4.x firewalling maintainer.
His main interest in computing has always been networking. In the few time
left besides netfilter/iptables related work, he's writing obscure documents
like the <a href="http://www.gnumonks.org/ftp/pub/doc/uucp-over-ssl.html"> UUCP
over SSL HOWTO</a>. Other kernel-related projects he has been contributing are
user mode linux and the international (crypto) kernel patch.
In the past he has been working as an independent IT Consultant working on
closed-source projecst for various companies ranging from banks to
manufacturers of networking gear. During the year 2001 he was living in
Curitiba (Brazil), where he got sponsored for his Linux related work by
<a href="http://www.conectiva.com/">Conectiva Inc.</a>.
Starting with February 2002, Harald has been contracted part-time by
<a href="http://www.astaro.com/">Astaro AG</a>, who are sponsoring him for his
current netfilter/iptables work.
Harald is living in Erlangen, Germany.

View File

@ -0,0 +1,466 @@
%include "default.mgp"
%default 1 bgrad
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
%nodefault
%back "blue"
%center
%size 7
Linux 2.4.x netfilter/iptables
firewalling internals
%center
%size 4
by
Harald Welte <laforge@gnumonks.org>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Contents
Introduction
Netfilter hooks in protocol stacks
Packet selection based on IP Tables
The Connection Tracking Subsystem
The NAT Subsystem based on netfilter + iptables
Packet filtering using the 'filter' table
Packet mangling using the 'mangle' table
Advanced netfilter concepts
Current development and Future
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Introduction
Why did we need netfilter/iptables?
Because ipchains...
has no infrastructure for passing packets to userspace
makes transparent proxying extremely difficult
has interface address dependent Packet filter rules
has Masquerading implemented as part of packet filtering
code is too complex and intermixed with core ipv4 stack
is neither modular nor extensible
only barely supports one special case of NAT (masquerading)
has only stateless packet filtering
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Introduction
Who's behind netfilter/iptables
Paul 'Rusty' Russel
co-author of iptables in Linux 2.2
was paid by Watchguard for about one Year of development
James Morris
userspace queuing (kernel, library and tools)
REJECT target
Marc Boucher
NAT and packet filtering controlled by one command
Mangle table
Harald Welte
Conntrack+NAT helper infrastructure (newnat)
Userspace packet logging (ULOG)
PPTP and IRC conntrack/NAT helpers
Jozsef Kadlecsik
TCP window tracking
H.323 conntrack + NAT helper
Continued newnat development
Non-core team contributors
http://www.netfilter.org/scoreboard/
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Netfilter Hooks
What is netfilter?
System of callback functions within network stack
Callback function to be called for every packet traversing certain point (hook) within network stack
Protocol independent framework
Hooks in layer 3 stacks (IPv4, IPv6, DECnet, ARP)
Multiple kernel modules can register with each of the hooks
Asynchronous packet handling in userspace (ip_queue)
Traditional packet filtering, NAT, ... is implemented on top of this framework
Can be used for other stuff interfacing with the core network stack, like DECnet routing daemon.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Netfilter Hooks
Netfilter architecture in IPv4
%font "courier"
--->[1]--->[ROUTE]--->[3]--->[4]--->
| ^
| |
| [ROUTE]
v |
[2] [5]
| ^
| |
v |
%font "standard"
1=NF_IP_PRE_ROUTING
2=NF_IP_LOCAL_IN
3=NF_IP_FORWARD
4=NF_IP_POST_ROUTING
5=NF_IP_LOCAL_OUT
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Netfilter Hooks
Netfilter Hooks
Any kernel module may register a callback function at any of the hooks
The module has to return one of the following constants
NF_ACCEPT continue traversal as normal
NF_DROP drop the packet, do not continue
NF_STOLEN I've taken over the packet do not continue
NF_QUEUE enqueue packet to userspace
NF_REPEAT call this hook again
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
IP tables
Packet selection using IP tables
The kernel provides generic IP tables support
Each kernel module may create it's own IP table
The three major parts of 2.4 firewalling subsystem are implemented using IP tables
Packet filtering table 'filter'
NAT table 'nat'
Packet mangling table 'mangle'
Can potentially be used for other stuff, i.e. IPsec SPDB
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
IP Tables
Managing chains and tables
An IP table consists out of multiple chains
A chain consists out of a list of rules
Every single rule in a chain consists out of
match[es] (rule executed if all matches true)
target (what to do if the rule is matched)
%size 4
matches and targets can either be builtin or implemented as kernel modules
%size 6
The userspace tool iptables is used to control IP tables
handles all different kinds of IP tables
supports a plugin/shlib interface for target/match specific options
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
IP Tables
Basic iptables commands
To build a complete iptables command, we must specify
which table to work with
which chain in this table to use
an operation (insert, add, delete, modify)
one or more matches (optional)
a target
The syntax is
%font "typewriter"
%size 3
iptables -t table -Operation chain -j target match(es)
%font "standard"
%size 5
Example:
%font "typewriter"
%size 3
iptables -t filter -A INPUT -j ACCEPT -p tcp --dport smtp
%font "standard"
%size 5
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
IP Tables
Matches
Basic matches
-p protocol (tcp/udp/icmp/...)
-s source address (ip/mask)
-d destination address (ip/mask)
-i incoming interface
-o outgoing interface
Match extensions (examples)
tcp/udp TCP/udp source/destination port
icmp ICMP code/type
ah/esp AH/ESP SPID match
mac source MAC address
mark nfmark
length match on length of packet
limit rate limiting (n packets per timeframe)
owner owner uid of the socket sending the packet
tos TOS field of IP header
ttl TTL field of IP header
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
IP Tables
Targets
very dependent on the particular table.
Table specific targets will be discussed later
Generic Targets, always available
ACCEPT accept packet within chain
DROP silently drop packet
QUEUE enqueue packet to userspace
LOG log packet via syslog
ULOG log packet via ulogd
RETURN return to previous (calling) chain
foobar jump to user defined chain
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Packet Filtering
Overview
Implemented as 'filter' table
Registers with three netfilter hooks
NF_IP_LOCAL_IN (packets destined for the local host)
NF_IP_FORWARD (packets forwarded by local host)
NF_IP_LOCAL_OUT (packets from the local host)
Each of the three hooks has attached one chain (INPUT, FORWARD, OUTPUT)
Every packet passes exactly one of the three chains. Note that this is very different compared to the old 2.2.x ipchains behaviour.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Packet Filtering
Targets available within 'filter' table
Builtin Targets to be used in filter table
ACCEPT accept the packet
DROP silently drop the packet
QUEUE enqueue packet to userspace
RETURN return to previous (calling) chain
foobar user defined chain
Targets implemented as loadable modules
REJECT drop the packet but inform sender
MIRROR change source/destination IP and resend
LOG log via syslog
ULOG log via userspace
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Connection Tracking Subsystem
Connection tracking...
implemented seperately from NAT
enables stateful filtering
implementation
hooks into NF_IP_PRE_ROUTING to track packets
hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to see if packet passed filtering rules
protocol modules (currently TCP/UDP/ICMP)
application helpers currently (FTP,IRC,H.323,talk,SNMP)
divides packets in the following four categories
NEW - would establish new connection
ESTABLISHED - part of already established connection
RELATED - is related to established connection
INVALID - (multicast, errors...)
does _NOT_ filter packets itself
can be utilized by iptables using the 'state' match
is used by NAT Subsystem
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Network Address Translation
Overview
Previous Linux Kernels only implemented one special case of NAT: Masquerading
Linux 2.4.x can do any kind of NAT.
NAT subsystem implemented on top of netfilter, iptables and conntrack
NAT subsystem registers with all five netfilter hooks
'nat' Table registers chains PREROUTING, POSTROUTING and OUTPUT
Following targets available within 'nat' Table
SNAT changes the packet's source whille passing NF_IP_POST_ROUTING
DNAT changes the packet's destination while passing NF_IP_PRE_ROUTING
MASQUERADE is a special case of SNAT
REDIRECT is a special case of DNAT
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Network Address Translation
Source NAT
SNAT Example:
%font "typewriter"
%size 3
iptables -t nat -A POSTROUTING -j SNAT --to-source 1.2.3.4 -s 10.0.0.0/8
%font "standard"
%size 4
MASQUERADE Example:
%font "typewriter"
%size 3
iptables -t nat -A POSTROUTING -j MASQUERADE -o ppp0
%font "standard"
%size 5
Destination NAT
DNAT example
%font "typewriter"
%size 3
iptables -t nat -A PREROUTING -j DNAT --to-destination 1.2.3.4:8080 -p tcp --dport 80 -i eth1
%font "standard"
%size 4
REDIRECT example
%font "typewriter"
%size 3
iptables -t nat -A PREROUTING -j REDIRECT --to-port 3128 -i eth1 -p tcp --dport 80
%font "standard"
%size 5
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Packet Mangling
Purpose of mangle table
packet manipulation except address manipulation
Integration with netfilter
'mangle' table hooks in all five netfilter hooks
priority: after conntrack
Targets specific to the 'mangle' table:
DSCP - manipulate DSCP field
IPV4OPTSSTRIP - strip IPv4 options
MARK - change the nfmark field of the skb
TCPMSS - set TCP MSS option
TOS - manipulate the TOS bits
TTL - set / increase / decrease TTL field
Simple example:
%font "typewriter"
%size 3
iptables -t mangle -A PREROUTING -j MARK --set-mark 10 -p tcp --dport 80
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Advanced Netfilter concepts
%size 4
Userspace logging
flexible replacement for old syslog-based logging
packets to userspace via multicast netlink sockets
easy-to-use library (libipulog)
plugin-extensible userspace logging daemon (ulogd)
Can even be used to directly log into MySQL
Queuing
reliable asynchronous packet handling
packets to userspace via unicast netlink socket
easy-to-use library (libipq)
provides Perl bindings
experimental queue multiplex daemon (ipqmpd)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Current Development and Future
Netfilter (although it proved very stable) is still work in progress.
Areas of current development
infrastructure for conntrack manipulation from userspace
failover of stateful firewalls
making iptables layer3 independent (pkttables)
new userspace library (libiptables) to hide plugins from apps
more matches and targets for advanced functions (pool, hashslot)
more conntrack and NAT modules (RPC, SNMP, SMB, ...)
better IPv6 support (conntrack, more matches / targets)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Thanks
Thanks to
the BBS people, Z-Netz, FIDO, ...
for heavily increasing my computer usage in 1992
KNF
for bringing me in touch with the internet as early as 1994
for providing a playground for technical people
for telling me about the existance of Linux!
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
for implementing (one of?) the world's best TCP/IP stacks
Paul 'Rusty' Russell
for starting the netfilter/iptables project
for trusting me to maintain it today
Astaro AG
for sponsoring parts of my netfilter work
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Availability of slides / Links
The slides and the an according paper of this presentation are available at
http://www.gnumonks.org/
The netfilter homepage
http://www.netfilter.org/

View File

@ -0,0 +1,537 @@
\documentclass{article}
\usepackage{german}
\usepackage{fancyheadings}
\usepackage{a4}
\setlength{\oddsidemargin}{0in}
\setlength{\evensidemargin}{0in}
\setlength{\topmargin}{0.0in}
\setlength{\headheight}{0in}
\setlength{\headsep}{0in}
\setlength{\textwidth}{6.5in}
\setlength{\textheight}{9.5in}
\setlength{\parindent}{0in}
\setlength{\parskip}{0.05in}
\begin{document}
\title{Linux 2.4.x netfilter/iptables firewalling internals}
\author{Harald Welte\\
laforge@gnumonks.org\\
\copyright{}2002 H. Welte}
\date{25. April 2002}
\maketitle
\setcounter{section}{0}
\setcounter{subsection}{0}
\setcounter{subsubsection}{0}
\section{Introduction}
The Linux 2.4.x kernel series has introduced a totally new kernel firewalling
subsystem. It is much more than a plain successor of ipfwadm or ipchains.
The netfilter/iptables project has a very modular design and it's
sub-projects can be split in several parts: netfilter, iptables, connection
tracking, NAT and packet mangling.
While most users will already have learned how to use the basic functions
of netfilter/iptables in order to convert their old ipchains firewalls to
iptables, there's more advanced but less used functionality in
netfilter/iptables.
The presentation covers the design principles behind the netfilter/iptables
implementation. This knowledge enables us to understand how the individual
parts of netfilter/iptables fit together, and for which potential applications
this is useful.
\section{Internal netfilter/iptables architecture}
\subsection{Netfilter hooks in protocol stacks}
One of the major motivations behind the redesign of the linux packet
filtering and NAT system during the 2.3.x kernel series was the widespread
firewall specific code parts within the core IPv4 stack. Ideally the core
IPv4 stack (as used by regular hosts and routers) shouldn't contain any
firewalling specific code, resulting in no unwanted interaction and less
code complexity. This desire lead to the invention of {\it netfilter}.
\subsubsection{Architecture of netfilter}
Netfilter is basically a system of callback functions within the network
stack. It provides a non-portable API towards in-kernel networking
extensions.
What we call {\it netfilter hook} is a well-defined call-out point within a
layer three protocol stack, such as IPv4, IPv6 or DECnet. Any layer three
network stack can define an arbitrary number of hooks, usually placed at
strategic points within the packet flow.
Any other kernel code can now subsequently register callback functions for
any of these hooks. As in most sytems will be more than one callback
function registered for a particular hook, a {\it priority} is specified upon
registration of the callback function. This priority defines the order in
which the individual callback functions at a particular hook are called.
The return value of any registered callback functions can be:
\begin{itemize}
\item
{\bf NF\_ACCEPT}: continue traversal as usual
\item
{\bf NF\_DROP}: drop the packet; do not continue traversal
\item
{\bf NF\_STOLEN}: callback function has taken over the packet; do not continue
\item
{\bf NF\_QUEUE}: enqueue the packet to userspace
\item
{\bf NF\_REPEAT}: call this hook again
\end{itemize}
\subsubsection{Netfilter hooks within IPv4}
The IPv4 stack provides five netfilter hooks, which are placed at the
following peculiar places within the code:
\begin{verbatim}
--->[1]--->[ROUTE]--->[3]--->[4]--->
| ^
| |
| [ROUTE]
v |
[2] [5]
| ^
| |
v |
local processes
\end{verbatim}
Packets received on any network interface arrive at the left side of the
diagram. After the verification of the IP header checksum, the
NF\_IP\_PRE\_ROUTING [1] hook is traversed.
If they ``survive'' (i.e. NF\_ACCEPT is returned), the packet enters the
routing code. Where we continue from here depends on the destintion of the
packet.
Packets with a local destination (i.e. packets where the destination address is
one of the own IP addresses of the host) traverse the NF\_IP\_LOCAL\_IN [2]
hook. If all callback function return NF\_ACCEPT, the packet is finally passed
to the socket code, which eventually passes the packet to a local process.
Packets with a remote destination (i.e. packets which are forwarded by the
local machine) traverse the NF\_IP\_FORWARD [3] hook. If they ``survive'',
they finally pass the NF\_IP\_POST\_ROUTING [4] hook and are sent off the
outgoing network interface.
Locally generated packets first traverse the NF\_IP\_LOCAL\_OUT [5] hook, then
enter the routing code, and finally go through the NF\_IP\_POST\_ROUTING [4]
hook before being sent off the outgoing network interface.
\subsubsection{Netfilter hooks within IPv6}
As the IPv4 and IPv6 protocols are very similar, the netfilter hooks within the
IPv6 stack are placed at exactly the same locations as in the IPv4 stack. The
only change are the hook names: NF\_IP6\_PRE\_ROUTING, NF\_IP6\_LOCAL\_IN,
NF\_IP6\_FORWARD, NF\_IP6\_POST\_ROUTING, NF\_IP6\_LOCAL\_OUT.
\subsubsection{Netfilter hooks within DECnet}
There are seven decnet hooks. The first five hooks (NF\_DN\_PRE\_ROUTING,
NF\_DN\_LOCAL\_IN, NF\_DN\_FORWARD, NF\_DN\_LOCAL\_OUT, NF\_DN\_POST\_ROUTING)
are prretty much the same as in IPv4. The last two hooks (NF\_DN\_HELLO,
NF\_DN\_ROUTE) are used in conjunction with DECnet Hello and Routing packets.
\subsubsection{Netfilter hooks within ARP}
Recent kernels\footnote{IIRC, starting with 2.4.19-pre3} have added support for netfilter hooks within the ARP code.
There are two hooks: NF\_ARP\_IN and NF\_ARP\_OUT, for incoming and outgoing
ARP packets respectively.
\subsubsection{Netfilter hooks within IPX}
There have been experimental patches to add netfilter hooks to the IPX code,
but they never got integrated into the kernel source.
\subsection{Packet selection using IP Tables}
The IP tables core (ip\_tables.o) provides a generic layer for evaluation
of rulesets.
An IP table consists out of an arbitrary number of {\it chains}, which in turn
consist out of a linear list of {\it rules}, which again consist out of any
number of {\it matches} and one {\it target}.
{\it Chains} can further be devided into two classes: Either {\it builtin
chains} or {\it user-defined chains}. Builtin chains are always present, they
are created upon table registration. They are also the entry points for table
iteration. User defined chains are created at runtime upon user interaction.
{\it Matches} specify the matching criteria, there can be zero or more matches
{\it Targets} specify the action which is to be executed in case {\bf all}
matches match. There can only be a single target per rule.
Matches and targets can either be {\it builtin} or {\it linux kernel modules}.
There are two special targets:
\begin{itemize}
\item
By using a chain name as target, it is possible to jump to the respective chain
in case the matches match.
\item
By using the RETURN target, it is possible to return to the previous (calling)
chain
\end{itemize}
The IP tables core handles the following functions
\begin{itemize}
\item
Registering and unregistering tables
\item
Registering and unregistering matches and targets (can be implemented as linux kernel modules)
\item
Kernel / userspace interface for manipulation of IP tables
\item
Traversal of IP tables
\end{itemize}
\subsubsection{Packet filtering unsing the ``filter'' table}
Traditional packet filtering (i.e. the successor to ipfwadm/ipchains) takes
place in the ``filter'' table. Packet filtering works like a sieve: A packet
is (in the end) either dropped or accepted - but never modified.
The ``filter'' table is implemented in the {\it iptable\_filter.o} module
and contains three builtin chains:
\begin{itemize}
\item
{\bf INPUT} attaches to NF\_IP\_LOCAL\_IN
\item
{\bf FORWARD} attaches to NF\_IP\_FORWARD
\item
{\bf OUTPUT} attaches to NF\_IP\_LOCAL\_OUT
\end{itemize}
The placement of the chains / hooks is done in such way, that evey concievable
packet always traverses only one of the built-in chains. Packets destined for
the local host traverse only INPUT, packets forwarded only FORWARD and
locally-originated packets only OUTPUT.
\subsubsection{Packet mangling using the ``mangle'' table}
As stated above, operations which would modify a packet do not belong in the
``filter'' table. The ``mangle'' table is available for all kinds of packet
manipulation - but not manipulation of addresses (which is NAT).
The mangle table attaches to all five netfilter hooks and provides the
respectiva builtin chains (PREROUTING, INPUT, FORWARD, OUTPUT, POSTROUTING)
\footnote{This has changed through recent 2.4.x kernel series, old kernels may
only support three (PREROUTING, POSTROUTING, OUTPUT) chains.}.
\subsection{Connection Tracking Subsystem}
Traditional packet filters can only match on matching criteria within the
currently processed packet, like source/destination IP address, port numbers,
TCP flags, etc. As most applications have a notion of connections or at least
a request/response style protocol, there is a lot of information which can not
be derived from looking at a single packet.
Thus, modern (stateful) packet filters attempt to track connections (flows)
and their respective protocol states for all traffic through the packet
filter.
Connection tracking within linux is implemented as a netfilter module, called
ip\_conntrack.o.
Before describing the connection tracking subsystem, we need to describe a couple of definitions and primitives used throughout the conntrack code.
A connection is represented within the conntrack subsystem using {\it struct
ip\_conntrack}, also called {\it connection tracking entry}.
Connection tracking is utilizing {\it conntrack tuples}, which are tuples
consisting out of (srcip, srcport, dstip, dstport, l4prot). A connection is
uniquely identified by two tuples: The tuple in the original direction
(IP\_CT\_DIR\_ORIGINAL) and the tuple for the reply direction
(IP\_CT\_DIR\_REPLY).
Connection tracking itself does not drop packets\footnote{well, in some rare
cases in combination with NAT it needs to drop. But don't tell anyone, this is
secret.} or impose any policy. It just associates every packet with a
connection tracking entry, which in turn has a particular state. All other
kernel code can use this state information\footnote{state information is
internally represented via the {\it struct sk\_buff.nfct} structure member of a
packet.}.
\subsubsection{Integration of conntrack with netfilter}
If the ip\_conntrack.o module is registered with netfilter, it attaches to the
NF\_IP\_PRE\_ROUTING, NF\_IP\_POST\_ROUTING, NF\_IP\_LOCAL\_IN and
NF\_IP\_LOCAL\_OUT hooks.
Because forwarded packets are the most common case on firewalls, I will only
describe how connection tracking works for forwarded packets. The two relevant
hooks for forwarded packets are NF\_IP\_PRE\_ROUTING and NF\_IP\_POST\_ROUTING.
Every time a packet arrives at the NF\_IP\_PRE\_ROUTING hook, connection
tracking creates a conntrack tuple from the packet. It then compares this
tuple to the original and reply tuples of all already-seen connections
\footnote{Of course this is not implemented as a linear search over all existing connections.} to find out if this just-arrived packet belongs to any existing
connection. If there is no match, a new conntrack table entry (struct
ip\_conntrack) is created.
Let's assume the case where we have already existing connections but are
starting from scratch.
The first packet comes in, we derive the tuple from the packet headers, look up
the conntrack hash table, don't find any matching entry. As a result, we
create a new struct ip\_conntrack. This struct ip\_conntrack is filled with
all necessarry data, like the original and reply tuple of the connection.
How do we know the reply tuple? By inverting the source and destination
parts of the original tuple.\footnote{So why do we need two tuples, if they can
be derived from each other? Wait until we discuss NAT.}
Please note that this new struct ip\_conntrack is {\bf not} yet placed
into the conntrack hash table.
The packet is now passed on to other callback functions which have registered
with a lower priority at NF\_IP\_PRE\_ROUTING. It then continues traversal of
the network stack as usual, including all respective netfilter hooks.
If the packet survives (i.e. is not dropped by the routing code, network stack,
firewall ruleset, ...), it re-appears at NF\_IP\_POST\_ROUTING. In this case,
we can now safely assume that this packet will be sent off on the outgoing
interface, and thus put the connection tracking entry which we created at
NF\_IP\_PRE\_ROUTING into the conntrack hash table. This process is called
{\it confirming the conntrack}.
The connection tracking code itself is not monolithic, but consists out of a
couple of seperate modules\footnote{They don't actually have to be seperate
kernel modules; e.g. TCP, UDP and ICMP tracking modules are all part of
the linux kernel module ip\_conntrack.o}. Besides the conntrack core, there
are two important kind of modules: Protocol helpers and application helpers.
Protocol helpers implement the layer-4-protocol specific parts. They currently
exist for TCP, UDP and ICMP (an experimental helper for GRE exists).
\subsubsection{TCP connection tracking}
As TCP is a connection oriented protocol, it is not very difficult to imagine
how conntection tracking for this protocol could work. There are well-defined
state transitions possible, and conntrack can decide which state transitions
are valid within the TCP specification. In reality it's not all that easy,
since we cannot assume that all packets that pass the packet filter actually
arrive at the receiving end, ...
It is noteworthy that the standard connection tracking code does {\bf not}
do TCP sequence number and window tracking. A well-maintained patch to add
this feature exists almost as long as connection tracking itself. It will
be integrated with the 2.5.x kernel. The problem with window tracking is
it's bad interaction with connection pickup. The TCP conntrack code is able to
pick up already existing connections, e.g. in case your firewall was rebooted.
However, connection pickup is conflicting with TCP window tracking: The TCP
window scaling option is only transferred at connection setup time, and we
don't know about it in case of pickup...
\subsubsection{ICMP tracking}
ICMP is not really a connection oriented protocol. So how is it possible to
do connection tracking for ICMP?
The ICMP protocol can be split in two groups of messages
\begin{itemize}
\item
ICMP error messages, which sort-of belong to a different connection
ICMP error messages are associated {\it RELATED} to a different connection.
(ICMP\_DEST\_UNREACH, ICMP\_SOURCE\_QUENCH, ICMP\_TIME\_EXCEEDED,
ICMP\_PARAMETERPROB, ICMP\_REDIRECT).
\item
ICMP queries, which have a request->reply character. So what the conntrack
code does, is let the request have a state of {\it NEW}, and the reply
{\it ESTABLISHED}. The reply closes the connection immediately.
(ICMP\_ECHO, ICMP\_TIMESTAMP, ICMP\_INFO\_REQUEST, ICMP\_ADDRESS)
\end{itemize}
\subsubsection{UDP connection tracking}
UDP is designed as a connectionless datagram protocol. But most common
protocols using UDP as layer 4 protocol have bi-directional UDP communication.
Imagine a DNS query, where the client sends an UDP frame to port 53 of the
nameserver, and the nameserver sends back a DNS reply packet from it's UDP
port 53 to the client.
Netfilter trats this as a connection. The first packet (the DNS request) is
assigned a state of {\it NEW}, because the packet is expected to create a new
'connection'. The dns servers' reply packet is marked as {\it ESTABLISHED}.
\subsubsection{conntrack application helpers}
More complex application protocols involving multiple connections need special
support by a so-called ``conntrack application helper module''. Modules in
the stock kernel come for FTP and IRC(DCC). Netfilter CVS currently contains
patches for PPTP, H.323, Eggdrop botnet, tftp ald talk. We're still lacking
a lot of protocols (e.g. SIP, SMB/CIFS) - but they are unlikely to appear
until somebody really needs them and either develops them on his own or
funds development.
\subsubsection{Integration of connection tracking with iptables}
As stated earlier, conntrack doesn't impose any policy on packets. It just
determines the relation of a packet to already existing connections. To base
packet filtering decision on this sate information, the iptables {\it state}
match can be used. Every packet is within one of the following categories:
\begin{itemize}
\item
{\bf NEW}: packet would create a new connection, if it survives
\item
{\bf ESTABLISHED}: packet is part of an already established connection
(either direction)
\item
{\bf RELATED}: packet is in some way related to an already established connection, e.g. ICMP errors or FTP data sessions
\item
{\bf INVALID}: conntrack is unable to derive conntrack information from this packet. Please note that all multicast or broadcast packets fall in this category.
\end{itemize}
\subsection{NAT Subsystem}
The NAT (Network Address Translation) subsystem is probably the worst
documented subsystem within the whole framework. This has two reasons: NAT is
nasty and complicated. The Linux 2.4.x NAT implementation is easy to use, so
nobody needs to know the nasty details.
Nonetheless, as I was traditionally concentrating most on the conntrack and NAT
systems, I will give a short overview.
NAT uses almost all of the previously described subsystems:
\begin{itemize}
\item
IP tables to specify which packets to NAT in which particular way. NAT
registers a ``nat'' table with PREROUTING, POSTROUTING and OUTPUT chains.
\item
Connection tracking to associate NAT state with the connection.
\item
Netfilter to do the actuall packet manipulation transparent to the rest of the
kernel. NAT registers with NF\_IP\_PRE\_ROUTING, NF\_IP\_POST\_ROUTING,
NF\_IP\_LOCAL\_IN and NF\_IP\_LOCAL\_OUT.
\end{itemize}
The NAT implementation supports all kinds of different nat; Source NAT,
Destination NAT, NAT to address/port ranges, 1:1 NAT, ...
This fundamental design principle is still frequently misunderstood:\\
The information about which NAT mappings apply to a certain connection
is only gathered once - with the first packet of every connection.
So let's start to look at the life of a poor to-be-nat'ed packet.
For ease of understanding, I have chosen to describe the most frequently
used NAT scenario: Source NAT of a forwarded packet. Let's assume the
packet has an original source address of 1.1.1.1, an original destination
address of 2.2.2.2, and is going to be SNAT'ed to 9.9.9.9. Let's further
ignore the fact that there are port numbers.
Once upon a time, our poor packet arrives at NF\_IP\_PRE\_ROUTING, where
conntrack has registered with highest priority. This means that a conntrack
entry with the following two tuples is created:
\begin{verbatim}
IP_CT_DIR_ORIGINAL: 1.1.1.1 -> 2.2.2.2
IP_CT_DIR_REPLY: 2.2.2.2 -> 1.1.1.1
\end{verbatim}
After conntrack, the packet traverses the PREROUTING chain of the ``nat''
IP table. Since only destination NAT happens at PREROUTING, no action
occurs. After it's lengthy way through the rest of the network stack,
the packet arrives at the NF\_IP\_POST\_ROUTING hook, where it traverses
the POSTROUTING chain of the ``nat'' table. Here it hits a SNAT rule,
causing the following actions:
\begin{itemize}
\item
Fill in a {\it struct ip\_nat\_manip}, indicating the new source address
and the type of NAT (source NAT at POSTROUTING). This struct is part of the
conntrack entry.
\item
Automatically derive the inverse NAT transormation for the reply packets:
Destination NAT at PREROUTING. Fill in another {\it struct ip\_nat\_manip}.
\item
Alter the REPLY tuple of the conntrack entry to
\begin{verbatim}
IP_CT_DIR_REPLY: 2.2.2.2 -> 9.9.9.9
\end{verbatim}
\item
Apply the SNAT transformation to the packet
\end{itemize}
Every other packt within this connection, independent of its direction,
will only execute the last step. Since all NAT information is connected
with the conntrack entry, there is no need to do anything but to apply
the same transormations to all packets witin the same connection.
\subsection{IPv6 Firewalling with ip6tables}
Yes, Linux 2.4.x comes with a usable, though incomplete system to secure
your IPv6 network.
The parts ported to IPv6 are
\begin{itemize}
\item
IP tables (called IP6 tables)
\item
The ``filter'' table
\item
The ``mangle'' table
\item
The userspace library (libip6tc)
\item
The command line tool (ip6tables)
\end{itemize}
Due to the lack of conntrack and NAT\footnote{for god's sake we don't have NAT
with IPv6}, only traditional, stateless packet filtering is possible. Apart
from the obvious matches/targets, ip6tables can match on
\begin{itemize}
\item
{\it EUI64 checker}; verifies if the MAC address of the sender is the same as in the EUI64 64 least significant bits of the source IPv6 address
\item
{\it frag6 match}, matches on IPv6 fragmentation header
\item
{\it route6 match}, matches on IPv6 routing header
\item
{\it ahesp6 match}, matches on SPIDs within AH or ESP over IPv6 packets
\end{itemize}
However, the ip6tables code doesn't seem to be used very widely (yet?).
So please expect some potential remaining issues, since it is not tested
as heavily as iptables.
\subsection{Recent Development}
Please refer to the spoken word at the presentation. Development at the
time this paper was written can be quite different from development at the
time the presentation is held.
\section{Thanks}
I'd like to thank
\begin{itemize}
\item
{\it Linus Torvalds} for starting this interesting UNIX-like kernel
\item
{\it Alan Cox, David Miller, Alexey Kuznetesov, Andi Kleen} for building
(one of?) the world's best TCP/IP stacks.
\item
{\it Paul ``Rusty'' Russell} for starting the netfilter/iptables project
\item
{\it The Netfilter Core Team} for continuing the netfilter/iptables effort
\item
{\it Astaro AG} for partially funding my current netfilter/iptables work
\item
{\it Conectiva Inc.} for partially funding parts of my past netfilter/iptables
work and for inviting me to live in Brazil
\item
{\it samba.org and Kommunikationsnetz Franken e.V.} for hosting the netfilter
homepage, CVS, mailing lists, ...
\end{itemize}
\end{document}

View File

@ -0,0 +1,50 @@
Firewalling mit netfilter/iptables unter Linux 2.4.x
Der Linux 2.4.x Kernel bietet eine fortgeschrittene Infrastruktur, genannt
netfilter, auf deren Basis ein Paketfilter, NAT und sonstige
Paket-Manipulationen implementiert sind.
Das gesamte Firewalling-Subsystem wurde gegenueber Kernel 2.2.x neu entwickelt.
Das netfilter/iptables System laesst alles bisher unter Linux existierende
(ipfwadm, ipchains) wie aus grauer Vorzeit erscheinen.
netfilter/iptables bietet neben dem traditionellen Paketfilter auch optional
Connection Tracking, mittels dessen sich im Handumdrehen eine Stateful
Firewall realisieren laesst. Auch das NAT (Network Address Translation)
System ist jetzt flexibel genug, um saemtliche Formen von NAT anbieten
zu koennen: source NAT, destination NAT, static NAT, NAPT, ...
Die hohe Modularitaet resultiert in einer sehr leichten Erweiterbarkeit,
so dass in einfacher Weise neue Erweiterungen zum Firewalling-System
entwickelt werden koennen.
Der Vortrag beschreibt die unterschiedlichen Teile des netfilter/iptables
Systems und gibt dadurch einen Ueberblick ueber dessen Moeglichkeiten und
Anwendungsszenarien. Er beschaeftigt sich mit den folgenden Themen:
- netfilter/iptables architektur
- netfilter hooks im Netzwerk-Stack
- IP tables als Regelbeschreibung
- Paketfilter
- Connection Tracking
- Network Address Translation
- source NAT
- destination NAT
- Masquerading
- transparent proxy support
- Packet mangling
- Userspace packet queuing
- Userspace packet logging
Voraussetzungen:
- Wissen ueber TCP/IP, Routing
- Grundlagen ueber Firewalling (insbesondere Paketfilter)
- Gewisse Grundkenntnisse ueber die Linux/Unix Architektur
Ueber den Vortragenden:
Harald Welte ist seit 1995 aktives KNF-Mitglied und der derzeitige
stellvertretende Technische Kontakt des KNF. Er ist der Maintainer des
netfilter/iptables Firewalling-Subsystems im Linux 2.4.x und
2.5.x Kernel und war massgeblich an dessen Entwicklung beteiligt.

View File

@ -0,0 +1,466 @@
%include "default.mgp"
%default 1 bgrad
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
%nodefault
%back "blue"
%center
%size 7
The netfilter/iptables framework in
Linux 2.4.x
%center
%size 4
by
Harald Welte <laforge@gnumonks.org>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Contents
Introduction
Netfilter hooks in protocol stacks
Packet selection based on IP Tables
The Connection Tracking Subsystem
The NAT Subsystem based on netfilter + iptables
Packet filtering using the 'filter' table
Packet mangling using the 'mangle' table
Advanced netfilter concepts
Current development and Future
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Introduction
Why did we need netfilter/iptables?
Because ipchains...
has no infrastructure for passing packets to userspace
makes transparent proxying extremely difficult
has interface address dependent Packet filter rules
has Masquerading implemented as part of packet filtering
code is too complex and intermixed with core ipv4 stack
is neither modular nor extensible
only barely supports one special case of NAT (masquerading)
has only stateless packet filtering
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Introduction
Who's behind netfilter/iptables
Paul 'Rusty' Russel
co-author of iptables in Linux 2.2
was paid by Watchguard for about one Year of development
James Morris
userspace queuing (kernel, library and tools)
REJECT target
Marc Boucher
NAT and packet filtering controlled by one command
Mangle table
Harald Welte
Conntrack+NAT helper infrastructure (newnat)
Userspace packet logging (ULOG)
PPTP and IRC conntrack/NAT helpers
Jozsef Kadlecsik
TCP window tracking
H.323 conntrack + NAT helper
Continued newnat development
Non-core team contributors
http://www.netfilter.org/scoreboard/
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Netfilter Hooks
What is netfilter?
System of callback functions within network stack
Callback function to be called for every packet traversing certain point (hook) within network stack
Protocol independent framework
Hooks in layer 3 stacks (IPv4, IPv6, DECnet, ARP)
Multiple kernel modules can register with each of the hooks
Asynchronous packet handling in userspace (ip_queue)
Traditional packet filtering, NAT, ... is implemented on top of this framework
Can be used for other stuff interfacing with the core network stack, like DECnet routing daemon.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Netfilter Hooks
Netfilter architecture in IPv4
%font "courier"
--->[1]--->[ROUTE]--->[3]--->[4]--->
| ^
| |
| [ROUTE]
v |
[2] [5]
| ^
| |
v |
%font "standard"
1=NF_IP_PRE_ROUTING
2=NF_IP_LOCAL_IN
3=NF_IP_FORWARD
4=NF_IP_POST_ROUTING
5=NF_IP_LOCAL_OUT
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Netfilter Hooks
Netfilter Hooks
Any kernel module may register a callback function at any of the hooks
The module has to return one of the following constants
NF_ACCEPT continue traversal as normal
NF_DROP drop the packet, do not continue
NF_STOLEN I've taken over the packet do not continue
NF_QUEUE enqueue packet to userspace
NF_REPEAT call this hook again
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
IP tables
Packet selection using IP tables
The kernel provides generic IP tables support
Each kernel module may create it's own IP table
The three major parts of 2.4 firewalling subsystem are implemented using IP tables
Packet filtering table 'filter'
NAT table 'nat'
Packet mangling table 'mangle'
Can potentially be used for other stuff, i.e. IPsec SPDB
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
IP Tables
Managing chains and tables
An IP table consists out of multiple chains
A chain consists out of a list of rules
Every single rule in a chain consists out of
match[es] (rule executed if all matches true)
target (what to do if the rule is matched)
%size 4
matches and targets can either be builtin or implemented as kernel modules
%size 6
The userspace tool iptables is used to control IP tables
handles all different kinds of IP tables
supports a plugin/shlib interface for target/match specific options
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
IP Tables
Basic iptables commands
To build a complete iptables command, we must specify
which table to work with
which chain in this table to use
an operation (insert, add, delete, modify)
one or more matches (optional)
a target
The syntax is
%font "typewriter"
%size 3
iptables -t table -Operation chain -j target match(es)
%font "standard"
%size 5
Example:
%font "typewriter"
%size 3
iptables -t filter -A INPUT -j ACCEPT -p tcp --dport smtp
%font "standard"
%size 5
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
IP Tables
Matches
Basic matches
-p protocol (tcp/udp/icmp/...)
-s source address (ip/mask)
-d destination address (ip/mask)
-i incoming interface
-o outgoing interface
Match extensions (examples)
tcp/udp TCP/udp source/destination port
icmp ICMP code/type
ah/esp AH/ESP SPID match
mac source MAC address
mark nfmark
length match on length of packet
limit rate limiting (n packets per timeframe)
owner owner uid of the socket sending the packet
tos TOS field of IP header
ttl TTL field of IP header
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
IP Tables
Targets
very dependent on the particular table.
Table specific targets will be discussed later
Generic Targets, always available
ACCEPT accept packet within chain
DROP silently drop packet
QUEUE enqueue packet to userspace
LOG log packet via syslog
ULOG log packet via ulogd
RETURN return to previous (calling) chain
foobar jump to user defined chain
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Packet Filtering
Overview
Implemented as 'filter' table
Registers with three netfilter hooks
NF_IP_LOCAL_IN (packets destined for the local host)
NF_IP_FORWARD (packets forwarded by local host)
NF_IP_LOCAL_OUT (packets from the local host)
Each of the three hooks has attached one chain (INPUT, FORWARD, OUTPUT)
Every packet passes exactly one of the three chains. Note that this is very different compared to the old 2.2.x ipchains behaviour.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Packet Filtering
Targets available within 'filter' table
Builtin Targets to be used in filter table
ACCEPT accept the packet
DROP silently drop the packet
QUEUE enqueue packet to userspace
RETURN return to previous (calling) chain
foobar user defined chain
Targets implemented as loadable modules
REJECT drop the packet but inform sender
MIRROR change source/destination IP and resend
LOG log via syslog
ULOG log via userspace
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Connection Tracking Subsystem
Connection tracking...
implemented seperately from NAT
enables stateful filtering
implementation
hooks into NF_IP_PRE_ROUTING to track packets
hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to see if packet passed filtering rules
protocol modules (currently TCP/UDP/ICMP)
application helpers currently (FTP,IRC,H.323,talk,SNMP)
divides packets in the following four categories
NEW - would establish new connection
ESTABLISHED - part of already established connection
RELATED - is related to established connection
INVALID - (multicast, errors...)
does _NOT_ filter packets itself
can be utilized by iptables using the 'state' match
is used by NAT Subsystem
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Network Address Translation
Overview
Previous Linux Kernels only implemented one special case of NAT: Masquerading
Linux 2.4.x can do any kind of NAT.
NAT subsystem implemented on top of netfilter, iptables and conntrack
NAT subsystem registers with all five netfilter hooks
'nat' Table registers chains PREROUTING, POSTROUTING and OUTPUT
Following targets available within 'nat' Table
SNAT changes the packet's source whille passing NF_IP_POST_ROUTING
DNAT changes the packet's destination while passing NF_IP_PRE_ROUTING
MASQUERADE is a special case of SNAT
REDIRECT is a special case of DNAT
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Network Address Translation
Source NAT
SNAT Example:
%font "typewriter"
%size 3
iptables -t nat -A POSTROUTING -j SNAT --to-source 1.2.3.4 -s 10.0.0.0/8
%font "standard"
%size 4
MASQUERADE Example:
%font "typewriter"
%size 3
iptables -t nat -A POSTROUTING -j MASQUERADE -o ppp0
%font "standard"
%size 5
Destination NAT
DNAT example
%font "typewriter"
%size 3
iptables -t nat -A PREROUTING -j DNAT --to-destination 1.2.3.4:8080 -p tcp --dport 80 -i eth1
%font "standard"
%size 4
REDIRECT example
%font "typewriter"
%size 3
iptables -t nat -A PREROUTING -j REDIRECT --to-port 3128 -i eth1 -p tcp --dport 80
%font "standard"
%size 5
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Packet Mangling
Purpose of mangle table
packet manipulation except address manipulation
Integration with netfilter
'mangle' table hooks in all five netfilter hooks
priority: after conntrack
Targets specific to the 'mangle' table:
DSCP - manipulate DSCP field
IPV4OPTSSTRIP - strip IPv4 options
MARK - change the nfmark field of the skb
TCPMSS - set TCP MSS option
TOS - manipulate the TOS bits
TTL - set / increase / decrease TTL field
Simple example:
%font "typewriter"
%size 3
iptables -t mangle -A PREROUTING -j MARK --set-mark 10 -p tcp --dport 80
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Advanced Netfilter concepts
%size 4
Userspace logging
flexible replacement for old syslog-based logging
packets to userspace via multicast netlink sockets
easy-to-use library (libipulog)
plugin-extensible userspace logging daemon (ulogd)
Can even be used to directly log into MySQL
Queuing
reliable asynchronous packet handling
packets to userspace via unicast netlink socket
easy-to-use library (libipq)
provides Perl bindings
experimental queue multiplex daemon (ipqmpd)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Current Development and Future
Netfilter (although it proved very stable) is still work in progress.
Areas of current development
infrastructure for conntrack manipulation from userspace
failover of stateful firewalls
making iptables layer3 independent (pkttables)
new userspace library (libiptables) to hide plugins from apps
more matches and targets for advanced functions (pool, hashslot)
more conntrack and NAT modules (RPC, SNMP, SMB, ...)
better IPv6 support (conntrack, more matches / targets)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Thanks
Thanks to
the BBS people, Z-Netz, FIDO, ...
for heavily increasing my computer usage in 1992
KNF
for bringing me in touch with the internet as early as 1995
for providing a playground for technical people
for telling me about the existance of Linux!
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
for implementing (one of?) the world's best TCP/IP stacks
Paul 'Rusty' Russell
for starting the netfilter/iptables project
for trusting me to maintain it today
Linux User Group Nuernberg (ALIGN, LUG-N)
for helping me with my initial Linux problems
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Availability of slides / Links
The slides and the an according paper of this presentation are available at
http://www.gnumonks.org/
The netfilter homepage
http://www.netfilter.org/

View File

@ -0,0 +1,201 @@
%include "default.mgp"
%default 1 bgrad
%%%%%%%%%%%%%%%%
%deffont "typewriter" tfont "MONOTYPE.TTF"
%page
%nodefault
%back "blue"
%center
%size 7
TCP state + windowtracking
%center
%size 4
by
Harald Welte <laforge@gnumonks.org>
%page
TCP state + window tracking
What? Why?
TCP is a stateful protocol, each endpoint is a state machine
What is TCP state / windowtracking?
Some intermediate System (Router/Firewall) is trying to derive the current state of the two TCP endpoints
Why does somebody want TCP state / windowtracking
Evaluation of TCP stack implementations
Hide/Protect broken implementations from a public network
%page
TCP state + windowtracking
TCP basics
states of a TCP endpoint:
LISTEN: port waiting for connection request from remote end
SYN_SENT: we've sent a SYN packet and not received anything yet
SYN_RECEIVED: We've received a SYN in reply to our SYN
ESTABLISHED: fully established TCP connection
FIN_WAIT1: waiting for FIN from remote end or ACK of sent FIN
FIN_WAIT2 waiting for FIN from remote end
TIME_WAIT: waiting for enough time to pass to be sure the remote end received the ACK of its FIN
CLOSED: no connection state at all
CLOSE_WAIT: waiting for a connection termination request from local user
CLOSING: waiting for a connection termination request acknowledgement from the remote end
LAST_ACK: Waiting for ACK of the FIN previously sent to remote end
%page
TCP state + windowtracking
TCP basics
sequence numbers
every octet has a corrsponding sequence number
sequence number is increased by one for every payload octet sent
receiver acknowledges last received contiguous sequence number (cumulative ack)
EXTENSION: selective acknowledgement (SACK) option, RFC2018
receiver can specify seperate sequencenumber blocks it has received
sliding window protocol
receiver advertises the size of the receive window
sender can only send up to 'window' number of octets which are not ACK'ed yet
EXTENSION: window scaling, RFC1323
window size of 16bit is too small for high bandwith links, thus window scaling was introduced
%page
TCP state + windowtracking
TCP state tracking
Where do we do TCP state tracking?
state tracker needs to see _all_ packets in both directions
problems with asymmetric routing!
So where's the Problem?
IP is an unreliable, best-effort protocol
If man in the middle does observe a packet, he can make no assumption on whether it actually arrives at the receiver.
%page
TCP state + window tracking
Problems
Example scenario 1
A sends SYN to B
man in the middle saves state as SYN_SENT
B sends SYN/ACK to A
man in the middle detects state transition to SYN_RECEIVED
SYN/ACK doesn't arrive at A
somebody spoofs ACK A->B to firewall
man in the middle detects state transition to ESTABLISHED
==> Any traffic between A and B will be accepted (wrong!)
%page
TCP state + window tracking
Problems
Example scenario 2
fully established TCP connection
A sends FIN to B
man in the middle saves state to FIN_WAIT1
B sends FIN/ACK to A
man in the middle saves state CLOSING/TIME_WAIT
FIN/ACK doesn't arrive at A
B retransmits FIN/ACK to A
man in the middle doesn't accept any further packets
.oO(booom)Oo.
%page
TCP state + window tracking
Problems
Example scenario 3 (FIN/RST spoofing without windowtracking)
fully established TCP connection
evil guy spoofs FIN A->B (with guessed sequence number
man in the middle saves satet as FIN_WAIT1
B ignores FIN/ACK because of wrong sequence number
A sends further segments to B
man in the middle doesn't accept further segments after FIN was sent in this direction
.oO(booom)Oo.
Solution: Real Window tracking
See paper by Guido van Rooj
%page
TCP state + window tracking
Further Problems
pickup of already established connections
window scaling sucks in this case
window tracking has to be disabled in that case
selective acknowledgements
man-in-the-middle needs to track all selectively acknowledged segments
this can draw lots of resources at the man in the middle and is prone to DoS
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
TCP state + window tracking
conntrack subsystem of netfilter
Netfilter architecture in IPv4
%font "typewriter"
%size 3
--->[1]--->[ROUTE]--->[3]--->[4]--->
| ^
| |
| [ROUTE]
v |
[2] [5]
| ^
| |
v |
%font "standard"
1=NF_IP_PRE_ROUTING
2=NF_IP_LOCAL_IN
3=NF_IP_FORWARD
4=NF_IP_POST_ROUTING
5=NF_IP_LOCAL_OUT
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
TCP state + window tracking
conntrack subsystem of netfilter
Connection tracking...
implemented seperately from NAT
enables stateful filtering
implementation
hooks into NF_IP_PRE_ROUTING to track packets
hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to see if packet passed filtering rules
protocol modules (currently TCP/UDP/ICMP)
application helpers currently (FTP,IRC,H.323,talk,SNMP)
divides packets in the following four categories
NEW - would establish new connection
ESTABLISHED - part of already established connection
RELATED - is related to established connection
INVALID - (multicast, errors...)
does _NOT_ filter packets itself
can be utilized by iptables using the 'state' match
is used by NAT Subsystem
%page
TCP state + window tracking
Further Reading
RFC793,RFC2018,RFC1323: Transmission Control Protocol
http://www.netfilter.org/ - netfilter hacking howto contains some info

View File

@ -0,0 +1,147 @@
0. Einfuehrung / Geschichte
1. Grundsaetzliches (.tex file, latex, .dvi, dvips, postscript, ...)
2. wie sieht ein LaTex dokument aus?
\documentstyle[german]{article}
\pagestyle{empty}
\begin{document}
..
\end{document}
3. Texteingabe
- leerzeichen / leerzeilen irrelevant
- absatz durch leere zeile oder \par
- zeilenumbruch mit \\
Trennvorgaben:
- lokal (nur): Donau\-dampf\-schiff
- lokal (auch): Donau"-dampf"-schiff
- global: \hyphenation{Donau-dampf-schiff}
- keine trennung lokal: \mbox{Untrennbar}
- keine trennung global: \hyphenation{Untrennbar}
- ~ == leerzeichen, bei dem kein zeilenumbruch erfolgen darf
Sonderzeichen:
- deutsch: "a "o "u "A "O "U "s
- sonstige: \'o \`o \^o \=o \.o \u{o} \v{o} \H{o} \t{oo} \c{c} \d{o}
z.b. C'est \c{c}a!
- \$ \& \% \# \{ \} [ ] \_ @ \S \pounds $<$ $>$ $\backslash ?` !`
Striche und Anfuehrungszeichen:
- Striche: - -- ---
- Anfuehrungszeichen: \glq \grq \glqq \grqq \flq \frq \flqq \frqq
(german left quote, french right quote, ...)
Kurzform: "` "' "> "<
Leerraeume
- \hspace{length}
- \vspace{length}
- \hfill \vfill
Auslassungszeichen
- \ldots{}
- \dotfill{}
- \vdots
Texthervorhebungen
- betonen {\em betontes}
- unterstreichen \underline{ }
- woertlich \verb{text}
Schriftarten:
- Roman (\rm)
- Fett (\bf)
- Kursiv (\it) (Italic-Korrektur \/)
- Slanted (\sl)
- Sans Serif (\sf)
- Small Caps (\sc)
- Typweriter (\tt)
Schriftgroessen:
- \tiny
- \scriptsize
- \footnotesize
- \small
- \normalsize
- \large
- \Large
- \LARGE
- \huge
- \Huge
NFSS:
- \family{cmr|cmss|cmtt}
- \series{ul|el|l|sl|m|sb|b|eb|ub}
- \series{uc|ec|c|sc|m|sx|x|ex|ux}
- \shape{n|it|sl|sc}
- \size{size}{linespacing}
- \selectfont
Dokumenststruktur
Gliederung:
- \part{Teilueberschrift} \part*
- \chapter{Kapitelueberschrift} \chapter*
- \section{Abschnittsueberschrift}
- \subsection{}
- \subsubsection{}
- \paragraph{} [text folgt in gleicher zeile wie ueberschrift]
- \subparagraph{}
- \setcounter{page|part|chapter|section|...}{wert}
Titelseite:
- \title{Titel}
- \thanks{Danksagung}
- \author{Autor}
- \date{Datum}
- \maketitle
Zusammenfassung:
- \begin{abstract} \end{abstract}
Inhaltsverzeichnis:
- \tableofcontents
- Tiefe des Einbindens: \setcounter{tocdepth}{tiefe}
tiefe: -1 keine ueberschrift
0 chapter
1 chapter und section
2 chapter bis subsection
3 chapter bis subsubsection
4 chapter bis paragraph
5 alle
Dokumentaufbau:
- \documentstyle[german, option, ...]{style}
Gaengige styles: article, report, book
Stiloptionen: 10pt,11pt,12pt,twoside,twocolumn,titlepage
- \pagestyle{plain|headings|empty|myheadings}
- \noindent
- Zentriert: \begin{center} \end{center}
- Links-Rechtsbuendig: \begin{flushleft|flushright}
- Zitat: \begin{quotation} \end{quotation}
- Gedichtzeilen: \begin{verse} \end{verse}
- Woertlich: \begin{verbatim} \end{verbatim}
- Randnotiz: \marginpar{Randnotiz}
- Fussnote: \footnote{foobaR}
Listen und Aufzaehlungen:
- Liste: \begin{itemize} \item \end{itemize}
- Aufzaehlung: \begin{enumerate} \item \end{enumerate}
- Beschreibung: \begin{description} \item[was] \end{description}
Querverweise:
- Ziel des verweises: \label{name}
- Verweis: \ref{name}, Seite \pageref{name}
Literaturverzeichnis:
- "Wie in \cite[Seiten 12,13]{1} beschrieben..."
\begin{thebibliography}{99}
\bibitem{1} Markus Mueller, {\sl Mahlzeit - das Kochbuch, Addison-Wesley, 1995}
...
\end{thebibliography}

View File

@ -0,0 +1,430 @@
\documentstyle[german,a4]{article}
\pagestyle{plain}
\setlength{\oddsidemargin}{0in}
\setlength{\evensidemargin}{0in}
\setlength{\topmargin}{0.0in}
\setlength{\headheight}{0in}
\setlength{\headsep}{0in}
\setlength{\textwidth}{6.5in}
\setlength{\textheight}{9.5in}
\setlength{\parindent}{0in}
\setlength{\parskip}{0.05in}
\begin{document}
\title{The UNIX way of text processing: \LaTeX}
\author{Harald Welte $<$laforge@gnumonks.org$>$}
\maketitle
\begin{abstract}
Dieses Dokument soll als kleines Begleitschreiben zu meinem Einf"uhrungskurs in
{\LaTeX} dienen. Vervielf"altigung erlaubt und erw"unscht.
\end{abstract}
\tableofcontents
\section{Einleitung}
Viele Anwender arbeiten heute mit sogenannten {\em WYSIWYG}-Textverarbeitungen.
Dieses relativ neue Konzept erm"oglicht es, ein Textdokument am Bildschirm so
zu bearbeiten, wie es nachher auch aus dem Drucker herauskommt (zumindest
Behaupten dies die Hersteller, die Realit"at sieht meist anders aus).
Eine ganz andere Philosophie verfolgen Textsatzsysteme wie \TeX oder dessen
Erweiterung {\LaTeX}. Hier wird der Text zun"achst mit einem beliebigen
Texteditor gescrhieben, wobei bestimmte Befehle und Steuerzeichen eingebettet
werden. Anschlie"send wird dann der Textprocessor aufgerufen, der das
eigentliche Resultat generiert.
Dieser Vorgang erinnert stark an das Programmieren: Der Autor schreibt einen
Quelltext, welcher mittels eines Compilers in Maschinensprache "ubersetzt wird.
Diese Analogie ist kein Zufall, wurde TeX doch von {\em Donald E. Knuth}, einem
der renommiertesten Informatikern "ubehaupt, geschrieben. Knuth hat neben zwei
Professuren unter anderem auch 27 Ehrendoktortitel und zahlreiche Ehrungen
Namhafter Institutionen.
Knuth schreibt seit Jahrzehnten ma"sgebliche Referenzwerke der Informatik, so
z.B. die mehrteilige Reihe {\em The Art of Programming}. Um seine B"ucher
vern"unftig schreiben zu k"onnen, hat Knuth keine befriedigende Software
gefunden, und hat so kurzerhand selbst eine entwickelt.
Die ersten TeX-Versionen wurden 1978 ver"offentlicht, und der letzte bekannte
(in Erscheinung tretende) Bug wurde 1985 gefixed. Der Autor hat f"ur jeden
Bugfix eine finanzielle Belohnung ausgesetzt, welche sich mit jedem Fehler
verdoppelt (beginnend bei US\$ 1.28, heute nahe der H"ochstgrenze von
US\$327.68).
\section{Warum sollte ich {\LaTeX} verwenden?}
\begin{itemize}
\item
Professionelles Textsatzsystem v"ollig kostenlos
\item
Absolut identische Ausgabe unabh"angig von verwendeter Computerhardware, Betriebssystem, Softwareversion, Drucker
\item
Kompatibilit"at "uber Jahrzehnte. Welches propriet"are Textsyetem existiert seit 1978 und kann heute noch problemlos mit den alten Dokuenten arbeiten?
\item
Perfekte Unterst"utzung f"ur alles, was in wissenschaftlichen Dokumenten gebraucht wird: Formelnsatz, Literaturverzeichnisse, Glossar, ...
\end{itemize}
\section{Mein erstes {\LaTeX}-Dokument}
\subsection{Bearbeiten des .tex Files im Editor}
Man nehme seinen lieblings-Texteditor und schreibe folgendes:
\begin{verbatim}
\documentstyle[german,a4]{article}
\pagestyle{empty}
\begin{document}
...
\end{document}
\end{verbatim}
Die einzelnen Elemente werden sp"ater noch ausf"uhrlich besprochen. Vorerst
sollte einfach das obige Template verwendet werden. Dort wo ``....'' steht,
kann man jetzt seinen eigenen Text hinschreiben.
\subsection{Das eigentliche Tex(t)processing}
Nach eingabe des Quelltextes wird der Processor aufgerufen, welcher dann das Ergebnis produziert. Unter Unix/Linux sieht der Befehl folgendermassen aus:
\begin{verbatim}
latex meinedatei.tex
\end{verbatim}
Anschlie"send liegt eine Datei {\em meinedatei.dvi} im gleichen Verzeichnis.
Das DVI ist ein device-independent File. Das hei"st, da"s es das Dokument in
einem vom Ausgabeger"at unabh"angigen Format beschreibt.
Dieses DVI-File kann man sich unter Linux am besten mit dem Programm {\em xdvi} ansehen:
\begin{verbatim}
xdvi meinedatei.dvi
\end{verbatim}
Sollte man nnoch etwas am Dokumnt nachbessern wollen, so editiert man wieder das .tex-File, ruft {\LaTeX} auf und sieht sich das neue .dvi an.
\subsection{Das Ausgeben auf dem Drucker}
Das .dvi kann nun in das endg"ultige Ausgabeformat "uberf"uhrt werden. Zumeist ist das Postscript. F"ur die Konvertierung nach Postscript wird {\em dvips} verwendet:
\begin{verbatim}
dvips meinedatei.dvi > meinedatei.ps
\end{verbatim}
oder gleich zum Drucker schicken:
\begin{verbatim}
dvips meinedatei.dvi | lpr
\end{verbatim}
\section{Allgemeines zur Syntax}
\subsection{Leerzeichen}
Leerzeichen und ein einfacher Zeilenumbruch werden von {\LaTeX} nicht beachtet.
Ein neuer Absatz wird durch eine Leerzeile (doppelter Zeilenumbruch) begonnen.
Zeilenumbr"uche k"onnen durch $\backslash\backslash$ am Zeilenende erzeugt
werden. Es k"onnen manuell Leerr"aume eingef"ugt werden:
\begin{verbatim}
\hspace{length} % horizontaler Leerraum
\vspace{length} % vertikaler Leerraum
\end{verbatim}
\subsection{Trennung}
Die Trennung wird von {\LaTeX} automatisch vorgenommen. Hierbei werden die in
der jeweiligen Landessprache ({\em german.sty}) geltneden Trennungsregeln
verwendet. Es k"onnen jedoch vom Autor Trennungsvorgaben gemacht werden:
\begin{itemize}
\item
Zus"atzliche Trennungsm"oglichkeit: Donau$\backslash$-dampf$\backslash$-schiff
\item
Ausschliessliche Trennvorgabe: Donau\"-dampf\"-schiff
\item
Globale Trennvorgabe f"ur ein Wort: $\backslash$hyphenation\{Donau-dampf-schiff\}
\item
Keine Trennungsm"oglichkeit (im text): $\backslash$mbox\{Untrennbar\}
\item
Keine Trennungsm"oglichkeit (global): $\backslash$hyphenation\{Untrennbar\}
\item
Leerzeichen, bei dem kein Zeilenumbruch erfolgen darf: $~$
\end{itemize}
\subsection{Sonderzeichen}
Sonderzeichen k"onnen nicht einfach im Flie"stext geschrieben werden, da sie
h"aufig von Zeichensatz zu Zeichensatz unterschiedlich sind. Einige andere
Symbole werden von {\LaTeX} selbst als Steuer- und Kommandozeichen verwendet.
\subsubsection{Deutsche Umlaute}
\begin{verbatim}
"a "o "u "A "O "U "s
\end{verbatim}
"a "o "u "A "O "U "s
\subsubsection{Ausl"andische Sonderzeichen}
\begin{verbatim}
\'o \`o \^o \=o \.o \u{o} \v{o} \H{o} \t{oo} \c{c} \d{o}
\end{verbatim}
\'o \`o \^o \=o \.o \u{o} \v{o} \H{o} \t{oo} \c{c} \d{o}
\subsubsection{Sonderzeichen/Symbole}
\begin{verbatim}
\$ \& \% \# \{ \} [ ] \_ @ \S \pounds $<$ $>$ $\backslash$ @' !'
\end{verbatim}
\$ \& \% \# \{ \} [ ] \_ @ \S \pounds $<$ $>$ $\backslash$ @' !'
\section{Dokumentstruktur}
\subsection{Gliederung}
Gr"o"sere Dokumentvorlagen wie book.sty haben Teile und Kapitel. Dise k"onnen wie folgt verwendet werden:
\begin{verbatim}
\part{Teil"uberschrift}
\chapter{Kapitel"uberschrift}
\end{verbatim}
Kleinere Dokumentvorlagen wie article.sty bieten die Unterteilung in
Abschnitte, Unterabschnitte, Unter-Unterabschnitte, Abs"tze und Unterabs"atze.
Selbstverst"andlich sind diese Gliederungselemente auch in gr"o"seren
Dokumentvorlagen verwendbar.
\begin{verbatim}
\section{Abschnitts"uberschrift}
\subsection{Unterabschnitts"uberschrift}
\subsubsection{Unter-Unterabschnitts"uberschrift}
\paragraph{Absatz"uberschrift}
\subparagraph{Unterabsatz"uberschrift}
\end{verbatim}
Die einzelnen Gliederungselemente werden automatisch durchnummeriert. Wie tief die Nummerierung angezeigt wird, ist einstellbar. Standardm"a"sig wird nur bis Subsection nummeriert, d.h. subsubsection, paragraph und subparagraph erhalten keine angezeigte Nummerierung.
\begin{verbatim}
\setcounter{secnumdepth}{wert}
\end{verbatim}
\label{gliederungswerte}
Wobei {\em wert} die folgenden Werte annehmen kann:
\begin{description}
\item[-1] keine Nummern
\item[0] nur Chapter
\item[1] Chapter und Section
\item[2] Chapter bis Subsection
\item[3] Chapter bis Subsubsection
\item[4] Chapter bis Paragraph
\item[5] alle
\end{description}
\subsection{Titelseite}
Die meisten Dokumentvorlagen bieten die M"oglichkeit, automatisch eine Titelseite zu generieren. Dazu werden die folgenden Definitionen verwendet:
\begin{verbatim}
\title{Titel}
\thanks{Danksagung}
\author{Autor}
\date{Datum}
\maketitle
\end{verbatim}
\subsection{Zusammenfassung}
Eine Zusammenfassung kann dem eigentlichen Dokument vorangestellt werden:
\begin{verbatim}
\begin{abstract}
...
\end{abstract}
\end{verbatim}
\subsection{Inhaltsverzeichnis}
Aus den Gliederungselementen kann auf Wunsch automatisch ein Inhaltsverzeichnis erzeugt werden. Hierzu verwendet man den Befehl
\begin{verbatim}
\tableofcontents
\end{verbatim}
Man kann nun noch bestimmen, bis zu welcher Gliederungsebene Eintr"age im Inhaltsverzeichnis gemacht werden sollen:
\begin{verbatim}
\setcounter{tocdepth}{tiefe}
\end{verbatim}
Wobei {\em tiefe} die gleichen Werte annehmen kann, wie in Teil \ref{gliederungswerte} auf Seite \pageref{gliederungswerte} beschrieben.
\section{Formatierung}
\subsection{Schriftarten}
Man kann selbstverst"andlich auch zwischen diversen Schriftarten w"ahlen. Der Einfachheit halber werden hier jedoch nur die Standardschriften beschrieben:
\begin{itemize}
\item
{\rm Roman}: $\backslash$rm
\item
{\bf Fett}: $\backslash$bf
\item
{\it Kursiv}: $\backslash$it (Italic-Korrektur $\backslash$/)
\item
{\sl Slanted}: $\backslash$sl
\item
{\sf Sans Serif}: $\backslash$sf
\item
{\sc Small Caps}: $\backslash$sc
\item
{\tt Typewriter}: $\backslash$tt
\end{itemize}
\subsection{Schriftgr"o"sen}
\begin{itemize}
\item {\tiny $\backslash$tiny}
\item {\scriptsize $\backslash$scriptsize}
\item {\footnotesize $\backslash$footnotesize}
\item {\small $\backslash$small}
\item {\normalsize $\backslash$normalsize}
\item {\large $\backslash$large}
\item {\Large $\backslash$Large}
\item {\LARGE $\backslash$LARGE}
\item {\huge $\backslash$huge}
\item {\Huge $\backslash$Huge}
\end{itemize}
\subsection{Textausrichtung}
Standardm"a"sig formatiert {\LaTeX} immer im Blocksatz. Dies kann ge"andert
werden:
\begin{verbatim}
\begin{center}
Zentriert
\end{center}
\end{verbatim}
Es k"onnen neben {\em center} auch {\em flushleft} oder {\em flushright}
verwendet werden, um linksb"undige bzw. rechtsb"undige Ausgabe zu erhalten.
\subsection{Zitate}
Zitate k"onnen wie folgt eingebunden werden:
\begin{verbatim}
\begin{quotation}
Mahlzeit
\end{quotation}
\end{verbatim}
\section{Listen und Aufz"ahlungen}
\subsection{Listen}
Eine Liste kann wie folgt erzeugt werden:
\begin{verbatim}
\begin{itemize}
\item erster eintrag
\item zweiter eintrag
\end{itemize}
\end{verbatim}
\subsection{Aufz"ahlungen}
Eine Aufz"ahlung kann wie folgt erzeugt werden:
\begin{verbatim}
\begin{enumerate}
\item erster Eintrag
\item zweiter Eintrag
\end{enumerate}
\end{verbatim}
\subsection{Beschreibungen/Definitionen}
Eine Beschreibung kann wie folgt erzeugt werden:
\begin{verbatim}
\begin{description}
\item[Donald E. Knuth] Autor des bekannten TeX Syetems
\item[Donald Becker] Autor von zahllosen Linux-Netzwerktreibern
\end{description}
\end{verbatim}
\section{Fu"snoten, Querverweise, Literaturverzeichnis}
\subsection{Fu"snoten}
Eine Fu"snote\footnote{Fussnoten sehen so aus *g*} wird einfach in den Text
mit hineingeschrieben, an der Stelle an der sie erscheinen soll:
\begin{verbatim}
Benutzt man hingegen das Sub-Etha-Sens-O-Matic\footnote{Ein Ger"at zur
detektion in der N"ahe befindlicher Raumschiffe}, so ....
\end{verbatim}
\subsection{Querverweise}
Es k"onnen Querverweise eingef"ugt werden, welche dann automatisch auf die
jeweils aktuelle Abschnittsnummer / Seite verweisen, auch wenn sich das Ziel
des Querverweises verschiebt.
An dem Ziel des Querverweises (also wohin man verweisen m"ochte), wird
folgender Befehl eingef"ugt:
\begin{verbatim}
\label{namedeslabels}
\end{verbatim}
Ein Querverweis dorthin sieht dann wie folgt aus:
\begin{verbatim}
Wie in Abschintt \ref{namedeslabels} auf Seite \pageref{namedeslabels}
beschrieben
\end{verbatim}
\subsection{Literaturverzeichnis}
Vor allem in wissenschaftlichen Dokumenten wird ein Literaturverzeichnis
gebraucht. Es gibt zwei unterschiedliche M"oglichkeiten, ein
Literaturverzeichnis unter {\LaTeX} zu verwenden.
Die einfache, hier beschriebene Variante eignet sich f"ur kleine, einzelne
Dokumente. Wer h"aufig zu den gleichen Themen dokumente verfasst, sollte sich
mit {\em bibtex} auseinandersetzen, hier kann man sich ein zentrales
Literaturverzeichnis anlegen, worauf von allen Dokumenten aus verwiesen werden
kann.
Ein Verweis auf ein Literaturverzeichnis sieht so aus:
\begin{verbatim}
Wie in \cite[Seiten 12 ff.]{1} bescrhieben, ...
\end{verbatim}
Das Literaturverzeichnis am Ende sieht dann so aus:
\begin{verbatim}
\begin{thebibliography}{99}
\bibitem{1} Markus M"uller, {\sl Mahlzeit - das Kochbuch, Addison-Wesley, 1995}
\end{thebibliography}
\end{verbatim}
\section{Dokumentvorlagen}
Wir haben in den bisherigen Beispielen immer die Dokumentvorlage {\em article.sty} vewendet. Die zu verwendende Dokumentvorlage wird im Kopf des Dokuments mit dem {\em $\backslash$documentstyle}-Befehl angegeben.
G"angige Dokumentvorlagen:
\begin{description}
\item[article] F"r das Verfassen strukturierter Dokumente begrenzter L"ange
\item[book] Zum Verfassen eines Ganzen Buches
\item[dinbrief] Zum Verfassen eines sich exakt an der DIN-Norm orientierenden Briefes
\end{description}
Zus"atzlich werden beim {\em documentstyle}-Befehl in den eckigen Klammern noch
Optionen angegeben.
G"angige Optionen:
\begin{description}
\item[10pt, 11pt, 12pt] Standardschriftgr"o"se
\item[german] Unterst"utzung f"ur deutsche Umlaute und Trennregeln
\item[twocolumn] Zweispaltige Formatierung
\item[twoside] Zweiseitiger Druck (Seitennummern, etc.)
\end{description}
\end{document}

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1 @@
http://www.allnet.de/ftp/pub/allnet/wireless/all0277/ALL0277_1.02.6_ETSI_0703_code.bin

View File

@ -0,0 +1,113 @@
%include "default.mgp"
%default 1 bgrad
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
%nodefault
%back "blue"
%center
%size 7
Reverse Engineering
%size 5
of Linux-Based Firmware Images
%center
%size 4
by
Harald Welte <laforge@gnumonks.org>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Linux Firmware Reverse Engineering
Overview
Linux has gained ground in the commercial market
Embedded hardware is getting cheaper
Network Appliances become more popular
802.11(abg) Acces Points, Bridges, Routers
DSL 'Routers' (in reality NAT-gateways)
Users demand more and more CPU-intensive functions
PPPoE, PPTP
NAT with ALG's for H.323, PPTP
IPsec
Many vendors seem to conclude:
Why not use Linux?
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Linux Firmware Reverse Engineering
Why is this worth a presentation?
Vendors tend to forget about their GPL obligations
They have to
redistribute or make available the sourcecode
redistribute or maka available build scripts
inform their users about their rights and obligations under the GPL
They are not allowed to link with GPL-incompatible code
Vendors tend to forget about security issues
Most people don't know that their appliance runs linux
Thus they won't even know that they're affected by a vulnerability
Vendors of consumer-class equipment tend to be lazy
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Linux Firmware Reverse Engineering
How to start (from a technical point of view)
In most cases you don't even need the device
Firmware images are available for download from the vendors
Reverse engineering starts by looking at that binary
In a number of cases, you will either find
a gzip signature for a compressed kernel
a signature of a cramfs disk image
a configuration file to enable/disable features
some other (arj/lzh/zip/...) image
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Linux Firmware Reverse Engineering
How to start from a technical point of view (cont'd)
Useful tools for looking at that image
'strings' (from gnu binutils)
your favourite hex editor
'file' (especially it's 'magic' signature file)
libmagic (library for accessing 'magic' signatures)
Strings to look for:
'piggy' (compressed kernel image)
0x28cd3d45 (compressed ram fs)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Linux Firmware Reverse Engineering
Practical Example
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Linux Firmware Reverse Engineering
Thanks
The slides of this presentation are available at http://www.gnumonks.org/
Thanks to
the BBS people, Z-Netz, FIDO, ...
for heavily increasing my computer usage in 1992
KNF
for bringing me in touch with the internet as early as 1994
for providing a playground for technical people
for telling me about the existance of Linux!
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
for implementing (one of?) the world's best TCP/IP stacks
Paul 'Rusty' Russell
for starting the netfilter/iptables project
for trusting me to maintain it today
Astaro AG
for sponsoring parts of my netfilter work

View File

@ -0,0 +1,79 @@
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <magic.h>
/* magic_ofs - check for 'file' magic at any possible offset within a file
*
* (C) 2003 by Harald Welte <laforge@gnumonks.org>
*
* This code is subject to the GNU GPL v2
*/
int main(int argc, char **argv)
{
struct stat st;
magic_t cookie;
int fd;
off_t i;
void *mem;
if (argc < 2) {
fprintf(stderr, "you have to name a file\n");
exit(2);
}
if (!strlen(argv[1])) {
fprintf(stderr, "empty argument\n");
exit(2);
}
fd = open(argv[1], 0);
if (fd < 0) {
fprintf(stderr, "unable to open file\n");
exit(1);
}
if (fstat(fd, &st)) {
fprintf(stderr, "unable to stat file\n");
exit(1);
}
mem = mmap(0, st.st_size, PROT_READ, MAP_SHARED, fd, (off_t ) 0);
if (!mem) {
fprintf(stderr, "unable to mmap file\n");
exit(1);
}
cookie = magic_open(MAGIC_CONTINUE);
if (!cookie) {
fprintf(stderr, "error opening libmagic\n");
exit(1);
}
if (magic_load(cookie, NULL)) {
fprintf(stderr, "error during magic_load\n");
magic_close(cookie);
exit(1);
}
for (i = 0; i < st.st_size; i++) {
const char *desc;
desc = magic_buffer(cookie, mem+i, st.st_size - i);
if (!desc) {
break;
}
if (!strcmp(desc, "data")) {
continue;
}
printf("%8.8u: %s\n", i, desc);
}
magic_close(cookie);
exit(0);
}

View File

@ -0,0 +1,26 @@
Wie waere es mit folgendem Titel:
"Einfuehrung in die Architektur des Linux-Kernels - Blicke jenseits des
Syscall-Horizonts der Userspace-Prozesse"
Teil 1: Theoretische Grundlagen
- kernel/userspace: Aufgaben, Grenzen, Beruehrungspunkte
- Execution context: User, Syscall, Softirq, Hardirq, Kernelthread, Tasklet
- Der Scheduler
- Primitives: Spinlocks, rwlocks, Mutex, Waitqueues
Teil 2: Exemplarischer Einblick in einzelne Subsysteme
- Netzwerkstack: Vom Empfang des Pakets auf der Netzwerkkarte bis zum
empfang im Userspace-prozess
- Filesystem: Vom read-syscall bis zum lesen der platte und zurueck
- aufgaben
- virt. speicherverwaltung
- prozessverwaltung
- filesystem
- networking
- hardware abstraction
- interprozesskommunikation
- schnittstellen fuer userspace-programme
- syscalls
-

View File

@ -0,0 +1,300 @@
%include "default.mgp"
%default 1 bgrad
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
%nodefault
%back "blue"
%center
%size 7
Architecture of the Linux kernel
%size 5
or: The world beyond the syscall barrier
%center
%size 4
by
Harald Welte <laforge@gnumonks.org>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Architecture of the Linux kernel
Prerequirements
Due to the technical nature of this presentation, the audience should be familiar with the following subjects
experience in programming on a Linux/*NIX system
C language preferred
general knowledge about computer hardware
interrupts / IO / DMA
general knowledge about modern CPU architeture
address space / MMU
'protected mode' / supervisor mode / ...
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Architecture of the Linux kernel
Kernel / Userspace
OS kernel provides
hardware abstraction (file I/O, network I/O, ...)
ressource allocation / limiting
address sepraration
privilege separation
IPC
the traditional process model in *NIX operating systems
processes reside in seperate virtual address spaces
kernel only executes one process (init) at bootup
all other processes descend from from init
processes are scheduled and preempted by the kernel
processes invoke system functions via syscalls.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Architecture of the Linux kernel
System calls
Definition
a userspace process enters the kernel
mechanism is CPU architecture dependent
can be software interrupt (int 0x80)
can be special asm instruction (sysenter)
arguments are passed on the stack
common examples
open/close/read/write
exit/fork/execve/kill
socketcall, implements (socket/bind/connect/listen)
about 270 system calls in 2.6.x kernels
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Architecture of the Linux kernel
Invocation of system call
chronological order of events in case of a system call
userspace process calls library function
library function is executed within the process' address space
library will eventually issue a systemcall, pushing arguments on the stack
library will issue syscall (int 0x80 / sysenter / ...)
execution will switch to syscall context in kernel mode
kernel will look up systemcall table and dispatch to respective function
syscall function in the kernel will handle the syscall
all data between kernel/userspace needs to be copied between address spaces
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Architecture of the Linux kernel
Execution contexts
apart from scheduling between different userspace processes, the kernel has different jobs like reacting to an external event
hardirq
hardware interrupt line was triggered
softirq
the workhorse behind a hardirq
userspace
executing within userspace process
syscall
invoked by a system call from userspace
vsyscall
virtual system calls, executed in userspace context
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Architecture of the Linux kernel
hardirq context
interrupt generated by hardware is received + handled
can be interrupted by other hardirq's
does only minimal job and returns
examples
packet has arrived on network board
character was received on serial port
dma read/write to disk drive has completed
timer interrupt went off
in most cases, a hardirq is followed by softirq or tasklet.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Architecture of the Linux kernel
softirq context
softirqs are run after hardirq
do the real work associated withe a hardirq
multithreaded (can run simultaneously on multiple cpus)
examples
network receive softirq
timer softirq
prior to softirq's, linux had so-called 'bottom halves'
softirq introduced in 2.4.x (net rx/tx softirq)
bottom halves removed in 2.6.x
difference: only one BH can be run at a time
BH's have to be converted to tasklets in 2.6.x
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Architecture of the Linux kernel
tasklets
tasklets are somewhat in between of softirq's and bottom halves
one particular tasklet cannot run on multiple CPUs simultaneously
different tasklets can run on different CPUs simultaneosly
otherwise, same as softirq context
tasklets are impl. inside the 'tasklet softirq'
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Architecture of the Linux kernel
syscall / userspace context
userspace context
in userspace, executing a process
syscall context
inside kernel, when userspace process issues syscall()
vsyscalls (virtual syscalls)
first introduced with the x86-64 (AMD Opteron) arch
fast read-only access to kernel data structures
can do stuff like gettimeofday() without context switch
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Architecture of the Linux kernel
synchronization
Due to reentrancy and SMP, synchronization issues arise:
simple case: UP system
softirq can be interrupted by hardirq
thus, shared structures (queues, ...) need to be protected
complex case: SMP system
softirq can run at the same time on multiple CPU's
as softirqs are multithreaded, synchronization between threads has to be implemented
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Architecture of the Linux kernel
synchronization primitives
busy-waiting locks
spinlocks
if lock was not taken, take it and continue
if lock was taken, bysy-loop until it is free
rwlocks
special case of spinlocks
useful when structure protected by lock is often read but rarely updated/written to
allows either
multiple readers simultaneously, or
only one writer [and no readers]
brlocks
super-fast read/write locks, with write-side penalty
avoid cache ping-pong in multi reader case
only in kernel 2.4.x
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Architecture of the Linux kernel
synchronization primitives (cont'd)
sleeper locks
semaphores
if semaphore can be acquired, continue
if semaphore cannot be acquired, put current process to sleep
once semaphore is available again, wakeup process
WARNING: can only be used for sync userspace/syscall context
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Architecture of the Linux kernel
new locking primitives in 2.6.x
seqlocks
introduced with vsyscalls in 2.5/2.6
reader/writer consistent mechanism without starving writers
readers never block but may have to retry if write in progress
read copy update
new lockless mechanism in kernel 2.5/2.6
defers update of data structure until all CPU's have scheduled and thus nobody has any references left
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Architecture of the Linux kernel
example: incoming network packet
hardirq context
NIC issues interrupt line after a packet was received
kernel enters (arch/i386/kernel/entry.S:common_interrupt)
core interrupt handler (arch/i386/kernel/irq.c:do_IRQ)
hardirq handler of network driver (drivers/net/tulip/interrupt.c:tulip_interrupt)
net/core/dev.c:netif_rx(): append skb to backlog queue
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Architecture of the Linux kernel
example: incoming network packet
softirq context
net/core/dev.c:net_rx_action()
net/core/dev.c:process_backlog()
net/core/dev.c:netif_receive_skb()
net/core/dev.c:deliver_skb()
net/ipv4/ip_input.c:ip_rcv()
netfilter prerouting hook
net/ipv4/ip_input.c:ip_rcv_finish()
call routing code
net/ipv4/ip_input.c:ip_local_deliver()
netfilter localin hook
net/ipv4/ip_input.c:ip_local_deliver_finish()
call l4 protocol
net/ipv4/udp.c:udp_rcv()
lookup socket, if any
include/net/sock.h:sock_queue_rcv_skb()
enqueue into socket receiver queue
net/core/sock.c:sock_def_readable()
wake_up_interruptible() on socket waitqueue
return from recv() via socketcall
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Architecture of the Linux kernel
example: reading of a file
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Thanks
The slides and the an according paper of this presentation are available at http://www.gnumonks.org/
Thanks to
the BBS people, Z-Netz, FIDO, ...
for heavily increasing my computer usage in 1992
KNF
for bringing me in touch with the internet as early as 1994
for providing a playground for technical people
for telling me about the existance of Linux!
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
for implementing (one of?) the world's best TCP/IP stacks
Paul 'Rusty' Russell
for starting the netfilter/iptables project
for trusting me to maintain it today
Astaro AG
for sponsoring parts of my netfilter work

View File

@ -0,0 +1,315 @@
%include "default.mgp"
%default 1 bgrad
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
%nodefault
%back "blue"
%center
%size 7
Linux Kernel Architecture
%size 5
SMP issues, locking primitives
%center
%size 4
by
Harald Welte <laforge@gnumonks.org>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Linux Kernel Architecture
Prerequirements
Due to the technical nature of this presentation, the audience should be familiar with the following subjects
experience in programming on a Linux/*NIX system
C language preferred
general knowledge about computer hardware
interrupts / IO / DMA
general knowledge about modern CPU architeture
address space / MMU
'protected mode' / supervisor mode / ...
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Linux Kernel Architecture
Kernel / Userspace
OS kernel provides
hardware abstraction (file I/O, network I/O, ...)
ressource allocation / limiting
address sepraration
privilege separation
IPC
the traditional process model in *NIX operating systems
processes reside in seperate virtual address spaces
kernel only executes one process (init) at bootup
all other processes descend from from init
processes are scheduled and preempted by the kernel
processes invoke system functions via syscalls.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Linux Kernel Architecture
System calls
Definition
a userspace process enters the kernel
mechanism is CPU architecture dependent
can be software interrupt (int 0x80)
can be special asm instruction (sysenter)
arguments are passed on the stack
common examples
open/close/read/write
exit/fork/execve/kill
socketcall, implements (socket/bind/connect/listen)
about 270 system calls in 2.6.x kernels
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Linux Kernel Architecture
Invocation of system call
chronological order of events in case of a system call
userspace process calls library function
library function is executed within the process' address space
library will eventually issue a systemcall, pushing arguments on the stack
library will issue syscall (int 0x80 / sysenter / ...)
execution will switch to syscall context in kernel mode
kernel will look up systemcall table and dispatch to respective function
syscall function in the kernel will handle the syscall
all data between kernel/userspace needs to be copied between address spaces
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Linux Kernel Architecture
Execution contexts
apart from scheduling between different userspace processes, the kernel has different jobs like reacting to an external event
hardirq
hardware interrupt line was triggered
softirq
the workhorse behind a hardirq
userspace
executing within userspace process
syscall
invoked by a system call from userspace
vsyscall
virtual system calls, executed in userspace context
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Linux Kernel Architecture
hardirq context
interrupt generated by hardware is received + handled
can be interrupted by other hardirq's
does only minimal job and returns
examples
packet has arrived on network board
character was received on serial port
dma read/write to disk drive has completed
timer interrupt went off
in most cases, a hardirq is followed by softirq or tasklet.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Linux Kernel Architecture
softirq context
softirqs are run after hardirq
do the real work associated withe a hardirq
multithreaded (can run simultaneously on multiple cpus)
examples
network receive softirq
timer softirq
prior to softirq's, linux had so-called 'bottom halves'
softirq introduced in 2.4.x (net rx/tx softirq)
bottom halves removed in 2.6.x
difference: only one BH can be run at a time
BH's have to be converted to tasklets in 2.6.x
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Linux Kernel Architecture
tasklets
tasklets are somewhat in between of softirq's and bottom halves
one particular tasklet cannot run on multiple CPUs simultaneously
different tasklets can run on different CPUs simultaneosly
otherwise, same as softirq context
tasklets are impl. inside the 'tasklet softirq'
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Linux Kernel Architecture
syscall / userspace context
userspace context
in userspace, executing a process
syscall context
inside kernel, when userspace process issues syscall()
vsyscalls (virtual syscalls)
first introduced with the x86-64 (AMD Opteron) arch
fast read-only access to kernel data structures
can do stuff like gettimeofday() without context switch
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Linux Kernel Architecture
synchronization
Due to reentrancy and SMP, synchronization issues arise:
simple case: UP system
softirq can be interrupted by hardirq
thus, shared structures (queues, ...) need to be protected
complex case: SMP system
softirq can run at the same time on multiple CPU's
as softirqs are multithreaded, synchronization between threads has to be implemented
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Linux Kernel Architecture
synchronization primitives
busy-waiting locks
spinlocks
if lock was not taken, take it and continue
if lock was taken, bysy-loop until it is free
rwlocks
special case of spinlocks
useful when structure protected by lock is often read but rarely updated/written to
allows either
multiple readers simultaneously, or
only one writer [and no readers]
brlocks
super-fast read/write locks, with write-side penalty
avoid cache ping-pong in multi reader case
only in kernel 2.4.x
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Linux Kernel Architecture
synchronization primitives (cont'd)
sleeper locks
semaphores
if semaphore can be acquired, continue
if semaphore cannot be acquired, put current process to sleep
once semaphore is available again, wakeup process
WARNING: can only be used for sync userspace/syscall context
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Linux Kernel Architecture
new locking primitives in 2.6.x
seqlocks
introduced with vsyscalls in 2.5/2.6
reader/writer consistent mechanism without starving writers
readers never block but may have to retry if write in progress
read copy update
new lockless mechanism in kernel 2.5/2.6
defers update of data structure until all CPU's have scheduled and thus nobody has any references left
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Linux Kernel Architecture
example: incoming network packet
hardirq context
NIC issues interrupt line after a packet was received
kernel enters (arch/i386/kernel/entry.S:common_interrupt)
core interrupt handler (arch/i386/kernel/irq.c:do_IRQ)
hardirq handler of network driver (drivers/net/tulip/interrupt.c:tulip_interrupt)
net/core/dev.c:netif_rx(): append skb to backlog queue
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Linux Kernel Architecture
example: incoming network packet
softirq context
net/core/dev.c:net_rx_action()
net/core/dev.c:process_backlog()
net/core/dev.c:netif_receive_skb()
net/core/dev.c:deliver_skb()
net/ipv4/ip_input.c:ip_rcv()
netfilter prerouting hook
net/ipv4/ip_input.c:ip_rcv_finish()
call routing code
net/ipv4/ip_input.c:ip_local_deliver()
netfilter localin hook
net/ipv4/ip_input.c:ip_local_deliver_finish()
call l4 protocol
net/ipv4/udp.c:udp_rcv()
lookup socket, if any
include/net/sock.h:sock_queue_rcv_skb()
enqueue into socket receiver queue
net/core/sock.c:sock_def_readable()
wake_up_interruptible() on socket waitqueue
return from recv() via socketcall
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Linux Kernel Architecture
Cache Effects
SMP systems have multiple CPU's
Every CPU has it's own cache(s) / cache hierarchies
Most modern CPU archs are cache coherent in hardware
This means a certain chunk of memory can only be write-cached on one CPU at a given time
Frequently updated data structures will ping-pong between CPU caches
Data structures have to be organized to avoid cache issues
Cacheline alignment
very easy by using SLAB_HWCACHE_ALIGN
per-cpu data structures
e.g. packet counters: have one for every CPU
structure layout
put all writeable/updated members together
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Thanks
The slides and the an according paper of this presentation are available at http://www.gnumonks.org/
Thanks to
the BBS people, Z-Netz, FIDO, ...
for heavily increasing my computer usage in 1992
KNF
for bringing me in touch with the internet as early as 1994
for providing a playground for technical people
for telling me about the existance of Linux!
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
for implementing (one of?) the world's best TCP/IP stacks
Paul 'Rusty' Russell
for starting the netfilter/iptables project
for trusting me to maintain it today
Astaro AG
for sponsoring parts of my netfilter work
linux-bangalore
for sponsoring my trip to this conference

View File

@ -0,0 +1,71 @@
- rule loadtime performance
- loading 10k rules in 1k chains takes 4'30min on P3-733
- 27seconds in kernelspace: mark_source_chains()
- reimplementation finished, needs more testing
- 4 minutes in userspace: Two n^2 complexity functions
- one of them could be removed in old chain_cache framework
- other function needs reimplementation (underway)
- ctnetlink still under development, used by a couple of large sites
- pkt_tables to be merged later in 2.6.x
- change to liked lists of rules in linked lists of chains
- use netlink-based kernel/userspace interface
- iptables2/pkttables userspace
- libnfentlink / libpkttnetlink as low-layer interface
- move all iptables functionality into libpkttables
- libpkttables provides query-interface
- what matches/targets does this system support?
- what parameters does match 'foo' support?
- what values are acceptable for param 'bar' of match 'foo'?
- what is the help message for param 'bar' of match 'foo'?
- nf-hipac as high-performance alternative to iptables
- very complex multi-dimensional tree structure
- 530kilobyte patch, 180k kernel module
- algorithm well-proven and regression-tested in userspace
- scales really good even with 100k rules
- now supports all iptables matches/targets
- cannot replace iptables because
- large footprint
- high memory usage
- most likely to be integrated after pkt_tables / pkttnetlink merge
- Session logging
- different implementations (SLOG one of them)
- best solution: ctnetlink event API
- problem: per-connection byte/packet counters in conntrack are
performance hit
- ipv6 connection tracking
- usagi people are working on this
- non-linear skb support (removal of skb_linearize())
- thanks to rusty, 2.5.x/2.6.x now has support
- changes in almost any netfilter/iptables API :(
- stateful failover / state synchronization
- no sponsor yet, but most likely in Q4/2003
- conntrack optimization
- new hashing algorithm in 2.4.21, should improve significantly
- locking optimization
- don't use timer per conntrack, but an expiration kernel thread
- TRACE target / raw table
- experimental patch in patch-o-matic
- enables tracing of packet through ruleset
- netfilter workshop, August 2003, Budapest, Hungary
- about 20 people will attend
- sponsored by Astaro Inc and KFKI Research Institute
- open to the public, registration needed
- we need more community
- developer diaries on netfilter homepage?
- wiki or similar tool ?
- announcement of IRC channel(s) on website
- patch-o-matic 2.6.x future?
- I will only maintain patch-o-matic for 2.6.x
- maybe somebody wants to backport patches?
- maybe an official 2.4.x maintainer?
- development of testing tools
- simple packet generator not suitable for stateful filtering
- even simple packet generators are very expensive
- connection generator
- user can specify profile of a connection
- e.g. HTTP: TCP, 500 bytes one direction, 10k other
- user can specify quantity and distribution
- i.e. 10k 'HTTP', from random source to single dest.
- first implementation will be userspace-only, may change later
- work will start in September/October, I'll post an RFC
- deprecate ipfwadm

View File

@ -0,0 +1,368 @@
%include "default.mgp"
%default 1 bgrad
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
%nodefault
%back "blue"
%center
%size 7
The future of Linux packet filtering
targeted for kernel 2.6 and beyond
%center
%size 4
by
Harald Welte <laforge@gnumonks.org>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Contents
Problems with current 2.4.x netfilter/iptables
Solution to code replication
Solution for dynamic rulesets
Solution for API to GUI's and other management programs
HA for stateful firewalling
What's special about firewalling HA
Poor man's failover
Real state replication
Other current work
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Problems with 2.4.x netfilter/iptables
code replication between iptables/ip6tables/arptables
iptables was never meant for other protocols, but people did copy+paste 'ports'
replication of
core kernel code
layer 3 independent matches (mac, interface, ...)
userspace library (libiptc)
userspace tool (iptables)
userspace plugins (libipt_xxx.so)
doesn't suit the needs for dynamically changing rulesets
dynamic rulesets becomming more common due (service selection, IDS)
a whole table is created in userspace and sent as blob to kernel
for every ruleset the table needs to be copied to userspace and back
inside kernel consistency checks on whole table, loop detection
too extensible for writing any forward-compatible GUI
new extensions showing up all the time
a frontend would need to know about the options and use of a new extension
thus frontends are always incomplete and out-of-date
no high-level API other than piping to iptables-restore
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Reducing code replication
code replication is a real problem: unclean, bugfixes missed
we need layer 3 independent layer for
submitting rules to the kernel
traversing packet-rulesets supporting match/target modules
registering matches/targets
layer 3 specific (like matching ipv4 address)
layer 3 independent (like matching MAC address)
solution
pkt_tables inside kernel
pkt_tables_ipv4 registers layer 3 handler with pkt_tables
pkt_tables_ipv6 registers layer 3 handler with pkt_tables
everybody registering a pkt_table (like iptable_filter) needs to specify the l3 protocol
libraries in userspace (see later)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Supporting dynamic rulesets
atomic table-replacement turned out to be bad idea
need new interface for sending individual rules to kernel
policy routing has the same problem and good solution: rtnetlink
solution: nfnetlink
multicast-netlink based packet-orinented socket between kernel and userspace
has extra benefit that other userspace processes get notified of rule changes [just like routing daemons]
nfnetlink is a low-layer below all kernel/userspace communication
pkttnetlink [aka iptnetlink]
ctnetlink
ulog
ip_queue
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Communication with other programs
whole set of libraries
libnfnetlink for low-layer communication
libpkttnetlink for rule modifications
will handle all plugins [which are currently part of iptables]
query functions about avaliable matches/targets
query functions about parameters
query functions for help messages about specific match/parameter of a match
generic structure from which rules can be built
conversion functions to parse generic structure into in-kernel structure
conversion functiosn to perse kernel structure into generic structure
functions to convert generic structure in plain text
libipq will stay API-compatible to current version
libipulog will stay API-compatible to current version
libiptc will go away [compatibility layer extremely difficult]
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Introduction
What is special about firewall failover?
Nothing, in case of the stateless packet filter
Common IP takeover solutions can be used
VRRP
Hartbeat
Distribution of packet filtering ruleset no problem
can be done manually
or implemented with simple userspace process
Problems arise with stateful packet filters
Connection state only on active node
NAT mappings only on active node
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Connection tracking...
implemented seperately from NAT
enables stateful filtering
implementation
hooks into NF_IP_PRE_ROUTING to track packets
hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to see if packet passed filtering rules
protocol modules (currently TCP/UDP/ICMP)
application helpers currently (FTP,IRC,H.323,talk,SNMP)
divides packets in the following four categories
NEW - would establish new connection
ESTABLISHED - part of already established connection
RELATED - is related to established connection
INVALID - (multicast, errors...)
does _NOT_ filter packets itself
can be utilized by iptables using the 'state' match
is used by NAT Subsystem
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Common structures
struct ip_conntrack_tuple, representing unidirectional flow
layer 3 src + dst
layer 4 protocol
layer 4 src + dst
connetions represented as struct ip_conntrack
original tuple
reply tuple
timeout
l4 state private data
app helper
app helper private data
expected connections
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Flow of events for new packet
packet enters NF_IP_PRE_ROUTING
tuple is derived from packet
lookup conntrack hash table with hash(tuple) -> fails
new ip_conntrack is allocated
fill in original and reply == inverted(original) tuple
initialize timer
assign app helper if applicable
see if we've been expected -> fails
call layer 4 helper 'new' function
...
packet enters NF_IP_POST_ROUTING
do hashtable lookup for packet -> fails
place struct ip_conntrack in hashtable
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Flow of events for packet part of existing connection
packet enters NF_IP_PRE_ROUTING
tuple is derived from packet
lookup conntrack hash table with hash(tuple)
assosiate conntrack entry with skb->nfct
call l4 protocol helper 'packet' function
do l4 state tracking
update timeouts as needed [i.e. TCP TIME_WAIT,...]
...
packet enters NF_IP_POST_ROUTING
do hashtable lookup for packet -> succeds
do nothing else
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Poor man's failover
principle
every node does it's own tracking, no state replicating
two possible implementations
connect every node to shared media (i.e. real ethernet)
forwarding only turned on on active node
slave nodes use promiscuous mode to sniff packets
copy all traffic to slave nodes
active master needs to copy all traffic to other nodes
disadvantage: high load, sync traffic == payload traffic
advantages
very easy implementation
only addition of sniffing mode to conntrack needed
existing means of address takeover can be used
same load on active master and slave nodes
disadvantages
can only be used with real shared media (no switches, ...)
can not be used with NAT
remaining problem
no initial state sync after reboot of slave node!
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Real state replication
Parts needed
state replication protocol
multicast based
sequence numbers for detection of packet loss
NACK-based retransmission
no security, since private ethernet segment to be used
event interface on active node
calling out to callback function at all state changes
exported interface to manipulate conntrack hash table
kernel thread for sending conntrack state protocol messages
registers with event interface
creates and accumulates state replication packets
sends them via in-kernel sockets api
kernel thread for receiving conntrack state replication messages
receives state replication packets via in-kernel sockets
uses conntrack hashtable manipulation interface
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Real state replication
Flow of events in chronological order:
on active node, inside the network RX softirq
connection tracking code is analyzing a forwarded packet
connection tracking gathers some new state information
connection tracking updates local connection tracking database
connection tracking sends event message to event API
on active node, inside the conntrack-sync kernel thread
conntrack sync daemon receives event through event API
conntrack sync daemon aggregates multiple event messages into a state replication protocol message, removing possible redundancy
conntrack sync daemon generates state replication protocol message
conntrack sync daemon sends state replication protocol message
on slave node(s), inside network RX softirq
connection tracking code ignores packets coming from the interface attached to the private conntrac sync network
state replication protocol messages is appended to socket receive queue of conntrack-sync kernel thread
on slave node(s), inside conntrack-sync kernel thread
conntrack sync daemon receives state replication message
conntrack sync daemon creates/updates conntrack entry
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Neccessary changes to kernel
Neccessary changes to current conntrack core
event generation (callback functions) for all state changes
conntrack hashtable manipulation API
is needed (and already implemented) for 'ctnetlink' API
conntrack exemptions
needed to _not_ track conntrack state replication packets
is needed for other cases as well
currently being developed by Jozsef Kadlecsik
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Other current work
conntrack hash function optimization
current hash function not good for even hash bucket count
other hash functions in development
hash function evaluation tool [cttest] avaliable
introduce per-system randomness to prevent hash attack
conntrack code optimization (locking/timers/...)
conntrack exemptions
not useable when NAT is active
SLOLG (session log)
maybe netflow compatible logs?
getting our work submitted into the mainstream kernel
turns out to be more difficult as expected
newnat has finally made it into 2.4.19
discussions about multiple targets/actions per rule
technical implementation easy
however, not everybody convinced that it fits into the concept
using tc for firewalling
Jamal Hadi Selim uses iptables targets from within TC
leads to discussion of generic classification engine API in kernel
netfilter for MPLS
implementation of mpls-ping-draft as netfilter module
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Thanks
The slides and the an according paper of this presentation are available at http://www.gnumonks.org/
The netfilter homepage http://www.netfilter.org/
Thanks to
the BBS people, Z-Netz, FIDO, ...
for heavily increasing my computer usage in 1992
KNF (http://www.franken.de/)
for bringing me in touch with the internet as early as 1994
for providing a playground for technical people
for telling me about the existance of Linux!
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
for implementing (one of?) the world's best TCP/IP stacks
Paul 'Rusty' Russell
for starting the netfilter/iptables project
for trusting me to maintain it today
Astaro AG
for sponsoring parts of my netfilter work
for sponsoring my flight ticket to this conference

View File

@ -0,0 +1,12 @@
The netfilter/iptables system is about three years old. With Linux kernel 2.4.x being deployed widely during the last two years, lots of systems worldwide are using netfilter/iptables as their packet filtering subsystem.
netfilter/iptables is no doubt a big improvement over the old ipchains system in the 2.2.x kernels. Hoewever, as with any project - after wide deployment for some time, we start to discover aspects that can be implemented more cleanly, more efficently.
The constant innovation and development of new applications and protocols (like SIP) on the internet also raise new requirements towards the linux packet filter.
So the question is: Is it time for yet another generation of the linux packet filtering subsystem? Will the tradition of change (ipfwadm->ipchains->iptables->?) be continued? Or can we integrate all necessarry changes within the current framework?
The presentation will cover a summary of the problems with the current netfilter/iptables implementation and describe the proposed solutions.
Intended Audience: System and Network Administrators
Prerequsites: Knowledge about Packet Filters. Usage of iptables.

View File

@ -0,0 +1,22 @@
<a href="http://www.gnumonks.org/users/laforge/">Harald Welte</a> is one
of the five <a href="http://www.netfilter.org/">netfilter/iptables</a> core
team members, and the current Linux 2.4.x firewalling maintainer.
His main interest in computing has always been networking. In the few time
left besides netfilter/iptables related work, he's writing obscure documents
like the <a href="http://www.gnumonks.org/ftp/pub/doc/uucp-over-ssl.html"> UUCP
over SSL HOWTO</a>. Other kernel-related projects he has been contributing are
user mode linux and the international (crypto) kernel patch.
In the past he has been working as an independent IT Consultant working on
closed-source projecst for various companies ranging from banks to
manufacturers of networking gear. During the year 2001 he was living in
Curitiba (Brazil), where he got sponsored for his Linux related work by
<a href="http://www.conectiva.com/">Conectiva Inc.</a>.
Starting with February 2002, Harald has been contracted part-time by
<a href="http://www.astaro.com/">Astaro AG</a>, who are sponsoring him for his
current netfilter/iptables work.
Harald is living in Berlin, Germany.

View File

@ -0,0 +1,19 @@
- pkttables
- linked lists instead of blob
- explain current situation
- dynamic rulesets are slow with iptables
- independent of layer 3 protocol
- current code duplication between [ip|ip6|arp]tables
- some matches (mac, interface, ...) are independent anyway
- nfnetlink
- idea
- ctnetlink
- iptnetlink / pkttnetlink
- ulog/queue port to it
- libnfnetlink, libctnetlink, libpkttnetlink
- libiptables / libpkttnetlink
- high-level API for rule-manipulation
- covering all the plugins which are currently part of iptables
- failover / load balancing for stateful firewalls
- slides from OLS

View File

@ -0,0 +1,299 @@
%include "default.mgp"
%default 1 bgrad
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
%nodefault
%back "blue"
%center
%size 7
The future of Linux packet filtering
%center
%size 4
by
Harald Welte <laforge@netfilter.org>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Contents
Problems with current 2.4/2.5 netfilter/iptables
Solution to code replication
Solution for dynamic rulesets
Solution for API to GUI's and other management programs
Other current work
Optimizing Rule load time of large rulesets
Making netfilter/iptables compatible with zerocopy tcp
HA for stateful firewalling
What's special about firewalling HA
Poor man's failover
Real state replication
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Problems with 2.4.x netfilter/iptables
code replication between iptables/ip6tables/arptables
iptables not meant for other protocols, but people did copy+paste 'ports'
replication of
core kernel code
layer 3 independent matches (mac, interface, ...)
userspace library (libiptc)
userspace tool (iptables)
userspace plugins (libipt_xxx.so)
doesn't suit the needs for dynamically changing rulesets
dynamic rulesets becomming more common due (service selection, IDS)
a whole table is created in userspace and sent as blob to kernel
for every ruleset the table needs to be copied to userspace and back
inside kernel consistency checks on whole table, loop detection
too extensible for writing any forward-compatible GUI
new extensions showing up all the time
a frontend would need to know about the options and use of a new extension
thus frontends are always incomplete and out-of-date
no high-level API other than piping to iptables-restore
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Reducing code replication
code replication is a real problem: unclean, bugfixes missed
we need layer 3 independent layer for
submitting rules to the kernel
traversing packet-rulesets supporting match/target modules
registering matches/targets
layer 3 specific (like matching ipv4 address)
layer 3 independent (like matching MAC address)
solution
pkt_tables inside kernel
pkt_tables_ipv4 registers layer 3 handler with pkt_tables
pkt_tables_ipv6 registers layer 3 handler with pkt_tables
everybody registering a pkt_table (like iptable_filter) needs to specify the l3 protocol
libraries in userspace (see later)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Supporting dynamic rulesets
atomic table-replacement turned out to be bad idea
need new interface for sending individual rules to kernel
policy routing has the same problem and good solution: rtnetlink
solution: nfnetlink
multicast-netlink based packet-orinented socket between kernel and userspace
has extra benefit that other userspace processes get notified of rule changes [just like routing daemons]
nfnetlink will be low-layer below all kernel/userspace communication
pkttnetlink [aka iptnetlink]
ctnetlink
ulog
ip_queue
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Communication with other programs
libnfnetlink for low-layer communication
libpkttnetlink for rule modifications
will handle all plugins [which are currently part of iptables]
query functions about avaliable matches/targets
query functions about parameters
query functions for help messages about specific match/parameter of a match
generic structure from which rules can be built
conversion functions to parse generic structure into in-kernel structure
conversion functions to perse kernel structure into generic structure
functions to convert generic structure in plain text
libipq will stay API-compatible to current version
libipulog will stay API-compatible to current version
libiptc will go away [compatibility layer extremely difficult]
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Optimizing rule load time
Current situation
loading 10,000 rules in 1,000 chains takes about 4 minutes on a PIII 733Mhz
this is caused by two bottlenecks
loop detection algorithm on kernel side inefficient
a couple of O^2 complexity functions in libiptc
Solution
efficient loop detection and mark_source_chains() algorithm (graph coloring)
current CVS libiptc with only one O^2 function: 2minutes37
whole reimplementation of libiptc needed for removing the last O^2 function
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Optimizing the connection tracking code
Conntrack hash function optimization
old hash function not good for even hash bucket count
hash function evaluation tool [cttest] avaliable
other hash functions in development (already in 2.4.21)
introduce per-system randomness to prevent hash attack
code optimization (locking/timers/...)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
netfilter and zerocopy TCP
Current situation (2.4.x)
skb_linearize() at each netfilter hook effectively prevents zerocopy TCP to work if netfilter/iptables is enabled
this is a big performance loss on stand-alone servers which filter packets locally
Solution
remove skb_linearize() from conntrack, nat and ip_tables core
all iptables extensions and conntrack/nat helpers have to use skb_copy_bits() if they want to access data beyond layer 4 header
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Introduction
What is special about firewall failover?
Nothing, in case of the stateless packet filter
Common IP takeover solutions can be used
VRRP
Hartbeat
Distribution of packet filtering ruleset no problem
can be done manually
or implemented with simple userspace process
Problems arise with stateful packet filters
Connection state only on active node
NAT mappings only on active node
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Poor man's failover
principle
let every node do it's own tracking rather than replicating state
two possible implementations
connect every node to shared media (i.e. real ethernet)
forwarding only turned on on active node
slave nodes use promiscuous mode to sniff packets
copy all traffic to slave nodes
active master needs to copy all traffic to other nodes
disadvantage: high load, sync traffic == payload traffic
IMHO stupid way of solving the problem
advantages
very easy implementation
only addition of sniffing mode to conntrack needed
existing means of address takeover can be used
same load on active master and slave nodes
no additional load on active master
disadvantages
can only be used with real shared media (no switches, ...)
can not be used with NAT
remaining problem
no initial state sync after reboot of slave node!
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Real state replication
Parts needed
state replication protocol
multicast based
sequence numbers for detection of packet loss
NACK-based retransmission
no security, since private ethernet segment to be used
event interface on active node
calling out to callback function at all state changes
exported interface to manipulate conntrack hash table
kernel thread for sending conntrack state protocol messages
registers with event interface
creates and accumulates state replication packets
sends them via in-kernel sockets api
kernel thread for receiving conntrack state replication messages
receives state replication packets via in-kernel sockets
uses conntrack hashtable manipulation interface
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Real state replication
Flow of events in chronological order:
on active node, inside the network RX softirq
connection tracking code is analyzing a forwarded packet
connection tracking gathers some new state information
connection tracking updates local connection tracking database
connection tracking sends event message to event API
on active node, inside the conntrack-sync kernel thread
conntrack sync daemon receives event through event API
conntrack sync daemon aggregates multiple event messages into a state replication protocol message, removing possible redundancy
conntrack sync daemon generates state replication protocol message
conntrack sync daemon sends state replication protocol message
on slave node(s), inside network RX softirq
connection tracking code ignores packets coming from the interface attached to the private conntrac sync network
state replication protocol messages is appended to socket receive queue of conntrack-sync kernel thread
on slave node(s), inside conntrack-sync kernel thread
conntrack sync daemon receives state replication message
conntrack sync daemon creates/updates conntrack entry
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Neccessary changes to kernel
Neccessary changes to current conntrack core
event generation (callback functions) for all state changes
conntrack hashtable manipulation API
is needed (and already implemented) for 'ctnetlink' API
conntrack exemptions
needed to _not_ track conntrack state replication packets
is needed for other cases as well
currently being developed by Jozsef Kadlecsik
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Thanks
The slides of this presentation are available at http://www.gnumonks.org/
Visit the netfilter homepage http://www.netfilter.org/
Thanks to
the BBS people, Z-Netz, FIDO, ...
for heavily increasing my computer usage in 1992
KNF
for bringing me in touch with the internet as early as 1994
for providing a playground for technical people
for telling me about the existance of Linux!
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
for implementing (one of?) the world's best TCP/IP stacks
Paul 'Rusty' Russell
for starting the netfilter/iptables project
for trusting me to maintain it today
Astaro AG
for sponsoring most of my current netfilter work

View File

@ -0,0 +1,304 @@
%include "default.mgp"
%default 1 bgrad
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
%nodefault
%back "blue"
%center
%size 7
The future of Linux packet filtering
%center
%size 4
by
Harald Welte <laforge@netfilter.org>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Contents
Problems with current 2.4/2.5 netfilter/iptables
Solution to code replication
Solution for dynamic rulesets
Solution for API to GUI's and other management programs
Other current work
Optimizing Rule load time of large rulesets
Making netfilter/iptables compatible with zerocopy tcp
HA for stateful firewalling
What's special about firewalling HA
Poor man's failover
Real state replication
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Problems with 2.4.x netfilter/iptables
code replication between iptables/ip6tables/arptables
iptables was never meant for other protocols, but people did copy+paste 'ports'
replication of
core kernel code
layer 3 independent matches (mac, interface, ...)
userspace library (libiptc)
userspace tool (iptables)
userspace plugins (libipt_xxx.so)
doesn't suit the needs for dynamically changing rulesets
dynamic rulesets becomming more common due (service selection, IDS)
a whole table is created in userspace and sent as blob to kernel
for every ruleset the table needs to be copied to userspace and back
inside kernel consistency checks on whole table, loop detection
too extensible for writing any forward-compatible GUI
new extensions showing up all the time
a frontend would need to know about the options and use of a new extension
thus frontends are always incomplete and out-of-date
no high-level API other than piping to iptables-restore
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Reducing code replication
code replication is a real problem: unclean, bugfixes missed
we need layer 3 independent layer for
submitting rules to the kernel
traversing packet-rulesets supporting match/target modules
registering matches/targets
layer 3 specific (like matching ipv4 address)
layer 3 independent (like matching MAC address)
solution
pkt_tables inside kernel
pkt_tables_ipv4 registers layer 3 handler with pkt_tables
pkt_tables_ipv6 registers layer 3 handler with pkt_tables
everybody registering a pkt_table (like iptable_filter) needs to specify the l3 protocol
libraries in userspace (see later)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Supporting dynamic rulesets
atomic table-replacement turned out to be bad idea
need new interface for sending individual rules to kernel
policy routing has the same problem and good solution: rtnetlink
solution: nfnetlink
multicast-netlink based packet-orinented socket between kernel and userspace
has extra benefit that other userspace processes get notified of rule changes [just like routing daemons]
nfnetlink will be low-layer below all kernel/userspace communication
pkttnetlink [aka iptnetlink]
ctnetlink
ulog
ip_queue
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Communication with other programs
whole set of libraries
libnfnetlink for low-layer communication
libpkttnetlink for rule modifications
will handle all plugins [which are currently part of iptables]
query functions about avaliable matches/targets
query functions about parameters
query functions for help messages about specific match/parameter of a match
generic structure from which rules can be built
conversion functions to parse generic structure into in-kernel structure
conversion functions to perse kernel structure into generic structure
functions to convert generic structure in plain text
libipq will stay API-compatible to current version
libipulog will stay API-compatible to current version
libiptc will go away [compatibility layer extremely difficult]
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Optimizing rule load time
Current situation
loading 10,000 rules in 1,000 chains takes about 4 minutes on a PIII 733Mhz
this is caused by two bottlenecks
loop detection algorithm on kernel side inefficient
a couple of O^2 complexity functions in libiptc
Solution
efficient loop detection and mark_source_chains() algorithm (graph coloring)
current CVS libiptc with only one O^2 function: 2minutes37
whole reimplementation of libiptc needed for removing the last O^2 function
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Optimizing the connection tracking code
Conntrack hash function optimization
old hash function not good for even hash bucket count
hash function evaluation tool [cttest] avaliable
other hash functions in development (already in 2.4.21)
introduce per-system randomness to prevent hash attack
code optimization (locking/timers/...)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
netfilter and zerocopy TCP
Current situation (2.4.x)
skb_linearize() at each netfilter hook effectively prevents zerocopy TCP to work if netfilter/iptables is enabled
this is a big performance loss on stand-alone servers which filter packets locally
Solution
remove skb_linearize() from conntrack, nat and ip_tables core
all iptables extensions and conntrack/nat helpers have to use skb_copy_bits() if they want to access data beyond layer 4 header
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Introduction
What is special about firewall failover?
Nothing, in case of the stateless packet filter
Common IP takeover solutions can be used
VRRP
Hartbeat
Distribution of packet filtering ruleset no problem
can be done manually
or implemented with simple userspace process
Problems arise with stateful packet filters
Connection state only on active node
NAT mappings only on active node
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Poor man's failover
Poor man's failover
principle
let every node do it's own tracking rather than replicating state
two possible implementations
connect every node to shared media (i.e. real ethernet)
forwarding only turned on on active node
slave nodes use promiscuous mode to sniff packets
copy all traffic to slave nodes
active master needs to copy all traffic to other nodes
disadvantage: high load, sync traffic == payload traffic
IMHO stupid way of solving the problem
advantages
very easy implementation
only addition of sniffing mode to conntrack needed
existing means of address takeover can be used
same load on active master and slave nodes
no additional load on active master
disadvantages
can only be used with real shared media (no switches, ...)
can not be used with NAT
remaining problem
no initial state sync after reboot of slave node!
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Real state replication
Parts needed
state replication protocol
multicast based
sequence numbers for detection of packet loss
NACK-based retransmission
no security, since private ethernet segment to be used
event interface on active node
calling out to callback function at all state changes
exported interface to manipulate conntrack hash table
kernel thread for sending conntrack state protocol messages
registers with event interface
creates and accumulates state replication packets
sends them via in-kernel sockets api
kernel thread for receiving conntrack state replication messages
receives state replication packets via in-kernel sockets
uses conntrack hashtable manipulation interface
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Real state replication
Flow of events in chronological order:
on active node, inside the network RX softirq
connection tracking code is analyzing a forwarded packet
connection tracking gathers some new state information
connection tracking updates local connection tracking database
connection tracking sends event message to event API
on active node, inside the conntrack-sync kernel thread
conntrack sync daemon receives event through event API
conntrack sync daemon aggregates multiple event messages into a state replication protocol message, removing possible redundancy
conntrack sync daemon generates state replication protocol message
conntrack sync daemon sends state replication protocol message
on slave node(s), inside network RX softirq
connection tracking code ignores packets coming from the interface attached to the private conntrac sync network
state replication protocol messages is appended to socket receive queue of conntrack-sync kernel thread
on slave node(s), inside conntrack-sync kernel thread
conntrack sync daemon receives state replication message
conntrack sync daemon creates/updates conntrack entry
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Neccessary changes to kernel
Neccessary changes to current conntrack core
event generation (callback functions) for all state changes
conntrack hashtable manipulation API
is needed (and already implemented) for 'ctnetlink' API
conntrack exemptions
needed to _not_ track conntrack state replication packets
is needed for other cases as well
currently being developed by Jozsef Kadlecsik
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Thanks
The slides of this presentation are available at http://www.gnumonks.org/
Visit the netfilter homepage http://www.netfilter.org/
Thanks to
the BBS people, Z-Netz, FIDO, ...
for heavily increasing my computer usage in 1992
KNF
for bringing me in touch with the internet as early as 1994
for providing a playground for technical people
for telling me about the existance of Linux!
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
for implementing (one of?) the world's best TCP/IP stacks
Paul 'Rusty' Russell
for starting the netfilter/iptables project
for trusting me to maintain it today
Astaro AG
for sponsoring most of my current netfilter work

View File

@ -0,0 +1,318 @@
\documentstyle{seminar}
\begin{document}
\vspace{3mm}
\begin{slide}
\vspace{3mm}
\begin{center}
\vspace{3mm}
\vspace{3mm}
The future of Linux packet filtering\\
\vspace{3mm}
\vspace{3mm}
\vspace{3mm}
\end{center}
\begin{center}
by\\
\vspace{3mm}
Harald Welte <laforge@netfilter.org>\\
\vspace{3mm}
\vspace{3mm}
\end{center}
\end{slide}
\begin{slide}
Future of Linux packet filtering\\
Contents\\
\vspace{3mm}
\vspace{3mm}
Problems with current 2.4/2.5 netfilter/iptables\\
Solution to code replication\\
Solution for dynamic rulesets\\
Solution for API to GUI's and other management programs\\
\vspace{3mm}
Other current work\\
Optimizing Rule load time of large rulesets\\
Making netfilter/iptables compatible with zerocopy tcp\\
\vspace{3mm}
HA for stateful firewalling\\
What's special about firewalling HA\\
Poor man's failover\\
Real state replication\\
\vspace{3mm}
\vspace{3mm}
\vspace{3mm}
\end{slide}
\begin{slide}
Future of Linux packet filtering\\
Problems with 2.4.x netfilter/iptables\\
\vspace{3mm}
code replication between iptables/ip6tables/arptables\\
iptables was never meant for other protocols, but people did copy+paste 'ports'\\
replication of\\
core kernel code\\
layer 3 independent matches (mac, interface, ...)\\
userspace library (libiptc)\\
userspace tool (iptables)\\
userspace plugins (libipt_xxx.so)\\
\vspace{3mm}
doesn't suit the needs for dynamically changing rulesets\\
dynamic rulesets becomming more common due (service selection, IDS)\\
a whole table is created in userspace and sent as blob to kernel\\
for every ruleset the table needs to be copied to userspace and back\\
inside kernel consistency checks on whole table, loop detection\\
\vspace{3mm}
too extensible for writing any forward-compatible GUI\\
new extensions showing up all the time\\
a frontend would need to know about the options and use of a new extension\\
thus frontends are always incomplete and out-of-date\\
no high-level API other than piping to iptables-restore\\
\vspace{3mm}
\vspace{3mm}
\end{slide}
\begin{slide}
Future of Linux packet filtering\\
Reducing code replication\\
\vspace{3mm}
code replication is a real problem: unclean, bugfixes missed\\
we need layer 3 independent layer for\\
submitting rules to the kernel\\
traversing packet-rulesets supporting match/target modules\\
registering matches/targets\\
layer 3 specific (like matching ipv4 address)\\
layer 3 independent (like matching MAC address)\\
\vspace{3mm}
solution\\
pkt_tables inside kernel\\
pkt_tables_ipv4 registers layer 3 handler with pkt_tables\\
pkt_tables_ipv6 registers layer 3 handler with pkt_tables\\
everybody registering a pkt_table (like iptable_filter) needs to specify the l3 protocol\\
libraries in userspace (see later)\\
\vspace{3mm}
\vspace{3mm}
\end{slide}
\begin{slide}
Future of Linux packet filtering\\
Supporting dynamic rulesets\\
\vspace{3mm}
atomic table-replacement turned out to be bad idea\\
need new interface for sending individual rules to kernel\\
policy routing has the same problem and good solution: rtnetlink\\
solution: nfnetlink\\
multicast-netlink based packet-orinented socket between kernel and userspace\\
has extra benefit that other userspace processes get notified of rule changes [just like routing daemons]\\
nfnetlink will be low-layer below all kernel/userspace communication\\
pkttnetlink [aka iptnetlink]\\
ctnetlink\\
ulog\\
ip_queue\\
\vspace{3mm}
\vspace{3mm}
\end{slide}
\begin{slide}
Future of Linux packet filtering\\
Communication with other programs\\
\vspace{3mm}
whole set of libraries\\
libnfnetlink for low-layer communication\\
libpkttnetlink for rule modifications\\
will handle all plugins [which are currently part of iptables]\\
query functions about avaliable matches/targets\\
query functions about parameters\\
query functions for help messages about specific match/parameter of a match\\
generic structure from which rules can be built\\
conversion functions to parse generic structure into in-kernel structure\\
conversion functions to perse kernel structure into generic structure\\
functions to convert generic structure in plain text\\
libipq will stay API-compatible to current version\\
libipulog will stay API-compatible to current version\\
libiptc will go away [compatibility layer extremely difficult]\\
\vspace{3mm}
\vspace{3mm}
\end{slide}
\begin{slide}
Future of Linux packet filtering\\
Optimizing rule load time\\
\vspace{3mm}
Current situation\\
loading 10,000 rules in 1,000 chains takes about 4 minutes on a PIII 733Mhz\\
this is caused by two bottlenecks\\
loop detection algorithm on kernel side inefficient\\
a couple of O^2 complexity functions in libiptc\\
\vspace{3mm}
Solution\\
efficient loop detection and mark_source_chains() algorithm (graph coloring)\\
current CVS libiptc with only one O^2 function: 2minutes37\\
whole reimplementation of libiptc needed for removing the last O^2 function \\
\vspace{3mm}
\vspace{3mm}
\end{slide}
\begin{slide}
HA for netfillter/iptables\\
Optimizing the connection tracking code\\
\vspace{3mm}
Conntrack hash function optimization\\
old hash function not good for even hash bucket count\\
hash function evaluation tool [cttest] avaliable\\
other hash functions in development (already in 2.4.21)\\
introduce per-system randomness to prevent hash attack\\
code optimization (locking/timers/...)\\
\vspace{3mm}
\vspace{3mm}
\vspace{3mm}
\end{slide}
\begin{slide}
Future of Linux packet filtering\\
netfilter and zerocopy TCP\\
\vspace{3mm}
Current situation (2.4.x)\\
skb_linearize() at each netfilter hook effectively prevents zerocopy TCP to work if netfilter/iptables is enabled\\
this is a big performance loss on stand-alone servers which filter packets locally\\
\vspace{3mm}
Solution\\
remove skb_linearize() from conntrack, nat and ip_tables core\\
all iptables extensions and conntrack/nat helpers have to use skb_copy_bits() if they want to access data beyond layer 4 header\\
\vspace{3mm}
\vspace{3mm}
\end{slide}
\begin{slide}
HA for netfillter/iptables\\
Introduction\\
\vspace{3mm}
What is special about firewall failover?\\
\vspace{3mm}
Nothing, in case of the stateless packet filter\\
Common IP takeover solutions can be used\\
VRRP\\
Hartbeat\\
\vspace{3mm}
Distribution of packet filtering ruleset no problem\\
can be done manually\\
or implemented with simple userspace process\\
\vspace{3mm}
Problems arise with stateful packet filters\\
Connection state only on active node\\
NAT mappings only on active node\\
\vspace{3mm}
\vspace{3mm}
\vspace{3mm}
\end{slide}
\begin{slide}
HA for netfillter/iptables\\
Poor man's failover\\
\vspace{3mm}
Poor man's failover\\
principle\\
let every node do it's own tracking rather than replicating state\\
two possible implementations\\
connect every node to shared media (i.e. real ethernet)\\
forwarding only turned on on active node\\
slave nodes use promiscuous mode to sniff packets\\
copy all traffic to slave nodes\\
active master needs to copy all traffic to other nodes\\
disadvantage: high load, sync traffic == payload traffic\\
IMHO stupid way of solving the problem \\
advantages\\
very easy implementation\\
only addition of sniffing mode to conntrack needed\\
existing means of address takeover can be used\\
same load on active master and slave nodes\\
no additional load on active master\\
disadvantages\\
can only be used with real shared media (no switches, ...)\\
can not be used with NAT\\
remaining problem\\
no initial state sync after reboot of slave node!\\
\vspace{3mm}
\vspace{3mm}
\vspace{3mm}
\end{slide}
\begin{slide}
HA for netfillter/iptables\\
Real state replication\\
\vspace{3mm}
Parts needed\\
state replication protocol\\
multicast based\\
sequence numbers for detection of packet loss\\
NACK-based retransmission\\
no security, since private ethernet segment to be used\\
event interface on active node\\
calling out to callback function at all state changes\\
exported interface to manipulate conntrack hash table\\
kernel thread for sending conntrack state protocol messages\\
registers with event interface\\
creates and accumulates state replication packets\\
sends them via in-kernel sockets api\\
kernel thread for receiving conntrack state replication messages\\
receives state replication packets via in-kernel sockets\\
uses conntrack hashtable manipulation interface\\
\vspace{3mm}
\vspace{3mm}
\vspace{3mm}
\end{slide}
\begin{slide}
HA for netfillter/iptables\\
Real state replication\\
\vspace{3mm}
Flow of events in chronological order:\\
on active node, inside the network RX softirq\\
connection tracking code is analyzing a forwarded packet\\
connection tracking gathers some new state information\\
connection tracking updates local connection tracking database\\
connection tracking sends event message to event API\\
on active node, inside the conntrack-sync kernel thread\\
conntrack sync daemon receives event through event API\\
conntrack sync daemon aggregates multiple event messages into a state replication protocol message, removing possible redundancy\\
conntrack sync daemon generates state replication protocol message\\
conntrack sync daemon sends state replication protocol message\\
on slave node(s), inside network RX softirq\\
connection tracking code ignores packets coming from the interface attached to the private conntrac sync network\\
state replication protocol messages is appended to socket receive queue of conntrack-sync kernel thread\\
on slave node(s), inside conntrack-sync kernel thread\\
conntrack sync daemon receives state replication message\\
conntrack sync daemon creates/updates conntrack entry\\
\vspace{3mm}
\vspace{3mm}
\end{slide}
\begin{slide}
HA for netfillter/iptables\\
Neccessary changes to kernel\\
\vspace{3mm}
Neccessary changes to current conntrack core\\
\vspace{3mm}
event generation (callback functions) for all state changes\\
\vspace{3mm}
conntrack hashtable manipulation API\\
is needed (and already implemented) for 'ctnetlink' API\\
\vspace{3mm}
conntrack exemptions\\
needed to _not_ track conntrack state replication packets\\
is needed for other cases as well\\
currently being developed by Jozsef Kadlecsik\\
\vspace{3mm}
\vspace{3mm}
\vspace{3mm}
\end{slide}
\begin{slide}
Future of Linux packet filtering\\
Thanks\\
The slides of this presentation are available at http://www.gnumonks.org/\\
\vspace{3mm}
Visit the netfilter homepage http://www.netfilter.org/\\
\vspace{3mm}
Thanks to\\
the BBS people, Z-Netz, FIDO, ...\\
for heavily increasing my computer usage in 1992\\
KNF\\
for bringing me in touch with the internet as early as 1994\\
for providing a playground for technical people\\
for telling me about the existance of Linux!\\
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen\\
for implementing (one of?) the world's best TCP/IP stacks\\
Paul 'Rusty' Russell\\
for starting the netfilter/iptables project\\
for trusting me to maintain it today\\
Astaro AG\\
for sponsoring most of my current netfilter work\\
\vspace{3mm}
\end{slide}
\end{document}

View File

@ -0,0 +1,73 @@
0 - introduction/definition: Firewalls, Proxies, Packet Filters
- present myself and my function within the netfilter coreteam
- what is a firewall
- packet filters at networking layer
- inspect each packet and make a choice based on the packet
- traditionally don't know about connections (== layer 4)
- advantage: fast, transparent
- disadvantage: filtering limited to l3+l4 (sometimes l2)
- proxies at application layer
- terminate two connections (client->proxy and proxy->server)
- advantage: can base policy decision on application protocol
- disadvantage: not transparent at all (not even transparent proxies)
- result: both of them have their application.
- history of linux packet filtering
- ipfwadm (2.0)
- ipchains (2.2)
- iptables (2.4+2.6)
- pkttables (2.6+)
- iptables was developed together with netfilter in the 2.3.x kernel series
1 - Why a free software firewall?
- the internet was built on free/open standards and software
- security relevant open sourcecode gets more auditing because more people read it (and thus report bugs)
- users can put more trust in FOSS, since they can check for hidden backdoors
- packet filters are used like routers. They are core infrastructure of the internet. Infrastructure should be open/free for the public, just like roads.
- Everybody should be able to learn and understand how packet filtering works
- Infrastructure should not depend on monopolistic companies.
- problem if company goes bankrupt
- dependent on 'upgrade pressure' and future license changes
- no possibility to adopt it to new standards if vendor doesn't want to support it
2 - What can you do with netfilter/iptables
- stateless packet filtering
- matches: mac, src/dst ip, src/dst port,
- stateful packet filtering by using connection tracking
- keeps state table about all ongoing connections
- supports l4 TCP,UDP,ICMP,GRE,PPTP
- supports l5+ complex protocols like ftp,pptp,h323,talk,...
- IP accounting (every rule has a packet/byte counter)
- Network Adress Translation (NAT/NAPT)
- Stateful, based on Connection tracking
- Source NAT / Masquerading
- Destination NAT / Redirect
- 1:1 NAT of whole networks (NETMAP)
- supports l5+ complex protocols like ftp,pptp,h323,talk,...
- Packet Mangling
- Clamp TCP MSS to PMTU
- Manipulate packet header (TTL, ECN, DSCP, ...)
- Combine with policy routing / traffic shaping systems
- stateless IPv6 packet filtering using ip6tables
3 - Who is behind the project? How to get involved?
- started by Paul 'Rusty' Russell from Australia (co-author of ipchains)
- Marc Boucher (Canada) and James Morris (Australia) dropped in
- Harald Welte (Germany), Jozsef Kadlecsik (Hungary), Martin Josefsson (Sweden) joined coreteam
- Countless contributions from hundreds of poeple all over the world
- used to keep a scoreboard, but it was eating too much time
- Project internet presence:
- HTTP (www.netfilter.org)
- FTP (ftp.netfilter.org)
- RSYNC (rsync.netfilter.org)
- CVS (pserver.netfilter.org)
- 5 mailinglists (lists.netfilter.org)
- Bugzilla (bugzilla.netfilter.org)
- CVSweb (http://cvs.netfilter.org)
- Anybody can contribute, as long as the contribution is GPL licensed
- development happens on netfilter-devel@lists.netfilter.org
- user questions belong to netfilter@lists.netfilter.org
- security relevant findings to coreteam@netfilter.org
Iptables is used by a lot of commercial [and also proprietary] products. Companies like Astaro and Smoothwall are offering iptables-based firewall appliances. Other companies (like Linksys, Belkin, ...) are embedding iptables into their wavelan access points - and users don't even know that they are using iptables.

View File

@ -0,0 +1,220 @@
%include "default.mgp"
%default 1 bgrad
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
%nodefault
%back "blue"
%center
%size 7
The netfilter/iptables project
%center
%size 4
by
Harald Welte <laforge@netfilter.org>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables project
Contents
Introduction: Firewalls, Proxies, Packet Filters
Why a free software firewall?
What can you do with netfilter/iptables?
Who is behind the project? How to get involved?
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables project
Introduction: Firewalls, Proxies, Packet Filters
Firewalls are security gateways between networks
Can be implemented in different ways, at different layers
Packet filters at networking layer (3)
inspect each packet and make decision based on the packet contents
traditionally don't know about connections
advantage: fast, transparent
disadvantage: filtering limited to l3 and l4 headers
Proxies at application layer (5-7)
terminate two connections (client->proxy and proxy->server)
advantage: can base decision on application protocol
disadvantage: not transparent, need application support
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables project
Introduction: Firewalls, Proxies, Packet Filters
However, the world is not that easy anymore since new techniques are blending those two concepts
stateful packet filters
keep state about existing connections/flows
allow even state tracking beyond l4 state
thus give packet filters some features of proxies
transparent proxies
can be implemented without application support
how 'transparent' do you want to be? to the client? the server? the network?
thus give proxies some of the transparency of packet filters
In reality it is sometimes hard to tell. netfilter/iptables implements a packet filter (stateless/stateful) and some support for transparent proxying.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables project
History of linux packet filtering
%size 3
1994: kernel 1.2.x (BSD4.4 ipfw)
first packet filter in the linux kernel
%size 3
1995: kernel 2.0.x (ipfwadm)
enhanced version of the old ipfw
first support for masquerading
%size 3
1997: kernel 2.2.x (ipchains)
enhanced version of ipfwadm
support for multiple lists of rules (chains)
support for transparent proxying
masquerading helpers for ftp/irc/quake/...
%size 3
2000: kernel 2.4.x (iptables)
totally new implementation (based on netfilter API)
allows for multiple tables (which each have multiple chains)
first support for stateful packet filtering
support for fully symmetric NAT (SNAT/DNAT/...)
%size 3
2003: kernel 2.6.0-testX (iptables)
breaking a tradition: no new packet filter (not yet...)
support for non-linear skb's (zerocopy TCP path)
%size 3
2003/4: kernel 2.7.x and later 2.6.x backport (pkttables)
totally new implementation
layer 3 independent packet filtering framework
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables project
Why a free software firewall?
Tradition
The internet was builton free/open standards and software
Code Quality
Security relevant open sourcecode gets more auditing because more people read it (and thus report/fix bugs)
Trust
Users can have more trust in FOSS, since they can check for hidden backdoors
Public infrastructure
Packet Filters (like routers) are core infrastructure of the internet.
Infrastructure should be open/free for the public, just like roads.
Arguments against proprietary software in infrastructure
What if the vendor of your product goes bankrupt?
Users are dependent on 'upgrade pressure' and future license changes
No possibility to adopt new standards if Vendor has no interest
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables project
What can you do using netfilter/iptables?
stateless packet filtering
provides matches for almost any criteria in the universe
stateful packet filtering (using connection tracking)
keeps state table about all ongoing connections
currently supports TCP/UDP/ICMP/GRE
currently supports l5+ helpers for ftp,irc,pptp,h323,talk,mms,tftp,...
network address translation
stateful, based on connection tracking
source NAT / Masquerading
destination NAT / redirect
1:1 nat of whole networks (NETMAP)
packet mangling
clamp TCP MSS to PMTU for broken PMTU discovery
manipulate packet header (TTL, ECN, DSCP, ...)
combine with policy routing / traffic shaping
stateless IPv6 packet filtering (ip6tables)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Who is behind netfilter/iptables?
Project started by Paul 'Rusty' Russell
Coreteam
Rusty, Marc Boucher, James Morris, Harald Welte, Jozsef Kadlecsik, Martin Josefsson
Elects a head of coreteam
Countless contributions from hundreds of people all over the world
In the past we had a scoreboard to keep track of the contributions
We are always in lack of volunteers, even for listadmin/webmaster/...
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables project
How to get involved?
Internet services:
Homepage - http://www.netfilter.org/
FTP Server - ftp://ftp.netfilter.org/
rsync server - rsync.netfilter.org
CVS server - pserver.netfilter.org
Bugzilla - http://bugzilla.netfilter.org/
CVSweb - http://cvs.netfilter.org/
Mailinglist - http://lists.netfilter.org/
Anybody can contribute, code has to be GPL licensed
Development discussion at netfilter-devel@lists.netfilter.org
User questions at netfilter@lists.netfilter.org
Security relevant issues at coreteam@netfilter.org
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables project
Areas of current development
pkttables (kernel part, pkttnetlink, libpkttnetlink, libpkttables)
make ULOG and ip_queue l3 independent (and move to nfnetlink)
optimizing connection tracking SMP performance
conntrack: support for more protocols (SCTP,...)
nf-hipac: highly optimized packet matching engine
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables project
Thanks
%size 4
The slides of this presentation are available at http://www.gnumonks.org/
Visit the netfilter homepage http://www.netfilter.org/
Thanks to
the BBS people, Z-Netz, FIDO, ...
for heavily increasing my computer usage in 1992
KNF (http://www.franken.de/)
for bringing me in touch with the internet as early as 1994
for providing a playground for technical people
for telling me about the existance of Linux!
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
for implementing (one of?) the world's best TCP/IP stacks
Paul 'Rusty' Russell
for starting the netfilter/iptables project
for trusting me to maintain it today
Astaro AG
for sponsoring most of my current netfilter work

View File

@ -0,0 +1,511 @@
%include "default.mgp"
%default 1 bgrad
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
%nodefault
%back "blue"
%center
%size 7
Linux 2.4.x netfilter/iptables
firewalling internals
%center
%size 4
by
Harald Welte <laforge@netfilter.org>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Contents
Introduction
Netfilter hooks in protocol stacks
Packet selection based on IP Tables
The Connection Tracking Subsystem
The NAT Subsystem based on netfilter + iptables
Packet filtering using the 'filter' table
Packet mangling using the 'mangle' table
Advanced netfilter concepts
Current development and Future
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Introduction
Why did we need netfilter/iptables?
Because ipchains...
has no infrastructure for passing packets to userspace
makes transparent proxying extremely difficult
has interface address dependent Packet filter rules
has Masquerading implemented as part of packet filtering
code is too complex and intermixed with core ipv4 stack
is neither modular nor extensible
only barely supports one special case of NAT (masquerading)
has only stateless packet filtering
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Introduction
Who's behind netfilter/iptables
Paul 'Rusty' Russell
co-author of iptables in Linux 2.2
was paid by Watchguard for about one Year of development
James Morris
userspace queuing (kernel, library and tools)
Marc Boucher
NAT and packet filtering controlled by one command
Mangle table
Harald Welte
Conntrack+NAT helper infrastructure (newnat)
Userspace packet logging (ULOG)
Jozsef Kadlecsik
TCP window tracking
H.323 conntrack + NAT helper
Non-core team contributors
http://www.netfilter.org/scoreboard/
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Netfilter Hooks
What is netfilter?
System of callback functions within network stack
Callback function to be called for every packet traversing certain point (hook) within network stack
Protocol independent framework
Hooks in layer 3 stacks (IPv4, IPv6, DECnet, ARP)
Multiple kernel modules can register with each of the hooks
Asynchronous packet handling in userspace (ip_queue)
Traditional packet filtering, NAT, ... is implemented on top of this framework
Can be used for other stuff interfacing with the core network stack, like DECnet routing daemon.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Netfilter Hooks
Netfilter architecture in IPv4
%font "typewriter"
%size 4
--->[1]--->[ROUTE]--->[3]--->[4]--->
| ^
| |
| [ROUTE]
v |
[2] [5]
| ^
| |
v |
%font "standard"
1=NF_IP_PRE_ROUTING
2=NF_IP_LOCAL_IN
3=NF_IP_FORWARD
4=NF_IP_POST_ROUTING
5=NF_IP_LOCAL_OUT
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Netfilter Hooks
Netfilter Hooks
Any kernel module may register a callback function at any of the hooks
The module has to return one of the following constants
NF_ACCEPT continue traversal as normal
NF_DROP drop the packet, do not continue
NF_STOLEN I've taken over the packet do not continue
NF_QUEUE enqueue packet to userspace
NF_REPEAT call this hook again
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
IP tables
Packet selection using IP tables
The kernel provides generic IP tables support
Each kernel module may create it's own IP table
The three major parts of 2.4 firewalling subsystem are implemented using IP tables
Packet filtering table 'filter'
NAT table 'nat'
Packet mangling table 'mangle'
Can potentially be used for other stuff, i.e. IPsec SPDB
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
IP Tables
Managing chains and tables
An IP table consists out of multiple chains
A chain consists out of a list of rules
Every single rule in a chain consists out of
match[es] (rule executed if all matches true)
target (what to do if the rule is matched)
%size 4
matches and targets can either be builtin or implemented as kernel modules
%size 5
The userspace tool iptables is used to control IP tables
handles all different kinds of IP tables
supports a plugin/shlib interface for target/match specific options
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
IP Tables
Basic iptables commands
To build a complete iptables command, we must specify
which table to work with
which chain in this table to use
an operation (insert, add, delete, modify)
one or more matches (optional)
a target
The syntax is
%font "typewriter"
%size 3
iptables -t table -Operation chain -j target match(es)
%font "standard"
%size 5
Example:
%font "typewriter"
%size 3
iptables -t filter -A INPUT -j ACCEPT -p tcp --dport smtp
%font "standard"
%size 5
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
IP Tables
Matches
Basic matches
-p protocol (tcp/udp/icmp/...)
-s source address (ip/mask)
-d destination address (ip/mask)
-i incoming interface
-o outgoing interface
Match extensions (examples)
tcp/udp TCP/udp source/destination port
icmp ICMP code/type
ah/esp AH/ESP SPID match
mac source MAC address
mark nfmark
length match on length of packet
limit rate limiting (n packets per timeframe)
owner owner uid of the socket sending the packet
tos TOS field of IP header
ttl TTL field of IP header
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
IP Tables
Targets
very dependent on the particular table.
Table specific targets will be discussed later
Generic Targets, always available
ACCEPT accept packet within chain
DROP silently drop packet
QUEUE enqueue packet to userspace
LOG log packet via syslog
ULOG log packet via ulogd
RETURN return to previous (calling) chain
foobar jump to user defined chain
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Packet Filtering
Overview
Implemented as 'filter' table
Registers with three netfilter hooks
NF_IP_LOCAL_IN (packets destined for the local host)
NF_IP_FORWARD (packets forwarded by local host)
NF_IP_LOCAL_OUT (packets from the local host)
Each of the three hooks has attached one chain (INPUT, FORWARD, OUTPUT)
Every packet passes exactly one of the three chains. Note that this is very different compared to the old 2.2.x ipchains behaviour.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Packet Filtering
Targets available within 'filter' table
Builtin Targets to be used in filter table
ACCEPT accept the packet
DROP silently drop the packet
QUEUE enqueue packet to userspace
RETURN return to previous (calling) chain
foobar user defined chain
Targets implemented as loadable modules
REJECT drop the packet but inform sender
MIRROR change source/destination IP and resend
LOG log via syslog
ULOG log via userspace
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Connection Tracking Subsystem
Connection tracking...
implemented seperately from NAT
enables stateful filtering
implementation
hooks into NF_IP_PRE_ROUTING to track packets
hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to see if packet passed filtering rules
protocol modules (currently TCP/UDP/ICMP)
application helpers currently (FTP,IRC,H.323,talk,SNMP)
divides packets in the following four categories
NEW - would establish new connection
ESTABLISHED - part of already established connection
RELATED - is related to established connection
INVALID - (multicast, errors...)
does _NOT_ filter packets itself
can be utilized by iptables using the 'state' match
is used by NAT Subsystem
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Common structures
struct ip_conntrack_tuple, representing unidirectional flow
layer 3 src + dst
layer 4 protocol
layer 4 src + dst
connetions represented as struct ip_conntrack
original tuple
reply tuple
timeout
l4 state private data
app helper
app helper private data
expected connections
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Flow of events for new packet
packet enters NF_IP_PRE_ROUTING
tuple is derived from packet
lookup conntrack hash table with hash(tuple) -> fails
new ip_conntrack is allocated
fill in original and reply == inverted(original) tuple
initialize timer
assign app helper if applicable
see if we've been expected -> fails
call layer 4 helper 'new' function
...
packet enters NF_IP_POST_ROUTING
do hashtable lookup for packet -> fails
place struct ip_conntrack in hashtable
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Flow of events for packet part of existing connection
packet enters NF_IP_PRE_ROUTING
tuple is derived from packet
lookup conntrack hash table with hash(tuple)
assosiate conntrack entry with skb->nfct
call l4 protocol helper 'packet' function
do l4 state tracking
update timeouts as needed [i.e. TCP TIME_WAIT,...]
...
packet enters NF_IP_POST_ROUTING
do hashtable lookup for packet -> succeds
do nothing else
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Network Address Translation
Overview
Previous Linux Kernels only implemented one special case of NAT: Masquerading
Linux 2.4.x can do any kind of NAT.
NAT subsystem implemented on top of netfilter, iptables and conntrack
NAT subsystem registers with all five netfilter hooks
'nat' Table registers chains PREROUTING, POSTROUTING and OUTPUT
Following targets available within 'nat' Table
SNAT changes the packet's source whille passing NF_IP_POST_ROUTING
DNAT changes the packet's destination while passing NF_IP_PRE_ROUTING
MASQUERADE is a special case of SNAT
REDIRECT is a special case of DNAT
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Network Address Translation
Source NAT
SNAT Example:
%font "typewriter"
%size 3
iptables -t nat -A POSTROUTING -j SNAT --to-source 1.2.3.4 -s 10.0.0.0/8
%font "standard"
%size 4
MASQUERADE Example:
%font "typewriter"
%size 3
iptables -t nat -A POSTROUTING -j MASQUERADE -o ppp0
%font "standard"
%size 5
Destination NAT
DNAT example
%font "typewriter"
%size 3
iptables -t nat -A PREROUTING -j DNAT --to-destination 1.2.3.4:8080 -p tcp --dport 80 -i eth1
%font "standard"
%size 4
REDIRECT example
%font "typewriter"
%size 3
iptables -t nat -A PREROUTING -j REDIRECT --to-port 3128 -i eth1 -p tcp --dport 80
%font "standard"
%size 5
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Packet Mangling
Purpose of mangle table
packet manipulation except address manipulation
Integration with netfilter
'mangle' table hooks in all five netfilter hooks
priority: after conntrack
Targets specific to the 'mangle' table:
DSCP - manipulate DSCP field
IPV4OPTSSTRIP - strip IPv4 options
MARK - change the nfmark field of the skb
TCPMSS - set TCP MSS option
TOS - manipulate the TOS bits
TTL - set / increase / decrease TTL field
Simple example:
%font "typewriter"
%size 3
iptables -t mangle -A PREROUTING -j MARK --set-mark 10 -p tcp --dport 80
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Advanced Netfilter concepts
%size 4
Userspace logging
flexible replacement for old syslog-based logging
packets to userspace via multicast netlink sockets
easy-to-use library (libipulog)
plugin-extensible userspace logging daemon (ulogd)
Can even be used to directly log into MySQL
Queuing
reliable asynchronous packet handling
packets to userspace via unicast netlink socket
easy-to-use library (libipq)
provides Perl bindings
experimental queue multiplex daemon (ipqmpd)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Current Development and Future
Netfilter (although it proved very stable) is still work in progress.
Areas of current development
infrastructure for conntrack manipulation from userspace
failover of stateful firewalls (conntrack sync)
making iptables layer3 independent (pkttables)
new userspace library (libiptables) to hide plugins from apps
more matches and targets for advanced functions (pool, hashslot)
more conntrack and NAT modules (RPC, SNMP, SMB, ...)
better IPv6 support (conntrack, more matches / targets)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Thanks
Thanks to
the BBS people, Z-Netz, FIDO, ...
for heavily increasing my computer usage in 1992
KNF
for bringing me in touch with the internet as early as 1994
for providing a playground for technical people
for telling me about the existance of Linux!
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
for implementing (one of?) the world's best TCP/IP stacks
Paul 'Rusty' Russell
for starting the netfilter/iptables project
for trusting me to maintain it today
Astaro AG
for sponsoring parts of my netfilter work
sarovar.org
for sponsoring www.in.netfilter.org
%size 3
The slides and the an according paper of this presentation are available at http://www.gnumonks.org/
%size 3
The netfilter homepage http://www.netfilter.org/

View File

@ -0,0 +1,49 @@
Linux 2.4.x netfilter/iptables firewalling internals
The Linux 2.4.x kernel series has introduced a totally new kernel firewalling subsystem. It is much more than a plain successor of ipfwadm or ipchains.
The netfilter/iptables project has a very modular design and it's
sub-projects can be split in several parts: netfilter, iptables, connection
tracking, NAT and packet mangling.
While most users will already have learned how to use the basic functions
of netfilter/iptables in order to convert their old ipchains firewalls to
iptables, there's more advanced but less used functionality in
netfilter/iptables.
The presentation covers the design principles behind the netfilter/iptables
implementation. This knowledge enables us to understand how the individual
parts of netfilter/iptables fit together, and for which potential applications
this is useful.
Topics covered:
- overview about the internal netfilter/iptables architecture
- the netfilter hooks inside the network protocol stacks
- packet selection with IP tables
- how is connection tracking and NAT integrated into the framework
- the connection tracking system
- how good does it track the TCP state?
- how does it track ICMP and UDP state at all?
- layer 4 protocol helpers (GRE, ...)
- application helpers (ftp, irc, h323, ...)
- restrictions/limitations
- the NAT system
- how does it interact with connection tracking?
- layer 4 protocol helpers
- application helpers (ftp, irc, ...)
- misc
- how far is IPv6 firewalling with ip6tables?
- advances in failover/HA of stateful firewalls
- ivisible firewalls with iptables on a bridge
- userspace packet queueing with QUEUE
- userspace packet logging with ULOG
Requirements:
- knowledge about the TCP/IP protocol family
- knowledge about general firewalling and packet filtering concepts
- prior experience with linux packet filters
Audience:
- firewall administrators
- network developers

View File

@ -0,0 +1,22 @@
<a href="http://gnumonks.org/users/laforge/">Harald Welte</a> is one
of the five <a href="http://www.netfilter.org/">netfilter/iptables</a> core
team members, and the current Linux 2.4.x firewalling maintainer.
His main interest in computing has always been networking. In the few time
left besides netfilter/iptables related work, he's writing obscure documents
like the <a href="http://www.gnumonks.org/ftp/pub/doc/uucp-over-ssl.html"> UUCP
over SSL HOWTO</a>. Other kernel-related projects he has been contributing are
user mode linux and the international (crypto) kernel patch.
In the past he has been working as an independent IT Consultant working on
closed-source projecst for various companies ranging from banks to
manufacturers of networking gear. During the year 2001 he was living in
Curitiba (Brazil), where he got sponsored for his Linux related work by
<a href="http://www.conectiva.com/">Conectiva Inc.</a>.
Starting with February 2002, Harald has been contracted part-time by
<a href="http://www.astaro.com/">Astaro AG</a>, who are sponsoring him for his
current netfilter/iptables work.
Harald is living in Berlin, Germany.

View File

@ -0,0 +1,509 @@
%include "default.mgp"
%default 1 bgrad
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
%nodefault
%back "blue"
%center
%size 7
Linux 2.4.x netfilter/iptables
firewalling internals
%center
%size 4
by
Harald Welte <laforge@netfilter.org>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Contents
Introduction
Netfilter hooks in protocol stacks
Packet selection based on IP Tables
The Connection Tracking Subsystem
The NAT Subsystem based on netfilter + iptables
Packet filtering using the 'filter' table
Packet mangling using the 'mangle' table
Advanced netfilter concepts
Current development and Future
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Introduction
Why did we need netfilter/iptables?
Because ipchains...
has no infrastructure for passing packets to userspace
makes transparent proxying extremely difficult
has interface address dependent Packet filter rules
has Masquerading implemented as part of packet filtering
code is too complex and intermixed with core ipv4 stack
is neither modular nor extensible
only barely supports one special case of NAT (masquerading)
has only stateless packet filtering
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Introduction
Who's behind netfilter/iptables
Paul 'Rusty' Russell
co-author of iptables in Linux 2.2
was paid by Watchguard for about one Year of development
James Morris
userspace queuing (kernel, library and tools)
Marc Boucher
NAT and packet filtering controlled by one command
Mangle table
Harald Welte
Conntrack+NAT helper infrastructure (newnat)
Userspace packet logging (ULOG)
Jozsef Kadlecsik
TCP window tracking
H.323 conntrack + NAT helper
Non-core team contributors
http://www.netfilter.org/scoreboard/
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Netfilter Hooks
What is netfilter?
System of callback functions within network stack
Callback function to be called for every packet traversing certain point (hook) within network stack
Protocol independent framework
Hooks in layer 3 stacks (IPv4, IPv6, DECnet, ARP)
Multiple kernel modules can register with each of the hooks
Asynchronous packet handling in userspace (ip_queue)
Traditional packet filtering, NAT, ... is implemented on top of this framework
Can be used for other stuff interfacing with the core network stack, like DECnet routing daemon.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Netfilter Hooks
Netfilter architecture in IPv4
%font "typewriter"
%size 4
--->[1]--->[ROUTE]--->[3]--->[4]--->
| ^
| |
| [ROUTE]
v |
[2] [5]
| ^
| |
v |
%font "standard"
1=NF_IP_PRE_ROUTING
2=NF_IP_LOCAL_IN
3=NF_IP_FORWARD
4=NF_IP_POST_ROUTING
5=NF_IP_LOCAL_OUT
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Netfilter Hooks
Netfilter Hooks
Any kernel module may register a callback function at any of the hooks
The module has to return one of the following constants
NF_ACCEPT continue traversal as normal
NF_DROP drop the packet, do not continue
NF_STOLEN I've taken over the packet do not continue
NF_QUEUE enqueue packet to userspace
NF_REPEAT call this hook again
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
IP tables
Packet selection using IP tables
The kernel provides generic IP tables support
Each kernel module may create it's own IP table
The three major parts of 2.4 firewalling subsystem are implemented using IP tables
Packet filtering table 'filter'
NAT table 'nat'
Packet mangling table 'mangle'
Can potentially be used for other stuff, i.e. IPsec SPDB
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
IP Tables
Managing chains and tables
An IP table consists out of multiple chains
A chain consists out of a list of rules
Every single rule in a chain consists out of
match[es] (rule executed if all matches true)
target (what to do if the rule is matched)
%size 4
matches and targets can either be builtin or implemented as kernel modules
%size 5
The userspace tool iptables is used to control IP tables
handles all different kinds of IP tables
supports a plugin/shlib interface for target/match specific options
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
IP Tables
Basic iptables commands
To build a complete iptables command, we must specify
which table to work with
which chain in this table to use
an operation (insert, add, delete, modify)
one or more matches (optional)
a target
The syntax is
%font "typewriter"
%size 3
iptables -t table -Operation chain -j target match(es)
%font "standard"
%size 5
Example:
%font "typewriter"
%size 3
iptables -t filter -A INPUT -j ACCEPT -p tcp --dport smtp
%font "standard"
%size 5
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
IP Tables
Matches
Basic matches
-p protocol (tcp/udp/icmp/...)
-s source address (ip/mask)
-d destination address (ip/mask)
-i incoming interface
-o outgoing interface
Match extensions (examples)
tcp/udp TCP/udp source/destination port
icmp ICMP code/type
ah/esp AH/ESP SPID match
mac source MAC address
mark nfmark
length match on length of packet
limit rate limiting (n packets per timeframe)
owner owner uid of the socket sending the packet
tos TOS field of IP header
ttl TTL field of IP header
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
IP Tables
Targets
very dependent on the particular table.
Table specific targets will be discussed later
Generic Targets, always available
ACCEPT accept packet within chain
DROP silently drop packet
QUEUE enqueue packet to userspace
LOG log packet via syslog
ULOG log packet via ulogd
RETURN return to previous (calling) chain
foobar jump to user defined chain
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Packet Filtering
Overview
Implemented as 'filter' table
Registers with three netfilter hooks
NF_IP_LOCAL_IN (packets destined for the local host)
NF_IP_FORWARD (packets forwarded by local host)
NF_IP_LOCAL_OUT (packets from the local host)
Each of the three hooks has attached one chain (INPUT, FORWARD, OUTPUT)
Every packet passes exactly one of the three chains. Note that this is very different compared to the old 2.2.x ipchains behaviour.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Packet Filtering
Targets available within 'filter' table
Builtin Targets to be used in filter table
ACCEPT accept the packet
DROP silently drop the packet
QUEUE enqueue packet to userspace
RETURN return to previous (calling) chain
foobar user defined chain
Targets implemented as loadable modules
REJECT drop the packet but inform sender
MIRROR change source/destination IP and resend
LOG log via syslog
ULOG log via userspace
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Connection Tracking Subsystem
Connection tracking...
implemented seperately from NAT
enables stateful filtering
implementation
hooks into NF_IP_PRE_ROUTING to track packets
hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to see if packet passed filtering rules
protocol modules (currently TCP/UDP/ICMP)
application helpers currently (FTP,IRC,H.323,talk,SNMP)
divides packets in the following four categories
NEW - would establish new connection
ESTABLISHED - part of already established connection
RELATED - is related to established connection
INVALID - (multicast, errors...)
does _NOT_ filter packets itself
can be utilized by iptables using the 'state' match
is used by NAT Subsystem
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Common structures
struct ip_conntrack_tuple, representing unidirectional flow
layer 3 src + dst
layer 4 protocol
layer 4 src + dst
connetions represented as struct ip_conntrack
original tuple
reply tuple
timeout
l4 state private data
app helper
app helper private data
expected connections
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Flow of events for new packet
packet enters NF_IP_PRE_ROUTING
tuple is derived from packet
lookup conntrack hash table with hash(tuple) -> fails
new ip_conntrack is allocated
fill in original and reply == inverted(original) tuple
initialize timer
assign app helper if applicable
see if we've been expected -> fails
call layer 4 helper 'new' function
...
packet enters NF_IP_POST_ROUTING
do hashtable lookup for packet -> fails
place struct ip_conntrack in hashtable
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Flow of events for packet part of existing connection
packet enters NF_IP_PRE_ROUTING
tuple is derived from packet
lookup conntrack hash table with hash(tuple)
assosiate conntrack entry with skb->nfct
call l4 protocol helper 'packet' function
do l4 state tracking
update timeouts as needed [i.e. TCP TIME_WAIT,...]
...
packet enters NF_IP_POST_ROUTING
do hashtable lookup for packet -> succeds
do nothing else
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Network Address Translation
Overview
Previous Linux Kernels only implemented one special case of NAT: Masquerading
Linux 2.4.x can do any kind of NAT.
NAT subsystem implemented on top of netfilter, iptables and conntrack
NAT subsystem registers with all five netfilter hooks
'nat' Table registers chains PREROUTING, POSTROUTING and OUTPUT
Following targets available within 'nat' Table
SNAT changes the packet's source whille passing NF_IP_POST_ROUTING
DNAT changes the packet's destination while passing NF_IP_PRE_ROUTING
MASQUERADE is a special case of SNAT
REDIRECT is a special case of DNAT
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Network Address Translation
Source NAT
SNAT Example:
%font "typewriter"
%size 3
iptables -t nat -A POSTROUTING -j SNAT --to-source 1.2.3.4 -s 10.0.0.0/8
%font "standard"
%size 4
MASQUERADE Example:
%font "typewriter"
%size 3
iptables -t nat -A POSTROUTING -j MASQUERADE -o ppp0
%font "standard"
%size 5
Destination NAT
DNAT example
%font "typewriter"
%size 3
iptables -t nat -A PREROUTING -j DNAT --to-destination 1.2.3.4:8080 -p tcp --dport 80 -i eth1
%font "standard"
%size 4
REDIRECT example
%font "typewriter"
%size 3
iptables -t nat -A PREROUTING -j REDIRECT --to-port 3128 -i eth1 -p tcp --dport 80
%font "standard"
%size 5
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Packet Mangling
Purpose of mangle table
packet manipulation except address manipulation
Integration with netfilter
'mangle' table hooks in all five netfilter hooks
priority: after conntrack
Targets specific to the 'mangle' table:
DSCP - manipulate DSCP field
IPV4OPTSSTRIP - strip IPv4 options
MARK - change the nfmark field of the skb
TCPMSS - set TCP MSS option
TOS - manipulate the TOS bits
TTL - set / increase / decrease TTL field
Simple example:
%font "typewriter"
%size 3
iptables -t mangle -A PREROUTING -j MARK --set-mark 10 -p tcp --dport 80
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Advanced Netfilter concepts
%size 4
Userspace logging
flexible replacement for old syslog-based logging
packets to userspace via multicast netlink sockets
easy-to-use library (libipulog)
plugin-extensible userspace logging daemon (ulogd)
Can even be used to directly log into MySQL
Queuing
reliable asynchronous packet handling
packets to userspace via unicast netlink socket
easy-to-use library (libipq)
provides Perl bindings
experimental queue multiplex daemon (ipqmpd)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
netfilter/iptables in Linux 2.4
Current Development and Future
Netfilter (although it proved very stable) is still work in progress.
Areas of current development
infrastructure for conntrack manipulation from userspace
failover of stateful firewalls (conntrack sync)
making iptables layer3 independent (pkttables)
new userspace library (libiptables) to hide plugins from apps
more matches and targets for advanced functions (pool, hashslot)
more conntrack and NAT modules (RPC, SNMP, SMB, ...)
better IPv6 support (conntrack, more matches / targets)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Thanks
Thanks to
the BBS people, Z-Netz, FIDO, ...
for heavily increasing my computer usage in 1992
KNF
for bringing me in touch with the internet as early as 1994
for providing a playground for technical people
for telling me about the existance of Linux!
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
for implementing (one of?) the world's best TCP/IP stacks
Paul 'Rusty' Russell
for starting the netfilter/iptables project
for trusting me to maintain it today
Astaro AG
for sponsoring parts of my netfilter work
%size 3
The slides and the an according paper of this presentation are available at http://www.gnumonks.org/
%size 3
The netfilter homepage http://www.netfilter.org/

View File

@ -0,0 +1,70 @@
Kurz-Paper zum Vortrag "Programmierung von netfilter/iptables-Erweiterungen"
(Schluessel 3b911575)
1. Warum ist dieses Thema fuer die Besucher interessant?
Das Thema ist aus unterschiedlichen Gruenden interessant.
Zum einen is vielen fortgeschrittenen Administratoren einfach nicht klar,
was sich durch eigene Erweiterungsmodule fuer Moeglichkeiten erschliessen.
Zu vielen denken noch in der alten, starren, monolithischen 'ipchains'-Welt.
Zum anderen ist es auch eine ideale Moeglichkeit, ein bisschen in die Welt
der Kernel-Programmierung hereinzuschnuppern. Viele der komplexen
Zusammenhaenge (locking, etc.) werden weitestgehend vom netfilter/iptables
core uebernommen, so dass wirklich lediglich C-Programmierkenntnisse noetig
sind, und man bisher den Kernel noch nicht angefasst haben muss.
Und nicht zuletzt waere dieser Vortrag erst die zweite Moeglichkeit,
sich anstatt RTFM durch einen Workshop mit diesem Thema zu beschaeftigen ;)
2. Warum beschaeftigen Sie sich mit dem Thema?
Weil ich ein Mitglied des netfilter core teams und der gegenwaertige
Maintainer des Linux Firewalling Subsystems bin.
Warum ich nun das bin, ist eine laengere Geschichte. Ich finde es jedenfalls
wichtig, an der Weiterentwicklung des Linux-Firewallings zu arbeiten.
Netzwerke, und insbesondere Netzwerksicherheit war schon immer mein
Lieblingsthema.
3. Welche Struktur/Gliederung soll der Vortrag bzw. Workshop haben?
Zunaechst kommt eine kurze Uebersicht ueber die interne Architektur
des netfilter- und iptables- subsystem.
Im zweiten Teil werden die im Rahmen dieser Architektur zur Verfuegung
stehenden API's besprochen, u.a. auch mit Code-Beispielen von existierenden
iptables matches/targets, sowie conntrack und NAT helper-Modulen.
Im driten Teil folgt dann eine schritt-fuer-schritt-Entwicklung eines
iptables-Erweiterungsmoduls.
4. Planen Sie auch eine praktische Vorfuehrung im Rahmen des Beitrages?
Nunja, nach dem es sich um eine Art Programmier-Tutorial handelt werden
wir nach dem Theoretischen Teil (einer Einfuehrung in die API's) zusammen
ein solches Erweiterungsmodul schreiben. Ich denke das zaehlt als
"praktische Vorfuehrung"
5. Welche einschlaegigen Webseiten gibt es zum Tema?
Die Homepage des netfilter/iptables-Projekts unter http://www.netfilter.org/,
insbesondere das netfilter-hacking-HOWTO unter
http://www.netfilter.org/documentation/HOWTO/netfilter-hacking-HOWTO.html ist
als relevant zu bezeichnen.
6. Haben Sie schon einmal ueber dieses Thema referiert?
Ich habe zahlreiche Vortraege und Tutorials run um das Thema
netfilter/iptables auf internationalen Konferenzen gehalten (unter anderem
Linuxtag, Linux-Kongress, Ottawa Linux Symposium, Sao Paulo Linux Expo, ...).
Eine unvollstaendige Liste ist unter http://www.netfilter.org/events.html
Dabei war auch bereits ein eintaegiger Workshop, in dem von den Grundlagen
der Anwendung bis zur Programmierung von Erweiterungsmodule der komplette
Themenbereich abgedeckt war.
Sonstiges:
Der Workshop kann in deutsch oder englisch angeboten werden.

View File

@ -0,0 +1,54 @@
#include <linux/module.h>
#include <linux/sk_buff.h>
#include <linux/netfilter_ipv4/ip_tables.h>
#include <linux/netfilter_ipv4/ipt_workshop.h>
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Harald Welte <laforge@netfilter.org>");
MODULE_DESCRIPTION("5CLT workshop iptables module");
static int ws_match(const struct sk_buff *skb, const struct net_device *in,
const struct net_device *out, const void *matchinfo,
int offset, const void *hdr, u_int16_t datalen,
int *hotdrop)
{
const struct ipt_ws_info *info = matchinfo;
const struct iphdr *iph = skb->nh.iph;
if (iph->ttl == info->ttl)
return 1;
return 0;
}
static int ws_checkentry(const char *tablename, const struct ipt_ip *ip,
void *matchinfo, unsigned int matchsize,
unsigned int hook_mask)
{
if (matchsize != IPT_ALIGN(sizeof(struct ipt_ws_info)))
return 0;
return 1;
}
static struct ipt_match ws_match = {
.list = { .prev = NULL, .next = NULL },
.name = "workshop",
.match = &ws_match,
.checkentry = &ws_checkentry,
.destroy = NULL,
.me = THIS_MODULE
};
static int __init init(void)
{
return ipt_register_match(&ws_match);
}
static void __exit fini(void)
{
ipt_unregister_match(&ws_match);
}
module_init(init);
module_exit(fini);

View File

@ -0,0 +1,6 @@
#ifndef _IPT_WORKSHOP_H
#define _IPT_WORKSHOP_H
struct ipt_ws_info {
u_int8_t ttl;
};
#endif

View File

@ -0,0 +1,102 @@
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <getopt.h>
#include <linux/netfilter_ipv4/ip_tables.h>
#include <linux/netfilter_ipv4/ipt_workshop.h>
static void help(void)
{
printf(
"workshop match v%s options:\n"
" --ttl TTL value\n"
, IPTABLES_VERSION);
}
static void init(struct ipt_entry_match *m, unsigned int *nfcache)
{
/* caching not implemented yet */
*nfcache |= NFC_UNKNOWN;
}
static int parse(int c, char **argv, int invert, unsigned int *flags,
const struct ipt_entry *entry, unsigned int *nfcache,
struct ipt_entry_match **match)
{
struct ipt_ws_info *info = (struct ipt_ws_info *) (*match)->data;
check_inverse(optarg, &invert, &optind, 0);
if (invert)
exit_error(PARAMETER_PROBLEM, "invert not supported");
if (*flags)
exit_error(PARAMETER_PROBLEM,
"workshop: can't specify parameter twice");
if (!optarg)
exit_error(PARAMETER_PROBLEM,
"workshop: you must specify a value");
switch (c) {
case 'z':
info->ttl = atoi(optarg);
/* FIXME: range 0-255 */
*flags = 1;
break;
default:
return 0;
}
return 1;
}
static void final_check(unsigned int flags)
{
if (!flags)
exit_error(PARAMETER_PROBLEM,
"workshop match: you must specify foo");
}
static void print(const struct ipt_ip *ip,
const struct ipt_entry_match *match,
int numeric)
{
const struct ipt_ws_info *info = (struct ipt_ws_info *) match->data;
printf("workshop match TTL=%u ", info->ttl);
}
static void save(const struct ipt_ip *ip,
const struct ipt_entry_match *match)
{
const struct ipt_ws_info *info = (struct ipt_ws_info *) match->data;
printf("--ttl %u ", info->ttl);
}
static struct option opts[] = {
{ "ttl", 1, 0, 'z' },
{ 0 }
};
static struct iptables_match ws = {
.next = NULL,
.name = "workshop",
.version = IPTABLES_VERSION,
.size = IPT_ALIGN(sizeof(struct ipt_ws_info)),
.userspacesize = IPT_ALIGN(sizeof(struct ipt_ws_info)),
.help = &help,
.init = &init,
.parse = &parse,
.final_check = &final_check,
.print = &print,
.save = &save,
.extra_opts = opts
};
void _init(void)
{
register_match(&ws);
}

View File

@ -0,0 +1,636 @@
%include "default.mgp"
%default 1 bgrad
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%deffont "typewriter" tfont "MONOTYPE.TTF"
%page
%nodefault
%back "blue"
%center
%size 7
Programming netfilter/iptables
extensions
%center
%size 4
by
Harald Welte <laforge@netfilter.org>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Contents
Introduction
The netfilter/iptables architecture
Netfilter hooks in protocol stacks
Packet selection based on IP Tables
The Connection Tracking Subsystem
The NAT Subsystem based on netfilter + iptables
Packet filtering using the 'filter' table
Packet mangling using the 'mangle' table
Advanced netfilter concepts
Current development and Future
Developing a netfilter module
Developing a new iptables match
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Introduction
Why did we need netfilter/iptables?
Because ipchains...
has no infrastructure for passing packets to userspace
makes transparent proxying extremely difficult
has interface address dependent Packet filter rules
has Masquerading implemented as part of packet filtering
code is too complex and intermixed with core ipv4 stack
is neither modular nor extensible
only barely supports one special case of NAT (masquerading)
has only stateless packet filtering
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Introduction
Who's behind netfilter/iptables
Paul 'Rusty' Russel
co-author of iptables in Linux 2.2
was paid by Watchguard for about one Year of development
James Morris
userspace queuing (kernel, library and tools)
REJECT target
Marc Boucher
NAT and packet filtering controlled by one command
Mangle table
Harald Welte
Conntrack+NAT helper infrastructure (newnat)
Userspace packet logging (ULOG)
PPTP and IRC conntrack/NAT helpers
Jozsef Kadlecsik
TCP window tracking
H.323 conntrack + NAT helper
Continued newnat development
Non-core team contributors
http://www.netfilter.org/scoreboard/
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Netfilter Hooks
What is netfilter?
System of callback functions within network stack
Callback function to be called for every packet traversing certain point (hook) within network stack
Protocol independent framework
Hooks in layer 3 stacks (IPv4, IPv6, DECnet, ARP)
Multiple kernel modules can register with each of the hooks
Asynchronous packet handling in userspace (ip_queue)
Traditional packet filtering, NAT, ... is implemented on top of this framework
Can be used for other stuff interfacing with the core network stack, like DECnet routing daemon.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Netfilter Hooks
Netfilter architecture in IPv4
%font "typewriter"
--->[1]--->[ROUTE]--->[3]--->[4]--->
| ^
| |
| [ROUTE]
v |
[2] [5]
| ^
| |
v |
%font "standard"
1=NF_IP_PRE_ROUTING
2=NF_IP_LOCAL_IN
3=NF_IP_FORWARD
4=NF_IP_POST_ROUTING
5=NF_IP_LOCAL_OUT
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Netfilter Hooks
Netfilter Hooks
Any kernel module may register a callback function at any of the hooks
The module has to return one of the following constants
NF_ACCEPT continue traversal as normal
NF_DROP drop the packet, do not continue
NF_STOLEN I've taken over the packet do not continue
NF_QUEUE enqueue packet to userspace
NF_REPEAT call this hook again
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
IP tables
Packet selection using IP tables
The kernel provides generic IP tables support
Each kernel module may create it's own IP table
The three major parts of 2.4 firewalling subsystem are implemented using IP tables
Packet filtering table 'filter'
NAT table 'nat'
Packet mangling table 'mangle'
Can potentially be used for other stuff, i.e. IPsec SPDB
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
IP Tables
Managing chains and tables
An IP table consists out of multiple chains
A chain consists out of a list of rules
Every single rule in a chain consists out of
match[es] (rule executed if all matches true)
target (what to do if the rule is matched)
%size 4
matches and targets can either be builtin or implemented as kernel modules
%size 6
The userspace tool iptables is used to control IP tables
handles all different kinds of IP tables
supports a plugin/shlib interface for target/match specific options
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
IP Tables
Basic iptables commands
To build a complete iptables command, we must specify
which table to work with
which chain in this table to use
an operation (insert, add, delete, modify)
one or more matches (optional)
a target
The syntax is
%font "typewriter"
%size 3
iptables -t table -Operation chain -j target match(es)
%font "standard"
%size 5
Example:
%font "typewriter"
%size 3
iptables -t filter -A INPUT -j ACCEPT -p tcp --dport smtp
%font "standard"
%size 5
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
IP Tables
Matches
Basic matches
-p protocol (tcp/udp/icmp/...)
-s source address (ip/mask)
-d destination address (ip/mask)
-i incoming interface
-o outgoing interface
Match extensions (examples)
tcp/udp TCP/udp source/destination port
icmp ICMP code/type
ah/esp AH/ESP SPID match
mac source MAC address
mark nfmark
length match on length of packet
limit rate limiting (n packets per timeframe)
owner owner uid of the socket sending the packet
tos TOS field of IP header
ttl TTL field of IP header
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
IP Tables
Targets
very dependent on the particular table.
Table specific targets will be discussed later
Generic Targets, always available
ACCEPT accept packet within chain
DROP silently drop packet
QUEUE enqueue packet to userspace
LOG log packet via syslog
ULOG log packet via ulogd
RETURN return to previous (calling) chain
foobar jump to user defined chain
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Packet Filtering
Overview
Implemented as 'filter' table
Registers with three netfilter hooks
NF_IP_LOCAL_IN (packets destined for the local host)
NF_IP_FORWARD (packets forwarded by local host)
NF_IP_LOCAL_OUT (packets from the local host)
Each of the three hooks has attached one chain (INPUT, FORWARD, OUTPUT)
Every packet passes exactly one of the three chains. Note that this is very different compared to the old 2.2.x ipchains behaviour.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Packet Filtering
Targets available within 'filter' table
Builtin Targets to be used in filter table
ACCEPT accept the packet
DROP silently drop the packet
QUEUE enqueue packet to userspace
RETURN return to previous (calling) chain
foobar user defined chain
Targets implemented as loadable modules
REJECT drop the packet but inform sender
MIRROR change source/destination IP and resend
LOG log via syslog
ULOG log via userspace
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Connection Tracking Subsystem
Connection tracking...
implemented seperately from NAT
enables stateful filtering
implementation
hooks into NF_IP_PRE_ROUTING to track packets
hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to see if packet passed filtering rules
protocol modules (currently TCP/UDP/ICMP)
application helpers currently (FTP,IRC,H.323,talk,SNMP)
divides packets in the following four categories
NEW - would establish new connection
ESTABLISHED - part of already established connection
RELATED - is related to established connection
INVALID - (multicast, errors...)
does _NOT_ filter packets itself
can be utilized by iptables using the 'state' match
is used by NAT Subsystem
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Common structures
struct ip_conntrack_tuple, representing unidirectional flow
layer 3 src + dst
layer 4 protocol
layer 4 src + dst
connetions represented as struct ip_conntrack
original tuple
reply tuple
timeout
l4 state private data
app helper
app helper private data
expected connections
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Flow of events for new packet
packet enters NF_IP_PRE_ROUTING
tuple is derived from packet
lookup conntrack hash table with hash(tuple) -> fails
new ip_conntrack is allocated
fill in original and reply == inverted(original) tuple
initialize timer
assign app helper if applicable
see if we've been expected -> fails
call layer 4 helper 'new' function
...
packet enters NF_IP_POST_ROUTING
do hashtable lookup for packet -> fails
place struct ip_conntrack in hashtable
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
HA for netfillter/iptables
Connection Tracking Subsystem
Flow of events for packet part of existing connection
packet enters NF_IP_PRE_ROUTING
tuple is derived from packet
lookup conntrack hash table with hash(tuple)
assosiate conntrack entry with skb->nfct
call l4 protocol helper 'packet' function
do l4 state tracking
update timeouts as needed [i.e. TCP TIME_WAIT,...]
...
packet enters NF_IP_POST_ROUTING
do hashtable lookup for packet -> succeds
do nothing else
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Network Address Translation
Overview
Previous Linux Kernels only implemented one special case of NAT: Masquerading
Linux 2.4.x can do any kind of NAT.
NAT subsystem implemented on top of netfilter, iptables and conntrack
NAT subsystem registers with all five netfilter hooks
'nat' Table registers chains PREROUTING, POSTROUTING and OUTPUT
Following targets available within 'nat' Table
SNAT changes the packet's source whille passing NF_IP_POST_ROUTING
DNAT changes the packet's destination while passing NF_IP_PRE_ROUTING
MASQUERADE is a special case of SNAT
REDIRECT is a special case of DNAT
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Network Address Translation
flow of events for NEW packet:
packet enters NF_IP_PRE_ROUTING after conntrack
resolve conntrack entry for packet
if (expectfn of helper) call it
else iterate over rules in PREROUTING chain of nat table
save respective NAT mappings in conntrack
apply the NAT mappings to the packet
call NAT helper function, if there is one for this proto
...
packet enters NF_IP_POST_ROUTING
resolve conntrack entry for packet
iterate over rules in POSTROUTING chain of nat table
save respectiva NAT mappings in conntrack
apply the NAT mappings to the packet
call NAT helper function, if there is one for this proto
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Network Address Translation
flow of events for ESTABLISHED packets:
packet enters NF_IP_PRE_ROUTING after conntrack
reseolve conntrack entry for packet
apply the NAT mappings (read from conntrack entry) to the packet
call NAT helper function, if there is one for this proto
...
packet enters NF_IP_POST_ROUTING
resolve conntrack entry for packet
apply the NAT mappings (read from conntrack entry) to the packet
call NAT helper function, if there is one for this proto
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Network Address Translation
Source NAT
SNAT Example:
%font "typewriter"
%size 3
iptables -t nat -A POSTROUTING -j SNAT --to-source 1.2.3.4 -s 10.0.0.0/8
%font "standard"
%size 4
MASQUERADE Example:
%font "typewriter"
%size 3
iptables -t nat -A POSTROUTING -j MASQUERADE -o ppp0
%font "standard"
%size 5
Destination NAT
DNAT example
%font "typewriter"
%size 3
iptables -t nat -A PREROUTING -j DNAT --to-destination 1.2.3.4:8080 -p tcp --dport 80 -i eth1
%font "standard"
%size 4
REDIRECT example
%font "typewriter"
%size 3
iptables -t nat -A PREROUTING -j REDIRECT --to-port 3128 -i eth1 -p tcp --dport 80
%font "standard"
%size 5
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Packet Mangling
Purpose of mangle table
packet manipulation except address manipulation
Integration with netfilter
'mangle' table hooks in all five netfilter hooks
priority: after conntrack
Targets specific to the 'mangle' table:
DSCP - manipulate DSCP field
IPV4OPTSSTRIP - strip IPv4 options
MARK - change the nfmark field of the skb
TCPMSS - set TCP MSS option
TOS - manipulate the TOS bits
TTL - set / increase / decrease TTL field
Simple example:
%font "typewriter"
%size 3
iptables -t mangle -A PREROUTING -j MARK --set-mark 10 -p tcp --dport 80
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Advanced Netfilter concepts
%size 4
Userspace logging
flexible replacement for old syslog-based logging
packets to userspace via multicast netlink sockets
easy-to-use library (libipulog)
plugin-extensible userspace logging daemon (ulogd)
Can even be used to directly log into MySQL
Queuing
reliable asynchronous packet handling
packets to userspace via unicast netlink socket
easy-to-use library (libipq)
provides Perl bindings
experimental queue multiplex daemon (ipqmpd)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Current Development and Future
Netfilter (although it proved very stable) is still work in progress.
Areas of current development
infrastructure for conntrack manipulation from userspace
failover of stateful firewalls
making iptables layer3 independent (pkttables)
new userspace library (libiptables) to hide plugins from apps
more matches and targets for advanced functions (pool, hashslot)
more conntrack and NAT modules (RPC, SNMP, SMB, ...)
better IPv6 support (conntrack, more matches / targets)
conntrack hash optimizations
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Developing netfilter/iptables extensions
Developing a netfilter module
Netfilter modules are very low-layer
Get called for every packet passing the hook in this l3prot
Examples of netfilter modules are: ip_tables, ip_conntrack, iptable_nat
API for netfilter <linux/netfilter.h>:
%font "typewriter"
nf_register_hook(struct nf_hook_ops *reg)
nf_unregister_hook(struct nf_hook_ops *reg)
struct nf_hook_ops:
struct list_head list; /* list header {NULL,NULL}) */
nf_hookfn *hook; /* the callback function */
int pf; /* protocol family */
int hooknum; /* hook to register with */
int priority; /* priority, determines order */
%font "standard"
Example code see "nf_workshop.c"
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Developing netfilter/iptables extensions
Developing an ip_tables match module
ip_tables modules are at a high layer
Get called for every packet iterating a rule with this match
Examples of iptables modules are: ipt_ttl, ipt_tos, ipt_tcpmss
API for iptables matches <linux/netfilter_ipv4/ip_tables.h>:
%font "typewriter"
ipt_register_match(struct ipt_match *match)
ipt_unregister_match(struct ipt_match *match)
struct ipt_match:
struct list_head list; /* list header {NULL,NULL} */
const char name[]; /* name of the match */
int (*match); /* called when pkt is matched */
int (*checkentry); /* called when entry inserted */
void (*destroy); /* called when entry deleted */
struct modulea *me; /* set to THIS_MODULE */
%font "standard"
Example code see "ipt_workshop.c"
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Developing netfilter/iptables extensions
Developing an iptables match module
Something has to parse the commandline optins for ipt_workshop.c
Solution: libpt_workshop.c as iptables plugin
API for iptables-command plugins <iptables.h>:
%font "typewriter"
register_match(struct iptables_match)
struct iptables_match:
struct iptables_match *next; /* next one */
ipt_chainlabel name; /* name */
const char *version; /* version */
size_t size; /* size of match data */
size_t userspacesize; /* size for userspace */
void (*help); /* print help message */
void (*init); /* init the matchinfo */
int (*parse); /* parse getopt chars */
void (*final_check); /* consistency check */
void (*print); /* print (iptables -L) */
void (*save); /* iptables-save */
struct option extra_opts; /* getopt-style opts */
%font "typewriter"
Example code see "libipt_workshop.c"
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Future of Linux packet filtering
Thanks
The slides and the an according paper of this presentation are available at http://www.gnumonks.org/
The netfilter homepage http://www.netfilter.org/
Thanks to
the BBS people, Z-Netz, FIDO, ...
for heavily increasing my computer usage in 1992
KNF
for bringing me in touch with the internet as early as 1994
for providing a playground for technical people
for telling me about the existance of Linux!
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
for implementing (one of?) the world's best TCP/IP stacks
Paul 'Rusty' Russell
for starting the netfilter/iptables project
for trusting me to maintain it today
Astaro AG
for sponsoring parts of my netfilter work
for sponsoring my travel cost to 5CLT

View File

@ -0,0 +1,57 @@
#include <linux/module.h>
#include <linux/config.h>
#include <linux/skbuff.h>
#include <linux/ip.h>
#include <linux/netfilter.h>
#include <linux/netfilter_ipv4.h>
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Harald Welte <laforge@netfilter.org>");
MODULE_DESCRIPTION("5CLT workshop module");
static unsigned int
workshop_fn(unsigned int hooknum,
struct sk_buff **pskb,
const struct net_device *in,
const struct net_device *out,
int (*okfn)(struct sk_buff *))
{
struct iphdr *iph = (*pskb)->nh.iph;
/* do whatever we want to do */
printk(KERN_NOTICE "packet from %u.%u.%u.%u received\n",
NIPQUAD(iph->saddr));
return NF_ACCEPT;
}
static struct nf_hook_ops workshop_ops = {
.list = { .prev = NULL, .next = NULL },
.hook = &workshop_fn,
.pf = PF_INET,
.hooknum = NF_IP_PRE_ROUTING,
.priority = NF_IP_PRI_LAST-1
};
static int __init init(void)
{
int ret = 0;
ret = nf_register_hook(&workshop_ops);
if (ret < 0) {
printk(KERN_ERR "something went wrong while registering\n");
return ret;
}
printk(KERN_DEBUG "workshop netfilter module successfully loaded\n");
return ret;
}
static void __exit fini(void)
{
nf_unregister_hook(&workshop_ops);
}
module_init(init);
module_exit(fini);

View File

@ -0,0 +1,54 @@
#include <linux/module.h>
#include <linux/sk_buff.h>
#include <linux/netfilter_ipv4/ip_tables.h>
#include <linux/netfilter_ipv4/ipt_workshop.h>
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Harald Welte <laforge@netfilter.org>");
MODULE_DESCRIPTION("OLS2003 workshop iptables module");
static int ws_match(const struct sk_buff *skb, const struct net_device *in,
const struct net_device *out, const void *matchinfo,
int offset, const void *hdr, u_int16_t datalen,
int *hotdrop)
{
const struct ipt_ws_info *info = matchinfo;
const struct iphdr *iph = skb->nh.iph;
if (iph->ttl == info->ttl)
return 1;
return 0;
}
static int ws_checkentry(const char *tablename, const struct ipt_ip *ip,
void *matchinfo, unsigned int matchsize,
unsigned int hook_mask)
{
if (matchsize != IPT_ALIGN(sizeof(struct ipt_ws_info)))
return 0;
return 1;
}
static struct ipt_match ws_match = {
.list = { .prev = NULL, .next = NULL },
.name = "workshop",
.match = &ws_match,
.checkentry = &ws_checkentry,
.destroy = NULL,
.me = THIS_MODULE
};
static int __init init(void)
{
return ipt_register_match(&ws_match);
}
static void __exit fini(void)
{
ipt_unregister_match(&ws_match);
}
module_init(init);
module_exit(fini);

View File

@ -0,0 +1,6 @@
#ifndef _IPT_WORKSHOP_H
#define _IPT_WORKSHOP_H
struct ipt_ws_info {
u_int8_t ttl;
};
#endif

View File

@ -0,0 +1,102 @@
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <getopt.h>
#include <linux/netfilter_ipv4/ip_tables.h>
#include <linux/netfilter_ipv4/ipt_workshop.h>
static void help(void)
{
printf(
"workshop match v%s options:\n"
" --ttl TTL value\n"
, IPTABLES_VERSION);
}
static void init(struct ipt_entry_match *m, unsigned int *nfcache)
{
/* caching not implemented yet */
*nfcache |= NFC_UNKNOWN;
}
static int parse(int c, char **argv, int invert, unsigned int *flags,
const struct ipt_entry *entry, unsigned int *nfcache,
struct ipt_entry_match **match)
{
struct ipt_ws_info *info = (struct ipt_ws_info *) (*match)->data;
check_inverse(optarg, &invert, &optind, 0);
if (invert)
exit_error(PARAMETER_PROBLEM, "invert not supported");
if (*flags)
exit_error(PARAMETER_PROBLEM,
"workshop: can't specify parameter twice");
if (!optarg)
exit_error(PARAMETER_PROBLEM,
"workshop: you must specify a value");
switch (c) {
case 'z':
info->ttl = atoi(optarg);
/* FIXME: range 0-255 */
*flags = 1;
break;
default:
return 0;
}
return 1;
}
static void final_check(unsigned int flags)
{
if (!flags)
exit_error(PARAMETER_PROBLEM,
"workshop match: you must specify ttl");
}
static void print(const struct ipt_ip *ip,
const struct ipt_entry_match *match,
int numeric)
{
const struct ipt_ws_info *info = (struct ipt_ws_info *) match->data;
printf("workshop match TTL=%u ", info->ttl);
}
static void save(const struct ipt_ip *ip,
const struct ipt_entry_match *match)
{
const struct ipt_ws_info *info = (struct ipt_ws_info *) match->data;
printf("--ttl %u ", info->ttl);
}
static struct option opts[] = {
{ "ttl", 1, 0, 'z' },
{ 0 }
};
static struct iptables_match ws = {
.next = NULL,
.name = "workshop",
.version = IPTABLES_VERSION,
.size = IPT_ALIGN(sizeof(struct ipt_ws_info)),
.userspacesize = IPT_ALIGN(sizeof(struct ipt_ws_info)),
.help = &help,
.init = &init,
.parse = &parse,
.final_check = &final_check,
.print = &print,
.save = &save,
.extra_opts = opts
};
void _init(void)
{
register_match(&ws);
}

View File

@ -0,0 +1,615 @@
%include "default.mgp"
%default 1 bgrad
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
#%deffont "typewriter" tfont "MONOTYPE.TTF"
%page
%nodefault
%back "blue"
%center
%size 7
Developing netfilter/iptables
extensions
%center
%size 4
by
Harald Welte <laforge@netfilter.org>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Contents
Introduction
The netfilter/iptables architecture
Netfilter hooks in protocol stacks
Packet selection based on IP Tables
The Connection Tracking Subsystem
The NAT Subsystem based on netfilter + iptables
Packet filtering using the 'filter' table
Packet mangling using the 'mangle' table
Advanced netfilter concepts
Current development and Future
Developing a netfilter module
Developing a new iptables match
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Netfilter Hooks
What is netfilter?
System of callback functions within network stack
Callback function to be called for every packet traversing certain point (hook) within network stack
Protocol independent framework
Hooks in layer 3 stacks (IPv4, IPv6, DECnet, ARP)
Multiple kernel modules can register with each of the hooks
Asynchronous packet handling in userspace (ip_queue)
Traditional packet filtering, NAT, ... is implemented on top of this framework
Can be used for other stuff interfacing with the core network stack, like DECnet routing daemon.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Netfilter Hooks
Netfilter architecture in IPv4
%font "typewriter"
%size 3
--->[1]--->[ROUTE]--->[3]--->[4]--->
| ^
| |
| [ROUTE]
v |
[2] [5]
| ^
| |
v |
%font "standard"
1=NF_IP_PRE_ROUTING
2=NF_IP_LOCAL_IN
3=NF_IP_FORWARD
4=NF_IP_POST_ROUTING
5=NF_IP_LOCAL_OUT
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Netfilter Hooks
Netfilter Hooks
Any kernel module may register a callback function at any of the hooks
The module has to return one of the following constants
NF_ACCEPT continue traversal as normal
NF_DROP drop the packet, do not continue
NF_STOLEN I've taken over the packet do not continue
NF_QUEUE enqueue packet to userspace
NF_REPEAT call this hook again
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Developing netfilter/iptables extensions
Developing a netfilter module
Netfilter modules are very low-layer
Get called for every packet passing the hook in this l3prot
Examples of netfilter modules are: ip_tables, ip_conntrack, iptable_nat
%font "typewriter"
%size 2
#include <linux/netfilter.h>
%size 2
nf_register_hook(struct nf_hook_ops *reg)
%size 2
nf_unregister_hook(struct nf_hook_ops *reg)
%size 2
struct nf_hook_ops:
%size 2
struct list_head list; /* list header */
%size 2
nf_hookfn *hook; /* the callback function */
%size 2
int pf; /* protocol family */
%size 2
int hooknum; /* hook to register with */
%size 2
int priority; /* priority (ordering) */
%font "standard"
Example code see "nf_workshop.c"
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
IP tables
Packet selection using IP tables
The kernel provides generic IP tables support
Each kernel module may create it's own IP table
The three major parts of 2.4 firewalling subsystem are implemented using IP tables
Packet filtering table 'filter'
NAT table 'nat'
Packet mangling table 'mangle'
Could potentially be used for other stuff, i.e. IPsec SPDB
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
IP Tables
Managing chains and tables
An IP table consists out of multiple chains
A chain consists out of a list of rules
Every single rule in a chain consists out of
match[es] (rule executed if all matches true)
target (what to do if the rule is matched)
%size 4
matches and targets can either be builtin or implemented as kernel modules
%size 5
The userspace tool iptables is used to control IP tables
handles all different kinds of IP tables
supports a plugin/shlib interface for target/match specific options
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
IP Tables
Basic iptables commands
To build a complete iptables command, we must specify
which table to work with
which chain in this table to use
an operation (insert, add, delete, modify)
one or more matches (optional)
a target
The syntax is
%font "typewriter"
%size 3
iptables -t table -Operation chain -j target match(es)
%font "standard"
%size 5
Example:
%font "typewriter"
%size 3
iptables -t filter -A INPUT -j ACCEPT -p tcp --dport smtp
%font "standard"
%size 5
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
IP Tables
Matches
Basic matches
-p protocol (tcp/udp/icmp/...)
-s source address (ip/mask)
-d destination address (ip/mask)
-i incoming interface
-o outgoing interface
Match extensions (examples)
tcp/udp TCP/udp source/destination port
icmp ICMP code/type
ah/esp AH/ESP SPID match
mac source MAC address
mark nfmark
length match on length of packet
limit rate limiting (n packets per timeframe)
owner owner uid of the socket sending the packet
tos TOS field of IP header
ttl TTL field of IP header
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
IP Tables
Targets
very dependent on the particular table.
Table specific targets will be discussed later
Generic Targets, always available
ACCEPT accept packet within chain
DROP silently drop packet
QUEUE enqueue packet to userspace
LOG log packet via syslog
ULOG log packet via ulogd
RETURN return to previous (calling) chain
foobar jump to user defined chain
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Packet Filtering
Overview
Implemented as 'filter' table
Registers with three netfilter hooks
NF_IP_LOCAL_IN (packets destined for the local host)
NF_IP_FORWARD (packets forwarded by local host)
NF_IP_LOCAL_OUT (packets from the local host)
Each of the three hooks has attached one chain (INPUT, FORWARD, OUTPUT)
Every packet passes exactly one of the three chains. Note that this is very different compared to the old 2.2.x ipchains behaviour.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Packet Filtering
Targets available within 'filter' table
Builtin Targets to be used in filter table
ACCEPT accept the packet
DROP silently drop the packet
QUEUE enqueue packet to userspace
RETURN return to previous (calling) chain
foobar user defined chain
Targets implemented as loadable modules
REJECT drop the packet but inform sender
MIRROR change source/destination IP and resend
LOG log via syslog
ULOG log via userspace
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Developing netfilter/iptables extensions
Developing an ip_tables match module
ip_tables modules are at a high layer
Get called for every packet iterating a rule with this match
Examples of iptables modules are: ipt_ttl, ipt_tos, ipt_tcpmss
%font "typewriter"
%size 2
#include <linux/netfilter_ipv4/ip_tables.h>
%size 2
ipt_register_match(struct ipt_match *match)
%size 2
ipt_unregister_match(struct ipt_match *match)
%size 2
struct ipt_match:
%size 2
struct list_head list; /* list header {NULL,NULL} */
%size 2
const char name[]; /* name of the match */
%size 2
int (*match); /* called when pkt is matched */
%size 2
int (*checkentry); /* called when entry inserted */
%size 2
void (*destroy); /* called when entry deleted */
%size 2
struct module *me; /* set to THIS_MODULE */
%font "standard"
Example code see "ipt_workshop.c"
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Developing netfilter/iptables extensions
Developing an iptables match module
Something has to parse the commandline options for ipt_workshop.c
Solution: libpt_workshop.c as iptables plugin
%font "typewriter"
%size 2
#include <iptables.h>:
%size 2
register_match(struct iptables_match)
%size 2
struct iptables_match:
%size 2
struct iptables_match *next; /* next one */
%size 2
ipt_chainlabel name; /* name */
%size 2
const char *version; /* version */
%size 2
size_t size; /* size of match data */
%size 2
size_t userspacesize; /* size for userspace */
%size 2
void (*help); /* print help message */
%size 2
void (*init); /* init the matchinfo */
%size 2
int (*parse); /* parse getopt chars */
%size 2
void (*final_check); /* consistency check */
%size 2
void (*print); /* print (iptables -L) */
%size 2
void (*save); /* iptables-save */
%size 2
struct option extra_opts; /* getopt-style opts */
%font "typewriter"
Example code see "libipt_workshop.c"
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Connection Tracking Subsystem
Connection tracking...
implemented seperately from NAT
enables stateful filtering
implementation
hooks into NF_IP_PRE_ROUTING to track packets
hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to see if packet passed filtering rules
protocol modules (currently TCP/UDP/ICMP)
application helpers currently (FTP,IRC,H.323,talk,SNMP)
divides packets in the following four categories
NEW - would establish new connection
ESTABLISHED - part of already established connection
RELATED - is related to established connection
INVALID - (multicast, errors...)
does _NOT_ filter packets itself
can be utilized by iptables using the 'state' match
is used by NAT Subsystem
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Connection Tracking Subsystem
Common structures
struct ip_conntrack_tuple, representing unidirectional flow
layer 3 src + dst
layer 4 protocol
layer 4 src + dst
connetions represented as struct ip_conntrack
original tuple
reply tuple
timeout
l4 state private data
app helper
app helper private data
expected connections
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Connection Tracking Subsystem
Flow of events for new packet
packet enters NF_IP_PRE_ROUTING
tuple is derived from packet
lookup conntrack hash table with hash(tuple) -> fails
new ip_conntrack is allocated
fill in original and reply == inverted(original) tuple
initialize timer
assign app helper if applicable
see if we've been expected -> fails
call layer 4 helper 'new' function
...
packet enters NF_IP_POST_ROUTING
do hashtable lookup for packet -> fails
place struct ip_conntrack in hashtable
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Connection Tracking Subsystem
Flow of events for packet part of existing connection
packet enters NF_IP_PRE_ROUTING
tuple is derived from packet
lookup conntrack hash table with hash(tuple)
assosiate conntrack entry with skb->nfct
call l4 protocol helper 'packet' function
do l4 state tracking
update timeouts as needed [i.e. TCP TIME_WAIT,...]
...
packet enters NF_IP_POST_ROUTING
do hashtable lookup for packet -> succeds
do nothing else
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Writing extensions for the conntrack subsystem
new l4 protocol modules are very rare
more common: application helpers for ftp,irc,h.323,quake,mms,...
API for conntrack helper modules:
%font "typewriter"
%size 2
#include <linux/netfilter_ipv4/ip_conntrack_helper.h>
%size 2
struct ip_conntrack_helper
%size 2
struct list_head *list;
%size 2
const char *name;
%size 2
unsigned char flags;
%size 2
struct module *me;
%size 2
unsigned int max_expected;
%size 2
unsigned int timeout;
%size 2
struct ip_conntrack_tuple tuple;
%size 2
struct ip_conntrack_mask mask;
%size 2
int (*help)(const struct iphdr *iph, size_t, struct ip_conntrack, enum ip_conntrack_info);
%size 2
int ip_conntrack_helper_register(struct ip_conntrack_helper);
%size 2
void ip_conntrack_helper_unregister(struct ip_conntrack_helper);
%size 2
int ip_conntrack_expect_related(struct ip_conntrack, struct ip_conntrack_expect);
%size 2
int ip_conntrack_change_expect(struct ip_conntrack_expect, struct ip_conntrack_tuple);
%size 2
void ip_conntrack_unexpect_related(struct ip_conntrack_expect);
%font "standard"
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Network Address Translation
Overview
Previous Linux Kernels only implemented one special case of NAT: Masquerading
Linux 2.4.x can do any kind of NAT.
NAT subsystem implemented on top of netfilter, iptables and conntrack
NAT subsystem registers with all five netfilter hooks
'nat' Table registers chains PREROUTING, POSTROUTING and OUTPUT
Following targets available within 'nat' Table
SNAT changes the packet's source whille passing NF_IP_POST_ROUTING
DNAT changes the packet's destination while passing NF_IP_PRE_ROUTING
MASQUERADE is a special case of SNAT
REDIRECT is a special case of DNAT
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Network Address Translation
flow of events for NEW packet:
packet enters NF_IP_PRE_ROUTING after conntrack
resolve conntrack entry for packet
if (expectfn of helper) call it
else iterate over rules in PREROUTING chain of nat table
save respective NAT mappings in conntrack
apply the NAT mappings to the packet
call NAT helper function, if there is one for this proto
...
packet enters NF_IP_POST_ROUTING
resolve conntrack entry for packet
iterate over rules in POSTROUTING chain of nat table
save respectiva NAT mappings in conntrack
apply the NAT mappings to the packet
call NAT helper function, if there is one for this proto
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Network Address Translation
flow of events for ESTABLISHED packets:
packet enters NF_IP_PRE_ROUTING after conntrack
reseolve conntrack entry for packet
apply the NAT mappings (read from conntrack entry) to the packet
call NAT helper function, if there is one for this proto
...
packet enters NF_IP_POST_ROUTING
resolve conntrack entry for packet
apply the NAT mappings (read from conntrack entry) to the packet
call NAT helper function, if there is one for this proto
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Developing a NAT helper module
Network Address Translation
%font "typewriter"
%size 2
#include <linux/netfilter_ipv4/ip_nat_helper.h>
%size 2
struct ip_nat_helper
%size 2
struct list_head list;
%size 2
const char *name;
%size 2
unsigned char *flags;
%size 2
struct module *me;
%size 2
struct ip_conntrack_tuple tuple;
%size 2
struct ip_conntrack_tuple mask;
%size 2
unsigned int (*help)(struct ip_conntrack *, struct ip_conntrack_expect *, struct ip_nat_info *, enum ip_conntrack_info, unsigned int hooknum, struct sk_buff **)
%size 2
unsigned int (*expect)(struct sk_buff **, unsigned int hooknum, struct ip_conntrack, struct ip_nat_info *)
%size 2
int ip_nat_helper_register(struct ip_nat_helper *);
%size 2
void ip_nat_helper_unregister(struct ip_nat_helper *);
%size 2
int ip_nat_mangle_tcp_packet();
%size 2
int ip_nat_mangle_udp_packet();
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Advanced Netfilter concepts
%size 4
Userspace logging
flexible replacement for old syslog-based logging
packets to userspace via multicast netlink sockets
easy-to-use library (libipulog)
plugin-extensible userspace logging daemon (ulogd)
Can even be used to directly log into MySQL
Queuing
reliable asynchronous packet handling
packets to userspace via unicast netlink socket
easy-to-use library (libipq)
provides Perl bindings
experimental queue multiplex daemon (ipqmpd)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Developing netfilter/iptables extensions
Thanks
The slides and the an according paper of this presentation are available at http://www.gnumonks.org/
The netfilter homepage: http://www.netfilter.org/
Thanks to
the BBS people, Z-Netz, FIDO, ...
for heavily increasing my computer usage in 1992
KNF
for bringing me in touch with the internet as early as 1994
for providing a playground for technical people
for telling me about the existance of Linux!
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
for implementing (one of?) the world's best TCP/IP stacks
Paul 'Rusty' Russell
for starting the netfilter/iptables project
for trusting me to maintain it today
Astaro AG (http://www.astaro.com/)
for sponsoring parts of my netfilter work
for sponsoring my travel cost to OLS

View File

@ -0,0 +1,615 @@
%include "default.mgp"
%default 1 bgrad
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
#%deffont "typewriter" tfont "MONOTYPE.TTF"
%page
%nodefault
%back "blue"
%center
%size 7
Developing netfilter/iptables
extensions
%center
%size 4
by
Harald Welte <laforge@netfilter.org>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Contents
Introduction
The netfilter/iptables architecture
Netfilter hooks in protocol stacks
Packet selection based on IP Tables
The Connection Tracking Subsystem
The NAT Subsystem based on netfilter + iptables
Packet filtering using the 'filter' table
Packet mangling using the 'mangle' table
Advanced netfilter concepts
Current development and Future
Developing a netfilter module
Developing a new iptables match
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Netfilter Hooks
What is netfilter?
System of callback functions within network stack
Callback function to be called for every packet traversing certain point (hook) within network stack
Protocol independent framework
Hooks in layer 3 stacks (IPv4, IPv6, DECnet, ARP)
Multiple kernel modules can register with each of the hooks
Asynchronous packet handling in userspace (ip_queue)
Traditional packet filtering, NAT, ... is implemented on top of this framework
Can be used for other stuff interfacing with the core network stack, like DECnet routing daemon.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Netfilter Hooks
Netfilter architecture in IPv4
%font "typewriter"
%size 3
--->[1]--->[ROUTE]--->[3]--->[4]--->
| ^
| |
| [ROUTE]
v |
[2] [5]
| ^
| |
v |
%font "standard"
1=NF_IP_PRE_ROUTING
2=NF_IP_LOCAL_IN
3=NF_IP_FORWARD
4=NF_IP_POST_ROUTING
5=NF_IP_LOCAL_OUT
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Netfilter Hooks
Netfilter Hooks
Any kernel module may register a callback function at any of the hooks
The module has to return one of the following constants
NF_ACCEPT continue traversal as normal
NF_DROP drop the packet, do not continue
NF_STOLEN I've taken over the packet do not continue
NF_QUEUE enqueue packet to userspace
NF_REPEAT call this hook again
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Developing netfilter/iptables extensions
Developing a netfilter module
Netfilter modules are very low-layer
Get called for every packet passing the hook in this l3prot
Examples of netfilter modules are: ip_tables, ip_conntrack, iptable_nat
%font "typewriter"
%size 3
#include <linux/netfilter.h>
%size 3
nf_register_hook(struct nf_hook_ops *reg)
%size 3
nf_unregister_hook(struct nf_hook_ops *reg)
%size 3
struct nf_hook_ops:
%size 3
struct list_head list; /* list header */
%size 3
nf_hookfn *hook; /* the callback function */
%size 3
int pf; /* protocol family */
%size 3
int hooknum; /* hook to register with */
%size 3
int priority; /* priority (ordering) */
%font "standard"
Example code see "nf_workshop.c"
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
IP tables
Packet selection using IP tables
The kernel provides generic IP tables support
Each kernel module may create it's own IP table
The three major parts of 2.4 firewalling subsystem are implemented using IP tables
Packet filtering table 'filter'
NAT table 'nat'
Packet mangling table 'mangle'
Could potentially be used for other stuff, i.e. IPsec SPDB
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
IP Tables
Managing chains and tables
An IP table consists out of multiple chains
A chain consists out of a list of rules
Every single rule in a chain consists out of
match[es] (rule executed if all matches true)
target (what to do if the rule is matched)
%size 4
matches and targets can either be builtin or implemented as kernel modules
%size 5
The userspace tool iptables is used to control IP tables
handles all different kinds of IP tables
supports a plugin/shlib interface for target/match specific options
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
IP Tables
Basic iptables commands
To build a complete iptables command, we must specify
which table to work with
which chain in this table to use
an operation (insert, add, delete, modify)
one or more matches (optional)
a target
The syntax is
%font "typewriter"
%size 3
iptables -t table -Operation chain -j target match(es)
%font "standard"
%size 5
Example:
%font "typewriter"
%size 3
iptables -t filter -A INPUT -j ACCEPT -p tcp --dport smtp
%font "standard"
%size 5
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
IP Tables
Matches
Basic matches
-p protocol (tcp/udp/icmp/...)
-s source address (ip/mask)
-d destination address (ip/mask)
-i incoming interface
-o outgoing interface
Match extensions (examples)
tcp/udp TCP/udp source/destination port
icmp ICMP code/type
ah/esp AH/ESP SPID match
mac source MAC address
mark nfmark
length match on length of packet
limit rate limiting (n packets per timeframe)
owner owner uid of the socket sending the packet
tos TOS field of IP header
ttl TTL field of IP header
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
IP Tables
Targets
very dependent on the particular table.
Table specific targets will be discussed later
Generic Targets, always available
ACCEPT accept packet within chain
DROP silently drop packet
QUEUE enqueue packet to userspace
LOG log packet via syslog
ULOG log packet via ulogd
RETURN return to previous (calling) chain
foobar jump to user defined chain
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Packet Filtering
Overview
Implemented as 'filter' table
Registers with three netfilter hooks
NF_IP_LOCAL_IN (packets destined for the local host)
NF_IP_FORWARD (packets forwarded by local host)
NF_IP_LOCAL_OUT (packets from the local host)
Each of the three hooks has attached one chain (INPUT, FORWARD, OUTPUT)
Every packet passes exactly one of the three chains. Note that this is very different compared to the old 2.2.x ipchains behaviour.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Packet Filtering
Targets available within 'filter' table
Builtin Targets to be used in filter table
ACCEPT accept the packet
DROP silently drop the packet
QUEUE enqueue packet to userspace
RETURN return to previous (calling) chain
foobar user defined chain
Targets implemented as loadable modules
REJECT drop the packet but inform sender
MIRROR change source/destination IP and resend
LOG log via syslog
ULOG log via userspace
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Developing netfilter/iptables extensions
Developing an ip_tables match module
ip_tables modules are at a high layer
Get called for every packet iterating a rule with this match
Examples of iptables modules are: ipt_ttl, ipt_tos, ipt_tcpmss
%font "typewriter"
%size 3
#include <linux/netfilter_ipv4/ip_tables.h>
%size 3
ipt_register_match(struct ipt_match *match)
%size 3
ipt_unregister_match(struct ipt_match *match)
%size 3
struct ipt_match:
%size 3
struct list_head list; /* list header {NULL,NULL} */
%size 3
const char name[]; /* name of the match */
%size 3
int (*match); /* called when pkt is matched */
%size 3
int (*checkentry); /* called when entry inserted */
%size 3
void (*destroy); /* called when entry deleted */
%size 3
struct module *me; /* set to THIS_MODULE */
%font "standard"
Example code see "ipt_workshop.c"
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Developing netfilter/iptables extensions
Developing an iptables match module
Something has to parse the commandline options for ipt_workshop.c
Solution: libpt_workshop.c as iptables plugin
%font "typewriter"
%size 3
#include <iptables.h>:
%size 3
register_match(struct iptables_match)
%size 3
struct iptables_match:
%size 3
struct iptables_match *next; /* next one */
%size 3
ipt_chainlabel name; /* name */
%size 3
const char *version; /* version */
%size 3
size_t size; /* size of match data */
%size 3
size_t userspacesize; /* size for userspace */
%size 3
void (*help); /* print help message */
%size 3
void (*init); /* init the matchinfo */
%size 3
int (*parse); /* parse getopt chars */
%size 3
void (*final_check); /* consistency check */
%size 3
void (*print); /* print (iptables -L) */
%size 3
void (*save); /* iptables-save */
%size 3
struct option extra_opts; /* getopt-style opts */
%font "typewriter"
Example code see "libipt_workshop.c"
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Connection Tracking Subsystem
Connection tracking...
implemented seperately from NAT
enables stateful filtering
implementation
hooks into NF_IP_PRE_ROUTING to track packets
hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to see if packet passed filtering rules
protocol modules (currently TCP/UDP/ICMP)
application helpers currently (FTP,IRC,H.323,talk,SNMP)
divides packets in the following four categories
NEW - would establish new connection
ESTABLISHED - part of already established connection
RELATED - is related to established connection
INVALID - (multicast, errors...)
does _NOT_ filter packets itself
can be utilized by iptables using the 'state' match
is used by NAT Subsystem
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Connection Tracking Subsystem
Common structures
struct ip_conntrack_tuple, representing unidirectional flow
layer 3 src + dst
layer 4 protocol
layer 4 src + dst
connetions represented as struct ip_conntrack
original tuple
reply tuple
timeout
l4 state private data
app helper
app helper private data
expected connections
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Connection Tracking Subsystem
Flow of events for new packet
packet enters NF_IP_PRE_ROUTING
tuple is derived from packet
lookup conntrack hash table with hash(tuple) -> fails
new ip_conntrack is allocated
fill in original and reply == inverted(original) tuple
initialize timer
assign app helper if applicable
see if we've been expected -> fails
call layer 4 helper 'new' function
...
packet enters NF_IP_POST_ROUTING
do hashtable lookup for packet -> fails
place struct ip_conntrack in hashtable
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Connection Tracking Subsystem
Flow of events for packet part of existing connection
packet enters NF_IP_PRE_ROUTING
tuple is derived from packet
lookup conntrack hash table with hash(tuple)
assosiate conntrack entry with skb->nfct
call l4 protocol helper 'packet' function
do l4 state tracking
update timeouts as needed [i.e. TCP TIME_WAIT,...]
...
packet enters NF_IP_POST_ROUTING
do hashtable lookup for packet -> succeds
do nothing else
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Writing extensions for the conntrack subsystem
new l4 protocol modules are very rare
more common: application helpers for ftp,irc,h.323,quake,mms,...
API for conntrack helper modules:
%font "typewriter"
%size 3
#include <linux/netfilter_ipv4/ip_conntrack_helper.h>
%size 3
struct ip_conntrack_helper
%size 3
struct list_head *list;
%size 3
const char *name;
%size 3
unsigned char flags;
%size 3
struct module *me;
%size 3
unsigned int max_expected;
%size 3
unsigned int timeout;
%size 3
struct ip_conntrack_tuple tuple;
%size 3
struct ip_conntrack_mask mask;
%size 3
int (*help)(const struct iphdr *iph, size_t, struct ip_conntrack, enum ip_conntrack_info);
%size 3
int ip_conntrack_helper_register(struct ip_conntrack_helper);
%size 3
void ip_conntrack_helper_unregister(struct ip_conntrack_helper);
%size 3
int ip_conntrack_expect_related(struct ip_conntrack, struct ip_conntrack_expect);
%size 3
int ip_conntrack_change_expect(struct ip_conntrack_expect, struct ip_conntrack_tuple);
%size 3
void ip_conntrack_unexpect_related(struct ip_conntrack_expect);
%font "standard"
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Network Address Translation
Overview
Previous Linux Kernels only implemented one special case of NAT: Masquerading
Linux 2.4.x can do any kind of NAT.
NAT subsystem implemented on top of netfilter, iptables and conntrack
NAT subsystem registers with all five netfilter hooks
'nat' Table registers chains PREROUTING, POSTROUTING and OUTPUT
Following targets available within 'nat' Table
SNAT changes the packet's source whille passing NF_IP_POST_ROUTING
DNAT changes the packet's destination while passing NF_IP_PRE_ROUTING
MASQUERADE is a special case of SNAT
REDIRECT is a special case of DNAT
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Network Address Translation
flow of events for NEW packet:
packet enters NF_IP_PRE_ROUTING after conntrack
resolve conntrack entry for packet
if (expectfn of helper) call it
else iterate over rules in PREROUTING chain of nat table
save respective NAT mappings in conntrack
apply the NAT mappings to the packet
call NAT helper function, if there is one for this proto
...
packet enters NF_IP_POST_ROUTING
resolve conntrack entry for packet
iterate over rules in POSTROUTING chain of nat table
save respectiva NAT mappings in conntrack
apply the NAT mappings to the packet
call NAT helper function, if there is one for this proto
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Network Address Translation
flow of events for ESTABLISHED packets:
packet enters NF_IP_PRE_ROUTING after conntrack
reseolve conntrack entry for packet
apply the NAT mappings (read from conntrack entry) to the packet
call NAT helper function, if there is one for this proto
...
packet enters NF_IP_POST_ROUTING
resolve conntrack entry for packet
apply the NAT mappings (read from conntrack entry) to the packet
call NAT helper function, if there is one for this proto
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Developing a NAT helper module
Network Address Translation
%font "typewriter"
%size 3
#include <linux/netfilter_ipv4/ip_nat_helper.h>
%size 3
struct ip_nat_helper
%size 3
struct list_head list;
%size 3
const char *name;
%size 3
unsigned char *flags;
%size 3
struct module *me;
%size 3
struct ip_conntrack_tuple tuple;
%size 3
struct ip_conntrack_tuple mask;
%size 3
unsigned int (*help)(struct ip_conntrack *, struct ip_conntrack_expect *, struct ip_nat_info *, enum ip_conntrack_info, unsigned int hooknum, struct sk_buff **)
%size 3
unsigned int (*expect)(struct sk_buff **, unsigned int hooknum, struct ip_conntrack, struct ip_nat_info *)
%size 3
int ip_nat_helper_register(struct ip_nat_helper *);
%size 3
void ip_nat_helper_unregister(struct ip_nat_helper *);
%size 3
int ip_nat_mangle_tcp_packet();
%size 3
int ip_nat_mangle_udp_packet();
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The netfilter/iptables architecture
Advanced Netfilter concepts
%size 4
Userspace logging
flexible replacement for old syslog-based logging
packets to userspace via multicast netlink sockets
easy-to-use library (libipulog)
plugin-extensible userspace logging daemon (ulogd)
Can even be used to directly log into MySQL
Queuing
reliable asynchronous packet handling
packets to userspace via unicast netlink socket
easy-to-use library (libipq)
provides Perl bindings
experimental queue multiplex daemon (ipqmpd)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Developing netfilter/iptables extensions
Thanks
The slides and the an according paper of this presentation are available at http://www.gnumonks.org/
The netfilter homepage: http://www.netfilter.org/
Thanks to
the BBS people, Z-Netz, FIDO, ...
for heavily increasing my computer usage in 1992
KNF
for bringing me in touch with the internet as early as 1994
for providing a playground for technical people
for telling me about the existance of Linux!
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
for implementing (one of?) the world's best TCP/IP stacks
Paul 'Rusty' Russell
for starting the netfilter/iptables project
for trusting me to maintain it today
Astaro AG (http://www.astaro.com/)
for sponsoring parts of my netfilter work
for sponsoring my travel cost to OLS

View File

@ -0,0 +1,57 @@
#include <linux/module.h>
#include <linux/config.h>
#include <linux/skbuff.h>
#include <linux/ip.h>
#include <linux/netfilter.h>
#include <linux/netfilter_ipv4.h>
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Harald Welte <laforge@netfilter.org>");
MODULE_DESCRIPTION("OLS2003 workshop module");
static unsigned int
workshop_fn(unsigned int hooknum,
struct sk_buff **pskb,
const struct net_device *in,
const struct net_device *out,
int (*okfn)(struct sk_buff *))
{
struct iphdr *iph = (*pskb)->nh.iph;
/* do whatever we want to do */
printk(KERN_NOTICE "packet from %u.%u.%u.%u received\n",
NIPQUAD(iph->saddr));
return NF_ACCEPT;
}
static struct nf_hook_ops workshop_ops = {
.list = { .prev = NULL, .next = NULL },
.hook = &workshop_fn,
.pf = PF_INET,
.hooknum = NF_IP_PRE_ROUTING,
.priority = NF_IP_PRI_LAST-1
};
static int __init init(void)
{
int ret = 0;
ret = nf_register_hook(&workshop_ops);
if (ret < 0) {
printk(KERN_ERR "something went wrong while registering\n");
return ret;
}
printk(KERN_DEBUG "workshop netfilter module successfully loaded\n");
return ret;
}
static void __exit fini(void)
{
nf_unregister_hook(&workshop_ops);
}
module_init(init);
module_exit(fini);

View File

@ -0,0 +1,105 @@
What is open source? How does it work?
Who writes code for nothing and why?
- traditional software model
- product-oriented
- company finances development of software
- same copy of software object code is sold under a very restrictive
license
- license fees refinance cost of development
- enforcement of restrictive license guarantees revenue
- advantages
- proven business model
- disadvantage
- have to develop everything on your own or buy licenses of 3rd
party software
- less flexibility for the customer
- does the customer trust the 'black box' you are selling?
- if vendor goes out of business, no bugfixes/updates
- open source model
- service based
- individual parties contribute code parts
- software is distributed for free
- software is distributed under very permissive license
- service / support / customization refinance development
- advantages
- vast amount of available FOSS can be used as foundation for
own products
- source code is available for peer review
- bug fixes for free, people just send you patches
- new features impelemented by your users!
- disadvantage
- business model has yet to be proven to work
- important open source license
- BSD style license
- permits any use of the sourcecode as long as copyright notice
remains
- GPL (GNU General Public License)
- source for resulting binary has to be provided
- ensures that derivates of free software are still free
- LGPL (GNU Lesser General Public License)
- permits linking with non-gpl code (mainly used for libraries)
- difference free software / open source
- term 'free software' (free as in freedom, not beer) introduced by
Stallman / FSF 1984.
- focus on political/ethical/philosophical freedom
- open source software (OSS) introduced by OSI in 1997
- focus on technological advantage by means of source review
- most FOSS licenses match both definitions, OSS less restrictive
- history of FOSS
- initially software always for free in source (e.g. IBM S/360)
- as hardware gets less expensive, companies start to license
software for money
- some people (Stallman, et. al.) didn't want to give up the freedom
they're used to.
- 1983: GNU project is founded, goal: Implementation of a free UNIX-like
operating system
- 1984: Free Software Foundation is established as non-for-profit legal
entity behind the GNU project
- 1991: Linus Torvalds releases the first version of the Linux Kernel
under the GNU GPL license. Together with the other parts from the
GNU project and others, a 100% free operating system is available
- 1994-2000: Free Software is increasingly recognized as reliable,
stable alternative to proprietary software
- Who is behind FOSS?
- in the beginning mostly computer enthusiasts with academic background
- motivation through
- fight: david <-> goliath
- to show how bad most proprietary software is
- to make the internet a better place
- to work together with _very_ good programmers
- to gain more experience / better reputation
- more and more commercial entities recognize the value of FOSS
- contributions to existing projects
- start of new projects
- contracting consultants and FOSS companies for implementation
of missing features
- experienced end-users
- independent consultants
- academic institutions (e.g. exim, cyrus)
- mixed FOSS / proprietary companies (like Astaro)
- use FOSS as foundation for their proprietary solutions
- have a vital need for a reliable and up-to-date foundation,
thus contribute back to and/or fund FOSS
- development process, communication
- everybody who agrees to the license can contribute code
- project is usually started by a single developer or a small group
- different actors:
- maintainer: official person to maintain the code, responsible
- core team: small group of leaders behind the project
- developers: people who write code on a regular basis
- contibutors: people who contribute a single feature or a bug
fix from time to time
- users: people who use the software, often organized on
mailinglists, newsgroups, user groups, ..
- main communication medium are mailinglists
- every developer can be contacted directly via email
- leaders/managers are people with the best technical skills, unlike the 'commercial world' where you need certain diploma, connections, ...
- communication is random. no manager <-> manager talk about technical
stuff they don't understand

View File

@ -0,0 +1,185 @@
%include "default.mgp"
%default 1 bgrad
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
%nodefault
%back "blue"
%center
%size 7
What is Open Source / Free Software ?
%center
%size 4
by
Harald Welte <hwelte@astaro.com>
Harald Welte <laforge@gnumonks.org>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
What is Free / Open Source Software (FOSS)
Contents
The traditional (proprietary) software model
The Free / Open Source software model
Important Free / Open Source software licenses
Difference Free Software / Open Source
History of Free / Open Source software
Who is behind FOSS?
Development Process
Thanks
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
What is Free / Open Source Software (FOSS)
The traditional (proprietary) software model
traditional software model
product-oriented
vendor finances development of software
business model of software based on secret source code
same copy of software object code is sold under a very restrictive license
license fees refinance cost of development
enforcement of restrictive license guarantees revenue
advantages
proven business model
disadvantage
vendor has to develop everything on his own or buy licenses of 3rd party software
less flexibility for the customer
does the customer trust the 'black box' you are selling?
if vendor goes out of business, no bugfixes/updates
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
What is Free / Open Source Software (FOSS)
The free / open source software model
Open Source / Free Software model
service based
individual parties contribute code parts
software is distributed for free
software source code is distributed under very permissive license
service / support / customization refinance development
advantages
vast amount of available FOSS can be used as foundation for own products
source code is available for peer review
bug fixes for free, people just send you patches
new features impelemented by your users!
disadvantage
business model has yet to prove scalability
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
What is Free / Open Source Software (FOSS)
Difference Free Software / Open Source
difference free software / open source
free software
term 'free software' (free as in freedom, not beer) introduced by Richard Stallman / FSF 1984.
focus on political/ethical/philosophical freedom
open source
term 'open source' software (OSS) introduced by OSI in 1997
focus on technological advantage by means of source review
most FOSS licenses match both definitions, OSS less restrictive
FOSS is _not_ to be mistaken as freeware / shareware!
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
What is Free / Open Source Software (FOSS)
Important FOSS licenses
important free / open source license
BSD (Berkeley Systems Derivate) style license
permits any use of the sourcecode as long as copyright notice remains
GPL (GNU General Public License)
source for resulting binary has to be provided
ensures that derivates of free software are still free
LGPL (GNU Lesser General Public License)
permits linking with non-gpl code (mainly used for libraries)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
What is Free / Open Source Software (FOSS)
History of Free / Open Source Software
history of free / open source software
initially software always for free in source (e.g. IBM S/360)
as hardware gets less expensive, companies start to license software for money
some people (Stallman, et. al.) didn't want to give up the freedom they're used to.
1983: GNU project is founded, goal: Implementation of a free UNIX-like operating system
1984: Free Software Foundation is established as non-for-profit legal entity behind the GNU project
1991: Linus Torvalds releases the first version of the Linux Kernel under the GNU GPL license. Together with the other parts from the GNU project and others, a 100% free operating system is available
1994-1999: FOSS is increasingly recognized as reliable, stable alternative to proprietary software, esp. in the server + networking market
2000-2003: FOSS is increasingly considered as an alternative on the desktop (see recent decision by Munich city administration, respective laws in latin america, ...)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
What is Free / Open Source Software (FOSS)
Who is behind FOSS?
individuals
computer enthusiasts motivated by
fight: david <-> goliath
ability to show how poorly implemented most proprietary software is
ability to gain more experience / better reputation
experienced end-users
independent consultants
looking for a solution to a particular problem and already have 95% by using existing FOSS
organizations
commercial entities who recognize the value of FOSS
contributions to existing projects
start of new projects
contracting consultants and FOSS companies for implementation of missing features
mixed FOSS / proprietary companies (like Astaro)
use FOSS as foundation for their proprietary solutions
have a vital need for a reliable and up-to-date foundation, thus contribute back to and/or fund FOSS
academic institutions (e.g. exim, cyrus)
are traditionally involved in the exchange of research results. Why treat software differently?
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
What is Free / Open Source Software (FOSS)
Development Process
development process, communication
everybody who agrees to the license can contribute code
project is usually started by a single developer or a small group
different actors in development process
maintainer: official person to maintain the code, responsible
core team: small group of leaders behind the project
developers: people who write code on a regular basis
contibutors: people who contribute a single feature or a bug fix from time to time
users: people who use the software, often organized on mailinglists, newsgroups, user groups, ..
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
What is Free / Open Source Software (FOSS)
Development Process
main communication medium are mailinglists
every developer can be contacted directly via email
leaders/managers are people with the best technical skills, unlike the 'commercial world' where you need certain diploma, connections, ...
communication is random. no manager <-> manager talk about technical stuff they don't understand
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
What is Free / Open Source Software (FOSS)
Thanks
in the name of the netfilter/iptables project, thanks to Astaro for funding
particular tasks on my schedule
equipment (dual Opteron below my desk)
my travel expenses to many FOSS conferences
the netfilter developer workshop in August 2003 (Budapest, HU)

View File

@ -0,0 +1,281 @@
%include "default.mgp"
%default 1 bgrad
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
%nodefault
%back "blue"
%center
%size 7
Firewalls, IPsec and Linux
%center
%size 4
by
Harald Welte <laforge@netfilter.org>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalls, IPsec and Linux
Contents
Introduction
Highly Scalable Linux Network Stack
Netfilter Hooks
Packet selection based on IP Tables
The Connection Tracking Subsystem
The NAT Subsystem
IPsec with Free S/WAN
IPsec with Kernel 2.6.x
Cipe, vtun, openvpn and others
Traffic Shaping, QoS, Policy Routing
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalls, IPsec and Linux
Introduction
What this is:
A broad overview about the advanced Linux networking features
Intended for a network savyy audience that has little Linux background
What this presentation is not:
A tutorial on how to use iptables, tc, iproute2, brctl
An introduction into the cool code we write every day ;)
It will try to show you what you can do with Linux networking, not how.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalls, IPsec and Linux
Introduction
Linux and Networking
Linux is a true child of the Internet
Early adopters: ISP's, Universities
Lots of work went into a highly scalable network stack
Not only for client/server, but also for routers
Features unheared of in other OS's
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalls, IPsec and Linux
Introduction
Did you know, that a stock 2.6.5 linux kernel can provide
a stateful packet filter ?
fully symmetric NA(P)T ?
policy routing ?
QoS / traffic shaping ?
IPv6 firewalling ?
packet filtering, NA(P)T on a bridge ?
layer 2 (mac) address translation ?
If not, chances are high that this presentation will tell you something new.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalls, IPsec and Linux
Netfilter Hooks
What is netfilter?
System of callback functions within network stack
Callback function to be called for every packet traversing certain point (hook) within network stack
Protocol independent framework
Hooks in layer 3 stacks (IPv4, IPv6, DECnet, ARP)
Multiple kernel modules can register with each of the hooks
Traditional packet filtering, NAT, ... is implemented on top of this framework
Can be used for other stuff interfacing with the core network stack, like DECnet routing daemon.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalls, IPsec and Linux
IP tables
Packet selection using IP tables
The kernel provides generic IP tables support
Each kernel module may create it's own IP table
The three major parts of 2.4 firewalling subsystem are implemented using IP tables
Packet filtering table 'filter'
NAT table 'nat'
Packet mangling table 'mangle'
Could potentially be used for other stuff, e.g. IPsec SPDB
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalls, IPsec and Linux
IP Tables
Managing chains and tables
An IP table consists out of multiple chains
A chain consists out of a list of rules
Every single rule in a chain consists out of
match[es] (rule executed if all matches true)
target (what to do if the rule is matched)
%size 4
matches and targets can either be builtin or implemented as kernel modules
%size 5
The userspace tool iptables is used to control IP tables
handles all different kinds of IP tables
supports a plugin/shlib interface for target/match specific options
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalls, IPsec and Linux
Connection Tracking Subsystem
Connection tracking...
implemented seperately from NAT
enables stateful filtering
protocol modules (currently TCP/UDP/ICMP/GRE/SCTP)
application helpers (currently FTP,IRC,H.323,talk,SNMP,RTSP)
does _NOT_ filter packets itself
can be utilized by iptables using the 'state' match
is used by NAT Subsystem
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalls, IPsec and Linux
Network Address Translation
Network Address Translation
Previous Linux Kernels only implemented one special case of NAT: Masquerading
Linux 2.4.x / 2.6.x can do any kind of NAT.
NAT subsystem implemented on top of netfilter, iptables and conntrack
Following targets available within 'nat' Table
SNAT changes the packet's source whille passing NF_IP_POST_ROUTING
DNAT changes the packet's destination while passing NF_IP_PRE_ROUTING
MASQUERADE is a special case of SNAT
REDIRECT is a special case of DNAT
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalls, IPsec and Linux
Packet Mangling
Purpose of mangle table
packet manipulation except address manipulation
Targets specific to the 'mangle' table:
DSCP - manipulate DSCP field
IPV4OPTSSTRIP - strip IPv4 options
MARK - change the nfmark field of the skb
TCPMSS - set TCP MSS option
TOS - manipulate the TOS bits
TTL - set / increase / decrease TTL field
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalls, IPsec and Linux
Linux Bridging
Bridging (brctl)
Includes support for Spanning Tree
Fully supports packet filtering and NAT (!) on a bridge
Can also filter and translate layer 2 MAC addresses
Can implement a 'brouter' (bridge certain traffic, route other)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalls, IPsec and Linux
Linux Policy Routing
Policy Routing (iproute2)
Allows routing decisions on arbitrary information
Provides up to 255 different routing tables within one system
By combining via nfmark with iptables, any matches of the packet filter can be used for the routing decision
Very useful in complex setups with mutiple links (e.g. multiple DSL uplinks with dynamic addresses, asymmetric routing, ...)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalls, IPsec and Linux
Linux Traffic Shaping
Traffic Control (tc)
Framework for lots of algorithms like RED,SFQ,TBF,CBQ,CSZ,GRED,HTB
Very granular control, especially for very low bandwidth links
Present since Linux 2.2.x but still not used widely
Lack of documentation, but situation is improving (www.lartc.org)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalls, IPsec and Linux
Free S/WAN
Free S/WAN
Was a politically motivated effort to provide IPsec for Linux 2.0+
Goal was to encrypt as much Internet Traffic as possible
Software architecture didn't fit very well with Linux 2.4/2.6 network stack
Project has been shut down, however Open S/WAN continues support
Is in widespread production use and has received a lot of testing
Political motivation prevented any U.S. citizen to contribute code
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalls, IPsec and Linux
Linux 2.6.x IPsec
Linux 2.6.x IPsec
Linux networking gods disaproved Free S/WAN political restrictions and software design
Thus, they decided to write their own IPsec stack
Result is in the stock 2.6.x kernel series
Offers complete support for transport and tunnel mode
Can be used with FreeSWAN (pluto) or KAME (isakmpd) userspace
Remaining problems
No integration with hardware crypto accelerators yet
No implementation of NAT traversal yet
Interaction with iptable_nat still has to be sorted out
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalls, IPsec and Linux
cipe, vtun, openswan and others
Other VPN protocols/programs
Evolved as linux specific VPN implementations since the Linux Kernel was lacking stock IPsec support for a long time
Are totally incompatible to IPsec and only compatible to themselves
Are of questionable security (at least in case of cipe, vtun)
Are mostly userspace implementations
Are way easier to configure
Can provide layer 2 tunnels to route (or bridge!) all kinds of protocols
openvpn with X.509 certificates is a very clean and easy solution for building strong VPN tunnels between two linux gateways
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
Firewalls, IPsec and Linux
Thanks
Thanks to
the BBS scene, Z-Netz, FIDO, ...
for heavily increasing my computer usage in 1992
KNF (http://www.franken.de/)
for bringing me in touch with the internet as early as 1994
for providing a playground for technical people
for telling me about the existance of Linux!
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
for implementing (one of?) the world's best TCP/IP stacks
Paul 'Rusty' Russell
for starting the netfilter/iptables project
for trusting me to maintain it today
Astaro AG
for sponsoring parts of my netfilter work
%size 3
The slides and the an according paper of this presentation are available at http://www.gnumonks.org/
%size 3

Binary file not shown.

Binary file not shown.

View File

@ -0,0 +1,21 @@
Harald Welte ist der Leiter des Netfilter Core Team und is massgeblich an der Entwicklung und Pflege des Paketfilters netfilter/iptables beteiligt.
Sein Augenmerk innerhalb der Computerwelt lag schon immer auf der
Netzwerktechnik. So ist z.B. der Grund sich 1994 mit Linux zu beschaeftigen
aus der Aufgabe entstanden, ein UUCP<->ZConnect<->FIDO gateway aufzusetzen.
In der wenigen Zeit, die ihm heute neben netfilter/iptables bleibt, schreibt er eigenartige Dokumente wie das UUCP-over-SSL-HOWTO.
Seit 1997 ist er als unabhaengiger IT-Consultant und -Entwickler in
zahlreichen Projekten fuer die unterschiedlichsten Firmen (von Banken bis zu
Computerhardware-Herstellern) taetig.
Im Jahr 2001 folgte er einem Angebot, fuer den Brasilianischen
Linux-Distributor in Curitiba (Brasilien) zu arbeiten.
Seit Februar 2002 wird seine Arbeit am netfilter/iptables-Projekt durch ein
Sponsoring der Fa. Astaro AG unterstuetzt. Neben diesem Sponsoring arbeitet
er nach wie vor als freiberuflicher Berater und Entwickler.
Harald lebt seit November 2002 in Berlin.

View File

@ -0,0 +1,30 @@
Rechtliche Durchsetzung der GPL
Immer mehr Firmen setzen Linux und andere GPL-Lizensierte Software in Ihren
Produkten ein, insbesondere im Bereich der Network Appliances wie Router,
NAT-Gateways und 802.11 Access Points.
Einerseits darf man dies als grossen Erfolg fuer Freie Software weten.
Andererseits gibt es eben leider auch eine Schattenseite: Nicht wenige dieser
Firmen kuemmern sich nicht oder nicht hinreichend um die GPL
Liznenzbedingungen.
Das netfilter/iptables Projekt hat sich deshalb zur Aufgabe gemacht, die
vollstaendige Erfuellung der GPL-Lizenzbedingungen von den betreffenden Firmen
in allen bekannten Faellen einzufordern, notfalls auch gerichtlich.
Diese Bemuehungen laufen nun seit Dezember 2003 - mit ausnahmslosem Erfolg. Das
Ergebnis sind 12 aussergerichtliche Vergleiche, und eine Einstweilige
Verfuegung, welche auch das Widerspruchsverfahren ueberstanden hat.
Die Liste der betroffenen Firmen beinhaltet nahezu ausschliesslich bekannte
Namen wie Siemens, Asus, Belkin.
Der Autor wird einen Ueberblick ueber diese erfolgreiche GPL-Durchsetzung
innerhalb des Deutschen Rechtsraums geben. Weiterhin wird er darueber
sprechen, welche genauen Bedingungen erfuellt werden muessen, um den
Softwarevertrieb GPL-konform zu gestalten.
Darueberhinaus moechte er einige Empfehlungen an Autoren Freier Software geben,
wie diese schon im Vorfeld einer moeglichen spaetere Durchsetzung ihrer Rechte
durch konkrete Massnahmen waehrend der Entwicklung helfen koennen.

View File

@ -0,0 +1,253 @@
%include "default.mgp"
%default 1 bgrad
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
%nodefault
%back "blue"
%center
%size 7
Enforcing the GNU GPL
Copyright helps Copyleft
%center
%size 4
by
Harald Welte <laforge@netfilter.org>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
Contents
Introduction
The GNU GPL Revisited
Motivations for licensing under the GPL
Enforcing the GNU GPL
Thanks
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
Introduction
Who is speaking to you?
an independent Free Software developer
who earns his living off Free Software since 1997
who is one of the authors of the linux kernel firewall system called netfilter/iptables
who IS NOT A LAWYER, although this presentation is the result of dealing six months with lawyers on the GPL
Why is he speaking to you?
because he became aware of copyright (copyleft?) infringement and took legal action within German jurisdiction
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
What is copyrightable?
The GNU GPL is a copyright license, and thus only covers copyrighted code
Not everything is copyrightable (German: Schoepfungshoehe)
Small bugfixes are not copyrightable (similar to typo-fixes in a book)
As soon as the programmer has a choice in the implementation, there is significant indication of a copyrightable work
Choice in algorithm, not in formal representation.
Apparently, the level for copyrightable works is relatively low.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
The GNU GPL Revisited
Revisiting the GNU General Public License
Regulates distribution of copyrighted code, not usage
Allows distribution of source code and modified source code
Allows distribution of binaries or modified binaries, if
The license itself is mentioned
A copy of the license accompanies every copy
The complete source code is either
included with the copy
made available to any 3rd party
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
Complete Source Code
%size 3
"... complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable."
Our interpretation of this is:
Source Code
Makefiles
Tools for generating the firmware binary from the source
(even if they are technically no 'scripts')
General Rule:
Intent of License is to enable user to run modified versions of the program. They need to be enabled to do so.
Result: Signing binaries and only accepting signed versions without providing a signature key is not acceptable!
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
Derivative Works
What is a derivative work?
Not dependent on any particular kind of technology (static/dynamic linking, dlopen, whatever)
Even while the modification can itself be a copyrightable work, the combination with GPL-licensed code is subject to GPL.
No precendent in Germany so far
As soon as code is written for a specific non-standard API (such as the iptables plugin API), there is significant indication for a derivative work
This position has been successfully enforced out-of-court with two Vendors so far (iptables modules/plugins).
Result
Position of my lawyers (apparently also of IBM lawyers):
In-kernel proprietary code (binary kernel modules) are not compliant
Case-by-case analysis required, especially when drivers/filesystems are ported from other OS's.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
Confusion about the GPL
%size 4
Unfortunately, the wide misconception about copyright, free software, public domain (even the RedHat CEO!) leads to people unknowingly, or even wilfully only benefit from the freedom but not fulfill the obligations of the GPL.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
Enforcing the GNU GPL
Enforcing the GPL
GPL violations are nothing new, as GPL licensed software is nothing new.
However, the recent Linux boom
The FSF enforces GPL violations of code on which they hold the copyright
silently, without public notice
in lengthy negotiations
During 2003 the "Linksys" case drew a lot of attention
Linksys was selling 802.11 WLAN Acces Ponts / Routers
Lots of GPL licensed software embedded in the device (included Linux, uClibc, busybox, iptables, ...)
FSF led alliance took the 'qiet' approach and it took about four months until the full source code was released
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
Enforcing the GNU GPL
The Linksys case
Some developers didn't agree with this approach
not enough publicity
violators don't loose anything by first not complying and wait for the FSF
four months delay is too much for low product lifecycles in WLAN world
So the netfilter/iptables project started to do their own enforcement in more cases coming up
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
Enforcing the GNU GPL
Enforcing the GPL
chronological order
reverse engineering of firmware images
sending the infringing organization a warning notice
wait for them to sign a statement to cease and desist
applying for a preliminary injunction if they don't (max 4 weeks after reverse engineering)
Success so far
amicable agreement with Asus, Belkin, Allnet, Fujitsu-Siemens, Siemens, Securepoint, U.S. Robotics, ...
some of which made significant donations to charitable organizations of the free software community
preliminary injunction against Sitecom, Sitecom also lost appeals case
more settled cases (not public yet)
negotiating in more cases
public awareness
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
Enforcing the GNU GPL
Enforcing the GPL
remains an important issue for Free Software
will start to happen within the court
has to be made public in order to raise awareness
Problems
only the copyright holder (in most cases the author) can do it
users discovering GPL'd software need to communicate those issues to all copyright holders
The http://www.gpl-violations.org/ project was started
as a platform wher users can report alleged violations
to verify those violations and inform all copyright holders
to inform the public about ongoing enforcement efforts
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GPL enforcement report
Cases so far
Cases so far
Allnet GmbH
Siemens AG
Fujitsu-Siemens Computers GmbH
Axis A.B.
Securepoint GmbH
U.S.Robotics Germany GmbH
undisclosed large vendor
Belkin Compnents GmbH
Asus GmbH
Gateprotect GmbH
Sitecom GmbH
TomTom B.V.
Gigabyte Technologies GmbH
D-Link GmbH
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
Make later enforcement easy
Practical rules for proof by reverse engineering
Don't fix typos in error messages and symbol names
Leave obscure error messages like 'Rusty needs more caffeine'
Make binary contain string of copyright message, not only source
Practical rules for potential damages claims
Use revision control system
Document source of each copyrightable contribution
Name+Email address in CVS commit message
Consider something like FSFE FLA (Fiduciary License Agreement)
Make sure that employers are fine with contributions of their employees
If you find out about violation
Don't make it public (has to be new/urgent for injunctive relief)
Contact lawyer immediately to send wanrning notice
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
Thanks
Thanks to
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
for implementing (one of?) the world's best TCP/IP stacks
Paul 'Rusty' Russell
for starting the netfilter/iptables project
for trusting me to maintain it today
Astaro AG
for sponsoring parts of my netfilter work
Free Software Foundation
for the GNU Project
for the GNU General Public License
%size 3
The slides of this presentation are available at http://www.gnumonks.org/
%size 3
The netfilter homepage http://www.netfilter.org/
%size 3
The http://www.gpl-violations.org/ project

View File

@ -0,0 +1,21 @@
Enforcing the GNU GPL - Copyright helps Copyleft
More and more vendors of various computing devices, especially network-related
appliances such as Routers, NAT-Gateways and 802.11 Access Points are using
Linux and other GPL licensed free software in their products.
While the linux community can look at this as a big success, there is a back
side of that coin: A large number of those vendors have no idea about the GPL
license terms, and as a result do not fulfill their obligations under the GPL.
The netfilter/iptables project has started legal proceedngs against a number of
companies in violation of the GPL since December 2003. Those legal proceedings
were quite successful so far, resulting in a number of amicable agreements and
one granted preliminary injunction.
The speaker will present an overview about his recent successful enforcement of
the GNU GPL within German jurisdiction.
In the end, it seems like the idea of the founding fathers of the GNU GPL
works: Guaranteeing Copyleft by using Copyright.

View File

@ -0,0 +1,25 @@
Harald Welte is the chairman of the netfilter/iptables core team.
His main interest in computing has always been networking. In the few time
left besides netfilter/iptables related work, he's writing obscure documents
like the UUCP over SSL HOWTO. Other kernel-related projects he has been
contributing are user mode linux and the international (crypto) kernel patch.
He has been working as an independent IT Consultant working on projects for
various companies ranging from banks to manufacturers of networking gear.
During the year 2001 he was living in Curitiba (Brazil), where he got
sponsored for his Linux related work by Conectiva Inc.
Starting with February 2002, Harald has been contracted part-time by
<a href="http://www.astaro.com/">Astaro AG</a>, who are sponsoring him for his
current netfilter/iptables work.
Aside from the Astaro sponsoring, he continues to work as a freelancing
kernel developer and network security consultant.
He licenses his software under the terms of the GNU GPL. He is determined to bring all users, distributors, value added resellers and vendors of netfilter/iptables based products in full compliance with the GPL, even if it includes raising legal charges.
Harald is living in Berlin, Germany.

View File

@ -0,0 +1,228 @@
%include "default.mgp"
%default 1 bgrad
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
%nodefault
%back "blue"
%center
%size 7
Enforcing the GNU GPL
Copyright helps Copyleft
%center
%size 4
by
Harald Welte <laforge@netfilter.org>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
Contents
Introduction
The GNU GPL Revisited
Motivations for licensing under the GPL
Enforcing the GNU GPL
Thanks
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
Introduction
Who is speaking to you?
an independent Free Software developer
who earns his living off Free Software since 1997
who is one of the authors of the linux kernel firewall system called netfilter/iptables
who IS NOT A LAWYER, although this presentation is the result of dealing six months with lawyers on the GPL
Why is he speaking to you?
because he became aware of copyright (copyleft?) infringement and took legal action within German jurisdiction
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
What is copyrightable?
The GNU GPL is a copyright license, and thus only covers copyrighted code
Not everything is copyrightable (German: Schoepfungshoehe)
Small bugfixes are not copyrightable (similar to typo-fixes in a book)
As soon as the programmer has a choice in the implementation, there is significant indication of a copyrightable work
Choice in algorithm, not in formal representation.
Apparently, the level for copyrightable works is relatively low.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
The GNU GPL Revisited
Revisiting the GNU General Public License
Regulates distribution of copyrighted code, not usage
Allows distribution of source code and modified source code
Allows distribution of binaries or modified binaries, if
The license itself is mentioned
A copy of the license accompanies every copy
The complete source code is either
included with the copy
made available to any 3rd party
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
Complete Source Code
%size 3
"... complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable."
Our interpretation of this is:
Source Code
Makefiles
Tools for generating the firmware binary from the source
(even if they are technically no 'scripts')
General Rule:
Intent of License is to enable user to run modified versions of the program. They need to be enabled to do so.
Result: Signing binaries and only accepting signed versions without providing a signature key is not acceptable!
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
Derivative Works
What is a derivative work?
Not dependent on any particular kind of technology (static/dynamic linking, dlopen, whatever)
Even while the modification can itself be a copyrightable work, the combination with GPL-licensed code is subject to GPL.
No precendent in Germany so far
As soon as code is written for a specific non-standard API (such as the iptables plugin API), there is significant indication for a derivative work
This position has been successfully enforced out-of-court with two Vendors so far (iptables modules/plugins).
Result
Position of my lawyers and IBM lawyers:
In-kernel proprietary code (binary kernel modules) are not compliant
Case-by-case analysis required, especially when drivers/filesystems are ported from other OS's.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
Confusion about the GPL
Unfortunately, the wide misconception about copyright, free software, public
domain (even the RedHat CEO!) leads to people unknowingly, or even wilfully
only benefit from the freedom but not fulfill the obligations of the GPL.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
Enforcing the GNU GPL
Enforcing the GPL
GPL violations are nothing new, as GPL licensed software is nothing new.
However, the recent Linux boom
The FSF enforces GPL violations of code on which they hold the copyright
silently, without public notice
in lengthy negotiations
During 2003 the "Linksys" case drew a lot of attention
Linksys was selling 802.11 WLAN Acces Ponts / Routers
Lots of GPL licensed software embedded in the device (included Linux, uClibc, busybox, iptables, ...)
FSF led alliance took the 'qiet' approach and it took about four months until the full source code was released
Some developers didn't agree with this approach
not enough publicity
violators don't loose anything by first not complying and wait for the FSF
four months delay is too much for low product lifecycles in WLAN world
So the netfilter/iptables project started to do their own enforcement in more cases coming up
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
Enforcing the GNU GPL
Enforcing the GPL
chronological order
reverse engineering of firmware images
sending the infringing organization a warning notice
wait for them to sign a statement to cease and desist
applying for a preliminary injunction if they don't (max 4 weeks after reverse engineering)
Success so far
amicable agreement with Asus, Belkin, Allnet, Fujitsu-Siemens, Siemens, Securepoint, U.S. Robotics, ...
some of which made significant donations to charitable organizations of the free software community
preliminary injunction against Sitecom, Sitecom also lost appeals case
more settled cases (not public yet)
negotiating in more cases
public awareness
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
Enforcing the GNU GPL
Enforcing the GPL
remains an important issue for Free Software
will start to happen within the court
has to be made public in order to raise awareness
Problems
only the copyright holder (in most cases the author) can do it
users discovering GPL'd software need to communicate those issues to all copyright holders
The http://www.gpl-violations.org/ project was started
as a platform wher users can report alleged violations
to verify those violations and inform all copyright holders
to inform the public about ongoing enforcement efforts
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
How to make later enforcement easy
Practical rules for proof by reverse engineering
Don't fix typos in error messages and symbol names
Leave obscure error messages like 'Rusty needs more caffeine'
Make binary contain string of copyright message, not only source
Practical rules for potential damages claims
Use revision control system
Document source of each copyrightable contribution
Name+Email address in CVS commit message
Consider something like FSFE FLA (Fiduciary License Agreement)
Make sure that employers are fine with contributions of their employees
If you find out about violation
Don't make it public (has to be new/urgent for injunctive relief)
Contact lawyer immediately to send wanrning notice
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
Thanks
Thanks to
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
for implementing (one of?) the world's best TCP/IP stacks
Paul 'Rusty' Russell
for starting the netfilter/iptables project
for trusting me to maintain it today
Astaro AG
for sponsoring parts of my netfilter work
Free Software Foundation
for the GNU Project
for the GNU General Public License
%size 3
The slides of this presentation are available at http://www.gnumonks.org/
%size 3
The netfilter homepage http://www.netfilter.org/
%size 3
The http://www.gpl-violations.org/ project

View File

@ -0,0 +1,24 @@
Harald Welte is the chairman of the netfilter/iptables core team.
His main interest in computing has always been networking. In the few time
left besides netfilter/iptables related work, he's writing obscure documents
like the UUCP over SSL HOWTO. Other kernel-related projects he has been
contributing are user mode linux, the international (crypto) kernel patch, device drivers and the neighbour cache.
He has been working as an independent IT Consultant working on projects for
various companies ranging from banks to manufacturers of networking gear.
During the year 2001 he was living in Curitiba (Brazil), where he got
sponsored for his Linux related work by Conectiva Inc.
Starting with February 2002, Harald has been contracted part-time by
<a href="http://www.astaro.com/">Astaro AG</a>, who are sponsoring him for his
current netfilter/iptables work.
Aside from the Astaro sponsoring, he continues to work as a freelancing
kernel developer and network security consultant.
He licenses his software under the terms of the GNU GPL. He is determined to bring all users, distributors, value added resellers and vendors of netfilter/iptables based products in full compliance with the GPL, even if it includes raising legal charges.
Harald is living in Berlin, Germany.

View File

@ -0,0 +1,46 @@
21c3-content@cccv.de
* Name: Full name of speaker
Harald Welte
* Bio: Short biography of speaker
See Attachment 1
* Contact: E-Mail, phone, instant messaging etc.
email: laforge@gnumonks.org
Phone: +49-30-24033902
Fax: +49-30-24033904
* Title: Name of event or lecture
Enforcing the GNU GPL
* Subtitle: Additional title description (a couple of words, optional)
Copyright helps Copyleft
* Abstract: An abstract of the event's content (max. 250 letters)
Linux is used more and more, especially in the embedded market. Unfortunately,
a number of vendors do not comply with the GNU GPL. The author has enforced
the GPL numerous times in and out of court, and will talk about his experience.
* Description: A detailed description of the event's content (250 to 500 words)
See Attachment 2
* Attachments: more information
o Links to background information
http://www.gpl-violations.org/
http://www.netfilter.org/licensing.html
http://gnumonks.org/~laforge/weblog/linux/gpl-violations/
o Links to information on the lecture itself
o Slides, Paper in PDF or other formats
Not yet available.

View File

@ -0,0 +1,29 @@
Enforcing the GNU GPL - Copyright helps Copyleft
More and more vendors of various computing devices, especially network-related
appliances such as Routers, NAT-Gateways and 802.11 Access Points are using
Linux and other GPL licensed free software in their products.
While the Linux community can look at this as a big success, there is a back
side of that coin: A large number of those vendors have no idea about the GPL
license terms, and as a result do not fulfill their obligations under the GPL.
The netfilter/iptables project has started legal proceedngs against a number of
companies in violation of the GPL since December 2003. Those legal proceedings
were quite successful so far, resulting in twelve amicable agreements and one
granted preliminary injunction. The list of companies includes large
corporations such as Siemens, Asus and Belkin.
The speaker will present an overview about his recent successful enforcement of
the GNU GPL within German jurisdiction.
He will go on speaking about what exactly is neccessarry to fully comply with
the GPL, including his legal position on corner cases such as cryptographic
signing.
Resulting from his experience in dealing with the german legal system, he will
give some hints to software authors about what they can do in order to make
eventual later license enforcement easier.
In the end, it seems like the idea of the founding fathers of the GNU GPL
works: Guaranteeing Copyleft by using Copyright.

View File

@ -0,0 +1,406 @@
%include "default.mgp"
%default 1 bgrad
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
%nodefault
%back "blue"
%center
%size 7
The GPL is not Public Domain
%center
%size 4
by
Harald Welte <laforge@gnumonks.org>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The GNU GPL Revisited
Contents 1/2
Introduction
What is Copyrightable?
Terminology
Common FOSS Licenses
The GNU GPL Revisited
Complete Source Code
Derivative Works
Non-Public Modifications
GPL Violations
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
Contents 2/2
Past GPL Enforcement
The Linksys case
Typical enforcement timeline
Success so far
Cases so far
Future GPL Enforcement
Thanks
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The GNU GPL Revisited
Introduction
Who is speaking to you?
an independent Free Software developer
who earns his living off Free Software since 1997
who is one of the authors of the Linux kernel firewall system called netfilter/iptables
who IS NOT A LAWYER, although this presentation is the result of dealing almost a year with lawyers on the subject of the GPL
Why is he speaking to you?
because he thinks there is too much confusion about copyright and free software licenses. Even Red Hat CEO Matt Szulik stated in an interview that RedHat puts investments into 'public domain' :(
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The GNU GPL Revisited
Disclaimer
Legal Disclaimer
All information presented here is provided on an as-is basis
There is no warranty for correctness of legal information
The author is not a lawyer
This does not comprise legal advise
The authors experience is limited to German copyright law
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The GNU GPL Revisited
What is copyrightable?
The GNU GPL is a copyright license, and thus only covers copyrighted works
Not everything is copyrightable (German: Schoepfungshoehe)
Small bugfixes are not copyrightable (similar to typo-fixes in a book)
As soon as the programmer has a choice in the implementation, there is significant indication of a copyrightable work
Choice in algorithm, not in formal representation
Apparently, the level for copyrightable works is relatively low
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The GNU GPL Revisited
Terminology
Public Domain
concept where copyright holder abandons all rights
same legal status as works where author has died 70 years ago (German: Gemeinfreie Werke)
Freeware
object code, free of cost. No source code
Shareware
proprietary "Try and Buy" model for object code.
Cardware/Beerware/...
Freeware that encourages users to send payment in kind
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The GNU GPL Revisited
Terminology
Free Software
source code freely distributed
must allow redistribution, modification, non-discriminatory use
mostly defined by Free Software Foundation
Open Source
source code freely distributed
must allow redistribution, modification, non-discriminatory use
defined in the "Open Source Definition" by OSI
The rest of this document will refer to Free and Open Source Software as FOSS.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The GNU GPL Revisited
Common FOSS licenses
Original BSD License
allows redistribution, modification
even allows proprietary extensions with no source code offer
all docs, advertisement materials have to mention copyright holder
Modified BSD License
same as "Original BSD License", but no copyright statements required in docs and advertisements
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The GNU GPL Revisited
Common FOSS licenses
GPL (GNU General Public Liense)
allows redistribution, including modified works
obliges distributor to supply source code including all modifications
usage rights are revoked if license conditions not met
LGPL (GNU Library General Public License)
explicitly allows linking of proprietary applications
written as special case for libraries (such as glibc)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The GNU GPL Revisited
The GNU GPL Revisited
Revisiting the GNU General Public License
Regulates distribution of copyrighted code, not usage
Allows distribution of source code and modified source code
The license itself is mentioned
A copy of the license accompanies every copy
Allows distribution of binaries or modified binaries, if
The license itself is mentioned
A copy of the license accompanies every copy
The complete source code is either included with the copy made available to any 3rd party
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The GNU GPL Revisited
Complete Source Code
%size 3
"... complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable."
Our interpretation of this is:
Source Code
Makefiles
Tools for generating the firmware binary from the source
(even if they are technically no 'scripts')
General Rule:
Intent of License is to enable user to run modified versions of the program. They need to be enabled to do so.
Result: Signing binaries and only accepting signed versions without providing a signature key is not acceptable!
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The GNU GPL Revisited
Derivative Works
What is a derivative work?
Not dependent on any particular kind of technology (static/dynamic linking, dlopen, whatever)
Even while the modification can itself be a copyrightable work, the combination with GPL-licensed code is subject to GPL.
No precendent in Germany so far
As soon as code is written for a specific non-standard API (such as the iptables plugin API), there is significant indication for a derivative work
This position has been successfully enforced out-of-court with two Vendors so far (iptables modules/plugins).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The GNU GPL Revisited
Derivative Works
Position of my lawyer:
In-kernel proprietary code (binary kernel modules) are hard to claim GPL compliant
Case-by-case analysis required, especially when drivers/filesystems are ported from other OS's.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The GNU GPL Revisited
Collected Works
%size 3
"... it is not the intent .. to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works ..."
%size 3
"... mere aggregation of another work ... with the program on a volume of a storage or distribution medium does not bring the other work under the scope of this license"
GPL allows "mere aggregation"
like a general-porpose Linux distribution (SuSE, Red Hat, ...)
GPL disallows "collective works"
legal grey area
tends to depend a lot on jurisdiction
no precendent so far
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The GNU GPL Revisited
Non-Public modifications
Non-Public modifications
A common misconception is that if you develop code within a corporation, and the code never leaves this corporation, you don't have to ship the source code.
However, at least German law would count every distribution beyound a number of close colleague as distribution.
Therefore, if you don't go for '3a' and include the source code together with the binary, you have to distribute the source code to any third party.
Also, as soon as you hand code between two companies, or between a company and a consultant, the code has been distributed.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
The GNU GPL Revisited
GPL Violations
When do I violate the license
when one ore more of the obligations are not fulfilled
What risk do I take if I violate the license?
the GPL automatically revokes any usage right
any copyright holder can obtain a preliminary injunction banning distribution of the infringing product
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
Past GPL enforcement
Past GPL enforcement
GPL violations are nothing new, as GPL licensed software is nothing new.
However, the recent Linux hype made GPL licensed software used more often
The FSF enforces GPL violations of code on which they hold the copyright
silently, without public notice
in lengthy negotiations
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
The Linksys case
During 2003 the "Linksys" case drew a lot of attention
Linksys was selling 802.11 WLAN Acces Ponts / Routers
Lots of GPL licensed software embedded in the device (included Linux, uClibc, busybox, iptables, ...)
FSF led alliance took the usual "quiet" approach
Linksys bought it self a lot of time
Some source code ws released two months later
About four months later, full GPL compliance was achieved
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
The Linksys case
Some developers didn't agree with this approach
not enough publicity
violators don't loose anything by first not complying and wait for the FSF
four months delay is too much for low product lifecycles in WLAN world
The netfilter/iptables project started to do their own enforcement in more cases that were coming up
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
Enforcement case timeline
In chronological order
some user sends us a note he found our code somewhere
reverse engineering of firmware images
sending the infringing organization a warning notice
wait for them to sign a statement to cease and desist
if no statement is signed
contract technical expert to do a stdudy
apply for a preliminary injunction
if statement was signed
try to work out the details
grace period for boxes in stock possible
try to indicate that a donation would be good PR
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
Sucess so far
Success so far
amicable agreements with a number of companies
some of which made significant donations to charitable organizations of the free software community
preliminary injunction against Sitecom, Sitecom also lost appeals case
more settled cases (not public yet)
negotiating in more cases
public awareness
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GPL enforcement report
Cases so far
Allnet GmbH
Siemens AG
Fujitsu-Siemens Computers GmbH
Axis A.B.
Securepoint GmbH
U.S.Robotics Germany GmbH
undisclosed large vendor
Belkin Compnents GmbH
Asus GmbH
Gateprotect GmbH
Sitecom GmbH
TomTom B.V.
Gigabyte Technologies GmbH
D-Link GmbH
Sun Deutschland GmbH
Open-E GmbH
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
Future GPL Enforcement
GPL Enforcement
remains an important issue for Free Software
will start to happen within the court
has to be made public in order to raise awareness
Problems
only the copyright holder (in most cases the author) can do it
users discovering GPL'd software need to communicate those issues to all copyright holders
The http://www.gpl-violations.org/ project was started
as a platform wher users can report alleged violations
to verify those violations and inform all copyright holders
to inform the public about ongoing enforcement efforts
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
Make later enforcement easy
Practical rules for proof by reverse engineering
Don't fix typos in error messages and symbol names
Leave obscure error messages like 'Rusty needs more caffeine'
Make binary contain string of copyright message, not only source
Practical rules for potential damages claims
Use revision control system
Document source of each copyrightable contribution
Name+Email address in CVS commit message
Consider something like FSFE FLA (Fiduciary License Agreement)
Make sure that employers are fine with contributions of their employees
If you find out about violation
Don't make it public (has to be new/urgent for injunctive relief)
Contact lawyer immediately to send wanrning notice
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%page
GNU GPL - Copyright helps Copyleft
Thanks
Thanks to
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
for implementing (one of?) the world's best TCP/IP stacks
Paul 'Rusty' Russell
for starting the netfilter/iptables project
for trusting me to maintain it today
Astaro AG
for sponsoring parts of my netfilter work
Free Software Foundation
for the GNU Project
for the GNU General Public License
%size 3
The slides of this presentation are available at http://www.gnumonks.org/
Further Reading
%size 3
The netfilter homepage http://www.netfilter.org/
%size 3
The http://www.gpl-violations.org/ project

View File

@ -0,0 +1,280 @@
<?xml version='1.0' encoding='ISO-8859-1'?>
<!DOCTYPE article PUBLIC '-//OASIS//DTD DocBook XML V4.3//EN' 'http://www.docbook.org/xml/4.3/docbookx.dtd'>
<article id="gpl-enforcement-ccc2004">
<articleinfo>
<title>Enforcing the GNU GPL - Copyright helps Copyleft</title>
<authorgroup>
<author>
<personname>
<firstname>Harald</firstname>
<surname>Welte</surname>
</personname>
<!--
<personblurb>Harald Welte</personblurb>
<affiliation>
<orgname>netfilter core team</orgname>
<address>
<email>laforge@netfilter.org</email>
</address>
</affiliation>
-->
<email>laforge@gpl-violations.org</email>
</author>
</authorgroup>
<copyright>
<year>2004</year>
<holder>Harald Welte &lt;laforge@gpl-violations.org&gt; </holder>
</copyright>
<date>Dec 01, 2004</date>
<edition>1</edition>
<orgname>netfilter core team</orgname>
<releaseinfo>
$Revision: 1.4 $
</releaseinfo>
<abstract>
<para>
More and more vendors of various computing devices, especially network-related
appliances such as Routers, NAT-Gateways and 802.11 Access Points are using
Linux and other GPL licensed free software in their products.
</para>
<para>
While the Linux community can look at this as a big success, there is a back
side of that coin: A large number of those vendors have no idea about the GPL
license terms, and as a result do not fulfill their obligations under the GPL.
</para>
<para>
The netfilter/iptables project has started legal proceedngs against a number of
companies in violation of the GPL since December 2003. Those legal proceedings
were quite successful so far, resulting in twelve amicable agreements and one
granted preliminary injunction. The list of companies includes large
corporations such as Siemens, Asus and Belkin.
</para>
<para>
This paper and the corresponding presentation will give an overview about the
author's recent successful enforcement of the GNU GPL within German
jurisdiction.
</para>
<para>
The paper will go on describing what exactly is neccessarry to fully comply
with the GPL, including the author's legal position on corner cases such as
cryptographic signing.
</para>
<para>
In the end, it seems like the idea of the founding fathers of the GNU GPL
works: Guaranteeing Copyleft by using Copyright.
</para>
</abstract>
</articleinfo>
<section>
<title>Legal Disclaimer</title>
<para>
The author of this paper is a software developer, not a lawyer. The content of
this paper represents his knowledge after dealing with the legal issues of
about 20 gpl violation cases.
</para>
<para>
All information in this paper is presented on a nas-is basis. There is no
warranty for correctness.
</para>
<para>
The paper does not comprise legal advise, and any details might be coupled to German copyright law (UrhG)
</para>
</section>
<section>
<title>What is copyrightable</title>
<para>
Since the GNU GPL is a copyright license, it can only cover copyrightable
works. The exact definition of what is copyrightable and what not might vary
from legislation to legislation.
</para>
<para>
Software is considered the immaterial result of a creative act, and is treated
very much like literary works. It might therefore be applicable to look at the
analogy of a printed book.
</para>
<para>
In order for a work to be copyrightable, it has to be non-trivial (German:
Sch&ouml;pfungsh&ouml;he). Much like a lector of a book, anybody who just
corrects spelling mistakes, compiler warnings, or even functional fixes such as
fixing a signedness bug or a typecast are unlikely to be seen as a
copyrightable contribution to an existing work.
</para>
<para>
An indication for copyrightability can be the question: Did the author have a
choice (i.e. between different algorithms)? As soon as there are multiple ways
of getting a particular job done, and the author has to make decisions on which
way to go, this is an indication for copyrightability.
</para>
</section>
<section>
<title>The GNU GPL revisited</title>
<para>
As a copyright license, the GNU GPL mainly regulates distribution of a
copyrighted work, not usage. To the opposite, the GNU GPL does not allow an
author to make any additional restrictions like <quote>must not be used for
military purpose</quote>.
</para>
<para>
As a summary, the license allows distribution of the source code (including
modifications, if any) if
<itemizedlist>
<listitem>The GPL license itself is mentioned</listitem>
<listitem>A copy of the full license text accompanies every copy</listitem>
</itemizedlist>
</para>
<para>
The GPL allows distribution of the object code (including modifications) if
<itemizedlist>
<listitem>The GPL license itself is mentioned</listitem>
<listitem>A copy of the full license text accompanies every copy</listitem>
<listitem>The <quote>complete corresponding source code</quote> or a written offer to ship it to any third party is included with every copy</listitem>
</itemizedlist>
</para>
</section>
<section>
<title>Complete Source Code</title>
<para>
The GPL contains a very specific definition of what the term <quote>full source
code</quote> actually means in practise:
</para>
<quote><para>
... complete source code means all the source code for all modules it contains,
plus any associated interface definition files, plus the scripts used to
control compilation and installation of the executable.
</para></quote>
<para>
The interpretation of the paper's author of this (for C programs) is:
<itemizedlist>
<listitem>source code</listitem>
<listitem>Header Files</listitem>
<listitem>Makefiles</listitem>
<listitem>Tools for installation of a modified binary, even if they are not technically implemented as scripts</listitem>
</itemizedlist>
<para>
The general rule in case of any question is the intent of the license: To
enable the user to modify the source code and run modified versions.
</para>
<para>
This brings us to the conclusion that in case of a bundle of hardware and
software, the hardware can not be implemented in a way to only accept
cryptographically signed software, without providing either the original key,
or the option of setting a new key in the hardware.
</para>
</section>
<section>
<title>Derivative Work</title>
<para>
The question of derivative works is probably the hardest question with regard
to the GPL. According to the license text, any derivative work can only be
distributed under the GPL, too. However, the definition of a derivative work
is left to the legal framework of copyright.
</para>
<para>
The paper's author is convinced that any court decision would not look at the
particular technology used to integrate multiple software parts. It is much
more a question of how much dependency there is between the two pieces.
</para>
<para>
If a program is written against a specific non-standard API, this can be
considered as an indication for a derivative work. If a program is written
against standard APIs, and the GPL licensed parts that provide those APIs can
be easily exchanged with other [existing] implementations, then it can be considered as indication for no derivative work.
</para>
<para>
Unfortunately there is no precedent on this issue, so it's up to the first
court decisions on the issue of derivative works to determine.
</para>
</section>
<section>
<title>Collective Works</title>
<para>
<quote>... it is not the intent ... to claim rights or contest your rights to work written entirely by you; rather, the intent is to excercise the right to control the distribution of derivative or collective works ...</quote>
</para>
<para>
<quote>... mere aggregation of another work ... with the program on a volume of a storage or distribution medium does not bring the other work under the scope of this license</quote>
</para>
<para>
So the GPL allows <quote>mere aggregation</quote>, which is what e.g. the
GNU/Linux distributors like RedHat or SuSE do, when they ship GPL-licensed
programs together with a proprietary Macromedia Flash player on one CD- or
DVD-Medium.
</para>
<para>
Further research is required to determine what exactly would be a collective
work, and how far this is backed by copyright law.
</para>
</section>
<section>
<title>Non-Public Modifications</title>
<para>
Since the GPL regulates distribution and not use, any modifications that are
not distributed in any form do not require offering the source code.
</para>
<para>
Special emphasis has to be given on when distribution happens within the legal
context.
</para>
Undoubtedly, as soon as you distribute modifications to a third party, such as
a contractor or another company, you are bound by the GPL to either include the
full source code, or a written offer. Please note that if you don't include
the source code at any given time, the written offer must be available to any third party!
</para>
<para>
Interestingly, at least in German copyright law, distribution can also happen
within an organization. Apparently, as soon as a copy is distributed to a
group larger than a small number of close colleagues whom you know personally,
distribution happens - and thus the obligations of the GPL apply.
</para>
</section>
<section>
<title>GPL Violations</title>
<para>
The GPL is violated as soon as one or more of the obligations are not fulfilled.</para>
<para>
For this case, the GPL automatically revokes any right, even the usage right on
the original unmodified code. So not only the distribution is infringing, also the mere use is no longer permitted.
</para>
<para>
This very strong provision is quite common in copyright licenses, especially in
the world of proprietary software.
</para>
</section>
<section>
<title>Past GPL Enforcement</title>
</section>
<section>
<title>The Linksys Case</title>
</section>
<section>
<title>Enforcement Case Timeline</title>
</section>
<section>
<title>Success so far</title>
</section>
<section>
<title>Future GPL Enforcement</title>
</section>
</article>

View File

@ -0,0 +1,4 @@
Linux is used more and more, especially in the embedded market. Unfortunately,
a number of vendors do not comply with the GNU GPL. The author has enforced
the GPL numerous times in and out of court, and will talk about his experience.

Some files were not shown because too many files have changed in this diff Show More