import of old now defunct presentation slides svn repo
This commit is contained in:
commit
fca59bea77
|
@ -0,0 +1,336 @@
|
|||
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
|
||||
<HTML>
|
||||
<HEAD>
|
||||
<META NAME="GENERATOR" CONTENT="SGML-Tools 1.0.9">
|
||||
<TITLE>The netfilter framework in Linux 2.4</TITLE>
|
||||
|
||||
|
||||
</HEAD>
|
||||
<BODY>
|
||||
<H1>The netfilter framework in Linux 2.4</H1>
|
||||
|
||||
<H2>Harald Welte <CODE>laforge@gnumonks.org</CODE></H2>$Date: 2004-10-10 15:04:54 +0200 (Sun, 10 Oct 2004) $
|
||||
<P><HR>
|
||||
<EM>This is the paper on which my talk about netfilter at Linux-Kongress 2000, CCC Congress 2000 (and probably some more occassions where I give this talk) is based. It describes the netfilter infrastructure, as well as the systems for packet filtering, NAT and packet mangling on top of it</EM>
|
||||
<HR>
|
||||
<H2><A NAME="s1">1. PART I - Netfilter basics / concepts</A></H2>
|
||||
|
||||
<H2>1.1 What is netfilter?</H2>
|
||||
|
||||
<P>Netfilter is definitely more than any of the firewall subsystems in the past linux kernels. Netfilter provides a abstract, generalized framework of which one particular incarnation is the packet filtering subsystem. So don't expect a talk about "how to set up a firewall or a masquerading gateway in 2.4". This would only cover a part of netfilter.
|
||||
<P>The netfilter framework consists out of three parts:
|
||||
<P>
|
||||
<P>
|
||||
<OL>
|
||||
<LI>Each protocol defines a set of 'hooks' (IPv4 defines 5), which are well-defined points in a packet's traversal of that protocol stack. At each of these points, the protocol stack will call the netfilter framework with the packet and the hook number.
|
||||
</LI>
|
||||
<LI>Parts of the kernel can register to listen to the different hooks for each protocol. So when a packet is passed to the netfilter framework, it checks to see if anyone has registered for that protocol and hook; if so, they get a chance to examine (and possibly alter) the packet, discard it, allow it to pass or ask netfilter to queue the packet for userspace.
|
||||
</LI>
|
||||
<LI>Packets that have been queued are collected for sending to userspace; these packets are handled asynchronously. A userspace process can examine the packet, can alter it, and reinject it at the same hook it left the kernel.</LI>
|
||||
</OL>
|
||||
<P>
|
||||
<P>All the packet filtering / NAT / ... stuff is based on this framework. There is no more dirty packet altering code spread all over the network stack.
|
||||
<P>
|
||||
<P>The netfilter framework currently has been implemented for IPv4, IPv6 and DECnet.
|
||||
<P>
|
||||
<H2>1.2 Why did we need netfilter?</H2>
|
||||
|
||||
<P>This chapter could be called 'What is wrong with ipchains?', too. So why did we need this change? (I only give a few examples here)
|
||||
<P>
|
||||
<UL>
|
||||
<LI>No infrastructure for passing packets to userspace, so all code which does some packet fiddling must be done as kernel code. Kernel programming is hard, must be done in C, and is dangerous.
|
||||
</LI>
|
||||
<LI>Transparent proxying is extremely difficult
|
||||
We have to look up _every_ packet to see if there's a socket bound to that adderess. No clean interface, 34 #ifdef' in 11 different files of the network stack
|
||||
</LI>
|
||||
<LI>Creating of packet filter rules independent of interface address is impossible.
|
||||
We must know local interface address to distinguish locally-generated or locally-terminated packets from through packets. The forward chain has only information on outgoing interface. So we must try to figure out where the packet came from.
|
||||
</LI>
|
||||
<LI>Masquerading and packet filtering are implemented as one part
|
||||
This makes the firewalling code way too complex.
|
||||
</LI>
|
||||
<LI>Ipchains code is neither modular nor extensible (eg. for MAC adress filtering)</LI>
|
||||
</UL>
|
||||
<P>
|
||||
<H2>1.3 The authors of netfilter</H2>
|
||||
|
||||
<P>The concept of the netfilter framework and most of its implementation were done by Rusty Russell. He is co-author if ipchains and is the current Linux Kernel IP firewall maintainer. Rusty got paid one Year by Watchguard (a firewall company) to do nothing, so he had enough time to do it :)
|
||||
<P>
|
||||
<P>The official netfilter core team consists out of Rusty Russell, Marc Boucher, James Morris and Harald Welte. Of course there are various other hackers who have contributed some stuff (for more information see
|
||||
<A HREF="http://netfilter.samba.org/scoreboard.html">http://netfilter.samba.org/scoreboard.html</A>).
|
||||
<P>
|
||||
<H2>1.4 Netfilter architecture in IPv4</H2>
|
||||
|
||||
<P>A Packet Traversing the Netfilter System:
|
||||
<BLOCKQUOTE><CODE>
|
||||
<PRE>
|
||||
|
||||
--->[1]--->[ROUTE]--->[3]--->[4]--->
|
||||
| ^
|
||||
| |
|
||||
| [ROUTE]
|
||||
v |
|
||||
[2] [5]
|
||||
| ^
|
||||
| |
|
||||
v |
|
||||
</PRE>
|
||||
</CODE></BLOCKQUOTE>
|
||||
<P>
|
||||
<P>
|
||||
<P>Packets come in from the left. After verification of the IP checksum, the packets hit the NF_IP_PRE_ROUTING [1] hook.
|
||||
<P>Next they enter the routing code, which decides if the packets are local or have to be passed to another interface.
|
||||
<P>If the packets are considered to be local, they traverse th NF_IP_LOCAL_IN [2] hook and get passed to the process (if any) afterwards.
|
||||
<P>If the packets are routed to another interface, they pass the NF_IP_FORWARD [3] hook.
|
||||
<P>The packet passes a final netfilter hook, NF_IP_POST_ROUTING [4], before they get transmitted on the target interface.
|
||||
<P>The NF_IP_LOCAL_OUT [5] hook is called for locally generated packets. Here You can see that routing occurs after this hook is called: in fact, the routing code is called first (to figure out the source IP address and some IP options), and called again if the packet is altered.
|
||||
<P>Locally generated packets hit NF_IP_POST_ROUTING [4], too.
|
||||
<P>
|
||||
<H2>1.5 Netfilter base</H2>
|
||||
|
||||
<P>Kernel modules can register a callback function for each one of these hooks. This callback function is called for each packet traversing the hook. The module is free to alter the packet. It has to return netfilter one of these constants:
|
||||
<P>
|
||||
<UL>
|
||||
<LI>NF_ACCEPT continue traversal as normal</LI>
|
||||
<LI>NF_DROP drop the packet; do not continue traversal</LI>
|
||||
<LI>NF_STOLEN I've taken over the packet; do not continue traversal</LI>
|
||||
<LI>NF_QUEUE queue the packet (usually for userspace handling)</LI>
|
||||
<LI>NF_REPEAT call this hook again</LI>
|
||||
</UL>
|
||||
<P>
|
||||
<P>
|
||||
<H2>1.6 Packet selection: IP tables</H2>
|
||||
|
||||
<P>A packet selection system called IP tables has been built. It is a direct descendant of ipchains, with extensibility.
|
||||
<P>Kernel modules can create a new table utilizing the IP tables core, and ask for a packet to traverse a given table.
|
||||
<P>IP tables are used for packet filtering (the 'filter' table), Network Address Translation (the 'nat' table) and general packet mangling (the 'mangle' table).
|
||||
<P>The three big parts of Linux 2.4 packet handling are built using netfilter hooks and IP tables. They are seperate modules and are independent from each other. They all plug in nicely into the infrastructure provided by netfilter.
|
||||
<P>
|
||||
<OL>
|
||||
<LI>Packet filtering
|
||||
<P>This table 'filter' should never alter packets, only filter them.
|
||||
One of the advantages of iptables over ipchains is that it is small and fast, and it hooks into netfilter at the NF_IP_LOCAL_IN, NF_IP_FORWARD and NF_IP_LOCAL_OUT hooks.
|
||||
<P>Therefore, for each packet there is one, and only one, place to filter it. This is one big change compared to ipchains, where a forwarded packet used to traverse three chains.
|
||||
<P>
|
||||
</LI>
|
||||
<LI> NAT
|
||||
<P>The nat table listens at three netfilter hooks: NF_IP_PRE_ROUTING and NF_IP_POST_ROUTING to do source and destination NAT for routed packets. For destination altering of local packets, the NF_IP_LOCAL_OUT hook is used.
|
||||
<P>This table is different from the 'filter' table, in that only the first packet of a new connection will traverse the table. The result of this traversal is then applied to all future packets of the same connection.
|
||||
<P>The NAT table is used for source NAT, destination NAT, masquerading (which is a special case of source nat) and transparent proxying (which is a special case of destination nat).
|
||||
<P>
|
||||
</LI>
|
||||
<LI> Packet mangling
|
||||
<P>The 'mangle' table registers at the NF_IP_PRE_ROUTING and NF_IP_LOCAL_OUT hooks.
|
||||
<P>Using the mangle table You can modify the packet itself or some of the out-of-band data attached to the packet. Currently the alteration of the TOS bits as well as setting the nfmark field inside the skb is implemented on top of the mangle table.
|
||||
</LI>
|
||||
</OL>
|
||||
<P>
|
||||
<H2>1.7 Connection tracking</H2>
|
||||
|
||||
<P>Connection tracking is fundamental to NAT, but has been implemented as a seperate module. This allows an extension to the packet filtering code to simply use connection tracking for "stateful firewalling". (the 'state' match)
|
||||
<P>
|
||||
<P>
|
||||
<H2><A NAME="s2">2. PART II - packet filtering using iptables and netfilter</A></H2>
|
||||
|
||||
<H2>2.1 Overview</H2>
|
||||
|
||||
<P>I expect You are familiar with TCP/IP, routing, firewall concepts and packet filtering in general.
|
||||
<P>As already explained in Part I, the filter table listens on three hooks, thus providing us three chains for packet filtering.
|
||||
<P>All packets coming from the network and destined for the local box traverse the INPUT chain.
|
||||
<P>All packets which are forwarded (routed) by us traverse the FORWARD chain (and only the FORWARD chain). Please again note this difference to the previous linux firewall implementations!
|
||||
<P>Finally, the packets originating from the local box traverse the OUTPUT chain.
|
||||
<P>
|
||||
<H2>2.2 Inserting rules into chains</H2>
|
||||
|
||||
<P>To insert/delete/modify any rules in linux 2.4 IP tables we have a neat and powerful commandline tool, called 'iptables'. I don't want to get too deep into all its features and extensibility. Here are some of its major features:
|
||||
<UL>
|
||||
<LI>It handles all different kinds of IP tables. Currently the filter, nat and mangle tables, but also all future table modules
|
||||
</LI>
|
||||
<LI>It supports plugins for new matches and new targets. Thus, nobody ever needs to patch anything to provide a netfilter extension. You have a kernel module doing the real work and a iptables plugin (dynamic library) to add the neccessary configuration options.
|
||||
</LI>
|
||||
<LI>It comes in two incarnations: iptalbes (IPv4) and ip6tables (IPv6). Both of them are based on the same library and mostly the same code.</LI>
|
||||
</UL>
|
||||
<P>
|
||||
<H3>Basic iptables commands</H3>
|
||||
|
||||
<P>An iptables command usually consists out of 5 parts:
|
||||
<OL>
|
||||
<LI>which table we want to work with</LI>
|
||||
<LI>which chain in this table we want it to use</LI>
|
||||
<LI>an operation (insert, add, delete, modify)</LI>
|
||||
<LI>a target for this particular rule</LI>
|
||||
<LI>a description of which packets we want to match this rule</LI>
|
||||
</OL>
|
||||
<P>The basic syntax is
|
||||
<PRE>
|
||||
iptables -t table -Operation chain -j target match(es)
|
||||
</PRE>
|
||||
<P>To add a rule allowing all traffic from anywhere to our local smtp port:
|
||||
<PRE>
|
||||
iptables -t filter -A INPUT -j ACCEPT -p tcp --dport smtp
|
||||
</PRE>
|
||||
<P>Of course there are various other commands like flush chain, set the default policy of a chain, add a user-defined chain, ...
|
||||
<P>Basic Operations:
|
||||
<PRE>
|
||||
-A append rule
|
||||
-I insert rule
|
||||
-D delete rule
|
||||
-R replace rule
|
||||
-L list rules
|
||||
</PRE>
|
||||
<P>Basic Targets, common to all chains:
|
||||
<PRE>
|
||||
ACCEPT accept the packet
|
||||
DROP drop the packet
|
||||
QUEUE queue packet to userspace
|
||||
RETURN return to the previous (calling) chain
|
||||
foobar user defined chain
|
||||
</PRE>
|
||||
<P>
|
||||
<P>Basic matches, common to all chains:
|
||||
<PRE>
|
||||
-p protocol (tcp/icmp/udp/...)
|
||||
-s source address (ip address/masklen)
|
||||
-d destination address (ip address/masklen)
|
||||
-i incoming interface
|
||||
-o outgoing interface
|
||||
</PRE>
|
||||
<P>Apart from these basic operations, matches and targets there are various extensions, which I'll describe in the apropriate chapters.
|
||||
<P>
|
||||
<H2>2.3 iptables match extensions for filtering</H2>
|
||||
|
||||
<P>There are various extensions which are useful for packet filtering. Describing them all in detail would take way too much time. Just to give You an impression about the power :)
|
||||
<P>At first there are some match extensions, which give us more power to describe which packets to match:
|
||||
<UL>
|
||||
<LI>TCP match extensions to match source port, destination port, arbitrary combinations of TCP flags, tcp options.</LI>
|
||||
<LI>UPD match extensions to match source port and destination port</LI>
|
||||
<LI>ICMP match extension to match icmp type</LI>
|
||||
<LI>MAC match extension to match incoming mac (ethernet) address</LI>
|
||||
<LI>MARK match extension to match the nfmark </LI>
|
||||
<LI>OWNER match extension (for locally generated packets only) to match user id, group id, process id, session id</LI>
|
||||
<LI>LIMIT match extension to match only a certain limit of packets per time frame. Very useful to prevent forwarding of any kind of flooding.</LI>
|
||||
<LI>STATE match extension to match packets of a certain state (decided by the connection tracking subsystem). Possible states are
|
||||
<UL>
|
||||
<LI>INVALID (not matched against a connection), </LI>
|
||||
<LI>ESTABLISHED (packet belongs to an already established connection), </LI>
|
||||
<LI>NEW (packet would establish a new connection) and </LI>
|
||||
<LI>RELATED (packet is in some way related to an already established connection. For example an ICMP error message or a ftp data connection)</LI>
|
||||
</UL>
|
||||
</LI>
|
||||
<LI>TOS match extension to match the value of the TOS IP header field</LI>
|
||||
<LI>TTL match extension to match the value of the TTL IP header field</LI>
|
||||
</UL>
|
||||
<P>
|
||||
<P>
|
||||
<H2>2.4 iptables target extensions for filtering</H2>
|
||||
|
||||
<P>
|
||||
<UL>
|
||||
<LI>LOG log matched packets via syslog()</LI>
|
||||
<LI>ULOG log matched packets via userspace logging daemon
|
||||
(supports interpreter and output plugins for flexible logging)</LI>
|
||||
<LI>REJECT not only drop the packet, but also send some kind of error
|
||||
message to the sender (which message is configurable)</LI>
|
||||
<LI>MIRROR retransmit the packet after exchanging source and destination
|
||||
IP address </LI>
|
||||
</UL>
|
||||
<P>
|
||||
<H2><A NAME="s3">3. PART III - NAT using iptables and netfilter</A></H2>
|
||||
|
||||
<P>Regarding to NAT (Network Address Translation) the previous Linux Kernels only supported one spacial case called "Masquerading"
|
||||
<P>Netfilter now enables Linux to do any kind of NAT.
|
||||
<P>Nat is divided into `source NAT' and `destination NAT'.
|
||||
<P>Source NAT alters the source address of a packet while passing the NF_IP_POST_ROUTING hook. Masquerading is a special application of SNAT
|
||||
<P>Destination NAT alters the destination address of a packet while passing the NF_IP_LOCAL_OUT respectively NF_IP_PRE_ROUTING hook. Port forwarding and transparent proxying are forms of DNAT.
|
||||
<P>
|
||||
<H2>3.1 iptables target extensions for NAT</H2>
|
||||
|
||||
<P>
|
||||
<P>
|
||||
<DL>
|
||||
<P>
|
||||
<DT><B>SNAT</B><DD><P>Change the source address to something different
|
||||
<P>Example:
|
||||
<PRE>
|
||||
iptables -t nat -A POSTROUTING -j SNAT --to-source 1.2.3.4
|
||||
</PRE>
|
||||
<P>
|
||||
<DT><B>MASQUERADE</B><DD><P>SNAT for dialup connections with dynamic ip address
|
||||
<P>Does almost the same as SNAT, but if the link goes down, all connection tracking information is dropped. The connections are lost anyway, because we get a different IP address at reconnect.
|
||||
<P>Example:
|
||||
<PRE>
|
||||
iptables -t nat -A POSTROUTING -j MASQUERADE -o ppp0
|
||||
</PRE>
|
||||
<P>
|
||||
<DT><B>DNAT</B><DD><P>Change the destination address to something different
|
||||
<P>This is done at the PREROUTING chain, just as the packet comes in. Therefore, anything else on the Linux box itself (routing, packet filtering) will se the packet to its real (new) destination.
|
||||
<P>Example:
|
||||
<PRE>
|
||||
iptables -t nat -A PREROUTING -j DNAT --to-destination 1.2.3.4:8080 -p tcp --dport 80 -i eth1
|
||||
</PRE>
|
||||
<P>
|
||||
<DT><B>REDIRECT</B><DD><P>Redirect packets to local destination
|
||||
<P>Exactly the same as doing DNAT to the address of the incoming interface
|
||||
<P>Example:
|
||||
<PRE>
|
||||
iptables -t nat -A PREROUTING -j REDIRECT --to-port 3128 -i eth1 -p tcp --dport 80
|
||||
</PRE>
|
||||
<P>
|
||||
</DL>
|
||||
<P>
|
||||
<H2><A NAME="s4">4. PART IV - Packet mangling using iptables and netfilter</A></H2>
|
||||
|
||||
<P>The `mangle' table enables us to alter the packet itself or some data accompaning the packet.
|
||||
<P>
|
||||
<H2>4.1 iptables target extensions for packet mangling</H2>
|
||||
|
||||
<P>
|
||||
<DL>
|
||||
<P>
|
||||
<DT><B>MARK</B><DD><P>set the value of the nfmark field
|
||||
<P>We can change the value of the nfmark field. The nfmark is just a user defined mark (anything within the range of an unsigned long) of the packet. The mark value is used to do policy routing, tell ipqmpd (the userspace queue multiplex daemon) which process to queue the packet to, etc.
|
||||
<P>Example:
|
||||
<BLOCKQUOTE><CODE>
|
||||
<PRE>
|
||||
iptables -t mangle -A PREROUTING -j MARK --set-mark 0x0a -p tcp
|
||||
</PRE>
|
||||
</CODE></BLOCKQUOTE>
|
||||
<P>
|
||||
<DT><B>TOS</B><DD><P>set the value of the TOS bits inside the IP header
|
||||
<P>We can change the value of the type of service bits inside the IP haeder. This is useful if You are using TOS based packet scheduling / routing.
|
||||
<P>Example:
|
||||
<BLOCKQUOTE><CODE>
|
||||
<PRE>
|
||||
iptables -t mangle -A PREROUTING -j TOS --set-tos 0x10 -p tcp --dport ssh
|
||||
</PRE>
|
||||
</CODE></BLOCKQUOTE>
|
||||
<P>
|
||||
<DT><B>TTL</B><DD><P>alther the value of the TTL field inside the IP header
|
||||
<P>Enables the user to set, increase or decrease the TTL field.
|
||||
<P>Example:
|
||||
<BLOCKQUOTE><CODE>
|
||||
<PRE>
|
||||
iptables -t mangle -A PREROUTING -j TTL --ttl-dec 2 -i eth0
|
||||
</PRE>
|
||||
</CODE></BLOCKQUOTE>
|
||||
</DL>
|
||||
<P>
|
||||
<H2><A NAME="s5">5. Queueing packets to userspace</A></H2>
|
||||
|
||||
<P>As I already mentioned, at any time in any netfilter chain, the packet can be queued to userspace. The actual queuing is done by a kernel module (ip_queue.o).
|
||||
<P>The packets (including metadata like nfmark and mac address) are sent to an userspace process using netlink sockets. This process can do whatever it wants to do with the packet.
|
||||
<P>After the userspace process is done with its work on the packet, it can either reinject the packet into the kernel, or set a verdict (DROP, ...) what to do with the packet.
|
||||
<P>This is one key technology of netfilter, enabling to do complicated packet handling by userspace processes. Thus, preventing more complexity in the kernel space.
|
||||
<P>
|
||||
<P>Userspace packet handling processes can be easily developed using a netfilter-provided library called 'libipq'.
|
||||
<P>
|
||||
<P>Currently only one userspace process is supported, but the first beta release of an userspace ip queueing multiplex daemon (ipqmpd) is available. ipqmpd provides a compatibility library (libipqmpd) which makes upgrading from raw ipqueue interface to the new ipqpmd as easy as relinking to another library.
|
||||
<P>
|
||||
<H2><A NAME="s6">6. PART V Credits</A></H2>
|
||||
|
||||
<P>Credits to all the netfilter hackers, especially the core team.
|
||||
<P>Namely: <B>Paul 'Rusty' Russel</B>, <B>Marc Boucher</B> and <B>James Morris</B>.
|
||||
<P>Additional special thanks to Rusty for his `netfilter-hacking-HOWTO', `packet-filtering-HOWTO' and `NAT-HOWTO' which I heavily used as a basis for this presentation.
|
||||
<P>
|
||||
</BODY>
|
||||
</HTML>
|
|
@ -0,0 +1,18 @@
|
|||
Tutorial: Firewalling using netfilter/iptables in Linux 2.4
|
||||
|
||||
One of the major advantages of the new Linux 2.4.x kernel series is the
|
||||
new packet filtering / NAT / packet mangling sybsystem, called iptables.
|
||||
Iptables is the successor of ipchains and ipfwadm in 2.2 and 2.0 kernels.
|
||||
Major new features are stateful firewalling, extensibility and better NAT
|
||||
(Network Address Translation) support.
|
||||
|
||||
Topics:
|
||||
|
||||
- concepts behind new netfilter/iptables infrastructure
|
||||
- usage of iptables
|
||||
- case example of a real-world firewall
|
||||
- current (experimental) netfilter work - or "what is patch-o-matic"
|
||||
- writing netfilter/iptables extension modules
|
||||
|
||||
The tutorial will be presented by two of the netfilter core team members,
|
||||
Rusty Russel <rusty@rustcorp.com.au> and Harald Welte <laforge@gnumonks.org>
|
|
@ -0,0 +1,9 @@
|
|||
Technical Presentation: A tour through the Linux 2.4 network stack
|
||||
|
||||
Linux based systems are known for performance and realiability in the area of
|
||||
networking. This presentation will give a tour through the Linux 2.4 kernel
|
||||
network stack, it's structure and implementation. Some of the topics covered
|
||||
are: Network hardware drivers, core network functions, IPv4 protocol stack,
|
||||
sockets implementation, zero-copy TCP.
|
||||
|
||||
The Author of this Presentation is Harald Welte <laforge@gnumonks.org>
|
|
@ -0,0 +1,116 @@
|
|||
<!doctype linuxdoc system>
|
||||
|
||||
<article>
|
||||
|
||||
<title>The journey of a packet through the linux 2.4 network stack</title>
|
||||
<author>Harald Welte <tt>laforge@gnumonks.org</tt>
|
||||
<date>$Revision: 537 $, $Date: 2004-10-10 15:04:54 +0200 (Sun, 10 Oct 2004) $</date>
|
||||
|
||||
<!-- $Id: packet-journey-2.4.sgml 537 2004-10-10 13:04:54Z laforge $ -->
|
||||
|
||||
<abstract>
|
||||
This document describes the journey of a network packet inside the linux kernel 2.4.x. This has changed drastically since 2.2 because the globally serialized bottom half was abandoned in favor of the new softirq system.
|
||||
|
||||
<toc>
|
||||
|
||||
<sect>Preface
|
||||
<p>
|
||||
I have to excuse for my ignorance, but this document has a strong focus on the "default case": x86 architecture and ip packets which get forwarded.
|
||||
|
||||
<p>
|
||||
I am definitely no kernel guru and the information provided by this document may be wrong. So don't expect too much, I'll always appreciate Your comments and bugfixes.
|
||||
|
||||
<sect>Receiving the packet
|
||||
|
||||
<sect1>The receive interrupt
|
||||
<p>
|
||||
If the network card receives an ethernet frame which matches the local MAC address or is a linklayer broadcast, it issues an interrupt.
|
||||
The network driver for this particular card handles the interrupt, fetches the packet data via DMA / PIO / whatever into RAM. It then allocates a skb and calls a function of the protocol independent device support routines: <tt>net/core/dev.c:netif_rx(skb)</tt>.
|
||||
<p>
|
||||
If the driver didn't already timestamp the skb, it is timestamped now. Afterwards the skb gets enqueued in the apropriate queue for the processor handling this packet. If the queue backlog is full the packet is dropped at this place. After enqueuing the skb the receive softinterrupt is marked for execution via <tt>include/linux/interrupt.h:__cpu_raise_softirq()</tt>.
|
||||
<p>
|
||||
The interrupt handler exits and all interrupts are reenabled.
|
||||
|
||||
<sect1>The network RX softirq
|
||||
<p>
|
||||
Now we encounter one of the big changes between 2.2 and 2.4: The whole network stack is no longer a bottom half, but a softirq. Softirqs have the major advantage, that they may run on more than one CPU simultaneously. bh's were guaranteed to run only on one CPU at a time.
|
||||
<p>
|
||||
Our network receive softirq is registered in <tt>net/core/dev.c:net_init()</tt> using the function <tt>kernel/softirq.c:open_softirq()</tt> provided by the softirq subsystem.
|
||||
<p>
|
||||
Further handling of our packet is done in the network receive softirq (NET_RX_SOFTIRQ) which is called from <tt>kernel/softirq.c:do_softirq()</tt>. do_softirq() itself is called from three places within the kernel:
|
||||
<enum>
|
||||
<item>from <tt>arch/i386/kernel/irq.c:do_IRQ()</tt>, which is the generic IRQ handler
|
||||
<item>from <tt>arch/i386/kernel/entry.S</tt> in case the kernel just returned from a syscall
|
||||
<item>inside the main process scheduler in <tt>kernel/sched.c:schedule()</tt>
|
||||
</enum>
|
||||
<p>
|
||||
So if execution passes one of these points, do_softirq() is called, it detects the NET_RX_SOFTIRQ marked an calls <tt>net/core/dev.c:net_rx_action()</tt>. Here the sbk is dequeued from this cpu's receive queue and afterwards handled to the apropriate packet handler. In case of IPv4 this is the IPv4 packet handler.
|
||||
|
||||
<sect1>The IPv4 packet handler
|
||||
<p>
|
||||
The IP packet handler is registered via <tt>net/core/dev.c:dev_add_pack()</tt> called from <tt>net/ipv4/ip_output.c:ip_init()</tt>.
|
||||
<p>
|
||||
The IPv4 packet handling function is <tt>net/ipv4/ip_input.c:ip_rcv()</tt>. After some initial checks (if the packet is for this host, ...) the ip checksum is calculated. Additional checks are done on the length and IP protocol version 4.
|
||||
<p>
|
||||
Every packet failing one of the sanity checks is dropped at this point.
|
||||
<p>
|
||||
If the packet passes the tests, we determine the size of the ip packet and trim the skb in case the transport medium has appended some padding.
|
||||
<p>
|
||||
Now it is the first time one of the netfilter hooks is called.
|
||||
<p>
|
||||
Netfilter provides an generict and abstract interface to the standard routing code. This is currently used for packet filtering, mangling, NAT and queuing packets to userspace. For further reference see my conference paper 'The netfilter subsystem in Linux 2.4' or one of Rustys unreliable guides, i.e the netfilter-hacking-guide.
|
||||
<p>
|
||||
After successful traversal the netfilter hook, <tt>net/ipv4/ipv_input.c:ip_rcv_finish()</tt> is called.
|
||||
<p>
|
||||
Inside ip_rcv_finish(), the packet's destination is determined by calling the routing function <tt>net/ipv4/route.c:ip_route_input()</tt>. Furthermore, if our IP packet has IP options, they are processed now. Depending on the routing decision made by <tt>net/ipv4/route.c:ip_route_input_slow()</tt>, the journey of our packet continues in one of the following functions:
|
||||
|
||||
<descrip>
|
||||
<tag>net/ipv4/ip_input.c:ip_local_deliver()</tag>
|
||||
The packet's destination is local, we have to process the layer 4 protocol and pass it to an userspace process.
|
||||
|
||||
<tag>net/ipv4/ip_forward.c:ip_forward()</tag>
|
||||
The packet's destination is not local, we have to forward it to another network
|
||||
|
||||
<tag>net/ipv4/route.c:ip_error()</tag>
|
||||
An error occurred, we are unable to find an apropriate routing table entry for this packet.
|
||||
|
||||
<tag>net/ipv4/ipmr.c:ip_mr_input()</tag>
|
||||
It is a Multicast packet and we have to do some multicast routing.
|
||||
</descrip>
|
||||
|
||||
<sect>Packet forwarding to another device
|
||||
|
||||
<p>
|
||||
If the routing decided that this packet has to be forwarded to another device, the function <tt>net/ipv4/ip_forward.c:ip_forward()</tt> is called.
|
||||
|
||||
<p>
|
||||
The first task of this function is to check the ip header's TTL. If it is <= 1 we drop the packet and return an ICMP time exceeded message to the sender.
|
||||
<p>
|
||||
We check the header's tailroom if we have enough tailroom for the destination device's link layer header and expand the skb if neccessary.
|
||||
<p>
|
||||
Next the TTL is decremented by one.
|
||||
<p>
|
||||
If our new packet is bigger than the MTU of the destination device and the don't fragment bit in the IP header is set, we drop the packet and send a ICMP frag needed message to the sender.
|
||||
|
||||
<p>
|
||||
Finally it is time to call another one of the netfilter hooks - this time it is the NF_IP_FORWARD hook.
|
||||
|
||||
<p>
|
||||
Assuming that the netfilter hooks is returning a NF_ACCEPT verdict, the function <tt>net/ipv4/ip_forward.c:ip_forward_finish()</tt> is the next step in our packet's journey.
|
||||
|
||||
<p>
|
||||
ip_forward_finish() itself checks if we need to set any additional options in the IP header, and has and has <tt>net/ipv4/ip_options.c:ip_forward_options()</tt> doing this. Afterwards it calls <tt>include/net/ip.h:ip_send()</tt>.
|
||||
|
||||
<p>
|
||||
If we need some fragmentation, <tt>net/ipv4/output.c:ip_fragment()</tt> gets called, otherwise we continue in <tt>net/ipv4/ip_forward:ip_finish_output()</tt>.
|
||||
|
||||
<p>
|
||||
ip_finish_output() again does nothing else than calling the netfilter postrouting hook NF_IP_POST_ROUTING and calling ip_finish_output2() on successful traversal of this hook.
|
||||
|
||||
<p>
|
||||
ip_finish_output2() calls prepends the hardware (link layer) header to our skb and calls <tt>dst->hh->hh_output()</tt> which seems to usually be <tt>net/core/dev.c:dev_queue_transmit()</tt>.
|
||||
<p>
|
||||
dev_queue_xmit() enqueues the packet for transmission by the network device.
|
||||
|
||||
</article>
|
||||
|
|
@ -0,0 +1,397 @@
|
|||
%include "cnc-style.mgp"
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
%nodefault
|
||||
%pcache 1 1 0 1
|
||||
%size 7, font "standard", fore "white", vgap 20, back "black"
|
||||
%bimage "fundo-cnc.png" 1024x768
|
||||
|
||||
%center
|
||||
%size 7
|
||||
|
||||
|
||||
Quality of Service in IP Networks
|
||||
|
||||
%center
|
||||
%size 4
|
||||
by
|
||||
|
||||
Harald Welte <laforge@gnumonks.org>
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
QoS in IP Networks
|
||||
Contents
|
||||
|
||||
Definition of QoS
|
||||
|
||||
Why QoS
|
||||
|
||||
IP Networks are not designed for QoS
|
||||
|
||||
How to do the impossible
|
||||
|
||||
What can Linux based systems help
|
||||
|
||||
Advanced Concepts (DiffServ, IntServ, RSVP, ...)
|
||||
|
||||
References / Further Reading
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
QoS in IP Networks
|
||||
Definiton of QoS
|
||||
|
||||
Provide Service Differentiation
|
||||
|
||||
Performance Assurance by
|
||||
|
||||
Bandwitdh guarantees
|
||||
for streaming multimedia traffic
|
||||
priorizing certain important applications
|
||||
|
||||
Latency guarantees
|
||||
for voice over IP
|
||||
for interactive character-oriented applications (ssh,telnet)
|
||||
|
||||
Packet-loss guarantees
|
||||
for unreliable layer-4 protocols
|
||||
to avoid retransmits
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
QoS in IP Networks
|
||||
Why QoS
|
||||
|
||||
|
||||
Decide how and who available bandwidth is devided
|
||||
|
||||
Limit available bandwidth for certain users / applications
|
||||
|
||||
Guarantee bandwidth for certain users / applications
|
||||
|
||||
Divide bandwidth more equally between users / applications
|
||||
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
QoS in IP Networks
|
||||
IP networks not designed for QoS
|
||||
|
||||
|
||||
Properties of IP-based networks:
|
||||
|
||||
offer a "best-effort" service
|
||||
|
||||
make NO guarantees about
|
||||
bandwidth
|
||||
latency
|
||||
packet loss
|
||||
|
||||
provide a non-reliable packet transport
|
||||
|
||||
Conclusion: IP networks are not suitable for QoS
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
QoS in IP Networks
|
||||
How to do the Impossible
|
||||
|
||||
%size 4
|
||||
|
||||
As IP Networks including Hardware (Routers, ...) are widely deployed, all QoS efforts have to layer on top of the existing technology.
|
||||
|
||||
There's no real solution to control latency
|
||||
latency widely dependent on routing, which may be dynamic
|
||||
|
||||
There's no real solution to control packet loss
|
||||
packet loss may occurr on any intermediate router
|
||||
|
||||
But we can control bandwidth usage!
|
||||
The sender can limit bandwidth for outgoing streams
|
||||
Intermediate routers BEFORE a bottleneck can control bandwidth usage
|
||||
|
||||
%size 5
|
||||
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
QoS in IP Networks
|
||||
What can Linux systems do?
|
||||
|
||||
Bandwidth limiting at the sender application
|
||||
not many applications support it
|
||||
server often out of control (on Internet, ...)
|
||||
server doesn't know what's between him and the client
|
||||
|
||||
Bandwidth control on intermediate router before bottleneck
|
||||
Ideal case because this is where packet loss would occurr
|
||||
Sophisticated queue scheduling on the outgoing queue
|
||||
Variety of different queue scheduling algorithms
|
||||
|
||||
Flow throttling at the Receiver
|
||||
Worst case, because influence is limited
|
||||
Theoretically possible for TCP, no implementation yet.
|
||||
Ingress qdisc might help
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
QoS in IP Networks
|
||||
Bandwidth limiting at server
|
||||
|
||||
Some Internet Servers support bandwidth limiting
|
||||
|
||||
ProFTPd (builtin support)
|
||||
|
||||
Apache (using contributed mod_bandwidth)
|
||||
|
||||
|
||||
Using those features it is easy to limit
|
||||
|
||||
maximum bandwidth used per connection
|
||||
|
||||
maximum bandwidth used per client (IP/network)
|
||||
|
||||
maximum bandwidth used by one virtual host (webserver/ftpserver)
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
QoS in IP Networks
|
||||
Router before bottleneck
|
||||
|
||||
%size 4
|
||||
|
||||
The router receives more packets on his incoming interface(s) than it can send out on the outgoing interface. It has to build a queue of packets (usually a FIFO one) and starts dropping packets as soon as the queue is full
|
||||
|
||||
%image "qos-1.png" 0 100 30
|
||||
|
||||
The idea is to change this queue, thus decide
|
||||
which packets get enqueued in which order
|
||||
how many packets get queued
|
||||
which packets get dropped in case of a filling queue
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
QoS in IP Networks
|
||||
The Linux 2.2 / 2.4 Solution
|
||||
|
||||
Packet Scheduling algorithms in the Kernel
|
||||
CBQ - Class Based Queue
|
||||
RED - Random Early Drop
|
||||
SFQ - Stochastic Fairness Queueing
|
||||
TEQL - True Link Equalizer
|
||||
TBF - Token Bucket Filter
|
||||
|
||||
tc command of iproute2 package for configuration
|
||||
almost no documentation
|
||||
very few examples on the internet
|
||||
|
||||
Packet Classification
|
||||
tc builtin classes (route, u23, ...)
|
||||
all iptables/netfilter matches by using fwmark
|
||||
|
||||
Conclusion: Linux is the best suited general-purpose operating system for QoS, but almost nobody is using it because lack of knowledge.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
QoS in IP Networks
|
||||
Available queuing algorithms
|
||||
|
||||
CBQ - Class Based Queue
|
||||
hierarchical bandwidth classes
|
||||
used as basis in almost all cases
|
||||
TBF - Token Bucket Filter
|
||||
really accurate algorithm
|
||||
uses a lot of CPU
|
||||
not possible for high bandwidth links (>1MBit)
|
||||
SFQ - Stochastic Fairness Queueing
|
||||
less accurate algorithm
|
||||
tries to distinguish between individual streams
|
||||
does round robin between those streams
|
||||
TEQL - True Link Equalizer
|
||||
allows to 'bundle' interfaces
|
||||
RED - Random Early Detect / Drop
|
||||
simulates congested link by statistic packet dropping
|
||||
uses almost no CPU
|
||||
recommended for high-bandwidth backbones
|
||||
others (WRR, TCINDEX, DSMARK, ..)
|
||||
WRR not officially included in kernel, similar to CBQ
|
||||
others mostly used for DiffServ
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
QoS in IP Networks
|
||||
The big picture
|
||||
|
||||
Overview of the a packet's journey
|
||||
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
Incoming Packets
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
|
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
V
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
Packet Classification classify
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
(ipchains/iptables) set nfmark
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
|
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
V
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
Routing decision
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
|
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
V
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
TC filter select classes based on nfmark
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
/ | \
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
/ | \
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
/ | \
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
Different Bandwidth classes bandwidth classes (CBQ)
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
\ | /
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
\ | /
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
\ | /
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
Enqueuing output queue discipline
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
|
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
V
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
Outgoing packets
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
QoS in IP Networks
|
||||
Example scenario usin CBQ
|
||||
|
||||
%size 4
|
||||
Let's assume we have a link with 10 MBit maximum available bandwidth.
|
||||
We offer two major services to the outside world: Anonymous FTP and a Webserver offering important Information.
|
||||
|
||||
FTP Bulk data transfers are using up almost all available bandwidth, thus slowing down accesses to our website :(
|
||||
|
||||
We want to have FTP transfers use up to 8MBit and reserve 2MBit for WWW.
|
||||
|
||||
Implementation uses CBQ for bandwidth divisions.
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
QoS in IP Networks
|
||||
Example scenario
|
||||
|
||||
%size 3
|
||||
attach a CBQ to the device
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
tc qdisc add dev eth0 root handle 10: cbq
|
||||
bandwidth 10Mbit avpkt 1000
|
||||
|
||||
%size 3
|
||||
%font "standard"
|
||||
create CBQ classes
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
tc class add dev eth0 parent 10:0 classid 10:1 cbq
|
||||
bandwidth 10MBit rate 10MBit allot 1514
|
||||
weight 1Mbit prio 8 maxburst 20 avpkt 1000
|
||||
|
||||
tc class add dev eth0 parent 10:1 classid 10:100 cbq
|
||||
bandwidth 10MBit rate 8MBit allot 1514
|
||||
weight 800kbit prio 5 maxburst 20 avpkt 1000 bounded
|
||||
|
||||
tc class add dev eth0 parent 10:1 classid 10:200 cbq
|
||||
bandwidth 10MBit rate 2MBit allot 1514
|
||||
weight 200kbit prio 5 maxburst 20 avpkt 1000 bounded
|
||||
|
||||
%size 3
|
||||
%font "standard"
|
||||
add filter rules
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
tc filter add dev eth0 parent 10:1 protocol ip handle 6 fw classid 10:100
|
||||
tc filter add dev eth0 parent 10:1 protocol ip handle 7 fw classid 10:200
|
||||
|
||||
iptables -t mangle -A PREROUTING -j MARK -p tcp --sport 20 --set-mark 6
|
||||
iptables -t mangle -A PREROUTING -j MARK -p tcp ! --sport 20 --set-mark 7
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
QoS in IP Networks
|
||||
Further optimization
|
||||
|
||||
%size 4
|
||||
Now we have achieved bandwidth division between two services.
|
||||
|
||||
Within one service, however, one individual user with a high bandwith link can still use up most of our bandwidth, slowing down other user.
|
||||
|
||||
We can improve this behaviour of changing the scheduling algorithm from it's default (fifo)
|
||||
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
tc qdisc add dev eth0 parent 10:100 sfq quantum 1514b perturb 15
|
||||
tc qdisc add dev eth0 parent 10:200 sfq quantum 1514b perturb 15
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
QoS in IP Networks
|
||||
Further reading / Links
|
||||
|
||||
Bandwidth limiting on Servers
|
||||
ProFTPd
|
||||
http://www.proftpd.net/
|
||||
Apache mod_bandwidth / mod_bwshare
|
||||
ftp://ftp.cohprog.com/pub/apache/module/mod_bandwidth.c
|
||||
http://www.topology.org/src/bwshare/
|
||||
|
||||
Queue scheduling
|
||||
Advanced Routing HOWTO
|
||||
http://www.ds9a.nl/2.4Routing/
|
||||
Linux QoS HOWTO
|
||||
http://www.ittc.ukans.edu/~rsarav/howto/
|
||||
iproute2+tc
|
||||
|
||||
This presentation
|
||||
Authors Homepage
|
||||
http://www.gnumonks.org/
|
File diff suppressed because it is too large
Load Diff
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
|
@ -0,0 +1,23 @@
|
|||
Quality of Service in IP Networks
|
||||
|
||||
IP networks were designed some 25 years ago. Networks based on TCP/IP are
|
||||
widely deployed, as organization-local Intranets as well as in the Internet
|
||||
itself. The usage patterns of those networks change. Especially new
|
||||
technologies like voice-over-IP as well as streaming multimedia applications
|
||||
have different requirements on the underlying network infrastructure than
|
||||
bulk data transfers like ftp/www or interactive traffic like telnet/ssh.
|
||||
|
||||
Organizations usually run a mixture of different services on their Internet
|
||||
uplinks or on their organization-internal wide area networks. Bandwidth is
|
||||
usually a limited ressource, so everybody wants to divide bandwidth between
|
||||
different services according to his specific needs.
|
||||
|
||||
Linux always had a very strong focus on network functionality and has
|
||||
sophisticated means for bandwidth control / QoS since Kernel 2.2.
|
||||
|
||||
The presentation is organized in the following parts:
|
||||
Basics of QoS in IP networks
|
||||
How can Linux help with QoS
|
||||
Sample scenarios of Linux-based QoS solutions
|
||||
Overview about advanced conecpts (DiffServ, IntServ, RSVP, ...)
|
||||
|
|
@ -0,0 +1,23 @@
|
|||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%
|
||||
%deffont "standard" tfont "VERDANA.TTF"
|
||||
%deffont "standard-i" tfont "VERDANAI.TTF"
|
||||
%deffont "thick" tfont "ARIBLK.TTF"
|
||||
%deffont "typewriter" xfont "courier-medium-r", tfont "courbd.ttf", tmfont "wadalab-gothic.ttf"
|
||||
%%
|
||||
%% Default settings per each line numbers.
|
||||
%%
|
||||
%default 1 leftfill, size 2, fore "white", back "black", font "thick"
|
||||
%default 1 bimage "fundo-cnc.png" 1024x768
|
||||
%default 1 pcache 1 1 0 0
|
||||
%default 2 size 7, vgap 10, prefix " "
|
||||
%default 3 size 2, bar "midnightblue", vgap 30
|
||||
%default 4 size 5, fore "lemon chiffon", vgap 30, prefix " ", font "standard"
|
||||
%%
|
||||
%% Default settings that are applied to TAB-indented lines.
|
||||
%%
|
||||
%tab 1 size 4, vgap 40, prefix " ", icon arc "tomato" 40
|
||||
%tab 2 size 4, vgap 20, prefix " ", icon box "spring green" 40
|
||||
%tab 3 size 3, vgap 20, prefix " ", icon delta3 "white" 40
|
||||
%tab 4 size 3, vgap 20, prefix " ", icon delta3 "white" 40
|
||||
%%
|
Binary file not shown.
After Width: | Height: | Size: 127 KiB |
|
@ -0,0 +1,397 @@
|
|||
%include "cnc-style.mgp"
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
%nodefault
|
||||
%pcache 1 1 0 1
|
||||
%size 7, font "standard", fore "white", vgap 20, back "black"
|
||||
%bimage "fundo-cnc.png" 1024x768
|
||||
|
||||
%center
|
||||
%size 7
|
||||
|
||||
|
||||
Quality of Service in IP Networks
|
||||
|
||||
%center
|
||||
%size 4
|
||||
by
|
||||
|
||||
Harald Welte <laforge@conectiva.com>
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
QoS in IP Networks
|
||||
Contents
|
||||
|
||||
Definition of QoS
|
||||
|
||||
Why QoS
|
||||
|
||||
IP Networks are not designed for QoS
|
||||
|
||||
How to do the impossible
|
||||
|
||||
What can Linux based systems help
|
||||
|
||||
Advanced Concepts (DiffServ, IntServ, RSVP, ...)
|
||||
|
||||
References / Further Reading
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
QoS in IP Networks
|
||||
Definiton of QoS
|
||||
|
||||
Provide Service Differentiation
|
||||
|
||||
Performance Assurance by
|
||||
|
||||
Bandwitdh guarantees
|
||||
for streaming multimedia traffic
|
||||
priorizing certain important applications
|
||||
|
||||
Latency guarantees
|
||||
for voice over IP
|
||||
for interactive character-oriented applications (ssh,telnet)
|
||||
|
||||
Packet-loss guarantees
|
||||
for unreliable layer-4 protocols
|
||||
to avoid retransmits
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
QoS in IP Networks
|
||||
Why QoS
|
||||
|
||||
|
||||
Decide how and who available bandwidth is devided
|
||||
|
||||
Limit available bandwidth for certain users / applications
|
||||
|
||||
Guarantee bandwidth for certain users / applications
|
||||
|
||||
Divide bandwidth more equally between users / applications
|
||||
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
QoS in IP Networks
|
||||
IP networks not designed for QoS
|
||||
|
||||
|
||||
Properties of IP-based networks:
|
||||
|
||||
offer a "best-effort" service
|
||||
|
||||
make NO guarantees about
|
||||
bandwidth
|
||||
latency
|
||||
packet loss
|
||||
|
||||
provide a non-reliable packet transport
|
||||
|
||||
Conclusion: IP networks are not suitable for QoS
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
QoS in IP Networks
|
||||
How to do the Impossible
|
||||
|
||||
%size 4
|
||||
|
||||
As IP Networks including Hardware (Routers, ...) are widely deployed, all QoS efforts have to layer on top of the existing technology.
|
||||
|
||||
There's no real solution to control latency
|
||||
latency widely dependent on routing, which may be dynamic
|
||||
|
||||
There's no real solution to control packet loss
|
||||
packet loss may occurr on any intermediate router
|
||||
|
||||
But we can control bandwidth usage!
|
||||
The sender can limit bandwidth for outgoing streams
|
||||
Intermediate routers BEFORE a bottleneck can control bandwidth usage
|
||||
|
||||
%size 5
|
||||
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
QoS in IP Networks
|
||||
What can Linux systems do?
|
||||
|
||||
Bandwidth limiting at the sender application
|
||||
not many applications support it
|
||||
server often out of control (on Internet, ...)
|
||||
server doesn't know what's between him and the client
|
||||
|
||||
Bandwidth control on intermediate router before bottleneck
|
||||
Ideal case because this is where packet loss would occurr
|
||||
Sophisticated queue scheduling on the outgoing queue
|
||||
Variety of different queue scheduling algorithms
|
||||
|
||||
Flow throttling at the Receiver
|
||||
Worst case, because influence is limited
|
||||
Theoretically possible for TCP, no implementation yet.
|
||||
Ingress qdisc might help
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
QoS in IP Networks
|
||||
Bandwidth limiting at server
|
||||
|
||||
Some Internet Servers support bandwidth limiting
|
||||
|
||||
ProFTPd (builtin support)
|
||||
|
||||
Apache (using contributed mod_bandwidth)
|
||||
|
||||
|
||||
Using those features it is easy to limit
|
||||
|
||||
maximum bandwidth used per connection
|
||||
|
||||
maximum bandwidth used per client (IP/network)
|
||||
|
||||
maximum bandwidth used by one virtual host (webserver/ftpserver)
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
QoS in IP Networks
|
||||
Router before bottleneck
|
||||
|
||||
%size 4
|
||||
|
||||
The router receives more packets on his incoming interface(s) than it can send out on the outgoing interface. It has to build a queue of packets (usually a FIFO one) and starts dropping packets as soon as the queue is full
|
||||
|
||||
%image "qos-1.png" 0 100 30
|
||||
|
||||
The idea is to change this queue, thus decide
|
||||
which packets get enqueued in which order
|
||||
how many packets get queued
|
||||
which packets get dropped in case of a filling queue
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
QoS in IP Networks
|
||||
The Linux 2.2 / 2.4 Solution
|
||||
|
||||
Packet Scheduling algorithms in the Kernel
|
||||
CBQ - Class Based Queue
|
||||
RED - Random Early Drop
|
||||
SFQ - Stochastic Fairness Queueing
|
||||
TEQL - True Link Equalizer
|
||||
TBF - Token Bucket Filter
|
||||
|
||||
tc command of iproute2 package for configuration
|
||||
almost no documentation
|
||||
very few examples on the internet
|
||||
|
||||
Packet Classification
|
||||
tc builtin classes (route, u23, ...)
|
||||
all iptables/netfilter matches by using fwmark
|
||||
|
||||
Conclusion: Linux is the best suited general-purpose operating system for QoS, but almost nobody is using it because lack of knowledge.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
QoS in IP Networks
|
||||
Available queuing algorithms
|
||||
|
||||
CBQ - Class Based Queue
|
||||
hierarchical bandwidth classes
|
||||
used as basis in almost all cases
|
||||
TBF - Token Bucket Filter
|
||||
really accurate algorithm
|
||||
uses a lot of CPU
|
||||
not possible for high bandwidth links (>1MBit)
|
||||
SFQ - Stochastic Fairness Queueing
|
||||
less accurate algorithm
|
||||
tries to distinguish between individual streams
|
||||
does round robin between those streams
|
||||
TEQL - True Link Equalizer
|
||||
allows to 'bundle' interfaces
|
||||
RED - Random Early Detect / Drop
|
||||
simulates congested link by statistic packet dropping
|
||||
uses almost no CPU
|
||||
recommended for high-bandwidth backbones
|
||||
others (WRR, TCINDEX, DSMARK, ..)
|
||||
WRR not officially included in kernel, similar to CBQ
|
||||
others mostly used for DiffServ
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
QoS in IP Networks
|
||||
The big picture
|
||||
|
||||
Overview of the a packet's journey
|
||||
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
Incoming Packets
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
|
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
V
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
Packet Classification classify
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
(ipchains/iptables) set nfmark
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
|
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
V
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
Routing decision
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
|
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
V
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
TC filter select classes based on nfmark
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
/ | \
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
/ | \
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
/ | \
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
Different Bandwidth classes bandwidth classes (CBQ)
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
\ | /
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
\ | /
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
\ | /
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
Enqueuing output queue discipline
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
|
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
V
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
Outgoing packets
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
QoS in IP Networks
|
||||
Example scenario usin CBQ
|
||||
|
||||
%size 4
|
||||
Let's assume we have a link with 10 MBit maximum available bandwidth.
|
||||
We offer two major services to the outside world: Anonymous FTP and a Webserver offering important Information.
|
||||
|
||||
FTP Bulk data transfers are using up almost all available bandwidth, thus slowing down accesses to our website :(
|
||||
|
||||
We want to have FTP transfers use up to 8MBit and reserve 2MBit for WWW.
|
||||
|
||||
Implementation uses CBQ for bandwidth divisions.
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
QoS in IP Networks
|
||||
Example scenario
|
||||
|
||||
%size 3
|
||||
attach a CBQ to the device
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
tc qdisc add dev eth0 root handle 10: cbq
|
||||
bandwidth 10Mbit avpkt 1000
|
||||
|
||||
%size 3
|
||||
%font "standard"
|
||||
create CBQ classes
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
tc class add dev eth0 parent 10:0 classid 10:1 cbq
|
||||
bandwidth 10MBit rate 10MBit allot 1514
|
||||
weight 1Mbit prio 8 maxburst 20 avpkt 1000
|
||||
|
||||
tc class add dev eth0 parent 10:1 classid 10:100 cbq
|
||||
bandwidth 10MBit rate 8MBit allot 1514
|
||||
weight 800kbit prio 5 maxburst 20 avpkt 1000 bounded
|
||||
|
||||
tc class add dev eth0 parent 10:1 classid 10:200 cbq
|
||||
bandwidth 10MBit rate 2MBit allot 1514
|
||||
weight 200kbit prio 5 maxburst 20 avpkt 1000 bounded
|
||||
|
||||
%size 3
|
||||
%font "standard"
|
||||
add filter rules
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
tc filter add dev eth0 parent 10:1 protocol ip handle 6 fw classid 10:100
|
||||
tc filter add dev eth0 parent 10:1 protocol ip handle 7 fw classid 10:200
|
||||
|
||||
iptables -t mangle -A PREROUTING -j MARK -p tcp --sport 20 --set-mark 6
|
||||
iptables -t mangle -A PREROUTING -j MARK -p tcp ! --sport 20 --set-mark 7
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
QoS in IP Networks
|
||||
Further optimization
|
||||
|
||||
%size 4
|
||||
Now we have achieved bandwidth division between two services.
|
||||
|
||||
Within one service, however, one individual user with a high bandwith link can still use up most of our bandwidth, slowing down other user.
|
||||
|
||||
We can improve this behaviour of changing the scheduling algorithm from it's default (fifo)
|
||||
|
||||
%size 3
|
||||
%font "typewriter"
|
||||
tc qdisc add dev eth0 parent 10:100 sfq quantum 1514b perturb 15
|
||||
tc qdisc add dev eth0 parent 10:200 sfq quantum 1514b perturb 15
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
QoS in IP Networks
|
||||
Further reading / Links
|
||||
|
||||
Bandwidth limiting on Servers
|
||||
ProFTPd
|
||||
http://www.proftpd.net/
|
||||
Apache mod_bandwidth / mod_bwshare
|
||||
ftp://ftp.cohprog.com/pub/apache/module/mod_bandwidth.c
|
||||
http://www.topology.org/src/bwshare/
|
||||
|
||||
Queue scheduling
|
||||
Advanced Routing HOWTO
|
||||
http://www.ds9a.nl/2.4Routing/
|
||||
Linux QoS HOWTO
|
||||
http://www.ittc.ukans.edu/~rsarav/howto/
|
||||
iproute2+tc
|
||||
|
||||
This presentation
|
||||
Authors Homepage
|
||||
http://www.gnumonks.org/
|
|
@ -0,0 +1,611 @@
|
|||
%!PS-Adobe-2.0 EPSF-2.0
|
||||
%%Title: /laforge/home/laforge/incoming/qos-1
|
||||
%%Creator: Dia v0.86
|
||||
%%CreationDate: Mon Apr 2 16:14:45 2001
|
||||
%%For: a user
|
||||
%%Magnification: 1.0000
|
||||
%%Orientation: Portrait
|
||||
%%BoundingBox: 0 0 1356 288
|
||||
%%Pages: 1
|
||||
%%BeginSetup
|
||||
%%EndSetup
|
||||
%%EndComments
|
||||
[ /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef
|
||||
/.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef
|
||||
/.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef
|
||||
/.notdef /.notdef /space /exclam /quotedbl /numbersign /dollar /percent /ampersand /quoteright
|
||||
/parenleft /parenright /asterisk /plus /comma /hyphen /period /slash /zero /one
|
||||
/two /three /four /five /six /seven /eight /nine /colon /semicolon
|
||||
/less /equal /greater /question /at /A /B /C /D /E
|
||||
/F /G /H /I /J /K /L /M /N /O
|
||||
/P /Q /R /S /T /U /V /W /X /Y
|
||||
/Z /bracketleft /backslash /bracketright /asciicircum /underscore /quoteleft /a /b /c
|
||||
/d /e /f /g /h /i /j /k /l /m
|
||||
/n /o /p /q /r /s /t /u /v /w
|
||||
/x /y /z /braceleft /bar /braceright /asciitilde /.notdef /.notdef /.notdef
|
||||
/.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef
|
||||
/.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef
|
||||
/.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef
|
||||
/space /exclamdown /cent /sterling /currency /yen /brokenbar /section /dieresis /copyright
|
||||
/ordfeminine /guillemotleft /logicalnot /hyphen /registered /macron /degree /plusminus /twosuperior /threesuperior
|
||||
/acute /mu /paragraph /periodcentered /cedilla /onesuperior /ordmasculine /guillemotright /onequarter /onehalf
|
||||
/threequarters /questiondown /Agrave /Aacute /Acircumflex /Atilde /Adieresis /Aring /AE /Ccedilla
|
||||
/Egrave /Eacute /Ecircumflex /Edieresis /Igrave /Iacute /Icircumflex /Idieresis /Eth /Ntilde
|
||||
/Ograve /Oacute /Ocircumflex /Otilde /Odieresis /multiply /Oslash /Ugrave /Uacute /Ucircumflex
|
||||
/Udieresis /Yacute /Thorn /germandbls /agrave /aacute /acircumflex /atilde /adieresis /aring
|
||||
/ae /ccedilla /egrave /eacute /ecircumflex /edieresis /igrave /iacute /icircumflex /idieresis
|
||||
/eth /ntilde /ograve /oacute /ocircumflex /otilde /odieresis /divide /oslash /ugrave
|
||||
/uacute /ucircumflex /udieresis /yacute /thorn /ydieresis] /isolatin1encoding exch def
|
||||
/Times-Roman-latin1
|
||||
/Times-Roman findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/Times-Italic-latin1
|
||||
/Times-Italic findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/Times-Bold-latin1
|
||||
/Times-Bold findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/Times-BoldItalic-latin1
|
||||
/Times-BoldItalic findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/AvantGarde-Book-latin1
|
||||
/AvantGarde-Book findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/AvantGarde-BookOblique-latin1
|
||||
/AvantGarde-BookOblique findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/AvantGarde-Demi-latin1
|
||||
/AvantGarde-Demi findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/AvantGarde-DemiOblique-latin1
|
||||
/AvantGarde-DemiOblique findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/Bookman-Light-latin1
|
||||
/Bookman-Light findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/Bookman-LightItalic-latin1
|
||||
/Bookman-LightItalic findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/Bookman-Demi-latin1
|
||||
/Bookman-Demi findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/Bookman-DemiItalic-latin1
|
||||
/Bookman-DemiItalic findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/Courier-latin1
|
||||
/Courier findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/Courier-Oblique-latin1
|
||||
/Courier-Oblique findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/Courier-Bold-latin1
|
||||
/Courier-Bold findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/Courier-BoldOblique-latin1
|
||||
/Courier-BoldOblique findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/Helvetica-latin1
|
||||
/Helvetica findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/Helvetica-Oblique-latin1
|
||||
/Helvetica-Oblique findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/Helvetica-Bold-latin1
|
||||
/Helvetica-Bold findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/Helvetica-BoldOblique-latin1
|
||||
/Helvetica-BoldOblique findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/Helvetica-Narrow-latin1
|
||||
/Helvetica-Narrow findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/Helvetica-Narrow-Oblique-latin1
|
||||
/Helvetica-Narrow-Oblique findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/Helvetica-Narrow-Bold-latin1
|
||||
/Helvetica-Narrow-Bold findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/Helvetica-Narrow-BoldOblique-latin1
|
||||
/Helvetica-Narrow-BoldOblique findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/NewCenturySchoolbook-Roman-latin1
|
||||
/NewCenturySchoolbook-Roman findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/NewCenturySchoolbook-Italic-latin1
|
||||
/NewCenturySchoolbook-Italic findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/NewCenturySchoolbook-Bold-latin1
|
||||
/NewCenturySchoolbook-Bold findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/NewCenturySchoolbook-BoldItalic-latin1
|
||||
/NewCenturySchoolbook-BoldItalic findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/Palatino-Roman-latin1
|
||||
/Palatino-Roman findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/Palatino-Italic-latin1
|
||||
/Palatino-Italic findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/Palatino-Bold-latin1
|
||||
/Palatino-Bold findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/Palatino-BoldItalic-latin1
|
||||
/Palatino-BoldItalic findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/Symbol-latin1
|
||||
/Symbol findfont
|
||||
definefont pop
|
||||
/ZapfChancery-MediumItalic-latin1
|
||||
/ZapfChancery-MediumItalic findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/ZapfDingbats-latin1
|
||||
/ZapfDingbats findfont
|
||||
dup length dict begin
|
||||
{1 index /FID ne {def} {pop pop} ifelse} forall
|
||||
/Encoding isolatin1encoding def
|
||||
currentdict end
|
||||
definefont pop
|
||||
/cp {closepath} bind def
|
||||
/c {curveto} bind def
|
||||
/f {fill} bind def
|
||||
/a {arc} bind def
|
||||
/ef {eofill} bind def
|
||||
/ex {exch} bind def
|
||||
/gr {grestore} bind def
|
||||
/gs {gsave} bind def
|
||||
/sa {save} bind def
|
||||
/rs {restore} bind def
|
||||
/l {lineto} bind def
|
||||
/m {moveto} bind def
|
||||
/rm {rmoveto} bind def
|
||||
/n {newpath} bind def
|
||||
/s {stroke} bind def
|
||||
/sh {show} bind def
|
||||
/slc {setlinecap} bind def
|
||||
/slj {setlinejoin} bind def
|
||||
/slw {setlinewidth} bind def
|
||||
/srgb {setrgbcolor} bind def
|
||||
/rot {rotate} bind def
|
||||
/sc {scale} bind def
|
||||
/sd {setdash} bind def
|
||||
/ff {findfont} bind def
|
||||
/sf {setfont} bind def
|
||||
/scf {scalefont} bind def
|
||||
/sw {stringwidth pop} bind def
|
||||
/tr {translate} bind def
|
||||
|
||||
/ellipsedict 8 dict def
|
||||
ellipsedict /mtrx matrix put
|
||||
/ellipse
|
||||
{ ellipsedict begin
|
||||
/endangle exch def
|
||||
/startangle exch def
|
||||
/yrad exch def
|
||||
/xrad exch def
|
||||
/y exch def
|
||||
/x exch def /savematrix mtrx currentmatrix def
|
||||
x y tr xrad yrad sc
|
||||
0 0 1 startangle endangle arc
|
||||
savematrix setmatrix
|
||||
end
|
||||
} def
|
||||
|
||||
/mergeprocs {
|
||||
dup length
|
||||
3 -1 roll
|
||||
dup
|
||||
length
|
||||
dup
|
||||
5 1 roll
|
||||
3 -1 roll
|
||||
add
|
||||
array cvx
|
||||
dup
|
||||
3 -1 roll
|
||||
0 exch
|
||||
putinterval
|
||||
dup
|
||||
4 2 roll
|
||||
putinterval
|
||||
} bind def
|
||||
28.346000 -28.346000 scale
|
||||
17.845207 437.856740 translate
|
||||
%%EndProlog
|
||||
|
||||
|
||||
1.000000 1.000000 1.000000 srgb
|
||||
n -13.077546 -447.950612 m -13.077546 -439.950612 l -7.985470 -439.950612 l -7.985470 -447.950612 l f
|
||||
0.100000 slw
|
||||
[] 0 sd
|
||||
[] 0 sd
|
||||
0 slj
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
n -13.077546 -447.950612 m -13.077546 -439.950612 l -7.985470 -439.950612 l -7.985470 -447.950612 l cp s
|
||||
1.000000 1.000000 1.000000 srgb
|
||||
n -2.033689 -446.022040 m -2.033689 -441.959540 l 10.048900 -441.959540 l 10.048900 -446.022040 l f
|
||||
0.100000 slw
|
||||
[] 0 sd
|
||||
[] 0 sd
|
||||
0 slj
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
n -2.033689 -446.022040 m -2.033689 -441.959540 l 10.048900 -441.959540 l 10.048900 -446.022040 l cp s
|
||||
1.000000 1.000000 1.000000 srgb
|
||||
n 16.050852 -447.942862 m 16.050852 -439.949112 l 25.116894 -439.949112 l 25.116894 -447.942862 l f
|
||||
0.100000 slw
|
||||
[] 0 sd
|
||||
[] 0 sd
|
||||
0 slj
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
n 16.050852 -447.942862 m 16.050852 -439.949112 l 25.116894 -439.949112 l 25.116894 -447.942862 l cp s
|
||||
0.100000 slw
|
||||
[] 0 sd
|
||||
[] 0 sd
|
||||
0 slc
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
n -16.038172 -440.048589 m -13.070281 -440.011807 l s
|
||||
0.100000 slw
|
||||
[] 0 sd
|
||||
[] 0 sd
|
||||
0 slc
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
n -15.997245 -447.906733 m -13.077546 -447.950612 l s
|
||||
0.100000 slw
|
||||
0 slc
|
||||
[] 0 sd
|
||||
1.000000 0.000000 0.000000 srgb
|
||||
n -13.560988 -443.913577 m -16.406523 -443.936733 l s
|
||||
0 slj
|
||||
1.000000 0.000000 0.000000 srgb
|
||||
n -14.362995 -443.670095 m -13.560988 -443.913577 l -14.358927 -444.170079 l f
|
||||
/Helvetica-Oblique-latin1 ff 0.600000 scf sf
|
||||
/Helvetica-Oblique-latin1 ff 0.600000 scf sf
|
||||
1.000000 0.000000 0.000000 srgb
|
||||
() dup sw 2 div -14.855953 ex sub -444.948149 m gs 1 -1 sc sh gr
|
||||
1.000000 1.000000 1.000000 srgb
|
||||
n -4.993883 -443.984335 1.990755 1.984000 0 360 ellipse f
|
||||
0.100000 slw
|
||||
[] 0 sd
|
||||
[] 0 sd
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
n -4.993883 -443.984335 1.990755 1.984000 0 360 ellipse cp s
|
||||
1.000000 1.000000 1.000000 srgb
|
||||
n 13.037714 -443.959407 2.005400 2.000000 0 360 ellipse f
|
||||
0.100000 slw
|
||||
[] 0 sd
|
||||
[] 0 sd
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
n 13.037714 -443.959407 2.005400 2.000000 0 360 ellipse cp s
|
||||
0.100000 slw
|
||||
[] 0 sd
|
||||
[] 0 sd
|
||||
0 slc
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
n -7.985470 -443.950612 m -6.984638 -443.984335 l s
|
||||
0.100000 slw
|
||||
[] 0 sd
|
||||
[] 0 sd
|
||||
0 slc
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
n -3.018922 -443.977661 m -2.033689 -443.990790 l s
|
||||
0.100000 slw
|
||||
[] 0 sd
|
||||
[] 0 sd
|
||||
0 slc
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
n 9.967044 -444.004735 m 11.032314 -443.959407 l s
|
||||
0.100000 slw
|
||||
[] 0 sd
|
||||
[] 0 sd
|
||||
0 slc
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
n 15.043114 -443.959407 m 16.016589 -443.977661 l s
|
||||
/Courier-latin1 ff 0.800000 scf sf
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
(Router) dup sw 2 div -4.832496 ex sub -443.824335 m gs 1 -1 sc sh gr
|
||||
/Courier-latin1 ff 0.800000 scf sf
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
(Router) dup sw 2 div 13.087201 ex sub -443.751407 m gs 1 -1 sc sh gr
|
||||
1.000000 1.000000 1.000000 srgb
|
||||
n -8.995481 -447.879407 m -8.995481 -440.039407 l -8.067481 -440.039407 l -8.067481 -447.879407 l f
|
||||
0.100000 slw
|
||||
[] 0 sd
|
||||
[] 0 sd
|
||||
0 slj
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
n -8.995481 -447.879407 m -8.995481 -440.039407 l -8.067481 -440.039407 l -8.067481 -447.879407 l cp s
|
||||
1.000000 1.000000 1.000000 srgb
|
||||
n -9.998281 -447.883807 m -9.998281 -440.043807 l -9.070281 -440.043807 l -9.070281 -447.883807 l f
|
||||
0.100000 slw
|
||||
[] 0 sd
|
||||
[] 0 sd
|
||||
0 slj
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
n -9.998281 -447.883807 m -9.998281 -440.043807 l -9.070281 -440.043807 l -9.070281 -447.883807 l cp s
|
||||
1.000000 1.000000 1.000000 srgb
|
||||
n -11.022281 -447.851807 m -11.022281 -440.011807 l -10.094281 -440.011807 l -10.094281 -447.851807 l f
|
||||
0.100000 slw
|
||||
[] 0 sd
|
||||
[] 0 sd
|
||||
0 slj
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
n -11.022281 -447.851807 m -11.022281 -440.011807 l -10.094281 -440.011807 l -10.094281 -447.851807 l cp s
|
||||
1.000000 1.000000 1.000000 srgb
|
||||
n -12.046281 -447.851807 m -12.046281 -440.011807 l -11.118281 -440.011807 l -11.118281 -447.851807 l f
|
||||
0.100000 slw
|
||||
[] 0 sd
|
||||
[] 0 sd
|
||||
0 slj
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
n -12.046281 -447.851807 m -12.046281 -440.011807 l -11.118281 -440.011807 l -11.118281 -447.851807 l cp s
|
||||
1.000000 1.000000 1.000000 srgb
|
||||
n -13.070281 -447.851807 m -13.070281 -440.011807 l -12.142281 -440.011807 l -12.142281 -447.851807 l f
|
||||
0.100000 slw
|
||||
[] 0 sd
|
||||
[] 0 sd
|
||||
0 slj
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
n -13.070281 -447.851807 m -13.070281 -440.011807 l -12.142281 -440.011807 l -12.142281 -447.851807 l cp s
|
||||
1.000000 1.000000 1.000000 srgb
|
||||
n 18.115732 -447.883807 m 18.115732 -440.043807 l 19.043732 -440.043807 l 19.043732 -447.883807 l f
|
||||
0.100000 slw
|
||||
[] 0 sd
|
||||
[] 0 sd
|
||||
0 slj
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
n 18.115732 -447.883807 m 18.115732 -440.043807 l 19.043732 -440.043807 l 19.043732 -447.883807 l cp s
|
||||
1.000000 1.000000 1.000000 srgb
|
||||
n 21.123732 -447.851807 m 21.123732 -440.011807 l 22.051732 -440.011807 l 22.051732 -447.851807 l f
|
||||
0.100000 slw
|
||||
[] 0 sd
|
||||
[] 0 sd
|
||||
0 slj
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
n 21.123732 -447.851807 m 21.123732 -440.011807 l 22.051732 -440.011807 l 22.051732 -447.851807 l cp s
|
||||
1.000000 1.000000 1.000000 srgb
|
||||
n 24.195732 -447.883807 m 24.195732 -440.043807 l 25.123732 -440.043807 l 25.123732 -447.883807 l f
|
||||
0.100000 slw
|
||||
[] 0 sd
|
||||
[] 0 sd
|
||||
0 slj
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
n 24.195732 -447.883807 m 24.195732 -440.043807 l 25.123732 -440.043807 l 25.123732 -447.883807 l cp s
|
||||
0.100000 slw
|
||||
[] 0 sd
|
||||
[] 0 sd
|
||||
0 slc
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
n 25.116894 -439.949112 m 28.034661 -439.966733 l s
|
||||
0.100000 slw
|
||||
[] 0 sd
|
||||
[] 0 sd
|
||||
0 slc
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
n 25.116894 -447.942862 m 27.993733 -447.947661 l s
|
||||
0.100000 slw
|
||||
0 slc
|
||||
[] 0 sd
|
||||
1.000000 0.000000 0.000000 srgb
|
||||
n 29.115480 -443.994786 m 25.988269 -444.018589 l s
|
||||
0 slj
|
||||
1.000000 0.000000 0.000000 srgb
|
||||
n 28.313600 -443.750882 m 29.115480 -443.994786 l 28.317406 -444.250868 l f
|
||||
/Helvetica-Oblique-latin1 ff 0.600000 scf sf
|
||||
/Helvetica-Oblique-latin1 ff 0.600000 scf sf
|
||||
1.000000 0.000000 0.000000 srgb
|
||||
() dup sw 2 div 27.554158 ex sub -444.306679 m gs 1 -1 sc sh gr
|
||||
/Courier-latin1 ff 0.800000 scf sf
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
(receiver) dup sw 2 div 27.851695 ex sub -444.836756 m gs 1 -1 sc sh gr
|
||||
/Courier-latin1 ff 0.800000 scf sf
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
(sender) dup sw 2 div -16.290807 ex sub -444.767845 m gs 1 -1 sc sh gr
|
||||
1.000000 1.000000 1.000000 srgb
|
||||
n -1.950016 -445.936335 m -1.950016 -442.064335 l 1.044776 -442.064335 l 1.044776 -445.936335 l f
|
||||
0.100000 slw
|
||||
[] 0 sd
|
||||
[] 0 sd
|
||||
0 slj
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
n -1.950016 -445.936335 m -1.950016 -442.064335 l 1.044776 -442.064335 l 1.044776 -445.936335 l cp s
|
||||
1.000000 1.000000 1.000000 srgb
|
||||
n 1.098648 -445.940735 m 1.098648 -442.068735 l 4.073436 -442.068735 l 4.073436 -445.940735 l f
|
||||
0.100000 slw
|
||||
[] 0 sd
|
||||
[] 0 sd
|
||||
0 slj
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
n 1.098648 -445.940735 m 1.098648 -442.068735 l 4.073436 -442.068735 l 4.073436 -445.940735 l cp s
|
||||
1.000000 1.000000 1.000000 srgb
|
||||
n 4.117184 -445.940735 m 4.117184 -442.068735 l 7.061168 -442.068735 l 7.061168 -445.940735 l f
|
||||
0.100000 slw
|
||||
[] 0 sd
|
||||
[] 0 sd
|
||||
0 slj
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
n 4.117184 -445.940735 m 4.117184 -442.068735 l 7.061168 -442.068735 l 7.061168 -445.940735 l cp s
|
||||
1.000000 1.000000 1.000000 srgb
|
||||
n 7.143023 -445.940735 m 7.143023 -442.068735 l 9.967044 -442.068735 l 9.967044 -445.940735 l f
|
||||
0.100000 slw
|
||||
[] 0 sd
|
||||
[] 0 sd
|
||||
0 slj
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
n 7.143023 -445.940735 m 7.143023 -442.068735 l 9.967044 -442.068735 l 9.967044 -445.940735 l cp s
|
||||
/Courier-latin1 ff 0.800000 scf sf
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
(3) dup sw 2 div 24.644229 ex sub -446.919407 m gs 1 -1 sc sh gr
|
||||
/Courier-latin1 ff 0.800000 scf sf
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
(4) dup sw 2 div 21.892229 ex sub -446.983407 m gs 1 -1 sc sh gr
|
||||
/Courier-latin1 ff 0.800000 scf sf
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
(5) dup sw 2 div 18.564229 ex sub -447.079407 m gs 1 -1 sc sh gr
|
||||
/Courier-latin1 ff 0.800000 scf sf
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
(6) dup sw 2 div 8.485208 ex sub -443.856335 m gs 1 -1 sc sh gr
|
||||
/Courier-latin1 ff 0.800000 scf sf
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
(7) dup sw 2 div 5.522128 ex sub -443.856335 m gs 1 -1 sc sh gr
|
||||
/Courier-latin1 ff 0.800000 scf sf
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
(8) dup sw 2 div 2.544520 ex sub -443.888335 m gs 1 -1 sc sh gr
|
||||
/Courier-latin1 ff 0.800000 scf sf
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
(9) dup sw 2 div -0.305843 ex sub -443.895805 m gs 1 -1 sc sh gr
|
||||
/Courier-latin1 ff 0.800000 scf sf
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
(10) dup sw 2 div -8.541268 ex sub -447.111407 m gs 1 -1 sc sh gr
|
||||
/Courier-latin1 ff 0.800000 scf sf
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
(11) dup sw 2 div -9.501268 ex sub -447.111407 m gs 1 -1 sc sh gr
|
||||
/Courier-latin1 ff 0.800000 scf sf
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
(12) dup sw 2 div -10.557268 ex sub -447.111407 m gs 1 -1 sc sh gr
|
||||
/Courier-latin1 ff 0.800000 scf sf
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
(13) dup sw 2 div -11.581268 ex sub -447.111407 m gs 1 -1 sc sh gr
|
||||
/Courier-latin1 ff 0.800000 scf sf
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
(14) dup sw 2 div -12.605268 ex sub -447.111407 m gs 1 -1 sc sh gr
|
||||
/Courier-latin1 ff 0.800000 scf sf
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
(low bandwidth link) dup sw 2 div 7.664387 ex sub -438.064335 m gs 1 -1 sc sh gr
|
||||
/Courier-latin1 ff 0.800000 scf sf
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
(high bandwidth link) dup sw 2 div -10.882479 ex sub -438.023407 m gs 1 -1 sc sh gr
|
||||
/Courier-latin1 ff 0.800000 scf sf
|
||||
0.000000 0.000000 0.000000 srgb
|
||||
(high bandwidth link) dup sw 2 div 21.653603 ex sub -438.052474 m gs 1 -1 sc sh gr
|
||||
showpage
|
Binary file not shown.
After Width: | Height: | Size: 8.1 KiB |
|
@ -0,0 +1,48 @@
|
|||
Grundlagen des Firewallings - Sicherheit in IP-Netzwerken
|
||||
=========================================================
|
||||
|
||||
Was ist eine Firewall?
|
||||
Was macht eine Firewall genau?
|
||||
Was fuer Unterschiede gibt es zwischen Firewalls?
|
||||
Wo ist der Unterschied zwischen Paketfiltern und Proxies?
|
||||
|
||||
Mit diesen (und anderen) Fragen beschaeftigt sich der KNF-Vortrag ueber die
|
||||
Grundlagen des Firewallings.
|
||||
|
||||
Ausgehend von einem grundlegenden Wissen ueber TCP/IP Netzwerke und Router
|
||||
beschreibt dieser Vortrag die dem Firewalling zugrunde liegenden Konzepte
|
||||
und Strategien, sowie deren Moeglichkeiten.
|
||||
|
||||
In dem Vortrag wird bewusst nicht auf bestimmte Firewall-Produkte oder
|
||||
Implementationen eingegangen, es sind daher auch keinerlei Vorkenntnisse
|
||||
in der Anwendung / Administration einer Firewall noetig.
|
||||
|
||||
Gliederung:
|
||||
|
||||
- Kurzer Ueberblick ueber IP-Routing, TCP, UDP, ICMP
|
||||
- Paketfilter
|
||||
- Funktionsweise
|
||||
- traditionelle Paketfilter (ohne state)
|
||||
- stateful firewalling (connection tracking)
|
||||
|
||||
- Proxies
|
||||
- Funktionsweise
|
||||
- 'normale' Proxies
|
||||
- transparente Proxies
|
||||
|
||||
- Vergleich Paketfilter/Proxy
|
||||
- daraus abgeleitet
|
||||
- Moeglichkeiten
|
||||
- Einsatzbereiche
|
||||
|
||||
- Network Address Translation (NAT)
|
||||
- static NAT
|
||||
- static NAPT
|
||||
- symmetric NAPT
|
||||
- masquerading
|
||||
|
||||
Ueber den Vortragenden:
|
||||
Harald Welte ist seit 1995 aktives KNF-Mitglied und der derzeitige
|
||||
stellvertretende Technische Kontakt des KNF. Er ist der Maintainer des
|
||||
netfilter/iptables Firewalling-Subsystems im Linux 2.4.x und
|
||||
2.5.x Kernel und war massgeblich an dessen Entwicklung beteiligt.
|
|
@ -0,0 +1,312 @@
|
|||
%include "default.mgp"
|
||||
%default 1 bgrad
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
%nodefault
|
||||
%back "blue"
|
||||
|
||||
%center
|
||||
%size 7
|
||||
|
||||
|
||||
TCP/IP Firewalling Basics
|
||||
|
||||
%center
|
||||
%size 4
|
||||
by
|
||||
|
||||
Harald Welte <laforge@sunbeam.franken.de>
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalling Basics
|
||||
Contents
|
||||
|
||||
Introduction
|
||||
|
||||
Networking Basics
|
||||
|
||||
Potential Security Problems
|
||||
|
||||
Solution 1: Packet Filters
|
||||
|
||||
Solution 2: Proxies
|
||||
|
||||
Comparison
|
||||
|
||||
Summary
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalling Basics
|
||||
Introduction
|
||||
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalling Basics
|
||||
Networking Basics
|
||||
|
||||
7 layer OSI model used to abstract networking protocols
|
||||
layer 7: application layer: e.g. telnet/ftp
|
||||
layer 6: presentation layer:
|
||||
layer 5: session layer:
|
||||
layer 4: transport layer: e.g. TCP/UDP
|
||||
layer 3: network layer: e.g. IP
|
||||
layer 2: data link layer: e.g. Ethernet
|
||||
layer 1: physical layer: e.g. Wire
|
||||
Layer 1 + 2 embedded in hardware
|
||||
Layer 3 + 4 implemented in operating system
|
||||
Layer 5+ embedded in application program
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalling Basics
|
||||
Networking Basics
|
||||
|
||||
Layer 2: Ethernet
|
||||
enables two hosts within same pysical net to exchange packets
|
||||
unreliable
|
||||
adressing granularity: host
|
||||
fixed hardware adresses (MAC adress, 48bit)
|
||||
|
||||
Layer 3: Internet Protocol (IP)
|
||||
enables two hosts in diferent physical networks to exchange packets
|
||||
unreliable, best effort
|
||||
packet reordering
|
||||
packet loss
|
||||
adressing granularity: host
|
||||
logical adresses (IP Adress, 32bit)
|
||||
checksum protects only IP header
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalling Basics
|
||||
Networking Basics
|
||||
|
||||
Layer 4: User Datagram Protocol (UDP)
|
||||
unreliable, best effort
|
||||
adressing granularity: ports (16bit = 65535)
|
||||
optional payload checksum
|
||||
|
||||
Layer 4: Transmission Control Protocol (TCP)
|
||||
provides connection abstraction
|
||||
reliable
|
||||
ordering guarantee
|
||||
retransmissions correct packet loss
|
||||
flow control
|
||||
payload checksum protects payload from data corruption
|
||||
|
||||
Layer 4: Internet Control Message Protocol (ICMP)
|
||||
used internally by TCP/IP protocol suite
|
||||
error messages (e.g. host unreachable)
|
||||
diagnostics (e.g. ping/pong)
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalling Basics
|
||||
Potential Security Problems
|
||||
|
||||
Security issues arise at interconnection of two networks
|
||||
Traditional Case: IP Router connecting an organization internal network to the Internet
|
||||
|
||||
What Security Problem?
|
||||
organization-internal services exposed to outside network
|
||||
spoofed (forged) packets to circumvent 'security by address'
|
||||
even if all internal services secured by authentication, difficult to guarantee security on all internal hosts
|
||||
|
||||
Why Firewalling?
|
||||
to restrict which internal services are exposed to the outside
|
||||
to restrict which outside services are used by internal users
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalling Basics
|
||||
Solution 1: Packet Filters
|
||||
|
||||
Filter individual packets at network interconnection (Router)
|
||||
|
||||
Filter criteria traditionally include
|
||||
IP source + destination address
|
||||
TCP/UDP source + destination port
|
||||
TCP header flags
|
||||
|
||||
Filtering rules determine if
|
||||
packet is allowed to transit interconnection
|
||||
packet is silently dropped
|
||||
packet is dropped and error message returned to sender
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalling Basics
|
||||
Solution 1: Packet Filters
|
||||
|
||||
Capabilities
|
||||
disallow communication between certain IP adresses
|
||||
disallow communication between certain port numbers
|
||||
disallow malicious packets, like packets
|
||||
using source routing IP option
|
||||
impossible combination of features, like tcp xmas scan
|
||||
generate log of malicious and/or filtered packets
|
||||
|
||||
Limitations
|
||||
scope limited to individual packets
|
||||
no ability to look inside packet payload (HTTP 1.1 virtual hosts)
|
||||
no abstraction of connection, filtering rules needed for both directions
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalling Basics
|
||||
Solution 1: Packet Filters
|
||||
|
||||
Extensions
|
||||
stateful packet filters (connection tracking)
|
||||
filtering only needed for connection-initiating packets
|
||||
all other packets within connection are accepted as part of an already established connection
|
||||
|
||||
TCP window tracking
|
||||
allow filtering not only on source/dest port but also on TCP sequence number
|
||||
|
||||
NAT (Network Address Translation)
|
||||
manipulation of source / destination address
|
||||
redirect packets to other hosts
|
||||
'share' one ip address at dialup accounts (masquerading)
|
||||
connect two networks with overlapping addresss ranges
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalling Basics
|
||||
Solution 2: Proxies
|
||||
|
||||
A proxy operates at layer 5 and above
|
||||
|
||||
Mode of operation
|
||||
client connects to proxy instead of server
|
||||
proxy initiates a second, seperate connection to server
|
||||
|
||||
Proxies are just normal programs implementing a server and a client for a particular application protocol (e.g. HTTP) using operating system mechanisms (like sockets API, winsock, ...)
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalling Basics
|
||||
Solution 2: Proxies
|
||||
|
||||
Capabilities
|
||||
disallow communication between certain IP adresses
|
||||
disallow communication between certain ports
|
||||
disallow communication based on packet payload
|
||||
e.g. pathnames / filenames within HTTP and FTP
|
||||
e.g. email-adresses within SMTP
|
||||
e.g. hostnames within DNS (www.netzzensur.de)
|
||||
e.g. badwords ('sex' and 'teen' within same file)
|
||||
manipulation of packet payload
|
||||
everything possible...
|
||||
|
||||
Limitations
|
||||
somebody needs to tell client app to connect to proxy instead of server
|
||||
seperate proxies for all used protocols needed
|
||||
not possible to filter on packet options, etc.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalling Basics
|
||||
Solution 2: Proxies
|
||||
|
||||
Extensions
|
||||
Transparent Proxies
|
||||
accept connections from client independent of dest IP
|
||||
make reply packets to the client look like as sent by server
|
||||
possibly to implement same transparancy towards server
|
||||
no need to tell clients about proxies anymore!
|
||||
|
||||
SOCKS
|
||||
application protocol indepentent proxy
|
||||
one proxy for all application protocols
|
||||
uses seperate protocol between client and proxy
|
||||
needs explicit support from client application
|
||||
integrated username/password authentication
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalling Basics
|
||||
Comparison
|
||||
|
||||
Packet Filter
|
||||
pro
|
||||
total control on lowest per-packet level
|
||||
very high performance
|
||||
possible to implement failover / load balancing
|
||||
NAT as extension solves adress space problem
|
||||
contra
|
||||
configuration requires sophisticated knowledge
|
||||
problems when no state / window tracking used
|
||||
support for complex protocols (H.323, SIP) difficult to implement
|
||||
Proxy
|
||||
pro
|
||||
no knowledge about layer3/4 protocol needed
|
||||
configuration very easy
|
||||
address space automatically seperated
|
||||
integrates easily with other applications like IDS
|
||||
easy implementation, just normal application programs
|
||||
contra
|
||||
seperate proxies needed for almost every protocol
|
||||
bad performance
|
||||
uses lots of ressources (e.g. sockets) on gatway
|
||||
horribly breaks end-to-end
|
||||
needs explicit configuration of client apps if not transparent proxy
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalling Basics
|
||||
Comparison
|
||||
|
||||
Transparent Proxy
|
||||
uses ideas/methods of packet filtering (NAT) to achieve protocol transparence
|
||||
horrible violation of layering
|
||||
|
||||
Stateful Packet Filter
|
||||
uese ideas of proxies (tracking of higher layer state) to achieve better security and easieer configuration
|
||||
horrible violation of layering
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalling Basics
|
||||
Conclusion
|
||||
|
||||
Conclusion
|
||||
proxies work for small installations where number of used protocols is small and administrative staff not very experienced
|
||||
packet filters without state tracking are difficult to configure correctly
|
||||
packet filters with state tracking are good solution for most usage scenarios: powerful but yet easy to configure correctly
|
||||
for highest security, best of both worlds can be combined
|
||||
imagine a stateful bridging packet filter in front of a proxy :)
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalling Basics
|
||||
Thanks
|
||||
|
||||
Thanks to
|
||||
the BBS people, Z-Netz, FIDO, ...
|
||||
for heavily increasing my computer usage in 1992
|
||||
|
||||
KNF
|
||||
for bringing me in touch with the internet as early as 1995
|
||||
for providing a playground for technical people
|
||||
for telling me about the existance of Linux!
|
||||
|
||||
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
|
||||
for implementing (one of?) the world's best TCP/IP stacks
|
||||
|
||||
Paul 'Rusty' Russell
|
||||
for starting the netfilter/iptables project
|
||||
for trusting me to maintain it today
|
||||
|
||||
Linux User Group Nuernberg (ALIGN, LUG-N)
|
||||
for helping me with my initial Linux problems
|
||||
|
|
@ -0,0 +1,100 @@
|
|||
- Introduction
|
||||
- since 1995 member of KNF, now 2nd TC, newsmaster + other stuff
|
||||
- learned lots of stuff while playing with KNF and own networks
|
||||
- done weird stuff like UUCP-over-SSL HOWTO :)
|
||||
- now maintainer of linux firewalling code
|
||||
|
||||
- Basics
|
||||
- Internet as packet switched network
|
||||
- 7-layer-OSI
|
||||
- Internetworking using IP
|
||||
- unreliable, best effort, no ordering guarantees
|
||||
- Routing within IP
|
||||
- UDP as stateless protocol
|
||||
- same characteristics as IP, but
|
||||
- added ports to multiplex between apps
|
||||
- optional payload checksum
|
||||
- TCP as session layer
|
||||
- providing abstraction of connection
|
||||
- reliable (payload checksum, retransmissions)
|
||||
- ordering guarantees
|
||||
- flow control
|
||||
- ICMP as helper
|
||||
- error messages / diagnostics
|
||||
- absolutely neccessary !!
|
||||
|
||||
- potential security problems
|
||||
- spoofed packets
|
||||
- connections to internal, private services
|
||||
- difficult to guarantee security on all internal hosts
|
||||
- restrictions the other way around (for outbound connections)
|
||||
-
|
||||
|
||||
- solution 1: packet filters
|
||||
- operates on layer 3
|
||||
- filter packets based on packet header/content
|
||||
- alternatively also generate ICMP errors, RST packets
|
||||
|
||||
- extensions / derivates
|
||||
- stateful firewalling
|
||||
- transparent firewalls (firewalling bridges)
|
||||
|
||||
- solution 2: proxies
|
||||
- layer 5+
|
||||
- description
|
||||
- explicit configuration of all clients
|
||||
|
||||
- extensions / derivates
|
||||
- transparent proxies
|
||||
- SOCKS
|
||||
- needs explicit application support
|
||||
- solves authentication problem
|
||||
- not used widely
|
||||
- should be offered in addition to proxies
|
||||
to give users a chance of running 'weird' prtoocols
|
||||
without httptunnel.
|
||||
|
||||
- comparison
|
||||
- proxy
|
||||
+ no knowledge about protocol headers needed
|
||||
+ configuration extremely easy
|
||||
+ address space separated (no need for NAT)
|
||||
+ integrates easily with other applications like IDS
|
||||
+ easy implementation, just normal programs
|
||||
- seperate proxies needed for almost every protocol
|
||||
- bad performance
|
||||
- uses lots of ressources (i.e. sockets) on gateway
|
||||
- horribly breaks end-to-end
|
||||
- needs configuration of enduser applications, if not
|
||||
used as transparent proxies
|
||||
|
||||
- packet filter
|
||||
+ total control on lowest per-packet level
|
||||
+ very high performance
|
||||
+ possible to implement failover / load balancing
|
||||
+ NAT as extension solves address space problem
|
||||
- configuration requires high knowledge on TCP/IP protocols
|
||||
- problems when no state/window tracking is done
|
||||
- support for complex protocols (h.323, SIP, ...) difficult
|
||||
to implement
|
||||
|
||||
- transparent proxies
|
||||
- use some ideas of packet filtering / NAT to achieve
|
||||
transparency
|
||||
|
||||
- stateful packet filters
|
||||
- use some ideas of proxies (tracking of higher layer state)
|
||||
to achieve better security and easier configuration
|
||||
|
||||
- summary:
|
||||
- proxies work for small installations where number of
|
||||
to-be-supported protocols small and administrative stuff
|
||||
not very experienced
|
||||
- packet filters without state tracking difficult to be
|
||||
configured correctly
|
||||
- packet filters with state tracking are a good solution for
|
||||
most usage scenarios: powerful, but yet easy to configure
|
||||
right :)
|
||||
- for highest security, they can be combined: imagine a
|
||||
stateful bridging packet filter in front of proxies :)
|
||||
|
|
@ -0,0 +1,243 @@
|
|||
%include "default.mgp"
|
||||
%default 1 bgrad
|
||||
%deffont "typewriter" tfont "MONOTYPE.TTF"
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
%nodefault
|
||||
%back "blue"
|
||||
|
||||
%center
|
||||
%size 7
|
||||
|
||||
|
||||
IPv6 Introduction
|
||||
|
||||
|
||||
%center
|
||||
%size 4
|
||||
by
|
||||
|
||||
Harald Welte <laforge@rfc2460.org>
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
IPv6 Introduction
|
||||
What? Why?
|
||||
|
||||
|
||||
What is IPv6?
|
||||
|
||||
Successor of currently used IP Version 4
|
||||
Specified 1995 in RFC 2460
|
||||
|
||||
Why?
|
||||
|
||||
Address space in IPv4 too small
|
||||
Routing tables too large
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
IPv6 Introduction
|
||||
Advantages
|
||||
|
||||
|
||||
Advantages
|
||||
|
||||
stateless autoconfiguration
|
||||
multicast obligatory
|
||||
IPsec obligatory
|
||||
Mobile IP
|
||||
|
||||
Address renumbering
|
||||
Multihoming
|
||||
Multiple address scopes
|
||||
smaller routing tables through aggregatable allocation
|
||||
|
||||
simplified l3 header
|
||||
64bit aligned
|
||||
no checksum (l4 or l2)
|
||||
no fragmentation at router
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
IPv6 Introduction
|
||||
Disadvantages
|
||||
|
||||
Disadvantages
|
||||
Not widely deployed yet
|
||||
In most cases access only possible using manual tunnel
|
||||
OS support not ideal in most cases
|
||||
W2k: IPv6 available from MSi
|
||||
Windows XP: IPv6 included
|
||||
Linux has support, but some flaws (no IPsec, ndisc not fully implemented, ...)
|
||||
*BSD: full support (KAME)
|
||||
Solaris: full support
|
||||
Application support not ideal in most cases
|
||||
not supported: postfix, current squid, inn, proftpd,
|
||||
supported: bind8/9, apache, openssh, xinetd, rsync, squid-2.5(CVS), exim, zmailer, sendmail, qmail, inn-2.4(CVS), zebra
|
||||
|
||||
Conclusion: Circular dependencies
|
||||
no application support without OS support
|
||||
no good OS support without applications
|
||||
no wide deployment without applications
|
||||
no applications without deployment
|
||||
no deployment without applications
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
IPv6 Introduction
|
||||
Deployment
|
||||
|
||||
|
||||
Experimental (6bone)
|
||||
Experimental 6bone (3ffe::) has been active since 1995.
|
||||
Uses slightly different Addressing Architecture (RFC2471)
|
||||
|
||||
Production (2001::)
|
||||
Initial TLA's and sub-TLA's assigned in Sept 2000
|
||||
Mostly used in education+research
|
||||
Some commercial ISP's in .de are offering production prefixes
|
||||
|
||||
Why isn't IPv6 widely used yet?
|
||||
No immediate need in Europe / North America
|
||||
Big deployment cost at ISP's (Training, Routers, ..)
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
IPv6 Introduction
|
||||
Technical: Address Space
|
||||
|
||||
IP Version 6 Addressing Architecture (RFC2373)
|
||||
Format prefix, variable length
|
||||
001: RFC2374 addresses, 1/8 of address space
|
||||
0000 001: Reserved for NSAP (1/128)
|
||||
0000 010: Reserved for IPX (1/128)
|
||||
1111 1110 10: link-local unicast addresses (1/1024)
|
||||
1111 1110 11: site-local unicast addresses (1/1024)
|
||||
1111 1111 flgs scop: multicast addresses
|
||||
flgs (0: well-known, 1:transient)
|
||||
scop (0: reserved, 1: node-local, 2: link-local, 5: site-local, 8: organization-local, e: global scope, f: reserved)
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
IPv6 Introduction
|
||||
Technical: Address Space
|
||||
|
||||
Aggregatable Global Unicast Address Format (RFC2374)
|
||||
3bit FP (format prefix = 001)
|
||||
13bit TLA ID - Top-Level Aggregation ID
|
||||
13bit Sub-TLA - Sub-TLA Aggergation ID
|
||||
19bit NLA - Next-Level Aggregation ID
|
||||
16bit SLA - Site-Level Aggregation ID
|
||||
64bit Interface ID - derived from 48bit ethernet MAC
|
||||
Initial subTLA-Assignments
|
||||
2001:0000::/29 - 2001:01f8::/29 IANA
|
||||
2001:0200::/29 - 2001:03f8::/29 APNIC
|
||||
2001:0400::/29 - 2001:05f8::/29 ARIN
|
||||
2001:0600::/29 - 2001:07f8::/29 RIPE
|
||||
loopback ::1
|
||||
unspecified: ::0
|
||||
embedded ipv4
|
||||
IPv4-compatible address: 0::xxxx:xxxx
|
||||
IPv4-mapped IPv4 (IPv4 only node): 0::ffff:xxxx:xxxx
|
||||
anycast
|
||||
allocated from unicast addresses
|
||||
only subnet-router anycast address predefined (prefix::0000)
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
IPv6 Introduction
|
||||
Technical: Header
|
||||
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
||||
|Version| Traffic Class | Flow Label |
|
||||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
||||
| Payload Length | Next Header | Hop Limit |
|
||||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
||||
+ Source Address +
|
||||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
||||
+ Destination Address +
|
||||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
||||
%font "standard"
|
||||
4bit Version: 6
|
||||
8bit Traffic Class
|
||||
20bit Flow Label
|
||||
16bit Payload Length (incl. extension hdrs)
|
||||
8bit next header (same values like IPv4, RFC1700 et seq.)
|
||||
8bit hop limit (TTL)
|
||||
128bit source address
|
||||
128bit dest address
|
||||
extension headers:
|
||||
hop-by-hop options
|
||||
routing
|
||||
fragment
|
||||
destination options
|
||||
IPsec (AH/ESP)
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
IPv6 Introduction
|
||||
Technical: Layer 2 <-> Address mapping
|
||||
|
||||
|
||||
Ethernet: No more ARP, everything within ICMPv6
|
||||
No Broadcast, everything built using multicast.
|
||||
|
||||
all-nodes multicast address ff02::1
|
||||
all-routers multicast address ff02::2
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
IPv6 Introduction
|
||||
Technical: Address Configuration
|
||||
|
||||
|
||||
router discovery
|
||||
routers periodically send router advertisements
|
||||
hosts can send router solicitation to explicitly request RADV
|
||||
|
||||
prefix discovery
|
||||
router includes prefix(es) in ICMPv6 router advertisements
|
||||
other nodes receive prefix advertisements and derive their final address from prefix + EUI64 of MAC address
|
||||
|
||||
neighbour discovery
|
||||
machines can discover it's neighbours without advertising router
|
||||
|
||||
|
||||
%page
|
||||
IPv6 Introduction
|
||||
How to get connected
|
||||
|
||||
In case of static IPv4 address
|
||||
SIT (ipv6-in-ipv4) tunnel possible
|
||||
http://www.join.uni-muenster.de/
|
||||
|
||||
In case of dynamic IPv4 address
|
||||
ppp (ipv6 over ppp) tunnel (pptp, l2tp) possible
|
||||
sitctrl (linux <-> linux)
|
||||
atncp (*NIX), http://www.dhis.org/atncp/
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
IPv6 Introduction
|
||||
Further Reading
|
||||
|
||||
http://www.ipv6-net.org/ (deutsches IPv6 forum)
|
||||
http://www.6bone.net/ (ipv6 testing backbone)
|
||||
http://www.freenet6.net/ (free tunnel broker)
|
||||
http://hs247.com/ (list of tunnel brokers)
|
||||
|
||||
http://www.bieringer.de/ (ipv6 for linux)
|
||||
http://www.linux-ipv6.org/ (improved ipv6 for linux)
|
||||
http://www.kame.net/ (ipv6 for *BDS)
|
||||
http://www.join.uni-muenster.de/ (ipv6 at DFN/WiN)
|
||||
|
||||
http://www.gnumonks.org/ (slides of this presentation)
|
||||
|
||||
And of course, all relevant RFC's
|
||||
|
|
@ -0,0 +1,114 @@
|
|||
What is IPv6?
|
||||
Successor of currently used IP Version 4
|
||||
Specified 1995 in RFC? 2460
|
||||
Why?
|
||||
Address space in IPv4 too small
|
||||
|
||||
Advantages?
|
||||
stateless autoconfiguration
|
||||
multicast obligatorisch
|
||||
IPsec obligatorisch
|
||||
Mobile IP
|
||||
QoS ?
|
||||
|
||||
Address Renumbering?
|
||||
Multihoming?
|
||||
AddressScopes?
|
||||
smaller routing tables through G
|
||||
|
||||
simplified l3 header
|
||||
64bit aligned
|
||||
no checksum (l4 or l2)
|
||||
no fragmentation at router
|
||||
|
||||
Disadvantages
|
||||
Not widely deployed yet
|
||||
In most cases access only possible using manual tunnel
|
||||
OS support not ideal in most cases
|
||||
W2k?
|
||||
Linux has support, but no IPsec in official tree -> USAGI
|
||||
*BSD: full support (KAME
|
||||
Application support not ideal in most cases
|
||||
not supported:
|
||||
supported: bind8/9, apache
|
||||
|
||||
Deployment
|
||||
Experimental 6bone (3ffe::) has been active since 199x.
|
||||
Uses slightly different Addressing Architecture (RFC2471)
|
||||
|
||||
Why isn't it widely used yet?
|
||||
No immediate need in Europe / North America
|
||||
Big deployment cost at ISP's (Training, Routers, ..)
|
||||
|
||||
Technical: Address Space
|
||||
IP Version 6 Addressing Architecture (RFC2373)
|
||||
Format prefix, variable length
|
||||
001: RFC2374 addresses, 1/8 of address space
|
||||
0000 001: Reserved for NSAP (1/128)
|
||||
0000 010: Reserved for IPX (1/128)
|
||||
1111 1110 10: link-local unicast addresses (1/1024)
|
||||
1111 1110 11: site-local unicast addresses (1/1024)
|
||||
1111 1111: multicast addresses
|
||||
1111 1111 flgs scop
|
||||
flgs (0: well-known, 1:transient)
|
||||
scop (0: reserved, 1: node-local, 2: link-local, 5: site-local, 8: organization-local, e: global scope, f: reserved)
|
||||
Aggregatable Global Unicast Address Format (RFC2374)
|
||||
3bit FP (format prefix = 001)
|
||||
13bit TLA ID - Top-Level Aggregation ID
|
||||
13bit Sub-TLA - Sub-TLA Aggergation ID
|
||||
19bit NLA - Next-Level Aggregation ID
|
||||
16bit SLA - Site-Level Aggregation ID
|
||||
64bit Interface ID - derived from 48bit ethernet MAC
|
||||
|
||||
2001:0000::/29 - 2001:01f8::/29 IANA
|
||||
2001:0200::/29 - 2001:03f8::/29 APNIC
|
||||
2001:0400::/29 - 2001:05f8::/29 ARIN
|
||||
2001:0600::/29 - 2001:07f8::/29 RIPE
|
||||
loopback
|
||||
::1
|
||||
unspecified:
|
||||
::0
|
||||
embedded ipv4
|
||||
IPv4-compatible address: 0::xxxx:xxxx
|
||||
IPv4-mapped IPv4 (IPv4 only node): 0::ffff:xxxx:xxxx
|
||||
anycast
|
||||
allocated from unicast addresses
|
||||
only subnet-router anycast address predefined (prefix::0000)
|
||||
|
||||
|
||||
Technical: Header
|
||||
|
||||
4bit Version: 6
|
||||
8bit Traffic Class
|
||||
20bit Flow Label
|
||||
16bit Payload Length (incl. extension hdrs)
|
||||
8bit next header (same values like IPv4, RF1700 et seq.)
|
||||
8bit hop limit (TTL)
|
||||
128bit source address
|
||||
128bit dest address
|
||||
|
||||
extension headers:
|
||||
hop-by-hop options
|
||||
routing
|
||||
fragment
|
||||
destination options
|
||||
authentication
|
||||
encapsulating security payload
|
||||
|
||||
Technical: Layer 2 <-> Address mapping
|
||||
Ethernet: No more ARP, everything within ICMPv6
|
||||
No Broadcast, everything built using multicast.
|
||||
|
||||
all-nodes multicast address ff02::1
|
||||
all-routers multicast address ff02::2
|
||||
|
||||
|
||||
Technical: Address Configuration
|
||||
router discovery
|
||||
routers periodically send router advertisements
|
||||
hosts can send router solicitation to explicitly request RADV
|
||||
prefix discovery
|
||||
router includes prefix(es) in ICMPv6 router advertisements
|
||||
other nodes receive prefix advertisements and derive their final address from prefix + EUI64 of MAC address
|
||||
|
||||
|
|
@ -0,0 +1,25 @@
|
|||
Future directions of linux firewalling
|
||||
|
||||
Harald Welte, netfilter core team & Astaro AG
|
||||
|
||||
The Linux 2.4.x series provided a fundamental redesign of the packet filtering
|
||||
and NAT framework, called netfilter/iptables. This flexible and modular
|
||||
framwork still had it's limitations. This BOF will discuss the recent and
|
||||
upcoming changes during the 2.4.x kernel series, as well as planned and
|
||||
partially implemented changes/extensions for the 2.5.x kernel series.
|
||||
|
||||
Topics covered:
|
||||
|
||||
2.4.x stuff:
|
||||
- The newnat API; supporting connection tracking and NAT for complex protocols
|
||||
like H.323
|
||||
- Accessing connection tracking table entries from userspace: ctnetlink
|
||||
- Packet filtering and even NAT on a bridge
|
||||
|
||||
2.5.x stuff:
|
||||
- libiptables: Providing a flexible and extensible API towards all iptables
|
||||
features
|
||||
- pkttables: Creating a layer-3-protocol independent layer for rule tables;
|
||||
unifying iptables, ip6tables and arptables.
|
||||
- nfnetlink: Move all netfilter/iptables related kernel/userspace communication
|
||||
towards netlink
|
|
@ -0,0 +1,374 @@
|
|||
%include "default.mgp"
|
||||
%default 1 bgrad
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
%nodefault
|
||||
%back "blue"
|
||||
|
||||
%center
|
||||
%size 7
|
||||
|
||||
|
||||
The future of Linux packet filtering
|
||||
targeted for kernel 2.6 and beyond
|
||||
|
||||
|
||||
%center
|
||||
%size 4
|
||||
by
|
||||
|
||||
Harald Welte <laforge@gnumonks.org>
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Contents
|
||||
|
||||
|
||||
Problems with current 2.4.x netfilter/iptables
|
||||
Solution to code replication
|
||||
Solution for dynamic rulesets
|
||||
Solution for API to GUI's and other management programs
|
||||
|
||||
HA for stateful firewalling
|
||||
What's special about firewalling HA
|
||||
Poor man's failover
|
||||
Real state replication
|
||||
|
||||
Other current work
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Problems with 2.4.x netfilter/iptables
|
||||
|
||||
code replication between iptables/ip6tables/arptables
|
||||
iptables was never meant for other protocols, but people did copy+paste 'ports'
|
||||
replication of
|
||||
core kernel code
|
||||
layer 3 independent matches (mac, interface, ...)
|
||||
userspace library (libiptc)
|
||||
userspace tool (iptables)
|
||||
userspace plugins (libipt_xxx.so)
|
||||
|
||||
doesn't suit the needs for dynamically changing rulesets
|
||||
dynamic rulesets becomming more common due (service selection, IDS)
|
||||
a whole table is created in userspace and sent as blob to kernel
|
||||
for every ruleset the table needs to be copied to userspace and back
|
||||
inside kernel consistency checks on whole table, loop detection
|
||||
|
||||
too extensible for writing any forward-compatible GUI
|
||||
new extensions showing up all the time
|
||||
a frontend would need to know about the options and use of a new extension
|
||||
thus frontends are always incomplete and out-of-date
|
||||
no high-level API other than piping to iptables-restore
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Reducing code replication
|
||||
|
||||
code replication is a real problem: unclean, bugfixes missed
|
||||
we need layer 3 independent layer for
|
||||
submitting rules to the kernel
|
||||
traversing packet-rulesets supporting match/target modules
|
||||
registering matches/targets
|
||||
layer 3 specific (like matching ipv4 address)
|
||||
layer 3 independent (like matching MAC address)
|
||||
|
||||
solution
|
||||
pkt_tables inside kernel
|
||||
pkt_tables_ipv4 registers layer 3 handler with pkt_tables
|
||||
pkt_tables_ipv6 registers layer 3 handler with pkt_tables
|
||||
everybody registering a pkt_table (like iptable_filter) needs to specify the l3 protocol
|
||||
libraries in userspace (see later)
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Supporting dynamic rulesets
|
||||
|
||||
atomic table-replacement turned out to be bad idea
|
||||
need new interface for sending individual rules to kernel
|
||||
policy routing has the same problem and good solution: rtnetlink
|
||||
solution: nfnetlink
|
||||
multicast-netlink based packet-orinented socket between kernel and userspace
|
||||
has extra benefit that other userspace processes get notified of rule changes [just like routing daemons]
|
||||
nfnetlink is a low-layer below all kernel/userspace communication
|
||||
pkttnetlink [aka iptnetlink]
|
||||
ctnetlink
|
||||
ulog
|
||||
ip_queue
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Communication with other programs
|
||||
|
||||
whole set of libraries
|
||||
libnfnetlink for low-layer communication
|
||||
libpkttnetlink for rule modifications
|
||||
will handle all plugins [which are currently part of iptables]
|
||||
query functions about avaliable matches/targets
|
||||
query functions about parameters
|
||||
query functions for help messages about specific match/parameter of a match
|
||||
generic structure from which rules can be built
|
||||
conversion functions to parse generic structure into in-kernel structure
|
||||
conversion functiosn to perse kernel structure into generic structure
|
||||
functions to convert generic structure in plain text
|
||||
libipq will stay API-compatible to current version
|
||||
libipulog will stay API-compatible to current version
|
||||
libiptc will go away [compatibility layer extremely difficult]
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Introduction
|
||||
|
||||
What is special about firewall failover?
|
||||
|
||||
Nothing, in case of the stateless packet filter
|
||||
Common IP takeover solutions can be used
|
||||
VRRP
|
||||
Hartbeat
|
||||
|
||||
Distribution of packet filtering ruleset no problem
|
||||
can be done manually
|
||||
or implemented with simple userspace process
|
||||
|
||||
Problems arise with stateful packet filters
|
||||
Connection state only on active node
|
||||
NAT mappings only on active node
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Connection tracking...
|
||||
|
||||
implemented seperately from NAT
|
||||
enables stateful filtering
|
||||
implementation
|
||||
hooks into NF_IP_PRE_ROUTING to track packets
|
||||
hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to see if packet passed filtering rules
|
||||
protocol modules (currently TCP/UDP/ICMP)
|
||||
application helpers currently (FTP,IRC,H.323,talk,SNMP)
|
||||
divides packets in the following four categories
|
||||
NEW - would establish new connection
|
||||
ESTABLISHED - part of already established connection
|
||||
RELATED - is related to established connection
|
||||
INVALID - (multicast, errors...)
|
||||
does _NOT_ filter packets itself
|
||||
can be utilized by iptables using the 'state' match
|
||||
is used by NAT Subsystem
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Common structures
|
||||
struct ip_conntrack_tuple, representing unidirectional flow
|
||||
layer 3 src + dst
|
||||
layer 4 protocol
|
||||
layer 4 src + dst
|
||||
|
||||
|
||||
connetions represented as struct ip_conntrack
|
||||
original tuple
|
||||
reply tuple
|
||||
timeout
|
||||
l4 state private data
|
||||
app helper
|
||||
app helper private data
|
||||
expected connections
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Flow of events for new packet
|
||||
packet enters NF_IP_PRE_ROUTING
|
||||
tuple is derived from packet
|
||||
lookup conntrack hash table with hash(tuple) -> fails
|
||||
new ip_conntrack is allocated
|
||||
fill in original and reply == inverted(original) tuple
|
||||
initialize timer
|
||||
assign app helper if applicable
|
||||
see if we've been expected -> fails
|
||||
call layer 4 helper 'new' function
|
||||
|
||||
...
|
||||
|
||||
packet enters NF_IP_POST_ROUTING
|
||||
do hashtable lookup for packet -> fails
|
||||
place struct ip_conntrack in hashtable
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Flow of events for packet part of existing connection
|
||||
packet enters NF_IP_PRE_ROUTING
|
||||
tuple is derived from packet
|
||||
lookup conntrack hash table with hash(tuple)
|
||||
assosiate conntrack entry with skb->nfct
|
||||
call l4 protocol helper 'packet' function
|
||||
do l4 state tracking
|
||||
update timeouts as needed [i.e. TCP TIME_WAIT,...]
|
||||
|
||||
...
|
||||
|
||||
packet enters NF_IP_POST_ROUTING
|
||||
do hashtable lookup for packet -> succeds
|
||||
do nothing else
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Poor man's failover
|
||||
|
||||
Poor man's failover
|
||||
principle
|
||||
let every node do it's own tracking rather than replicating state
|
||||
two possible implementations
|
||||
connect every node to shared media (i.e. real ethernet)
|
||||
forwarding only turned on on active node
|
||||
slave nodes use promiscuous mode to sniff packets
|
||||
copy all traffic to slave nodes
|
||||
active master needs to copy all traffic to other nodes
|
||||
disadvantage: high load, sync traffic == payload traffic
|
||||
IMHO stupid way of solving the problem
|
||||
advantages
|
||||
very easy implementation
|
||||
only addition of sniffing mode to conntrack needed
|
||||
existing means of address takeover can be used
|
||||
same load on active master and slave nodes
|
||||
no additional load on active master
|
||||
disadvantages
|
||||
can only be used with real shared media (no switches, ...)
|
||||
can not be used with NAT
|
||||
remaining problem
|
||||
no initial state sync after reboot of slave node!
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Real state replication
|
||||
|
||||
Parts needed
|
||||
state replication protocol
|
||||
multicast based
|
||||
sequence numbers for detection of packet loss
|
||||
NACK-based retransmission
|
||||
no security, since private ethernet segment to be used
|
||||
event interface on active node
|
||||
calling out to callback function at all state changes
|
||||
exported interface to manipulate conntrack hash table
|
||||
kernel thread for sending conntrack state protocol messages
|
||||
registers with event interface
|
||||
creates and accumulates state replication packets
|
||||
sends them via in-kernel sockets api
|
||||
kernel thread for receiving conntrack state replication messages
|
||||
receives state replication packets via in-kernel sockets
|
||||
uses conntrack hashtable manipulation interface
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Real state replication
|
||||
|
||||
Flow of events in chronological order:
|
||||
on active node, inside the network RX softirq
|
||||
connection tracking code is analyzing a forwarded packet
|
||||
connection tracking gathers some new state information
|
||||
connection tracking updates local connection tracking database
|
||||
connection tracking sends event message to event API
|
||||
on active node, inside the conntrack-sync kernel thread
|
||||
conntrack sync daemon receives event through event API
|
||||
conntrack sync daemon aggregates multiple event messages into a state replication protocol message, removing possible redundancy
|
||||
conntrack sync daemon generates state replication protocol message
|
||||
conntrack sync daemon sends state replication protocol message
|
||||
on slave node(s), inside network RX softirq
|
||||
connection tracking code ignores packets coming from the interface attached to the private conntrac sync network
|
||||
state replication protocol messages is appended to socket receive queue of conntrack-sync kernel thread
|
||||
on slave node(s), inside conntrack-sync kernel thread
|
||||
conntrack sync daemon receives state replication message
|
||||
conntrack sync daemon creates/updates conntrack entry
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Neccessary changes to kernel
|
||||
|
||||
Neccessary changes to current conntrack core
|
||||
|
||||
event generation (callback functions) for all state changes
|
||||
|
||||
conntrack hashtable manipulation API
|
||||
is needed (and already implemented) for 'ctnetlink' API
|
||||
|
||||
conntrack exemptions
|
||||
needed to _not_ track conntrack state replication packets
|
||||
is needed for other cases as well
|
||||
currently being developed by Jozsef Kadlecsik
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Other current work
|
||||
|
||||
optimizing the conntrack code
|
||||
hash function optimization
|
||||
current hash function not good for even hash bucket count
|
||||
other hash functions in development
|
||||
hash function evaluation tool [cttest] avaliable
|
||||
introduce per-system randomness to prevent hash attack
|
||||
code optimization (locking/timers/...)
|
||||
|
||||
getting our work submitted into the mainstream kernel
|
||||
turns out to be more difficult
|
||||
e.g. newnat api now waiting for three months
|
||||
|
||||
discussions about multiple targets/actions per rule
|
||||
technical implementation easy
|
||||
however, not everybody convinced that it fits into the concept
|
||||
|
||||
using tc for firewalling
|
||||
Jamal Hadi Selim uses iptables targets from within TC
|
||||
leads to discussion of generic classification engine API in kernel
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Thanks
|
||||
The slides and the an according paper of this presentation are available at http://www.gnumonks.org/
|
||||
|
||||
The netfilter homepage http://www.netfilter.org/
|
||||
|
||||
Thanks to
|
||||
the BBS people, Z-Netz, FIDO, ...
|
||||
for heavily increasing my computer usage in 1992
|
||||
KNF
|
||||
for bringing me in touch with the internet as early as 1994
|
||||
for providing a playground for technical people
|
||||
for telling me about the existance of Linux!
|
||||
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
|
||||
for implementing (one of?) the world's best TCP/IP stacks
|
||||
Paul 'Rusty' Russell
|
||||
for starting the netfilter/iptables project
|
||||
for trusting me to maintain it today
|
||||
Astaro AG
|
||||
for sponsoring parts of my netfilter work
|
||||
|
|
@ -0,0 +1,374 @@
|
|||
%include "default.mgp"
|
||||
%default 1 bgrad
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
%nodefault
|
||||
%back "blue"
|
||||
|
||||
%center
|
||||
%size 7
|
||||
|
||||
|
||||
The future of Linux packet filtering
|
||||
targeted for kernel 2.6
|
||||
|
||||
|
||||
%center
|
||||
%size 4
|
||||
by
|
||||
|
||||
Harald Welte <laforge@gnumonks.org>
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Contents
|
||||
|
||||
|
||||
Problems with current 2.4.x netfilter/iptables
|
||||
Solution to code replication
|
||||
Solution for dynamic rulesets
|
||||
Solution for API to GUI's and other management programs
|
||||
|
||||
HA for stateful firewalling
|
||||
What's special about firewalling HA
|
||||
Poor man's failover
|
||||
Real state replication
|
||||
|
||||
Other current work
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Problems with 2.4.x netfilter/iptables
|
||||
|
||||
code replication between iptables/ip6tables/arptables
|
||||
iptables was never meant for other protocols, but people did copy+paste 'ports'
|
||||
replication of
|
||||
core kernel code
|
||||
layer 3 independent matches (mac, interface, ...)
|
||||
userspace library (libiptc)
|
||||
userspace tool (iptables)
|
||||
userspace plugins (libipt_xxx.so)
|
||||
|
||||
doesn't suit the needs for dynamically changing rulesets
|
||||
dynamic rulesets becomming more common due (service selection, IDS)
|
||||
a whole table is created in userspace and sent as blob to kernel
|
||||
for every ruleset the table needs to be copied to userspace and back
|
||||
inside kernel consistency checks on whole table, loop detection
|
||||
|
||||
too extensible for writing any forward-compatible GUI
|
||||
new extensions showing up all the time
|
||||
a frontend would need to know about the options and use of a new extension
|
||||
thus frontends are always incomplete and out-of-date
|
||||
no high-level API other than piping to iptables-restore
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Reducing code replication
|
||||
|
||||
code replication is a real problem: unclean, bugfixes missed
|
||||
we need layer 3 independent layer for
|
||||
submitting rules to the kernel
|
||||
traversing packet-rulesets supporting match/target modules
|
||||
registering matches/targets
|
||||
layer 3 specific (like matching ipv4 address)
|
||||
layer 3 independent (like matching MAC address)
|
||||
|
||||
solution
|
||||
pkt_tables inside kernel
|
||||
pkt_tables_ipv4 registers layer 3 handler with pkt_tables
|
||||
pkt_tables_ipv6 registers layer 3 handler with pkt_tables
|
||||
everybody registering a pkt_table (like iptable_filter) needs to specify the l3 protocol
|
||||
libraries in userspace (see later)
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Supporting dynamic rulesets
|
||||
|
||||
atomic table-replacement turned out to be bad idea
|
||||
need new interface for sending individual rules to kernel
|
||||
policy routing has the same problem and good solution: rtnetlink
|
||||
solution: nfnetlink
|
||||
multicast-netlink based packet-orinented socket between kernel and userspace
|
||||
has extra benefit that other userspace processes get notified of rule changes [just like routing daemons]
|
||||
nfnetlink will be low-layer below all kernel/userspace communication
|
||||
pkttnetlink [aka iptnetlink]
|
||||
ctnetlink
|
||||
ulog
|
||||
ip_queue
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Communication with other programs
|
||||
|
||||
whole set of libraries
|
||||
libnfnetlink for low-layer communication
|
||||
libpkttnetlink for rule modifications
|
||||
will handle all plugins [which are currently part of iptables]
|
||||
query functions about avaliable matches/targets
|
||||
query functions about parameters
|
||||
query functions for help messages about specific match/parameter of a match
|
||||
generic structure from which rules can be built
|
||||
conversion functions to parse generic structure into in-kernel structure
|
||||
conversion functiosn to perse kernel structure into generic structure
|
||||
functions to convert generic structure in plain text
|
||||
libipq will stay API-compatible to current version
|
||||
libipulog will stay API-compatible to current version
|
||||
libiptc will go away [compatibility layer extremely difficult]
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Introduction
|
||||
|
||||
What is special about firewall failover?
|
||||
|
||||
Nothing, in case of the stateless packet filter
|
||||
Common IP takeover solutions can be used
|
||||
VRRP
|
||||
Hartbeat
|
||||
|
||||
Distribution of packet filtering ruleset no problem
|
||||
can be done manually
|
||||
or implemented with simple userspace process
|
||||
|
||||
Problems arise with stateful packet filters
|
||||
Connection state only on active node
|
||||
NAT mappings only on active node
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Connection tracking...
|
||||
|
||||
implemented seperately from NAT
|
||||
enables stateful filtering
|
||||
implementation
|
||||
hooks into NF_IP_PRE_ROUTING to track packets
|
||||
hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to see if packet passed filtering rules
|
||||
protocol modules (currently TCP/UDP/ICMP)
|
||||
application helpers currently (FTP,IRC,H.323,talk,SNMP)
|
||||
divides packets in the following four categories
|
||||
NEW - would establish new connection
|
||||
ESTABLISHED - part of already established connection
|
||||
RELATED - is related to established connection
|
||||
INVALID - (multicast, errors...)
|
||||
does _NOT_ filter packets itself
|
||||
can be utilized by iptables using the 'state' match
|
||||
is used by NAT Subsystem
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Common structures
|
||||
struct ip_conntrack_tuple, representing unidirectional flow
|
||||
layer 3 src + dst
|
||||
layer 4 protocol
|
||||
layer 4 src + dst
|
||||
|
||||
|
||||
connetions represented as struct ip_conntrack
|
||||
original tuple
|
||||
reply tuple
|
||||
timeout
|
||||
l4 state private data
|
||||
app helper
|
||||
app helper private data
|
||||
expected connections
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Flow of events for new packet
|
||||
packet enters NF_IP_PRE_ROUTING
|
||||
tuple is derived from packet
|
||||
lookup conntrack hash table with hash(tuple) -> fails
|
||||
new ip_conntrack is allocated
|
||||
fill in original and reply == inverted(original) tuple
|
||||
initialize timer
|
||||
assign app helper if applicable
|
||||
see if we've been expected -> fails
|
||||
call layer 4 helper 'new' function
|
||||
|
||||
...
|
||||
|
||||
packet enters NF_IP_POST_ROUTING
|
||||
do hashtable lookup for packet -> fails
|
||||
place struct ip_conntrack in hashtable
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Flow of events for packet part of existing connection
|
||||
packet enters NF_IP_PRE_ROUTING
|
||||
tuple is derived from packet
|
||||
lookup conntrack hash table with hash(tuple)
|
||||
assosiate conntrack entry with skb->nfct
|
||||
call l4 protocol helper 'packet' function
|
||||
do l4 state tracking
|
||||
update timeouts as needed [i.e. TCP TIME_WAIT,...]
|
||||
|
||||
...
|
||||
|
||||
packet enters NF_IP_POST_ROUTING
|
||||
do hashtable lookup for packet -> succeds
|
||||
do nothing else
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Poor man's failover
|
||||
|
||||
Poor man's failover
|
||||
principle
|
||||
let every node do it's own tracking rather than replicating state
|
||||
two possible implementations
|
||||
connect every node to shared media (i.e. real ethernet)
|
||||
forwarding only turned on on active node
|
||||
slave nodes use promiscuous mode to sniff packets
|
||||
copy all traffic to slave nodes
|
||||
active master needs to copy all traffic to other nodes
|
||||
disadvantage: high load, sync traffic == payload traffic
|
||||
IMHO stupid way of solving the problem
|
||||
advantages
|
||||
very easy implementation
|
||||
only addition of sniffing mode to conntrack needed
|
||||
existing means of address takeover can be used
|
||||
same load on active master and slave nodes
|
||||
no additional load on active master
|
||||
disadvantages
|
||||
can only be used with real shared media (no switches, ...)
|
||||
can not be used with NAT
|
||||
remaining problem
|
||||
no initial state sync after reboot of slave node!
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Real state replication
|
||||
|
||||
Parts needed
|
||||
state replication protocol
|
||||
multicast based
|
||||
sequence numbers for detection of packet loss
|
||||
NACK-based retransmission
|
||||
no security, since private ethernet segment to be used
|
||||
event interface on active node
|
||||
calling out to callback function at all state changes
|
||||
exported interface to manipulate conntrack hash table
|
||||
kernel thread for sending conntrack state protocol messages
|
||||
registers with event interface
|
||||
creates and accumulates state replication packets
|
||||
sends them via in-kernel sockets api
|
||||
kernel thread for receiving conntrack state replication messages
|
||||
receives state replication packets via in-kernel sockets
|
||||
uses conntrack hashtable manipulation interface
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Real state replication
|
||||
|
||||
Flow of events in chronological order:
|
||||
on active node, inside the network RX softirq
|
||||
connection tracking code is analyzing a forwarded packet
|
||||
connection tracking gathers some new state information
|
||||
connection tracking updates local connection tracking database
|
||||
connection tracking sends event message to event API
|
||||
on active node, inside the conntrack-sync kernel thread
|
||||
conntrack sync daemon receives event through event API
|
||||
conntrack sync daemon aggregates multiple event messages into a state replication protocol message, removing possible redundancy
|
||||
conntrack sync daemon generates state replication protocol message
|
||||
conntrack sync daemon sends state replication protocol message
|
||||
on slave node(s), inside network RX softirq
|
||||
connection tracking code ignores packets coming from the interface attached to the private conntrac sync network
|
||||
state replication protocol messages is appended to socket receive queue of conntrack-sync kernel thread
|
||||
on slave node(s), inside conntrack-sync kernel thread
|
||||
conntrack sync daemon receives state replication message
|
||||
conntrack sync daemon creates/updates conntrack entry
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Neccessary changes to kernel
|
||||
|
||||
Neccessary changes to current conntrack core
|
||||
|
||||
event generation (callback functions) for all state changes
|
||||
|
||||
conntrack hashtable manipulation API
|
||||
is needed (and already implemented) for 'ctnetlink' API
|
||||
|
||||
conntrack exemptions
|
||||
needed to _not_ track conntrack state replication packets
|
||||
is needed for other cases as well
|
||||
currently being developed by Jozsef Kadlecsik
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Other current work
|
||||
|
||||
optimizing the conntrack code
|
||||
hash function optimization
|
||||
current hash function not good for even hash bucket count
|
||||
other hash functions in development
|
||||
hash function evaluation tool [cttest] avaliable
|
||||
introduce per-system randomness to prevent hash attack
|
||||
code optimization (locking/timers/...)
|
||||
|
||||
getting our work submitted into the mainstream kernel
|
||||
turns out to be more difficult
|
||||
e.g. newnat api now waiting for three months
|
||||
|
||||
discussions about multiple targets/actions per rule
|
||||
technical implementation easy
|
||||
however, not everybody convinced that it fits into the concept
|
||||
|
||||
using tc for firewalling
|
||||
Jamal Hadi Selim uses iptables targets from within TC
|
||||
leads to discussion of generic classification engine API in kernel
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Thanks
|
||||
The slides and the an according paper of this presentation are available at http://www.gnumonks.org/
|
||||
|
||||
The netfilter homepage http://www.netfilter.org/
|
||||
|
||||
Thanks to
|
||||
the BBS people, Z-Netz, FIDO, ...
|
||||
for heavily increasing my computer usage in 1992
|
||||
KNF
|
||||
for bringing me in touch with the internet as early as 1994
|
||||
for providing a playground for technical people
|
||||
for telling me about the existance of Linux!
|
||||
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
|
||||
for implementing (one of?) the world's best TCP/IP stacks
|
||||
Paul 'Rusty' Russell
|
||||
for starting the netfilter/iptables project
|
||||
for trusting me to maintain it today
|
||||
Astaro AG
|
||||
for sponsoring parts of my netfilter work
|
||||
|
|
@ -0,0 +1,31 @@
|
|||
How to replicate the fire - HA for netfilter based firewalls.
|
||||
|
||||
With traditional, stateless firewalling (such as ipfwadm, ipchains) there is
|
||||
no need for special HA support in the firewalling subsystem. As long as all
|
||||
packet filtering rules and routing table entries are configured in exactly the
|
||||
same way, one can use any available tool for IP-Address takeover to accomplish
|
||||
the goal of failing over from one node to the other.
|
||||
|
||||
With Linux 2.4.x netfilter/iptables, the Linux firewalling code moves beyond
|
||||
traditional packet filtering. Netfilter provides a modular connection tracking
|
||||
susbsystem which can be employed for stateful firewalling. The connection
|
||||
tracking subsystem gathers information about the state of all current network
|
||||
flows (connections). Packet filtering decisions and NAT information is
|
||||
associated with this state information.
|
||||
|
||||
In a high availability scenario, this connection tracking state needs to be
|
||||
replicated from the currently active firewall node to all standby slave
|
||||
firewall nodes. Only when all connection tracking state is replicated, the
|
||||
slave node will have all necessarry state information at the time a failover
|
||||
event occurs.
|
||||
|
||||
The netfilter/iptables does currently not have any functionality for
|
||||
replicating connection tracking state accross multiple nodes. However,
|
||||
the author of this presentation, Harald Welte, has started a project for
|
||||
connection tracking state replication with netfilter/iptables.
|
||||
|
||||
The presentation will cover the architectural design and implementation
|
||||
of the connection tracking failover sytem. With respect to the date of
|
||||
the conference, it is to be expected that the project is still a
|
||||
work-in-progress at that time.
|
||||
|
|
@ -0,0 +1,22 @@
|
|||
<a href="http://www.gnumonks.org/users/laforge/">Harald Welte</a> is one
|
||||
of the five <a href="http://www.netfilter.org/">netfilter/iptables</a> core
|
||||
team members, and the current Linux 2.4.x firewalling maintainer.
|
||||
|
||||
His main interest in computing has always been networking. In the few time
|
||||
left besides netfilter/iptables related work, he's writing obscure documents
|
||||
like the <a href="http://www.gnumonks.org/ftp/pub/doc/uucp-over-ssl.html"> UUCP
|
||||
over SSL HOWTO</a>. Other kernel-related projects he has been contributing are
|
||||
user mode linux and the international (crypto) kernel patch.
|
||||
|
||||
In the past he has been working as an independent IT Consultant working on
|
||||
closed-source projecst for various companies ranging from banks to
|
||||
manufacturers of networking gear. During the year 2001 he was living in
|
||||
Curitiba (Brazil), where he got sponsored for his Linux related work by
|
||||
<a href="http://www.conectiva.com/">Conectiva Inc.</a>.
|
||||
|
||||
Starting with February 2002, Harald has been contracted part-time by
|
||||
<a href="http://www.astaro.com/">Astaro AG</a>, who are sponsoring him for his
|
||||
current netfilter/iptables work.
|
||||
|
||||
Harald is living in Erlangen, Germany.
|
||||
|
|
@ -0,0 +1,294 @@
|
|||
%include "default.mgp"
|
||||
%default 1 bgrad
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
%nodefault
|
||||
%back "blue"
|
||||
|
||||
%center
|
||||
%size 7
|
||||
|
||||
|
||||
How to replicate the fire
|
||||
HA for netfilter-based firewalls
|
||||
|
||||
|
||||
%center
|
||||
%size 4
|
||||
by
|
||||
|
||||
Harald Welte <laforge@gnumonks.org>
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Contents
|
||||
|
||||
|
||||
Introduction
|
||||
Connection Tracking Subsystem
|
||||
Packet selection based on IP Tables
|
||||
The Connection Tracking Subsystem
|
||||
The NAT Subsystem
|
||||
Poor man's failover
|
||||
Real state replication
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Introduction
|
||||
|
||||
What is special about firewall failover?
|
||||
|
||||
Nothing, in case of the stateless packet filter
|
||||
Common IP takeover solutions can be used
|
||||
VRRP
|
||||
Hartbeat
|
||||
|
||||
Distribution of packet filtering ruleset no problem
|
||||
can be done manually
|
||||
or implemented with simple userspace process
|
||||
|
||||
Problems arise with stateful packet filters
|
||||
Connection state only on active node
|
||||
NAT mappings only on active node
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Connection tracking...
|
||||
|
||||
implemented seperately from NAT
|
||||
enables stateful filtering
|
||||
implementation
|
||||
hooks into NF_IP_PRE_ROUTING to track packets
|
||||
hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to see if packet passed filtering rules
|
||||
protocol modules (currently TCP/UDP/ICMP)
|
||||
application helpers currently (FTP,IRC,H.323,talk,SNMP)
|
||||
divides packets in the following four categories
|
||||
NEW - would establish new connection
|
||||
ESTABLISHED - part of already established connection
|
||||
RELATED - is related to established connection
|
||||
INVALID - (multicast, errors...)
|
||||
does _NOT_ filter packets itself
|
||||
can be utilized by iptables using the 'state' match
|
||||
is used by NAT Subsystem
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Common structures
|
||||
struct ip_conntrack_tuple, representing unidirectional flow
|
||||
layer 3 src + dst
|
||||
layer 4 protocol
|
||||
layer 4 src + dst
|
||||
|
||||
|
||||
connetions represented as struct ip_conntrack
|
||||
original tuple
|
||||
reply tuple
|
||||
timeout
|
||||
l4 state private data
|
||||
app helper
|
||||
app helper private data
|
||||
expected connections
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Flow of events for new packet
|
||||
packet enters NF_IP_PRE_ROUTING
|
||||
tuple is derived from packet
|
||||
lookup conntrack hash table with hash(tuple) -> fails
|
||||
new ip_conntrack is allocated
|
||||
fill in original and reply == inverted(original) tuple
|
||||
initialize timer
|
||||
assign app helper if applicable
|
||||
see if we've been expected -> fails
|
||||
call layer 4 helper 'new' function
|
||||
|
||||
...
|
||||
|
||||
packet enters NF_IP_POST_ROUTING
|
||||
do hashtable lookup for packet -> fails
|
||||
place struct ip_conntrack in hashtable
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Flow of events for packet part of existing connection
|
||||
packet enters NF_IP_PRE_ROUTING
|
||||
tuple is derived from packet
|
||||
lookup conntrack hash table with hash(tuple)
|
||||
assosiate conntrack entry with skb->nfct
|
||||
call l4 protocol helper 'packet' function
|
||||
do l4 state tracking
|
||||
update timeouts as needed [i.e. TCP TIME_WAIT,...]
|
||||
|
||||
...
|
||||
|
||||
packet enters NF_IP_POST_ROUTING
|
||||
do hashtable lookup for packet -> succeds
|
||||
do nothing else
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Network Address Translation
|
||||
|
||||
Overview
|
||||
Previous Linux Kernels only implemented one special case of NAT: Masquerading
|
||||
Linux 2.4.x can do any kind of NAT.
|
||||
NAT subsystem implemented on top of netfilter, iptables and conntrack
|
||||
NAT subsystem registers with all five netfilter hooks
|
||||
'nat' Table registers chains PREROUTING, POSTROUTING and OUTPUT
|
||||
Following targets available within 'nat' Table
|
||||
SNAT changes the packet's source whille passing NF_IP_POST_ROUTING
|
||||
DNAT changes the packet's destination while passing NF_IP_PRE_ROUTING
|
||||
MASQUERADE is a special case of SNAT
|
||||
REDIRECT is a special case of DNAT
|
||||
NAT bindings determined only for NEW packet and saved in ip_conntrack
|
||||
Further packets within connection NATed according NAT bindings
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Poor man's failover
|
||||
|
||||
Poor man's failover
|
||||
principle
|
||||
let every node do it's own tracking rather than replicating state
|
||||
two possible implementations
|
||||
connect every node to shared media (i.e. real ethernet)
|
||||
forwarding only turned on on active node
|
||||
slave nodes use promiscuous mode to sniff packets
|
||||
copy all traffic to slave nodes
|
||||
active master needs to copy all traffic to other nodes
|
||||
disadvantage: high load, sync traffic == payload traffic
|
||||
IMHO stupid way of solving the problem
|
||||
advantages
|
||||
very easy implementation
|
||||
only addition of sniffing mode to conntrack needed
|
||||
existing means of address takeover can be used
|
||||
same load on active master and slave nodes
|
||||
no additional load on active master
|
||||
disadvantages
|
||||
can only be used with real shared media (no switches, ...)
|
||||
can not be used with NAT
|
||||
remaining problem
|
||||
no initial state sync after reboot of slave node!
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Real state replication
|
||||
|
||||
Parts needed
|
||||
state replication protocol
|
||||
multicast based
|
||||
sequence numbers for detection of packet loss
|
||||
NACK-based retransmission
|
||||
no security, since private ethernet segment to be used
|
||||
event interface on active node
|
||||
calling out to callback function at all state changes
|
||||
exported interface to manipulate conntrack hash table
|
||||
kernel thread for sending conntrack state protocol messages
|
||||
registers with event interface
|
||||
creates and accumulates state replication packets
|
||||
sends them via in-kernel sockets api
|
||||
kernel thread for receiving conntrack state replication messages
|
||||
receives state replication packets via in-kernel sockets
|
||||
uses conntrack hashtable manipulation interface
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Real state replication
|
||||
|
||||
Flow of events in chronological order:
|
||||
on active node, inside the network RX softirq
|
||||
connection tracking code is analyzing a forwarded packet
|
||||
connection tracking gathers some new state information
|
||||
connection tracking updates local connection tracking database
|
||||
connection tracking sends event message to event API
|
||||
on active node, inside the conntrack-sync kernel thread
|
||||
conntrack sync daemon receives event through event API
|
||||
conntrack sync daemon aggregates multiple event messages into a state replication protocol message, removing possible redundancy
|
||||
conntrack sync daemon generates state replication protocol message
|
||||
conntrack sync daemon sends state replication protocol message
|
||||
on slave node(s), inside network RX softirq
|
||||
connection tracking code ignores packets coming from the interface attached to the private conntrac sync network
|
||||
state replication protocol messages is appended to socket receive queue of conntrack-sync kernel thread
|
||||
on slave node(s), inside conntrack-sync kernel thread
|
||||
conntrack sync daemon receives state replication message
|
||||
conntrack sync daemon creates/updates conntrack entry
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Neccessary changes to kernel
|
||||
|
||||
Neccessary changes to current conntrack core
|
||||
|
||||
event generation (callback functions) for all state changes
|
||||
|
||||
conntrack hashtable manipulation API
|
||||
is needed (and already implemented) for 'ctnetlink' API
|
||||
|
||||
conntrack exemptions
|
||||
needed to _not_ track conntrack state replication packets
|
||||
is needed for other cases as well
|
||||
currently being developed by Jozsef Kadlecsik
|
||||
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Thanks
|
||||
|
||||
Thanks to
|
||||
the BBS people, Z-Netz, FIDO, ...
|
||||
for heavily increasing my computer usage in 1992
|
||||
|
||||
KNF
|
||||
for bringing me in touch with the internet as early as 1994
|
||||
for providing a playground for technical people
|
||||
for telling me about the existance of Linux!
|
||||
|
||||
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
|
||||
for implementing (one of?) the world's best TCP/IP stacks
|
||||
|
||||
Paul 'Rusty' Russell
|
||||
for starting the netfilter/iptables project
|
||||
for trusting me to maintain it today
|
||||
|
||||
Astaro AG
|
||||
for sponsoring parts of my netfilter work
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Availability of slides / Links
|
||||
|
||||
The slides and the an according paper of this presentation are available at
|
||||
http://www.gnumonks.org/
|
||||
|
||||
The netfilter homepage
|
||||
http://www.netfilter.org/
|
||||
|
|
@ -0,0 +1,504 @@
|
|||
\documentclass[twocolumn]{article}
|
||||
\usepackage{ols}
|
||||
\begin{document}
|
||||
|
||||
\date{}
|
||||
|
||||
\title{\Large \bf How to replicate the fire - HA for netfilter based firewalls}
|
||||
|
||||
\author{
|
||||
Harald Welte\\
|
||||
{\em Netfilter Core Team + Astaro AG}\\
|
||||
{\normalsize laforge@gnumonks.org/laforge@astaro.com, http://www.gnumonks.org/}
|
||||
}
|
||||
|
||||
\maketitle
|
||||
|
||||
\thispagestyle{empty}
|
||||
|
||||
\subsection*{Abstract}
|
||||
With traditional, stateless firewalling (such as ipfwadm, ipchains) there is
|
||||
no need for special HA support in the firewalling subsystem. As long as all
|
||||
packet filtering rules and routing table entries are configured in exactly the
|
||||
same way, one can use any available tool for IP-Address takeover to accomplish
|
||||
the goal of failing over from one node to the other.
|
||||
|
||||
With Linux 2.4.x netfilter/iptables, the Linux firewalling code moves beyond
|
||||
traditional packet filtering. Netfilter provides a modular connection tracking
|
||||
susbsystem which can be employed for stateful firewalling. The connection
|
||||
tracking subsystem gathers information about the state of all current network
|
||||
flows (connections). Packet filtering decisions and NAT information is
|
||||
associated with this state information.
|
||||
|
||||
In a high availability scenario, this connection tracking state needs to be
|
||||
replicated from the currently active firewall node to all standby slave
|
||||
firewall nodes. Only when all connection tracking state is replicated, the
|
||||
slave node will have all necessarry state information at the time a failover
|
||||
event occurs.
|
||||
|
||||
The netfilter/iptables does currently not have any functionality for
|
||||
replicating connection tracking state accross multiple nodes. However,
|
||||
the author of this presentation, Harald Welte, has started a project for
|
||||
connection tracking state replication with netfilter/iptables.
|
||||
|
||||
The presentation will cover the architectural design and implementation
|
||||
of the connection tracking failover sytem. With respect to the date of
|
||||
the conference, it is to be expected that the project is still a
|
||||
work-in-progress at that time.
|
||||
|
||||
\section{Failover of stateless firewalls}
|
||||
|
||||
There are no special precautions when installing a highly available
|
||||
stateless packet filter. Since there is no state kept, all information
|
||||
needed for filtering is the ruleset and the individual, seperate packets.
|
||||
|
||||
Building a set of highly available stateless packet filters can thus be
|
||||
achieved by using any traditional means of IP-address takeover, such
|
||||
as Hartbeat or VRRPd.
|
||||
|
||||
The only remaining issue is to make sure the firewalling ruleset is
|
||||
exactly the same on both machines. This should be ensured by the firewall
|
||||
administrator every time he updates the ruleset.
|
||||
|
||||
If this is not applicable, because a very dynamic ruleset is employed, one
|
||||
can build a very easy solution using iptables-supplied tools iptables-save
|
||||
and iptables-restore. The output of iptables-save can be piped over ssh
|
||||
to iptables-restore on a different host.
|
||||
|
||||
Limitations
|
||||
\begin{itemize}
|
||||
\item
|
||||
no state tracking
|
||||
\item
|
||||
not possible in combination with NAT
|
||||
\item
|
||||
no counter consistency of per-rule packet/byte counters
|
||||
\end{itemize}
|
||||
|
||||
\section{Failover of stateful firewalls}
|
||||
|
||||
Modern firewalls implement state tracking (aka connection tracking) in order
|
||||
to keep some state about the currently active sessions. The amount of
|
||||
per-connection state kept at the firewall depends on the particular
|
||||
implementation.
|
||||
|
||||
As soon as {\bf any} state is kept at the packet filter, this state information
|
||||
needs to be replicated to the slave/backup nodes within the failover setup.
|
||||
|
||||
In Linux 2.4.x, all relevant state is kept within the {\it connection tracking
|
||||
subsystem}. In order to understand how this state could possibly be
|
||||
replicated, we need to understand the architecture of this conntrack subsystem.
|
||||
|
||||
\subsection{Architecture of the Linux Connection Tracking Subsystem}
|
||||
|
||||
Connection tracking within Linux is implemented as a netfilter module, called
|
||||
ip\_conntrack.o.
|
||||
|
||||
Before describing the connection tracking subsystem, we need to describe a
|
||||
couple of definitions and primitives used throughout the conntrack code.
|
||||
|
||||
A connection is represented within the conntrack subsystem using {\it struct
|
||||
ip\_conntrack}, also called {\it connection tracking entry}.
|
||||
|
||||
Connection tracking is utilizing {\it conntrack tuples}, which are tuples
|
||||
consisting out of (srcip, srcport, dstip, dstport, l4prot). A connection is
|
||||
uniquely identified by two tuples: The tuple in the original direction
|
||||
(IP\_CT\_DIR\_ORIGINAL) and the tuple for the reply direction
|
||||
(IP\_CT\_DIR\_REPLY).
|
||||
|
||||
Connection tracking itself does not drop packets\footnote{well, in some rare
|
||||
cases in combination with NAT it needs to drop. But don't tell anyone, this is
|
||||
secret.} or impose any policy. It just associates every packet with a
|
||||
connection tracking entry, which in turn has a particular state. All other
|
||||
kernel code can use this state information\footnote{state information is
|
||||
internally represented via the {\it struct sk\_buff.nfct} structure member of a
|
||||
packet.}.
|
||||
|
||||
\subsubsection{Integration of conntrack with netfilter}
|
||||
|
||||
If the ip\_conntrack.o module is registered with netfilter, it attaches to the
|
||||
NF\_IP\_PRE\_ROUTING, NF\_IP\_POST\_ROUTING, NF\_IP\_LOCAL\_IN and
|
||||
NF\_IP\_LOCAL\_OUT hooks.
|
||||
|
||||
Because forwarded packets are the most common case on firewalls, I will only
|
||||
describe how connection tracking works for forwarded packets. The two relevant
|
||||
hooks for forwarded packets are NF\_IP\_PRE\_ROUTING and NF\_IP\_POST\_ROUTING.
|
||||
|
||||
Every time a packet arrives at the NF\_IP\_PRE\_ROUTING hook, connection
|
||||
tracking creates a conntrack tuple from the packet. It then compares this
|
||||
tuple to the original and reply tuples of all already-seen connections
|
||||
\footnote{Of course this is not implemented as a linear search over all existing connections.} to find out if this just-arrived packet belongs to any existing
|
||||
connection. If there is no match, a new conntrack table entry (struct
|
||||
ip\_conntrack) is created.
|
||||
|
||||
Let's assume the case where we have already existing connections but are
|
||||
starting from scratch.
|
||||
|
||||
The first packet comes in, we derive the tuple from the packet headers, look up
|
||||
the conntrack hash table, don't find any matching entry. As a result, we
|
||||
create a new struct ip\_conntrack. This struct ip\_conntrack is filled with
|
||||
all necessarry data, like the original and reply tuple of the connection.
|
||||
How do we know the reply tuple? By inverting the source and destination
|
||||
parts of the original tuple.\footnote{So why do we need two tuples, if they can
|
||||
be derived from each other? Wait until we discuss NAT.}
|
||||
Please note that this new struct ip\_conntrack is {\bf not} yet placed
|
||||
into the conntrack hash table.
|
||||
|
||||
The packet is now passed on to other callback functions which have registered
|
||||
with a lower priority at NF\_IP\_PRE\_ROUTING. It then continues traversal of
|
||||
the network stack as usual, including all respective netfilter hooks.
|
||||
|
||||
If the packet survives (i.e. is not dropped by the routing code, network stack,
|
||||
firewall ruleset, ...), it re-appears at NF\_IP\_POST\_ROUTING. In this case,
|
||||
we can now safely assume that this packet will be sent off on the outgoing
|
||||
interface, and thus put the connection tracking entry which we created at
|
||||
NF\_IP\_PRE\_ROUTING into the conntrack hash table. This process is called
|
||||
{\it confirming the conntrack}.
|
||||
|
||||
The connection tracking code itself is not monolithic, but consists out of a
|
||||
couple of seperate modules\footnote{They don't actually have to be seperate
|
||||
kernel modules; e.g. TCP, UDP and ICMP tracking modules are all part of
|
||||
the linux kernel module ip\_conntrack.o}. Besides the conntrack core, there
|
||||
are two important kind of modules: Protocol helpers and application helpers.
|
||||
|
||||
Protocol helpers implement the layer-4-protocol specific parts. They currently
|
||||
exist for TCP, UDP and ICMP (an experimental helper for GRE exists).
|
||||
|
||||
\subsubsection{TCP connection tracking}
|
||||
|
||||
As TCP is a connection oriented protocol, it is not very difficult to imagine
|
||||
how conntection tracking for this protocol could work. There are well-defined
|
||||
state transitions possible, and conntrack can decide which state transitions
|
||||
are valid within the TCP specification. In reality it's not all that easy,
|
||||
since we cannot assume that all packets that pass the packet filter actually
|
||||
arrive at the receiving end, ...
|
||||
|
||||
It is noteworthy that the standard connection tracking code does {\bf not}
|
||||
do TCP sequence number and window tracking. A well-maintained patch to add
|
||||
this feature exists almost as long as connection tracking itself. It will
|
||||
be integrated with the 2.5.x kernel. The problem with window tracking is
|
||||
it's bad interaction with connection pickup. The TCP conntrack code is able to
|
||||
pick up already existing connections, e.g. in case your firewall was rebooted.
|
||||
However, connection pickup is conflicting with TCP window tracking: The TCP
|
||||
window scaling option is only transferred at connection setup time, and we
|
||||
don't know about it in case of pickup...
|
||||
|
||||
\subsubsection{ICMP tracking}
|
||||
|
||||
ICMP is not really a connection oriented protocol. So how is it possible to
|
||||
do connection tracking for ICMP?
|
||||
|
||||
The ICMP protocol can be split in two groups of messages
|
||||
|
||||
\begin{itemize}
|
||||
\item
|
||||
ICMP error messages, which sort-of belong to a different connection
|
||||
ICMP error messages are associated {\it RELATED} to a different connection.
|
||||
(ICMP\_DEST\_UNREACH, ICMP\_SOURCE\_QUENCH, ICMP\_TIME\_EXCEEDED,
|
||||
ICMP\_PARAMETERPROB, ICMP\_REDIRECT).
|
||||
\item
|
||||
ICMP queries, which have a request->reply character. So what the conntrack
|
||||
code does, is let the request have a state of {\it NEW}, and the reply
|
||||
{\it ESTABLISHED}. The reply closes the connection immediately.
|
||||
(ICMP\_ECHO, ICMP\_TIMESTAMP, ICMP\_INFO\_REQUEST, ICMP\_ADDRESS)
|
||||
\end{itemize}
|
||||
|
||||
\subsubsection{UDP connection tracking}
|
||||
|
||||
UDP is designed as a connectionless datagram protocol. But most common
|
||||
protocols using UDP as layer 4 protocol have bi-directional UDP communication.
|
||||
Imagine a DNS query, where the client sends an UDP frame to port 53 of the
|
||||
nameserver, and the nameserver sends back a DNS reply packet from it's UDP
|
||||
port 53 to the client.
|
||||
|
||||
Netfilter trats this as a connection. The first packet (the DNS request) is
|
||||
assigned a state of {\it NEW}, because the packet is expected to create a new
|
||||
'connection'. The dns servers' reply packet is marked as {\it ESTABLISHED}.
|
||||
|
||||
\subsubsection{conntrack application helpers}
|
||||
|
||||
More complex application protocols involving multiple connections need special
|
||||
support by a so-called ``conntrack application helper module''. Modules in
|
||||
the stock kernel come for FTP and IRC(DCC). Netfilter CVS currently contains
|
||||
patches for PPTP, H.323, Eggdrop botnet, tftp ald talk. We're still lacking
|
||||
a lot of protocols (e.g. SIP, SMB/CIFS) - but they are unlikely to appear
|
||||
until somebody really needs them and either develops them on his own or
|
||||
funds development.
|
||||
|
||||
\subsubsection{Integration of connection tracking with iptables}
|
||||
|
||||
As stated earlier, conntrack doesn't impose any policy on packets. It just
|
||||
determines the relation of a packet to already existing connections. To base
|
||||
packet filtering decision on this sate information, the iptables {\it state}
|
||||
match can be used. Every packet is within one of the following categories:
|
||||
|
||||
\begin{itemize}
|
||||
\item
|
||||
{\bf NEW}: packet would create a new connection, if it survives
|
||||
\item
|
||||
{\bf ESTABLISHED}: packet is part of an already established connection
|
||||
(either direction)
|
||||
\item
|
||||
{\bf RELATED}: packet is in some way related to an already established connection, e.g. ICMP errors or FTP data sessions
|
||||
\item
|
||||
{\bf INVALID}: conntrack is unable to derive conntrack information from this packet. Please note that all multicast or broadcast packets fall in this category.
|
||||
\end{itemize}
|
||||
|
||||
|
||||
\subsection{Poor man's conntrack failover}
|
||||
|
||||
When thinking about failover of stateful firewalls, one usually thinks about
|
||||
replication of state. This presumes that the state is gathered at one
|
||||
firewalling node (the currently active node), and replicated to several other
|
||||
passive standby nodes. There is, howeve, a very different approach to
|
||||
replication: concurrent state tracking on all firewalling nodes.
|
||||
|
||||
The basic assumption of this approach is: In a setup where all firewalling
|
||||
nodes receive exactly the same traffic, all nodes will deduct the same state
|
||||
information.
|
||||
|
||||
The implementability of this approach is totally dependent on fulfillment of
|
||||
this assumption.
|
||||
|
||||
\begin{itemize}
|
||||
\item
|
||||
{\it All packets need to be seen by all nodes}. This is not always true, but
|
||||
can be achieved by using shared media like traditional ethernet (no switches!!)
|
||||
and promiscuous mode on all ethernet interfaces.
|
||||
\item
|
||||
{\it All nodes need to be able to process all packets}. This cannot be
|
||||
universally guaranteed. Even if the hardware (CPU, RAM, Chipset, NIC's) and
|
||||
software (Linux kernel) are exactly the same, they might behave different,
|
||||
especially under high load. To avoid those effects, the hardware should be
|
||||
able to deal with way more traffic than seen during operation. Also, there
|
||||
should be no userspace processes (like proxes, etc.) running on the firewalling
|
||||
nodes at all. WARNING: Nobody guarantees this behaviour. However, the poor
|
||||
man is usually not interested in scientific proof but in usability in his
|
||||
particular practical setup.
|
||||
\end{itemize}
|
||||
|
||||
However, even if those conditions are fulfilled, ther are remaining issues:
|
||||
\begin{itemize}
|
||||
\item
|
||||
{\it No resynchronization after reboot}. If a node is rebooted (because of
|
||||
a hardware fault, software bug, software update, ..) it will loose all state
|
||||
information until the event of the reboot. This means, the state information
|
||||
of this node after reboot will not contain any old state, gathered before the
|
||||
reboot. The effect depend on the traffic. Generally, it is only assured that
|
||||
state information about all connections initiated after the reboot will be
|
||||
present. If there are short-lived connections (like http), the state
|
||||
information on the just rebooted node will approximate the state information of
|
||||
an older node. Only after all sessions active at the time of reboot have
|
||||
terminated, state information is guaranteed to be resynchronized.
|
||||
\item
|
||||
{\it Only possible with shared medium}. The practical implication is that no
|
||||
switched ethernet (and thus no full duplex) can be used.
|
||||
\end{itemize}
|
||||
|
||||
The major advantage of the poor man's approach is implementation simplicity.
|
||||
No state transfer mechanism needs to be developed. Only very little changes
|
||||
to the existing conntrack code would be needed in order to be able to
|
||||
do tracking based on packets received from promiscuous interfaces. The active
|
||||
node would have packet forwarding turned on, the passive nodes off.
|
||||
|
||||
I'm not proposing this as a real solution to the failover problem. It's
|
||||
hackish, buggy and likely to break very easily. But considering it can be
|
||||
implemented in very little programming time, it could be an option for very
|
||||
small installations with low reliability criteria.
|
||||
|
||||
\subsection{Conntrack state replication}
|
||||
|
||||
The preferred solution to the failover problem is, without any doubt,
|
||||
replication of the connection tracking state.
|
||||
|
||||
The proposed conntrack state replication soltution consists out of several
|
||||
parts:
|
||||
\begin{itemize}
|
||||
\item
|
||||
A connection tracking state replication protocol
|
||||
\item
|
||||
An event interface generating event messages as soon as state information
|
||||
changes on the active node
|
||||
\item
|
||||
An interface for explicit generation of connection tracking table entries on
|
||||
the standby slaves
|
||||
\item
|
||||
Some code (preferrably a kernel thread) running on the active node, receiving
|
||||
state updates by the event interface and generating conntrack state replication
|
||||
protocol messages
|
||||
\item
|
||||
Some code (preferrably a kernel thread) running on the slave node(s), receiving
|
||||
conntrack state replication protocol messages and updating the local conntrack
|
||||
table accordingly
|
||||
\end{itemize}
|
||||
|
||||
Flow of events in chronological order:
|
||||
\begin{itemize}
|
||||
\item
|
||||
{\it on active node, inside the network RX softirq}
|
||||
\begin{itemize}
|
||||
\item
|
||||
connection tracking code is analyzing a forwarded packet
|
||||
\item
|
||||
connection tracking gathers some new state information
|
||||
\item
|
||||
connection tracking updates local connection tracking database
|
||||
\item
|
||||
connection tracking sends event message to event API
|
||||
\end{itemize}
|
||||
\item
|
||||
{\it on active node, inside the conntrack-sync kernel thread}
|
||||
\begin{itemize}
|
||||
\item
|
||||
conntrack sync daemon receives event through event API
|
||||
\item
|
||||
conntrack sync daemon aggregates multiple event messages into a state replication protocol message, removing possible redundancy
|
||||
\item
|
||||
conntrack sync daemon generates state replication protocol message
|
||||
\item
|
||||
conntrack sync daemon sends state replication protocol message
|
||||
(private network between firewall nodes)
|
||||
\end{itemize}
|
||||
\item
|
||||
{\it on slave node(s), inside network RX softirq}
|
||||
\begin{itemize}
|
||||
\item
|
||||
connection tracking code ignores packets coming from the interface attached to the private conntrac sync network
|
||||
\item
|
||||
state replication protocol messages is appended to socket receive queue of conntrack-sync kernel thread
|
||||
\end{itemize}
|
||||
\item
|
||||
{\it on slave node(s), inside conntrack-sync kernel thread}
|
||||
\begin{itemize}
|
||||
\item
|
||||
conntrack sync daemon receives state replication message
|
||||
\item
|
||||
conntrack sync daemon creates/updates conntrack entry
|
||||
\end{itemize}
|
||||
\end{itemize}
|
||||
|
||||
|
||||
\subsubsection{Connection tracking state replication protocol}
|
||||
|
||||
|
||||
In order to be able to replicate the state between two or more firewalls, a
|
||||
state replication protocol is needed. This protocol is used over a private
|
||||
network segment shared by all nodes for state replication. It is designed to
|
||||
work over IP unicast and IP multicast transport. IP unicast will be used for
|
||||
direct point-to-point communication between one active firewall and one
|
||||
standby firewall. IP multicast will be used when the state needs to be
|
||||
replicated to more than one standby firewall.
|
||||
|
||||
|
||||
The principle design criteria of this protocol are:
|
||||
\begin{itemize}
|
||||
\item
|
||||
{\bf reliable against data loss}, as the underlying UDP layer does only
|
||||
provide checksumming against data corruption, but doesn't employ any
|
||||
means against data loss
|
||||
\item
|
||||
{\bf lightweight}, since generating the state update messages is
|
||||
already a very expensive process for the sender, eating additional CPU,
|
||||
memory and IO bandwith.
|
||||
\item
|
||||
{\bf easy to parse}, to minimize overhead at the receiver(s)
|
||||
\end{itemize}
|
||||
|
||||
The protocol does not employ any security mechanism like encryption,
|
||||
authentication or reliability against spoofing attacks. It is
|
||||
assumed that the private conntrack sync network is a secure communications
|
||||
channel, not accessible to any malicious 3rd party.
|
||||
|
||||
To achieve the reliability against data loss, an easy sequence numbering
|
||||
scheme is used. All protocol messages are prefixed by a seuqence number,
|
||||
determined by the sender. If the slave detects packet loss by discontinuous
|
||||
sequence numbers, it can request the retransmission of the missing packets
|
||||
by stating the missing sequence number(s). Since there is no acknowledgement
|
||||
for sucessfully received packets, the sender has to keep a reasonably-sized
|
||||
backlog of recently-sent packets in order to be able to fulfill retransmission
|
||||
requests.
|
||||
|
||||
The different state replication protocol messages types are:
|
||||
\begin{itemize}
|
||||
\item
|
||||
{\bf NF\_CTSRP\_NEW}: New conntrack entry has been created (and
|
||||
confirmed\footnote{See the above description of the conntrack code for what is
|
||||
meant by {\it confirming} a conntrack entry})
|
||||
\item
|
||||
{\bf NF\_CTSRP\_UPDATE}: State information of existing conntrack entry has
|
||||
changed
|
||||
\item
|
||||
{\bf NF\_CTSRP\_EXPIRE}: Existing conntrack entry has been expired
|
||||
\end{itemize}
|
||||
|
||||
To uniquely identify (and later reference) a conntrack entry, a
|
||||
{\it conntrack\_id} is assigned to every conntrack entry transferred
|
||||
using a NF\_CTSRP\_NEW message. This conntrack\_id must be saved at the
|
||||
receiver(s) together with the conntrack entry, since it is used by the sender
|
||||
for subsequent NF\_CTSRP\_UPDATE and NF\_CTSRP\_EXPIRE messages.
|
||||
|
||||
The protocol itself does not care about the source of this conntrack\_id,
|
||||
but since the current netfilter connection tracking implementation does never
|
||||
change the addres of a conntrack entry, the memory address of the entry can be
|
||||
used, since it comes for free.
|
||||
|
||||
|
||||
\subsubsection{Connection tracking state syncronization sender}
|
||||
|
||||
Maximum care needs to be taken for the implementation of the ctsyncd sender.
|
||||
|
||||
The normal workload of the active firewall node is likely to be already very
|
||||
high, so generating and sending the conntrack state replication messages needs
|
||||
to be highly efficient.
|
||||
|
||||
\begin{itemize}
|
||||
\item
|
||||
{\bf NF\_CTSRP\_NEW} will be generated at the NF\_IP\_POST\_ROUTING
|
||||
hook, at the time ip\_conntrack\_confirm() is called. Delaying
|
||||
this message until conntrack confirmation happens saves us from
|
||||
replicating otherwise unneeded state information.
|
||||
\item
|
||||
{\bf NF\_CTSRP\_UPDATE} need to be created automagically by the
|
||||
conntrack core. It is not possible to have any failover-specific
|
||||
code within conntrack protocol and/or application helpers.
|
||||
The easiest way involving the least changes to the conntrack core
|
||||
code is to copy parts of the conntrack entry before calling any
|
||||
helper functions, and then use memcmp() to find out if the helper
|
||||
has changed any information.
|
||||
\item
|
||||
{\bf NF\_CTSRP\_EXPIRE} can be added very easily to the existing
|
||||
conntrack destroy function.
|
||||
\end{itemize}
|
||||
|
||||
|
||||
\subsubsection{Connection tracking state syncronization receiver}
|
||||
|
||||
Impmentation of the receiver is very straightforward.
|
||||
|
||||
Apart from dealing with lost CTSRP packets, it just needs to call the
|
||||
respective conntrack add/modify/delete functions offered by the core.
|
||||
|
||||
|
||||
\subsubsection{Necessary changes within netfilter conntrack core}
|
||||
|
||||
To be able to implement the described conntrack state replication mechanism,
|
||||
the following changes to the conntrack core are needed:
|
||||
\begin{itemize}
|
||||
\item
|
||||
Ability to exclude certain packets from being tracked. This is a
|
||||
long-wanted feature on the TODO list of the netfilter project and will
|
||||
be implemented by having a ``prestate'' table in combination with a
|
||||
``NOTRACK'' target.
|
||||
\item
|
||||
Ability to register callback functions to be called every time a new
|
||||
conntrack entry is created or an existing entry modified.
|
||||
\item
|
||||
Export an API to add externally add, modify and remove conntrack
|
||||
entries. Since the needed ip\_conntrack\_lock is exported,
|
||||
implementation could even reside outside the conntrack core code.
|
||||
\end{itemize}
|
||||
|
||||
Since the number of changes is very low, it is very likely that the
|
||||
modifications will go into the mainstream kernel without any big hazzle.
|
||||
|
||||
\end{document}
|
|
@ -0,0 +1,56 @@
|
|||
|
||||
% TEMPLATE for Usenix papers, specifically to meet requirements of
|
||||
% TCL97 committee.
|
||||
% originally a template for producing IEEE-format articles using LaTeX.
|
||||
% written by Matthew Ward, CS Department, Worcester Polytechnic Institute.
|
||||
% adapted by David Beazley for his excellent SWIG paper in Proceedings,
|
||||
% Tcl 96
|
||||
% turned into a smartass generic template by De Clarke, with thanks to
|
||||
% both the above pioneers
|
||||
% use at your own risk. Complaints to /dev/null.
|
||||
% make it two column with no page numbering, default is 10 point
|
||||
|
||||
% adapted for Ottawa Linux Symposium
|
||||
|
||||
% include following in document.
|
||||
%\documentclass[twocolumn]{article}
|
||||
%\usepackage{usits,epsfig}
|
||||
\pagestyle{empty}
|
||||
|
||||
%set dimensions of columns, gap between columns, and space between paragraphs
|
||||
%\setlength{\textheight}{8.75in}
|
||||
\setlength{\textheight}{9.0in}
|
||||
\setlength{\columnsep}{0.25in}
|
||||
\setlength{\textwidth}{6.45in}
|
||||
\setlength{\footskip}{0.0in}
|
||||
\setlength{\topmargin}{0.0in}
|
||||
\setlength{\headheight}{0.0in}
|
||||
\setlength{\headsep}{0.0in}
|
||||
\setlength{\oddsidemargin}{0in}
|
||||
%\setlength{\oddsidemargin}{-.065in}
|
||||
%\setlength{\oddsidemargin}{-.17in}
|
||||
\setlength{\parindent}{0pc}
|
||||
\setlength{\parskip}{\baselineskip}
|
||||
|
||||
% started out with art10.sty and modified params to conform to IEEE format
|
||||
% further mods to conform to Usenix standard
|
||||
|
||||
\makeatletter
|
||||
%as Latex considers descenders in its calculation of interline spacing,
|
||||
%to get 12 point spacing for normalsize text, must set it to 10 points
|
||||
\def\@normalsize{\@setsize\normalsize{12pt}\xpt\@xpt
|
||||
\abovedisplayskip 10pt plus2pt minus5pt\belowdisplayskip \abovedisplayskip
|
||||
\abovedisplayshortskip \z@ plus3pt\belowdisplayshortskip 6pt plus3pt
|
||||
minus3pt\let\@listi\@listI}
|
||||
|
||||
%need a 12 pt font size for subsection and abstract headings
|
||||
\def\subsize{\@setsize\subsize{12pt}\xipt\@xipt}
|
||||
|
||||
%make section titles bold and 12 point, 2 blank lines before, 1 after
|
||||
\def\section{\@startsection {section}{1}{\z@}{24pt plus 2pt minus 2pt}
|
||||
{12pt plus 2pt minus 2pt}{\large\bf}}
|
||||
|
||||
%make subsection titles bold and 11 point, 1 blank line before, 1 after
|
||||
\def\subsection{\@startsection {subsection}{2}{\z@}{12pt plus 2pt minus 2pt}
|
||||
{12pt plus 2pt minus 2pt}{\subsize\bf}}
|
||||
\makeatother
|
|
@ -0,0 +1,33 @@
|
|||
Linux packet filtering in the 2.6.x kernel series
|
||||
|
||||
The Linux 2.4.x provided a complete rewrite of the firewalling subsystem,
|
||||
called netfilter/iptables. It was a major improvement about the previous
|
||||
ipchains subsystem. The major advantages are it's modularity and flexibility.
|
||||
|
||||
However, as wity any project, as soon as you are sort-of finished, you become
|
||||
aware of potential improvements and extensions.
|
||||
|
||||
The firewalling subsystem within the Linux kernel will undergo some fundamental design changes during the 2.5.x development kernel series.
|
||||
|
||||
Some of the changes from 2.4.x are:
|
||||
|
||||
- Have an independent pkt_tables subsystem, as a layer3 independent replacement
|
||||
for iptables, ip6tables and arptables. This will allow adding support for
|
||||
other layer 3 protocols very easily
|
||||
- Move all kernel/userspace communication to netlink sockets. There will be
|
||||
a generic nfnetlink layer, with pkttnetlink (for managing pkt_tables) and
|
||||
ctnetlink (for manipulating the connection tracking database from userspace).
|
||||
- Change the internal data structure of an ip_table to a linked list of chains,
|
||||
which in turn are a linked lists out of rules, which are linked lists out of
|
||||
matches + targets. This way it is _way_ more performant in the case of
|
||||
dynamic firewalling rulesets.
|
||||
- Provide a generic high-level API to userspace applications for manipulation
|
||||
of packet filtering rules. This will enable generic GUI's, which need no
|
||||
changes in case new matches or targets are added.
|
||||
|
||||
Optionally, the netfilter core team is planning to have support for connection
|
||||
tracking state replication - something necessarry for failover of stateful
|
||||
firewalls.
|
||||
|
||||
The talk assumes prior knowledge about the netfilter/iptables architecture.
|
||||
|
|
@ -0,0 +1,374 @@
|
|||
%include "default.mgp"
|
||||
%default 1 bgrad
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
%nodefault
|
||||
%back "blue"
|
||||
|
||||
%center
|
||||
%size 7
|
||||
|
||||
|
||||
The future of Linux packet filtering
|
||||
targeted for kernel 2.6
|
||||
|
||||
|
||||
%center
|
||||
%size 4
|
||||
by
|
||||
|
||||
Harald Welte <laforge@gnumonks.org>
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Contents
|
||||
|
||||
|
||||
Problems with current 2.4.x netfilter/iptables
|
||||
Solution to code replication
|
||||
Solution for dynamic rulesets
|
||||
Solution for API to GUI's and other management programs
|
||||
|
||||
HA for stateful firewalling
|
||||
What's special about firewalling HA
|
||||
Poor man's failover
|
||||
Real state replication
|
||||
|
||||
Other current work
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Problems with 2.4.x netfilter/iptables
|
||||
|
||||
code replication between iptables/ip6tables/arptables
|
||||
iptables was never meant for other protocols, but people did copy+paste 'ports'
|
||||
replication of
|
||||
core kernel code
|
||||
layer 3 independent matches (mac, interface, ...)
|
||||
userspace library (libiptc)
|
||||
userspace tool (iptables)
|
||||
userspace plugins (libipt_xxx.so)
|
||||
|
||||
doesn't suit the needs for dynamically changing rulesets
|
||||
dynamic rulesets becomming more common due (service selection, IDS)
|
||||
a whole table is created in userspace and sent as blob to kernel
|
||||
for every ruleset the table needs to be copied to userspace and back
|
||||
inside kernel consistency checks on whole table, loop detection
|
||||
|
||||
too extensible for writing any forward-compatible GUI
|
||||
new extensions showing up all the time
|
||||
a frontend would need to know about the options and use of a new extension
|
||||
thus frontends are always incomplete and out-of-date
|
||||
no high-level API other than piping to iptables-restore
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Reducing code replication
|
||||
|
||||
code replication is a real problem: unclean, bugfixes missed
|
||||
we need layer 3 independent layer for
|
||||
submitting rules to the kernel
|
||||
traversing packet-rulesets supporting match/target modules
|
||||
registering matches/targets
|
||||
layer 3 specific (like matching ipv4 address)
|
||||
layer 3 independent (like matching MAC address)
|
||||
|
||||
solution
|
||||
pkt_tables inside kernel
|
||||
pkt_tables_ipv4 registers layer 3 handler with pkt_tables
|
||||
pkt_tables_ipv6 registers layer 3 handler with pkt_tables
|
||||
everybody registering a pkt_table (like iptable_filter) needs to specify the l3 protocol
|
||||
libraries in userspace (see later)
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Supporting dynamic rulesets
|
||||
|
||||
atomic table-replacement turned out to be bad idea
|
||||
need new interface for sending individual rules to kernel
|
||||
policy routing has the same problem and good solution: rtnetlink
|
||||
solution: nfnetlink
|
||||
multicast-netlink based packet-orinented socket between kernel and userspace
|
||||
has extra benefit that other userspace processes get notified of rule changes [just like routing daemons]
|
||||
nfnetlink will be low-layer below all kernel/userspace communication
|
||||
pkttnetlink [aka iptnetlink]
|
||||
ctnetlink
|
||||
ulog
|
||||
ip_queue
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Communication with other programs
|
||||
|
||||
whole set of libraries
|
||||
libnfnetlink for low-layer communication
|
||||
libpkttnetlink for rule modifications
|
||||
will handle all plugins [which are currently part of iptables]
|
||||
query functions about avaliable matches/targets
|
||||
query functions about parameters
|
||||
query functions for help messages about specific match/parameter of a match
|
||||
generic structure from which rules can be built
|
||||
conversion functions to parse generic structure into in-kernel structure
|
||||
conversion functiosn to perse kernel structure into generic structure
|
||||
functions to convert generic structure in plain text
|
||||
libipq will stay API-compatible to current version
|
||||
libipulog will stay API-compatible to current version
|
||||
libiptc will go away [compatibility layer extremely difficult]
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Introduction
|
||||
|
||||
What is special about firewall failover?
|
||||
|
||||
Nothing, in case of the stateless packet filter
|
||||
Common IP takeover solutions can be used
|
||||
VRRP
|
||||
Hartbeat
|
||||
|
||||
Distribution of packet filtering ruleset no problem
|
||||
can be done manually
|
||||
or implemented with simple userspace process
|
||||
|
||||
Problems arise with stateful packet filters
|
||||
Connection state only on active node
|
||||
NAT mappings only on active node
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Connection tracking...
|
||||
|
||||
implemented seperately from NAT
|
||||
enables stateful filtering
|
||||
implementation
|
||||
hooks into NF_IP_PRE_ROUTING to track packets
|
||||
hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to see if packet passed filtering rules
|
||||
protocol modules (currently TCP/UDP/ICMP)
|
||||
application helpers currently (FTP,IRC,H.323,talk,SNMP)
|
||||
divides packets in the following four categories
|
||||
NEW - would establish new connection
|
||||
ESTABLISHED - part of already established connection
|
||||
RELATED - is related to established connection
|
||||
INVALID - (multicast, errors...)
|
||||
does _NOT_ filter packets itself
|
||||
can be utilized by iptables using the 'state' match
|
||||
is used by NAT Subsystem
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Common structures
|
||||
struct ip_conntrack_tuple, representing unidirectional flow
|
||||
layer 3 src + dst
|
||||
layer 4 protocol
|
||||
layer 4 src + dst
|
||||
|
||||
|
||||
connetions represented as struct ip_conntrack
|
||||
original tuple
|
||||
reply tuple
|
||||
timeout
|
||||
l4 state private data
|
||||
app helper
|
||||
app helper private data
|
||||
expected connections
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Flow of events for new packet
|
||||
packet enters NF_IP_PRE_ROUTING
|
||||
tuple is derived from packet
|
||||
lookup conntrack hash table with hash(tuple) -> fails
|
||||
new ip_conntrack is allocated
|
||||
fill in original and reply == inverted(original) tuple
|
||||
initialize timer
|
||||
assign app helper if applicable
|
||||
see if we've been expected -> fails
|
||||
call layer 4 helper 'new' function
|
||||
|
||||
...
|
||||
|
||||
packet enters NF_IP_POST_ROUTING
|
||||
do hashtable lookup for packet -> fails
|
||||
place struct ip_conntrack in hashtable
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Flow of events for packet part of existing connection
|
||||
packet enters NF_IP_PRE_ROUTING
|
||||
tuple is derived from packet
|
||||
lookup conntrack hash table with hash(tuple)
|
||||
assosiate conntrack entry with skb->nfct
|
||||
call l4 protocol helper 'packet' function
|
||||
do l4 state tracking
|
||||
update timeouts as needed [i.e. TCP TIME_WAIT,...]
|
||||
|
||||
...
|
||||
|
||||
packet enters NF_IP_POST_ROUTING
|
||||
do hashtable lookup for packet -> succeds
|
||||
do nothing else
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Poor man's failover
|
||||
|
||||
Poor man's failover
|
||||
principle
|
||||
let every node do it's own tracking rather than replicating state
|
||||
two possible implementations
|
||||
connect every node to shared media (i.e. real ethernet)
|
||||
forwarding only turned on on active node
|
||||
slave nodes use promiscuous mode to sniff packets
|
||||
copy all traffic to slave nodes
|
||||
active master needs to copy all traffic to other nodes
|
||||
disadvantage: high load, sync traffic == payload traffic
|
||||
IMHO stupid way of solving the problem
|
||||
advantages
|
||||
very easy implementation
|
||||
only addition of sniffing mode to conntrack needed
|
||||
existing means of address takeover can be used
|
||||
same load on active master and slave nodes
|
||||
no additional load on active master
|
||||
disadvantages
|
||||
can only be used with real shared media (no switches, ...)
|
||||
can not be used with NAT
|
||||
remaining problem
|
||||
no initial state sync after reboot of slave node!
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Real state replication
|
||||
|
||||
Parts needed
|
||||
state replication protocol
|
||||
multicast based
|
||||
sequence numbers for detection of packet loss
|
||||
NACK-based retransmission
|
||||
no security, since private ethernet segment to be used
|
||||
event interface on active node
|
||||
calling out to callback function at all state changes
|
||||
exported interface to manipulate conntrack hash table
|
||||
kernel thread for sending conntrack state protocol messages
|
||||
registers with event interface
|
||||
creates and accumulates state replication packets
|
||||
sends them via in-kernel sockets api
|
||||
kernel thread for receiving conntrack state replication messages
|
||||
receives state replication packets via in-kernel sockets
|
||||
uses conntrack hashtable manipulation interface
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Real state replication
|
||||
|
||||
Flow of events in chronological order:
|
||||
on active node, inside the network RX softirq
|
||||
connection tracking code is analyzing a forwarded packet
|
||||
connection tracking gathers some new state information
|
||||
connection tracking updates local connection tracking database
|
||||
connection tracking sends event message to event API
|
||||
on active node, inside the conntrack-sync kernel thread
|
||||
conntrack sync daemon receives event through event API
|
||||
conntrack sync daemon aggregates multiple event messages into a state replication protocol message, removing possible redundancy
|
||||
conntrack sync daemon generates state replication protocol message
|
||||
conntrack sync daemon sends state replication protocol message
|
||||
on slave node(s), inside network RX softirq
|
||||
connection tracking code ignores packets coming from the interface attached to the private conntrac sync network
|
||||
state replication protocol messages is appended to socket receive queue of conntrack-sync kernel thread
|
||||
on slave node(s), inside conntrack-sync kernel thread
|
||||
conntrack sync daemon receives state replication message
|
||||
conntrack sync daemon creates/updates conntrack entry
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Neccessary changes to kernel
|
||||
|
||||
Neccessary changes to current conntrack core
|
||||
|
||||
event generation (callback functions) for all state changes
|
||||
|
||||
conntrack hashtable manipulation API
|
||||
is needed (and already implemented) for 'ctnetlink' API
|
||||
|
||||
conntrack exemptions
|
||||
needed to _not_ track conntrack state replication packets
|
||||
is needed for other cases as well
|
||||
currently being developed by Jozsef Kadlecsik
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Other current work
|
||||
|
||||
optimizing the conntrack code
|
||||
hash function optimization
|
||||
current hash function not good for even hash bucket count
|
||||
other hash functions in development
|
||||
hash function evaluation tool [cttest] avaliable
|
||||
introduce per-system randomness to prevent hash attack
|
||||
code optimization (locking/timers/...)
|
||||
|
||||
getting our work submitted into the mainstream kernel
|
||||
turns out to be more difficult
|
||||
e.g. newnat api now waiting for three months
|
||||
|
||||
discussions about multiple targets/actions per rule
|
||||
technical implementation easy
|
||||
however, not everybody convinced that it fits into the concept
|
||||
|
||||
using tc for firewalling
|
||||
Jamal Hadi Selim uses iptables targets from within TC
|
||||
leads to discussion of generic classification engine API in kernel
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Thanks
|
||||
The slides and the an according paper of this presentation are available at http://www.gnumonks.org/
|
||||
|
||||
The netfilter homepage http://www.netfilter.org/
|
||||
|
||||
Thanks to
|
||||
the BBS people, Z-Netz, FIDO, ...
|
||||
for heavily increasing my computer usage in 1992
|
||||
KNF
|
||||
for bringing me in touch with the internet as early as 1994
|
||||
for providing a playground for technical people
|
||||
for telling me about the existance of Linux!
|
||||
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
|
||||
for implementing (one of?) the world's best TCP/IP stacks
|
||||
Paul 'Rusty' Russell
|
||||
for starting the netfilter/iptables project
|
||||
for trusting me to maintain it today
|
||||
Astaro AG
|
||||
for sponsoring parts of my netfilter work
|
||||
|
|
@ -0,0 +1,49 @@
|
|||
Linux 2.4.x netfilter/iptables firewalling internals (lt-690870524)
|
||||
|
||||
The Linux 2.4.x kernel series has introduced a totally new kernel firewalling subsystem. It is much more than a plain successor of ipfwadm or ipchains.
|
||||
|
||||
The netfilter/iptables project has a very modular design and it's
|
||||
sub-projects can be split in several parts: netfilter, iptables, connection
|
||||
tracking, NAT and packet mangling.
|
||||
|
||||
While most users will already have learned how to use the basic functions
|
||||
of netfilter/iptables in order to convert their old ipchains firewalls to
|
||||
iptables, there's more advanced but less used functionality in
|
||||
netfilter/iptables.
|
||||
|
||||
The presentation covers the design principles behind the netfilter/iptables
|
||||
implementation. This knowledge enables us to understand how the individual
|
||||
parts of netfilter/iptables fit together, and for which potential applications
|
||||
this is useful.
|
||||
|
||||
Topics covered:
|
||||
|
||||
- overview about the internal netfilter/iptables architecture
|
||||
- the netfilter hooks inside the network protocol stacks
|
||||
- packet selection with IP tables
|
||||
- how is connection tracking and NAT integrated into the framework
|
||||
- the connection tracking system
|
||||
- how good does it track the TCP state?
|
||||
- how does it track ICMP and UDP state at all?
|
||||
- layer 4 protocol helpers (GRE, ...)
|
||||
- application helpers (ftp, irc, h323, ...)
|
||||
- restrictions/limitations
|
||||
- the NAT system
|
||||
- how does it interact with connection tracking?
|
||||
- layer 4 protocol helpers
|
||||
- application helpers (ftp, irc, ...)
|
||||
- misc
|
||||
- how far is IPv6 firewalling with ip6tables?
|
||||
- advances in failover/HA of stateful firewalls
|
||||
- ivisible firewalls with iptables on a bridge
|
||||
- userspace packet queueing with QUEUE
|
||||
- userspace packet logging with ULOG
|
||||
|
||||
Requirements:
|
||||
- knowledge about the TCP/IP protocol family
|
||||
- knowledge about general firewalling and packet filtering concepts
|
||||
- prior experience with linux packet filters
|
||||
|
||||
Audience:
|
||||
- firewall administrators
|
||||
- network developers
|
|
@ -0,0 +1,520 @@
|
|||
%include "default.mgp"
|
||||
%default 1 bgrad
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
%nodefault
|
||||
%back "blue"
|
||||
|
||||
%center
|
||||
%size 7
|
||||
|
||||
|
||||
Linux 2.4.x netfilter/iptables
|
||||
firewalling internals
|
||||
|
||||
|
||||
%center
|
||||
%size 4
|
||||
by
|
||||
|
||||
Harald Welte <laforge@gnumonks.org>
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Contents
|
||||
|
||||
|
||||
Introduction
|
||||
Netfilter hooks in protocol stacks
|
||||
Packet selection based on IP Tables
|
||||
The Connection Tracking Subsystem
|
||||
The NAT Subsystem based on netfilter + iptables
|
||||
Packet filtering using the 'filter' table
|
||||
Packet mangling using the 'mangle' table
|
||||
Advanced netfilter concepts
|
||||
Current development and Future
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Introduction
|
||||
|
||||
Why did we need netfilter/iptables?
|
||||
Because ipchains...
|
||||
|
||||
has no infrastructure for passing packets to userspace
|
||||
makes transparent proxying extremely difficult
|
||||
has interface address dependent Packet filter rules
|
||||
has Masquerading implemented as part of packet filtering
|
||||
code is too complex and intermixed with core ipv4 stack
|
||||
is neither modular nor extensible
|
||||
only barely supports one special case of NAT (masquerading)
|
||||
has only stateless packet filtering
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Introduction
|
||||
|
||||
Who's behind netfilter/iptables
|
||||
Paul 'Rusty' Russel
|
||||
co-author of iptables in Linux 2.2
|
||||
was paid by Watchguard for about one Year of development
|
||||
James Morris
|
||||
userspace queuing (kernel, library and tools)
|
||||
REJECT target
|
||||
Marc Boucher
|
||||
NAT and packet filtering controlled by one command
|
||||
Mangle table
|
||||
Harald Welte
|
||||
Conntrack+NAT helper infrastructure (newnat)
|
||||
Userspace packet logging (ULOG)
|
||||
PPTP and IRC conntrack/NAT helpers
|
||||
Jozsef Kadlecsik
|
||||
TCP window tracking
|
||||
H.323 conntrack + NAT helper
|
||||
Continued newnat development
|
||||
Non-core team contributors
|
||||
http://www.netfilter.org/scoreboard/
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Netfilter Hooks
|
||||
|
||||
What is netfilter?
|
||||
|
||||
System of callback functions within network stack
|
||||
Callback function to be called for every packet traversing certain point (hook) within network stack
|
||||
Protocol independent framework
|
||||
Hooks in layer 3 stacks (IPv4, IPv6, DECnet, ARP)
|
||||
Multiple kernel modules can register with each of the hooks
|
||||
Asynchronous packet handling in userspace (ip_queue)
|
||||
|
||||
Traditional packet filtering, NAT, ... is implemented on top of this framework
|
||||
|
||||
Can be used for other stuff interfacing with the core network stack, like DECnet routing daemon.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Netfilter Hooks
|
||||
|
||||
Netfilter architecture in IPv4
|
||||
%font "courier"
|
||||
|
||||
--->[1]--->[ROUTE]--->[3]--->[4]--->
|
||||
| ^
|
||||
| |
|
||||
| [ROUTE]
|
||||
v |
|
||||
[2] [5]
|
||||
| ^
|
||||
| |
|
||||
v |
|
||||
|
||||
%font "standard"
|
||||
1=NF_IP_PRE_ROUTING
|
||||
2=NF_IP_LOCAL_IN
|
||||
3=NF_IP_FORWARD
|
||||
4=NF_IP_POST_ROUTING
|
||||
5=NF_IP_LOCAL_OUT
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Netfilter Hooks
|
||||
|
||||
Netfilter Hooks
|
||||
|
||||
Any kernel module may register a callback function at any of the hooks
|
||||
|
||||
The module has to return one of the following constants
|
||||
|
||||
NF_ACCEPT continue traversal as normal
|
||||
NF_DROP drop the packet, do not continue
|
||||
NF_STOLEN I've taken over the packet do not continue
|
||||
NF_QUEUE enqueue packet to userspace
|
||||
NF_REPEAT call this hook again
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
IP tables
|
||||
|
||||
Packet selection using IP tables
|
||||
|
||||
The kernel provides generic IP tables support
|
||||
|
||||
Each kernel module may create it's own IP table
|
||||
|
||||
The three major parts of 2.4 firewalling subsystem are implemented using IP tables
|
||||
Packet filtering table 'filter'
|
||||
NAT table 'nat'
|
||||
Packet mangling table 'mangle'
|
||||
|
||||
Can potentially be used for other stuff, i.e. IPsec SPDB
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
IP Tables
|
||||
|
||||
Managing chains and tables
|
||||
|
||||
An IP table consists out of multiple chains
|
||||
A chain consists out of a list of rules
|
||||
Every single rule in a chain consists out of
|
||||
match[es] (rule executed if all matches true)
|
||||
target (what to do if the rule is matched)
|
||||
|
||||
%size 4
|
||||
matches and targets can either be builtin or implemented as kernel modules
|
||||
|
||||
%size 6
|
||||
The userspace tool iptables is used to control IP tables
|
||||
handles all different kinds of IP tables
|
||||
supports a plugin/shlib interface for target/match specific options
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
IP Tables
|
||||
|
||||
Basic iptables commands
|
||||
|
||||
To build a complete iptables command, we must specify
|
||||
which table to work with
|
||||
which chain in this table to use
|
||||
an operation (insert, add, delete, modify)
|
||||
one or more matches (optional)
|
||||
a target
|
||||
|
||||
The syntax is
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t table -Operation chain -j target match(es)
|
||||
%font "standard"
|
||||
%size 5
|
||||
|
||||
Example:
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t filter -A INPUT -j ACCEPT -p tcp --dport smtp
|
||||
%font "standard"
|
||||
%size 5
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
IP Tables
|
||||
|
||||
Matches
|
||||
Basic matches
|
||||
-p protocol (tcp/udp/icmp/...)
|
||||
-s source address (ip/mask)
|
||||
-d destination address (ip/mask)
|
||||
-i incoming interface
|
||||
-o outgoing interface
|
||||
|
||||
Match extensions (examples)
|
||||
tcp/udp TCP/udp source/destination port
|
||||
icmp ICMP code/type
|
||||
ah/esp AH/ESP SPID match
|
||||
mac source MAC address
|
||||
mark nfmark
|
||||
length match on length of packet
|
||||
limit rate limiting (n packets per timeframe)
|
||||
owner owner uid of the socket sending the packet
|
||||
tos TOS field of IP header
|
||||
ttl TTL field of IP header
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
IP Tables
|
||||
|
||||
Targets
|
||||
very dependent on the particular table.
|
||||
|
||||
Table specific targets will be discussed later
|
||||
|
||||
Generic Targets, always available
|
||||
ACCEPT accept packet within chain
|
||||
DROP silently drop packet
|
||||
QUEUE enqueue packet to userspace
|
||||
LOG log packet via syslog
|
||||
ULOG log packet via ulogd
|
||||
RETURN return to previous (calling) chain
|
||||
foobar jump to user defined chain
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Packet Filtering
|
||||
|
||||
Overview
|
||||
|
||||
Implemented as 'filter' table
|
||||
Registers with three netfilter hooks
|
||||
|
||||
NF_IP_LOCAL_IN (packets destined for the local host)
|
||||
NF_IP_FORWARD (packets forwarded by local host)
|
||||
NF_IP_LOCAL_OUT (packets from the local host)
|
||||
|
||||
Each of the three hooks has attached one chain (INPUT, FORWARD, OUTPUT)
|
||||
|
||||
Every packet passes exactly one of the three chains. Note that this is very different compared to the old 2.2.x ipchains behaviour.
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Packet Filtering
|
||||
|
||||
Targets available within 'filter' table
|
||||
|
||||
Builtin Targets to be used in filter table
|
||||
ACCEPT accept the packet
|
||||
DROP silently drop the packet
|
||||
QUEUE enqueue packet to userspace
|
||||
RETURN return to previous (calling) chain
|
||||
foobar user defined chain
|
||||
|
||||
Targets implemented as loadable modules
|
||||
REJECT drop the packet but inform sender
|
||||
MIRROR change source/destination IP and resend
|
||||
LOG log via syslog
|
||||
ULOG log via userspace
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Connection tracking...
|
||||
|
||||
implemented seperately from NAT
|
||||
enables stateful filtering
|
||||
implementation
|
||||
hooks into NF_IP_PRE_ROUTING to track packets
|
||||
hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to see if packet passed filtering rules
|
||||
protocol modules (currently TCP/UDP/ICMP)
|
||||
application helpers currently (FTP,IRC,H.323,talk,SNMP)
|
||||
divides packets in the following four categories
|
||||
NEW - would establish new connection
|
||||
ESTABLISHED - part of already established connection
|
||||
RELATED - is related to established connection
|
||||
INVALID - (multicast, errors...)
|
||||
does _NOT_ filter packets itself
|
||||
can be utilized by iptables using the 'state' match
|
||||
is used by NAT Subsystem
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Common structures
|
||||
struct ip_conntrack_tuple, representing unidirectional flow
|
||||
layer 3 src + dst
|
||||
layer 4 protocol
|
||||
layer 4 src + dst
|
||||
|
||||
|
||||
connetions represented as struct ip_conntrack
|
||||
original tuple
|
||||
reply tuple
|
||||
timeout
|
||||
l4 state private data
|
||||
app helper
|
||||
app helper private data
|
||||
expected connections
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Flow of events for new packet
|
||||
packet enters NF_IP_PRE_ROUTING
|
||||
tuple is derived from packet
|
||||
lookup conntrack hash table with hash(tuple) -> fails
|
||||
new ip_conntrack is allocated
|
||||
fill in original and reply == inverted(original) tuple
|
||||
initialize timer
|
||||
assign app helper if applicable
|
||||
see if we've been expected -> fails
|
||||
call layer 4 helper 'new' function
|
||||
|
||||
...
|
||||
|
||||
packet enters NF_IP_POST_ROUTING
|
||||
do hashtable lookup for packet -> fails
|
||||
place struct ip_conntrack in hashtable
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Flow of events for packet part of existing connection
|
||||
packet enters NF_IP_PRE_ROUTING
|
||||
tuple is derived from packet
|
||||
lookup conntrack hash table with hash(tuple)
|
||||
assosiate conntrack entry with skb->nfct
|
||||
call l4 protocol helper 'packet' function
|
||||
do l4 state tracking
|
||||
update timeouts as needed [i.e. TCP TIME_WAIT,...]
|
||||
|
||||
...
|
||||
|
||||
packet enters NF_IP_POST_ROUTING
|
||||
do hashtable lookup for packet -> succeds
|
||||
do nothing else
|
||||
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Network Address Translation
|
||||
|
||||
Overview
|
||||
|
||||
Previous Linux Kernels only implemented one special case of NAT: Masquerading
|
||||
Linux 2.4.x can do any kind of NAT.
|
||||
NAT subsystem implemented on top of netfilter, iptables and conntrack
|
||||
NAT subsystem registers with all five netfilter hooks
|
||||
'nat' Table registers chains PREROUTING, POSTROUTING and OUTPUT
|
||||
Following targets available within 'nat' Table
|
||||
SNAT changes the packet's source whille passing NF_IP_POST_ROUTING
|
||||
DNAT changes the packet's destination while passing NF_IP_PRE_ROUTING
|
||||
MASQUERADE is a special case of SNAT
|
||||
REDIRECT is a special case of DNAT
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Network Address Translation
|
||||
|
||||
Source NAT
|
||||
SNAT Example:
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t nat -A POSTROUTING -j SNAT --to-source 1.2.3.4 -s 10.0.0.0/8
|
||||
%font "standard"
|
||||
%size 4
|
||||
|
||||
MASQUERADE Example:
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t nat -A POSTROUTING -j MASQUERADE -o ppp0
|
||||
%font "standard"
|
||||
%size 5
|
||||
|
||||
Destination NAT
|
||||
DNAT example
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t nat -A PREROUTING -j DNAT --to-destination 1.2.3.4:8080 -p tcp --dport 80 -i eth1
|
||||
%font "standard"
|
||||
%size 4
|
||||
|
||||
REDIRECT example
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t nat -A PREROUTING -j REDIRECT --to-port 3128 -i eth1 -p tcp --dport 80
|
||||
%font "standard"
|
||||
%size 5
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Packet Mangling
|
||||
|
||||
Purpose of mangle table
|
||||
packet manipulation except address manipulation
|
||||
|
||||
Integration with netfilter
|
||||
'mangle' table hooks in all five netfilter hooks
|
||||
priority: after conntrack
|
||||
|
||||
Targets specific to the 'mangle' table:
|
||||
DSCP - manipulate DSCP field
|
||||
IPV4OPTSSTRIP - strip IPv4 options
|
||||
MARK - change the nfmark field of the skb
|
||||
TCPMSS - set TCP MSS option
|
||||
TOS - manipulate the TOS bits
|
||||
TTL - set / increase / decrease TTL field
|
||||
|
||||
Simple example:
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t mangle -A PREROUTING -j MARK --set-mark 10 -p tcp --dport 80
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Advanced Netfilter concepts
|
||||
|
||||
%size 4
|
||||
Userspace logging
|
||||
flexible replacement for old syslog-based logging
|
||||
packets to userspace via multicast netlink sockets
|
||||
easy-to-use library (libipulog)
|
||||
plugin-extensible userspace logging daemon (ulogd)
|
||||
Can even be used to directly log into MySQL
|
||||
|
||||
Queuing
|
||||
reliable asynchronous packet handling
|
||||
packets to userspace via unicast netlink socket
|
||||
easy-to-use library (libipq)
|
||||
provides Perl bindings
|
||||
experimental queue multiplex daemon (ipqmpd)
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Current Development and Future
|
||||
|
||||
Netfilter (although it proved very stable) is still work in progress.
|
||||
|
||||
Areas of current development
|
||||
infrastructure for conntrack manipulation from userspace
|
||||
failover of stateful firewalls
|
||||
making iptables layer3 independent (pkttables)
|
||||
new userspace library (libiptables) to hide plugins from apps
|
||||
more matches and targets for advanced functions (pool, hashslot)
|
||||
more conntrack and NAT modules (RPC, SNMP, SMB, ...)
|
||||
better IPv6 support (conntrack, more matches / targets)
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Thanks
|
||||
The slides and the an according paper of this presentation are available at http://www.gnumonks.org/
|
||||
|
||||
The netfilter homepage http://www.netfilter.org/
|
||||
|
||||
Thanks to
|
||||
the BBS people, Z-Netz, FIDO, ...
|
||||
for heavily increasing my computer usage in 1992
|
||||
KNF
|
||||
for bringing me in touch with the internet as early as 1994
|
||||
for providing a playground for technical people
|
||||
for telling me about the existance of Linux!
|
||||
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
|
||||
for implementing (one of?) the world's best TCP/IP stacks
|
||||
Paul 'Rusty' Russell
|
||||
for starting the netfilter/iptables project
|
||||
for trusting me to maintain it today
|
||||
Astaro AG
|
||||
for sponsoring parts of my netfilter work
|
||||
|
|
@ -0,0 +1,537 @@
|
|||
\documentclass{article}
|
||||
\usepackage{german}
|
||||
\usepackage{fancyheadings}
|
||||
\usepackage{a4}
|
||||
|
||||
\setlength{\oddsidemargin}{0in}
|
||||
\setlength{\evensidemargin}{0in}
|
||||
\setlength{\topmargin}{0.0in}
|
||||
\setlength{\headheight}{0in}
|
||||
\setlength{\headsep}{0in}
|
||||
\setlength{\textwidth}{6.5in}
|
||||
\setlength{\textheight}{9.5in}
|
||||
\setlength{\parindent}{0in}
|
||||
\setlength{\parskip}{0.05in}
|
||||
|
||||
|
||||
\begin{document}
|
||||
\title{Linux 2.4.x netfilter/iptables firewalling internals}
|
||||
|
||||
\author{Harald Welte\\
|
||||
laforge@gnumonks.org\\
|
||||
\copyright{}2002 H. Welte}
|
||||
|
||||
\date{25. April 2002}
|
||||
|
||||
\maketitle
|
||||
|
||||
\setcounter{section}{0}
|
||||
\setcounter{subsection}{0}
|
||||
\setcounter{subsubsection}{0}
|
||||
|
||||
\section{Introduction}
|
||||
The Linux 2.4.x kernel series has introduced a totally new kernel firewalling
|
||||
subsystem. It is much more than a plain successor of ipfwadm or ipchains.
|
||||
|
||||
The netfilter/iptables project has a very modular design and it's
|
||||
sub-projects can be split in several parts: netfilter, iptables, connection
|
||||
tracking, NAT and packet mangling.
|
||||
|
||||
While most users will already have learned how to use the basic functions
|
||||
of netfilter/iptables in order to convert their old ipchains firewalls to
|
||||
iptables, there's more advanced but less used functionality in
|
||||
netfilter/iptables.
|
||||
|
||||
The presentation covers the design principles behind the netfilter/iptables
|
||||
implementation. This knowledge enables us to understand how the individual
|
||||
parts of netfilter/iptables fit together, and for which potential applications
|
||||
this is useful.
|
||||
|
||||
\section{Internal netfilter/iptables architecture}
|
||||
|
||||
\subsection{Netfilter hooks in protocol stacks}
|
||||
|
||||
One of the major motivations behind the redesign of the linux packet
|
||||
filtering and NAT system during the 2.3.x kernel series was the widespread
|
||||
firewall specific code parts within the core IPv4 stack. Ideally the core
|
||||
IPv4 stack (as used by regular hosts and routers) shouldn't contain any
|
||||
firewalling specific code, resulting in no unwanted interaction and less
|
||||
code complexity. This desire lead to the invention of {\it netfilter}.
|
||||
|
||||
\subsubsection{Architecture of netfilter}
|
||||
|
||||
Netfilter is basically a system of callback functions within the network
|
||||
stack. It provides a non-portable API towards in-kernel networking
|
||||
extensions.
|
||||
|
||||
What we call {\it netfilter hook} is a well-defined call-out point within a
|
||||
layer three protocol stack, such as IPv4, IPv6 or DECnet. Any layer three
|
||||
network stack can define an arbitrary number of hooks, usually placed at
|
||||
strategic points within the packet flow.
|
||||
|
||||
Any other kernel code can now subsequently register callback functions for
|
||||
any of these hooks. As in most sytems will be more than one callback
|
||||
function registered for a particular hook, a {\it priority} is specified upon
|
||||
registration of the callback function. This priority defines the order in
|
||||
which the individual callback functions at a particular hook are called.
|
||||
|
||||
The return value of any registered callback functions can be:
|
||||
\begin{itemize}
|
||||
\item
|
||||
{\bf NF\_ACCEPT}: continue traversal as usual
|
||||
\item
|
||||
{\bf NF\_DROP}: drop the packet; do not continue traversal
|
||||
\item
|
||||
{\bf NF\_STOLEN}: callback function has taken over the packet; do not continue
|
||||
\item
|
||||
{\bf NF\_QUEUE}: enqueue the packet to userspace
|
||||
\item
|
||||
{\bf NF\_REPEAT}: call this hook again
|
||||
\end{itemize}
|
||||
|
||||
\subsubsection{Netfilter hooks within IPv4}
|
||||
|
||||
The IPv4 stack provides five netfilter hooks, which are placed at the
|
||||
following peculiar places within the code:
|
||||
|
||||
\begin{verbatim}
|
||||
--->[1]--->[ROUTE]--->[3]--->[4]--->
|
||||
| ^
|
||||
| |
|
||||
| [ROUTE]
|
||||
v |
|
||||
[2] [5]
|
||||
| ^
|
||||
| |
|
||||
v |
|
||||
|
||||
local processes
|
||||
\end{verbatim}
|
||||
|
||||
Packets received on any network interface arrive at the left side of the
|
||||
diagram. After the verification of the IP header checksum, the
|
||||
NF\_IP\_PRE\_ROUTING [1] hook is traversed.
|
||||
|
||||
If they ``survive'' (i.e. NF\_ACCEPT is returned), the packet enters the
|
||||
routing code. Where we continue from here depends on the destintion of the
|
||||
packet.
|
||||
|
||||
Packets with a local destination (i.e. packets where the destination address is
|
||||
one of the own IP addresses of the host) traverse the NF\_IP\_LOCAL\_IN [2]
|
||||
hook. If all callback function return NF\_ACCEPT, the packet is finally passed
|
||||
to the socket code, which eventually passes the packet to a local process.
|
||||
|
||||
Packets with a remote destination (i.e. packets which are forwarded by the
|
||||
local machine) traverse the NF\_IP\_FORWARD [3] hook. If they ``survive'',
|
||||
they finally pass the NF\_IP\_POST\_ROUTING [4] hook and are sent off the
|
||||
outgoing network interface.
|
||||
|
||||
Locally generated packets first traverse the NF\_IP\_LOCAL\_OUT [5] hook, then
|
||||
enter the routing code, and finally go through the NF\_IP\_POST\_ROUTING [4]
|
||||
hook before being sent off the outgoing network interface.
|
||||
|
||||
\subsubsection{Netfilter hooks within IPv6}
|
||||
|
||||
As the IPv4 and IPv6 protocols are very similar, the netfilter hooks within the
|
||||
IPv6 stack are placed at exactly the same locations as in the IPv4 stack. The
|
||||
only change are the hook names: NF\_IP6\_PRE\_ROUTING, NF\_IP6\_LOCAL\_IN,
|
||||
NF\_IP6\_FORWARD, NF\_IP6\_POST\_ROUTING, NF\_IP6\_LOCAL\_OUT.
|
||||
|
||||
\subsubsection{Netfilter hooks within DECnet}
|
||||
|
||||
There are seven decnet hooks. The first five hooks (NF\_DN\_PRE\_ROUTING,
|
||||
NF\_DN\_LOCAL\_IN, NF\_DN\_FORWARD, NF\_DN\_LOCAL\_OUT, NF\_DN\_POST\_ROUTING)
|
||||
are prretty much the same as in IPv4. The last two hooks (NF\_DN\_HELLO,
|
||||
NF\_DN\_ROUTE) are used in conjunction with DECnet Hello and Routing packets.
|
||||
|
||||
\subsubsection{Netfilter hooks within ARP}
|
||||
|
||||
Recent kernels\footnote{IIRC, starting with 2.4.19-pre3} have added support for netfilter hooks within the ARP code.
|
||||
There are two hooks: NF\_ARP\_IN and NF\_ARP\_OUT, for incoming and outgoing
|
||||
ARP packets respectively.
|
||||
|
||||
\subsubsection{Netfilter hooks within IPX}
|
||||
|
||||
There have been experimental patches to add netfilter hooks to the IPX code,
|
||||
but they never got integrated into the kernel source.
|
||||
|
||||
\subsection{Packet selection using IP Tables}
|
||||
|
||||
The IP tables core (ip\_tables.o) provides a generic layer for evaluation
|
||||
of rulesets.
|
||||
|
||||
An IP table consists out of an arbitrary number of {\it chains}, which in turn
|
||||
consist out of a linear list of {\it rules}, which again consist out of any
|
||||
number of {\it matches} and one {\it target}.
|
||||
|
||||
{\it Chains} can further be devided into two classes: Either {\it builtin
|
||||
chains} or {\it user-defined chains}. Builtin chains are always present, they
|
||||
are created upon table registration. They are also the entry points for table
|
||||
iteration. User defined chains are created at runtime upon user interaction.
|
||||
|
||||
{\it Matches} specify the matching criteria, there can be zero or more matches
|
||||
|
||||
{\it Targets} specify the action which is to be executed in case {\bf all}
|
||||
matches match. There can only be a single target per rule.
|
||||
|
||||
Matches and targets can either be {\it builtin} or {\it linux kernel modules}.
|
||||
|
||||
There are two special targets:
|
||||
\begin{itemize}
|
||||
\item
|
||||
By using a chain name as target, it is possible to jump to the respective chain
|
||||
in case the matches match.
|
||||
\item
|
||||
By using the RETURN target, it is possible to return to the previous (calling)
|
||||
chain
|
||||
\end{itemize}
|
||||
|
||||
The IP tables core handles the following functions
|
||||
\begin{itemize}
|
||||
\item
|
||||
Registering and unregistering tables
|
||||
\item
|
||||
Registering and unregistering matches and targets (can be implemented as linux kernel modules)
|
||||
\item
|
||||
Kernel / userspace interface for manipulation of IP tables
|
||||
\item
|
||||
Traversal of IP tables
|
||||
\end{itemize}
|
||||
|
||||
\subsubsection{Packet filtering unsing the ``filter'' table}
|
||||
|
||||
Traditional packet filtering (i.e. the successor to ipfwadm/ipchains) takes
|
||||
place in the ``filter'' table. Packet filtering works like a sieve: A packet
|
||||
is (in the end) either dropped or accepted - but never modified.
|
||||
|
||||
The ``filter'' table is implemented in the {\it iptable\_filter.o} module
|
||||
and contains three builtin chains:
|
||||
|
||||
\begin{itemize}
|
||||
\item
|
||||
{\bf INPUT} attaches to NF\_IP\_LOCAL\_IN
|
||||
\item
|
||||
{\bf FORWARD} attaches to NF\_IP\_FORWARD
|
||||
\item
|
||||
{\bf OUTPUT} attaches to NF\_IP\_LOCAL\_OUT
|
||||
\end{itemize}
|
||||
|
||||
The placement of the chains / hooks is done in such way, that evey concievable
|
||||
packet always traverses only one of the built-in chains. Packets destined for
|
||||
the local host traverse only INPUT, packets forwarded only FORWARD and
|
||||
locally-originated packets only OUTPUT.
|
||||
|
||||
\subsubsection{Packet mangling using the ``mangle'' table}
|
||||
|
||||
As stated above, operations which would modify a packet do not belong in the
|
||||
``filter'' table. The ``mangle'' table is available for all kinds of packet
|
||||
manipulation - but not manipulation of addresses (which is NAT).
|
||||
|
||||
The mangle table attaches to all five netfilter hooks and provides the
|
||||
respectiva builtin chains (PREROUTING, INPUT, FORWARD, OUTPUT, POSTROUTING)
|
||||
\footnote{This has changed through recent 2.4.x kernel series, old kernels may
|
||||
only support three (PREROUTING, POSTROUTING, OUTPUT) chains.}.
|
||||
|
||||
\subsection{Connection Tracking Subsystem}
|
||||
|
||||
Traditional packet filters can only match on matching criteria within the
|
||||
currently processed packet, like source/destination IP address, port numbers,
|
||||
TCP flags, etc. As most applications have a notion of connections or at least
|
||||
a request/response style protocol, there is a lot of information which can not
|
||||
be derived from looking at a single packet.
|
||||
|
||||
Thus, modern (stateful) packet filters attempt to track connections (flows)
|
||||
and their respective protocol states for all traffic through the packet
|
||||
filter.
|
||||
|
||||
Connection tracking within linux is implemented as a netfilter module, called
|
||||
ip\_conntrack.o.
|
||||
|
||||
Before describing the connection tracking subsystem, we need to describe a couple of definitions and primitives used throughout the conntrack code.
|
||||
|
||||
A connection is represented within the conntrack subsystem using {\it struct
|
||||
ip\_conntrack}, also called {\it connection tracking entry}.
|
||||
|
||||
Connection tracking is utilizing {\it conntrack tuples}, which are tuples
|
||||
consisting out of (srcip, srcport, dstip, dstport, l4prot). A connection is
|
||||
uniquely identified by two tuples: The tuple in the original direction
|
||||
(IP\_CT\_DIR\_ORIGINAL) and the tuple for the reply direction
|
||||
(IP\_CT\_DIR\_REPLY).
|
||||
|
||||
Connection tracking itself does not drop packets\footnote{well, in some rare
|
||||
cases in combination with NAT it needs to drop. But don't tell anyone, this is
|
||||
secret.} or impose any policy. It just associates every packet with a
|
||||
connection tracking entry, which in turn has a particular state. All other
|
||||
kernel code can use this state information\footnote{state information is
|
||||
internally represented via the {\it struct sk\_buff.nfct} structure member of a
|
||||
packet.}.
|
||||
|
||||
\subsubsection{Integration of conntrack with netfilter}
|
||||
|
||||
If the ip\_conntrack.o module is registered with netfilter, it attaches to the
|
||||
NF\_IP\_PRE\_ROUTING, NF\_IP\_POST\_ROUTING, NF\_IP\_LOCAL\_IN and
|
||||
NF\_IP\_LOCAL\_OUT hooks.
|
||||
|
||||
Because forwarded packets are the most common case on firewalls, I will only
|
||||
describe how connection tracking works for forwarded packets. The two relevant
|
||||
hooks for forwarded packets are NF\_IP\_PRE\_ROUTING and NF\_IP\_POST\_ROUTING.
|
||||
|
||||
Every time a packet arrives at the NF\_IP\_PRE\_ROUTING hook, connection
|
||||
tracking creates a conntrack tuple from the packet. It then compares this
|
||||
tuple to the original and reply tuples of all already-seen connections
|
||||
\footnote{Of course this is not implemented as a linear search over all existing connections.} to find out if this just-arrived packet belongs to any existing
|
||||
connection. If there is no match, a new conntrack table entry (struct
|
||||
ip\_conntrack) is created.
|
||||
|
||||
Let's assume the case where we have already existing connections but are
|
||||
starting from scratch.
|
||||
|
||||
The first packet comes in, we derive the tuple from the packet headers, look up
|
||||
the conntrack hash table, don't find any matching entry. As a result, we
|
||||
create a new struct ip\_conntrack. This struct ip\_conntrack is filled with
|
||||
all necessarry data, like the original and reply tuple of the connection.
|
||||
How do we know the reply tuple? By inverting the source and destination
|
||||
parts of the original tuple.\footnote{So why do we need two tuples, if they can
|
||||
be derived from each other? Wait until we discuss NAT.}
|
||||
Please note that this new struct ip\_conntrack is {\bf not} yet placed
|
||||
into the conntrack hash table.
|
||||
|
||||
The packet is now passed on to other callback functions which have registered
|
||||
with a lower priority at NF\_IP\_PRE\_ROUTING. It then continues traversal of
|
||||
the network stack as usual, including all respective netfilter hooks.
|
||||
|
||||
If the packet survives (i.e. is not dropped by the routing code, network stack,
|
||||
firewall ruleset, ...), it re-appears at NF\_IP\_POST\_ROUTING. In this case,
|
||||
we can now safely assume that this packet will be sent off on the outgoing
|
||||
interface, and thus put the connection tracking entry which we created at
|
||||
NF\_IP\_PRE\_ROUTING into the conntrack hash table. This process is called
|
||||
{\it confirming the conntrack}.
|
||||
|
||||
The connection tracking code itself is not monolithic, but consists out of a
|
||||
couple of seperate modules\footnote{They don't actually have to be seperate
|
||||
kernel modules; e.g. TCP, UDP and ICMP tracking modules are all part of
|
||||
the linux kernel module ip\_conntrack.o}. Besides the conntrack core, there
|
||||
are two important kind of modules: Protocol helpers and application helpers.
|
||||
|
||||
Protocol helpers implement the layer-4-protocol specific parts. They currently
|
||||
exist for TCP, UDP and ICMP (an experimental helper for GRE exists).
|
||||
|
||||
\subsubsection{TCP connection tracking}
|
||||
|
||||
As TCP is a connection oriented protocol, it is not very difficult to imagine
|
||||
how conntection tracking for this protocol could work. There are well-defined
|
||||
state transitions possible, and conntrack can decide which state transitions
|
||||
are valid within the TCP specification. In reality it's not all that easy,
|
||||
since we cannot assume that all packets that pass the packet filter actually
|
||||
arrive at the receiving end, ...
|
||||
|
||||
It is noteworthy that the standard connection tracking code does {\bf not}
|
||||
do TCP sequence number and window tracking. A well-maintained patch to add
|
||||
this feature exists almost as long as connection tracking itself. It will
|
||||
be integrated with the 2.5.x kernel. The problem with window tracking is
|
||||
it's bad interaction with connection pickup. The TCP conntrack code is able to
|
||||
pick up already existing connections, e.g. in case your firewall was rebooted.
|
||||
However, connection pickup is conflicting with TCP window tracking: The TCP
|
||||
window scaling option is only transferred at connection setup time, and we
|
||||
don't know about it in case of pickup...
|
||||
|
||||
\subsubsection{ICMP tracking}
|
||||
|
||||
ICMP is not really a connection oriented protocol. So how is it possible to
|
||||
do connection tracking for ICMP?
|
||||
|
||||
The ICMP protocol can be split in two groups of messages
|
||||
|
||||
\begin{itemize}
|
||||
\item
|
||||
ICMP error messages, which sort-of belong to a different connection
|
||||
ICMP error messages are associated {\it RELATED} to a different connection.
|
||||
(ICMP\_DEST\_UNREACH, ICMP\_SOURCE\_QUENCH, ICMP\_TIME\_EXCEEDED,
|
||||
ICMP\_PARAMETERPROB, ICMP\_REDIRECT).
|
||||
\item
|
||||
ICMP queries, which have a request->reply character. So what the conntrack
|
||||
code does, is let the request have a state of {\it NEW}, and the reply
|
||||
{\it ESTABLISHED}. The reply closes the connection immediately.
|
||||
(ICMP\_ECHO, ICMP\_TIMESTAMP, ICMP\_INFO\_REQUEST, ICMP\_ADDRESS)
|
||||
\end{itemize}
|
||||
|
||||
\subsubsection{UDP connection tracking}
|
||||
|
||||
UDP is designed as a connectionless datagram protocol. But most common
|
||||
protocols using UDP as layer 4 protocol have bi-directional UDP communication.
|
||||
Imagine a DNS query, where the client sends an UDP frame to port 53 of the
|
||||
nameserver, and the nameserver sends back a DNS reply packet from it's UDP
|
||||
port 53 to the client.
|
||||
|
||||
Netfilter trats this as a connection. The first packet (the DNS request) is
|
||||
assigned a state of {\it NEW}, because the packet is expected to create a new
|
||||
'connection'. The dns servers' reply packet is marked as {\it ESTABLISHED}.
|
||||
|
||||
\subsubsection{conntrack application helpers}
|
||||
|
||||
More complex application protocols involving multiple connections need special
|
||||
support by a so-called ``conntrack application helper module''. Modules in
|
||||
the stock kernel come for FTP and IRC(DCC). Netfilter CVS currently contains
|
||||
patches for PPTP, H.323, Eggdrop botnet, tftp ald talk. We're still lacking
|
||||
a lot of protocols (e.g. SIP, SMB/CIFS) - but they are unlikely to appear
|
||||
until somebody really needs them and either develops them on his own or
|
||||
funds development.
|
||||
|
||||
\subsubsection{Integration of connection tracking with iptables}
|
||||
|
||||
As stated earlier, conntrack doesn't impose any policy on packets. It just
|
||||
determines the relation of a packet to already existing connections. To base
|
||||
packet filtering decision on this sate information, the iptables {\it state}
|
||||
match can be used. Every packet is within one of the following categories:
|
||||
|
||||
\begin{itemize}
|
||||
\item
|
||||
{\bf NEW}: packet would create a new connection, if it survives
|
||||
\item
|
||||
{\bf ESTABLISHED}: packet is part of an already established connection
|
||||
(either direction)
|
||||
\item
|
||||
{\bf RELATED}: packet is in some way related to an already established connection, e.g. ICMP errors or FTP data sessions
|
||||
\item
|
||||
{\bf INVALID}: conntrack is unable to derive conntrack information from this packet. Please note that all multicast or broadcast packets fall in this category.
|
||||
\end{itemize}
|
||||
|
||||
\subsection{NAT Subsystem}
|
||||
|
||||
The NAT (Network Address Translation) subsystem is probably the worst
|
||||
documented subsystem within the whole framework. This has two reasons: NAT is
|
||||
nasty and complicated. The Linux 2.4.x NAT implementation is easy to use, so
|
||||
nobody needs to know the nasty details.
|
||||
|
||||
Nonetheless, as I was traditionally concentrating most on the conntrack and NAT
|
||||
systems, I will give a short overview.
|
||||
|
||||
NAT uses almost all of the previously described subsystems:
|
||||
\begin{itemize}
|
||||
\item
|
||||
IP tables to specify which packets to NAT in which particular way. NAT
|
||||
registers a ``nat'' table with PREROUTING, POSTROUTING and OUTPUT chains.
|
||||
\item
|
||||
Connection tracking to associate NAT state with the connection.
|
||||
\item
|
||||
Netfilter to do the actuall packet manipulation transparent to the rest of the
|
||||
kernel. NAT registers with NF\_IP\_PRE\_ROUTING, NF\_IP\_POST\_ROUTING,
|
||||
NF\_IP\_LOCAL\_IN and NF\_IP\_LOCAL\_OUT.
|
||||
\end{itemize}
|
||||
|
||||
The NAT implementation supports all kinds of different nat; Source NAT,
|
||||
Destination NAT, NAT to address/port ranges, 1:1 NAT, ...
|
||||
|
||||
This fundamental design principle is still frequently misunderstood:\\
|
||||
The information about which NAT mappings apply to a certain connection
|
||||
is only gathered once - with the first packet of every connection.
|
||||
|
||||
So let's start to look at the life of a poor to-be-nat'ed packet.
|
||||
For ease of understanding, I have chosen to describe the most frequently
|
||||
used NAT scenario: Source NAT of a forwarded packet. Let's assume the
|
||||
packet has an original source address of 1.1.1.1, an original destination
|
||||
address of 2.2.2.2, and is going to be SNAT'ed to 9.9.9.9. Let's further
|
||||
ignore the fact that there are port numbers.
|
||||
|
||||
Once upon a time, our poor packet arrives at NF\_IP\_PRE\_ROUTING, where
|
||||
conntrack has registered with highest priority. This means that a conntrack
|
||||
entry with the following two tuples is created:
|
||||
\begin{verbatim}
|
||||
IP_CT_DIR_ORIGINAL: 1.1.1.1 -> 2.2.2.2
|
||||
IP_CT_DIR_REPLY: 2.2.2.2 -> 1.1.1.1
|
||||
\end{verbatim}
|
||||
After conntrack, the packet traverses the PREROUTING chain of the ``nat''
|
||||
IP table. Since only destination NAT happens at PREROUTING, no action
|
||||
occurs. After it's lengthy way through the rest of the network stack,
|
||||
the packet arrives at the NF\_IP\_POST\_ROUTING hook, where it traverses
|
||||
the POSTROUTING chain of the ``nat'' table. Here it hits a SNAT rule,
|
||||
causing the following actions:
|
||||
\begin{itemize}
|
||||
\item
|
||||
Fill in a {\it struct ip\_nat\_manip}, indicating the new source address
|
||||
and the type of NAT (source NAT at POSTROUTING). This struct is part of the
|
||||
conntrack entry.
|
||||
\item
|
||||
Automatically derive the inverse NAT transormation for the reply packets:
|
||||
Destination NAT at PREROUTING. Fill in another {\it struct ip\_nat\_manip}.
|
||||
\item
|
||||
Alter the REPLY tuple of the conntrack entry to
|
||||
\begin{verbatim}
|
||||
IP_CT_DIR_REPLY: 2.2.2.2 -> 9.9.9.9
|
||||
\end{verbatim}
|
||||
\item
|
||||
Apply the SNAT transformation to the packet
|
||||
\end{itemize}
|
||||
|
||||
Every other packt within this connection, independent of its direction,
|
||||
will only execute the last step. Since all NAT information is connected
|
||||
with the conntrack entry, there is no need to do anything but to apply
|
||||
the same transormations to all packets witin the same connection.
|
||||
|
||||
\subsection{IPv6 Firewalling with ip6tables}
|
||||
|
||||
Yes, Linux 2.4.x comes with a usable, though incomplete system to secure
|
||||
your IPv6 network.
|
||||
|
||||
The parts ported to IPv6 are
|
||||
\begin{itemize}
|
||||
\item
|
||||
IP tables (called IP6 tables)
|
||||
\item
|
||||
The ``filter'' table
|
||||
\item
|
||||
The ``mangle'' table
|
||||
\item
|
||||
The userspace library (libip6tc)
|
||||
\item
|
||||
The command line tool (ip6tables)
|
||||
\end{itemize}
|
||||
|
||||
Due to the lack of conntrack and NAT\footnote{for god's sake we don't have NAT
|
||||
with IPv6}, only traditional, stateless packet filtering is possible. Apart
|
||||
from the obvious matches/targets, ip6tables can match on
|
||||
\begin{itemize}
|
||||
\item
|
||||
{\it EUI64 checker}; verifies if the MAC address of the sender is the same as in the EUI64 64 least significant bits of the source IPv6 address
|
||||
\item
|
||||
{\it frag6 match}, matches on IPv6 fragmentation header
|
||||
\item
|
||||
{\it route6 match}, matches on IPv6 routing header
|
||||
\item
|
||||
{\it ahesp6 match}, matches on SPIDs within AH or ESP over IPv6 packets
|
||||
\end{itemize}
|
||||
|
||||
However, the ip6tables code doesn't seem to be used very widely (yet?).
|
||||
So please expect some potential remaining issues, since it is not tested
|
||||
as heavily as iptables.
|
||||
|
||||
\subsection{Recent Development}
|
||||
|
||||
Please refer to the spoken word at the presentation. Development at the
|
||||
time this paper was written can be quite different from development at the
|
||||
time the presentation is held.
|
||||
|
||||
\section{Thanks}
|
||||
|
||||
I'd like to thank
|
||||
\begin{itemize}
|
||||
\item
|
||||
{\it Linus Torvalds} for starting this interesting UNIX-like kernel
|
||||
\item
|
||||
{\it Alan Cox, David Miller, Alexey Kuznetesov, Andi Kleen} for building
|
||||
(one of?) the world's best TCP/IP stacks.
|
||||
\item
|
||||
{\it Paul ``Rusty'' Russell} for starting the netfilter/iptables project
|
||||
\item
|
||||
{\it The Netfilter Core Team} for continuing the netfilter/iptables effort
|
||||
\item
|
||||
{\it Astaro AG} for partially funding my current netfilter/iptables work
|
||||
\item
|
||||
{\it Conectiva Inc.} for partially funding parts of my past netfilter/iptables
|
||||
work and for inviting me to live in Brazil
|
||||
\item
|
||||
{\it samba.org and Kommunikationsnetz Franken e.V.} for hosting the netfilter
|
||||
homepage, CVS, mailing lists, ...
|
||||
\end{itemize}
|
||||
|
||||
\end{document}
|
|
@ -0,0 +1,49 @@
|
|||
Linux 2.4.x netfilter/iptables firewalling internals (lt-690870524)
|
||||
|
||||
The Linux 2.4.x kernel series has introduced a totally new kernel firewalling subsystem. It is much more than a plain successor of ipfwadm or ipchains.
|
||||
|
||||
The netfilter/iptables project has a very modular design and it's
|
||||
sub-projects can be split in several parts: netfilter, iptables, connection
|
||||
tracking, NAT and packet mangling.
|
||||
|
||||
While most users will already have learned how to use the basic functions
|
||||
of netfilter/iptables in order to convert their old ipchains firewalls to
|
||||
iptables, there's more advanced but less used functionality in
|
||||
netfilter/iptables.
|
||||
|
||||
The presentation covers the design principles behind the netfilter/iptables
|
||||
implementation. This knowledge enables us to understand how the individual
|
||||
parts of netfilter/iptables fit together, and for which potential applications
|
||||
this is useful.
|
||||
|
||||
Topics covered:
|
||||
|
||||
- overview about the internal netfilter/iptables architecture
|
||||
- the netfilter hooks inside the network protocol stacks
|
||||
- packet selection with IP tables
|
||||
- how is connection tracking and NAT integrated into the framework
|
||||
- the connection tracking system
|
||||
- how good does it track the TCP state?
|
||||
- how does it track ICMP and UDP state at all?
|
||||
- layer 4 protocol helpers (GRE, ...)
|
||||
- application helpers (ftp, irc, h323, ...)
|
||||
- restrictions/limitations
|
||||
- the NAT system
|
||||
- how does it interact with connection tracking?
|
||||
- layer 4 protocol helpers
|
||||
- application helpers (ftp, irc, ...)
|
||||
- misc
|
||||
- how far is IPv6 firewalling with ip6tables?
|
||||
- advances in failover/HA of stateful firewalls
|
||||
- ivisible firewalls with iptables on a bridge
|
||||
- userspace packet queueing with QUEUE
|
||||
- userspace packet logging with ULOG
|
||||
|
||||
Requirements:
|
||||
- knowledge about the TCP/IP protocol family
|
||||
- knowledge about general firewalling and packet filtering concepts
|
||||
- prior experience with linux packet filters
|
||||
|
||||
Audience:
|
||||
- firewall administrators
|
||||
- network developers
|
|
@ -0,0 +1,22 @@
|
|||
<a href="http://www.gnumonks.org/users/laforge/">Harald Welte</a> is one
|
||||
of the five <a href="http://www.netfilter.org/">netfilter/iptables</a> core
|
||||
team members, and the current Linux 2.4.x firewalling maintainer.
|
||||
|
||||
His main interest in computing has always been networking. In the few time
|
||||
left besides netfilter/iptables related work, he's writing obscure documents
|
||||
like the <a href="http://www.gnumonks.org/ftp/pub/doc/uucp-over-ssl.html"> UUCP
|
||||
over SSL HOWTO</a>. Other kernel-related projects he has been contributing are
|
||||
user mode linux and the international (crypto) kernel patch.
|
||||
|
||||
In the past he has been working as an independent IT Consultant working on
|
||||
closed-source projecst for various companies ranging from banks to
|
||||
manufacturers of networking gear. During the year 2001 he was living in
|
||||
Curitiba (Brazil), where he got sponsored for his Linux related work by
|
||||
<a href="http://www.conectiva.com/">Conectiva Inc.</a>.
|
||||
|
||||
Starting with February 2002, Harald has been contracted part-time by
|
||||
<a href="http://www.astaro.com/">Astaro AG</a>, who are sponsoring him for his
|
||||
current netfilter/iptables work.
|
||||
|
||||
Harald is living in Erlangen, Germany.
|
||||
|
|
@ -0,0 +1,466 @@
|
|||
%include "default.mgp"
|
||||
%default 1 bgrad
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
%nodefault
|
||||
%back "blue"
|
||||
|
||||
%center
|
||||
%size 7
|
||||
|
||||
|
||||
Linux 2.4.x netfilter/iptables
|
||||
firewalling internals
|
||||
|
||||
|
||||
%center
|
||||
%size 4
|
||||
by
|
||||
|
||||
Harald Welte <laforge@gnumonks.org>
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Contents
|
||||
|
||||
|
||||
Introduction
|
||||
Netfilter hooks in protocol stacks
|
||||
Packet selection based on IP Tables
|
||||
The Connection Tracking Subsystem
|
||||
The NAT Subsystem based on netfilter + iptables
|
||||
Packet filtering using the 'filter' table
|
||||
Packet mangling using the 'mangle' table
|
||||
Advanced netfilter concepts
|
||||
Current development and Future
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Introduction
|
||||
|
||||
Why did we need netfilter/iptables?
|
||||
Because ipchains...
|
||||
|
||||
has no infrastructure for passing packets to userspace
|
||||
makes transparent proxying extremely difficult
|
||||
has interface address dependent Packet filter rules
|
||||
has Masquerading implemented as part of packet filtering
|
||||
code is too complex and intermixed with core ipv4 stack
|
||||
is neither modular nor extensible
|
||||
only barely supports one special case of NAT (masquerading)
|
||||
has only stateless packet filtering
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Introduction
|
||||
|
||||
Who's behind netfilter/iptables
|
||||
Paul 'Rusty' Russel
|
||||
co-author of iptables in Linux 2.2
|
||||
was paid by Watchguard for about one Year of development
|
||||
James Morris
|
||||
userspace queuing (kernel, library and tools)
|
||||
REJECT target
|
||||
Marc Boucher
|
||||
NAT and packet filtering controlled by one command
|
||||
Mangle table
|
||||
Harald Welte
|
||||
Conntrack+NAT helper infrastructure (newnat)
|
||||
Userspace packet logging (ULOG)
|
||||
PPTP and IRC conntrack/NAT helpers
|
||||
Jozsef Kadlecsik
|
||||
TCP window tracking
|
||||
H.323 conntrack + NAT helper
|
||||
Continued newnat development
|
||||
Non-core team contributors
|
||||
http://www.netfilter.org/scoreboard/
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Netfilter Hooks
|
||||
|
||||
What is netfilter?
|
||||
|
||||
System of callback functions within network stack
|
||||
Callback function to be called for every packet traversing certain point (hook) within network stack
|
||||
Protocol independent framework
|
||||
Hooks in layer 3 stacks (IPv4, IPv6, DECnet, ARP)
|
||||
Multiple kernel modules can register with each of the hooks
|
||||
Asynchronous packet handling in userspace (ip_queue)
|
||||
|
||||
Traditional packet filtering, NAT, ... is implemented on top of this framework
|
||||
|
||||
Can be used for other stuff interfacing with the core network stack, like DECnet routing daemon.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Netfilter Hooks
|
||||
|
||||
Netfilter architecture in IPv4
|
||||
%font "courier"
|
||||
|
||||
--->[1]--->[ROUTE]--->[3]--->[4]--->
|
||||
| ^
|
||||
| |
|
||||
| [ROUTE]
|
||||
v |
|
||||
[2] [5]
|
||||
| ^
|
||||
| |
|
||||
v |
|
||||
|
||||
%font "standard"
|
||||
1=NF_IP_PRE_ROUTING
|
||||
2=NF_IP_LOCAL_IN
|
||||
3=NF_IP_FORWARD
|
||||
4=NF_IP_POST_ROUTING
|
||||
5=NF_IP_LOCAL_OUT
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Netfilter Hooks
|
||||
|
||||
Netfilter Hooks
|
||||
|
||||
Any kernel module may register a callback function at any of the hooks
|
||||
|
||||
The module has to return one of the following constants
|
||||
|
||||
NF_ACCEPT continue traversal as normal
|
||||
NF_DROP drop the packet, do not continue
|
||||
NF_STOLEN I've taken over the packet do not continue
|
||||
NF_QUEUE enqueue packet to userspace
|
||||
NF_REPEAT call this hook again
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
IP tables
|
||||
|
||||
Packet selection using IP tables
|
||||
|
||||
The kernel provides generic IP tables support
|
||||
|
||||
Each kernel module may create it's own IP table
|
||||
|
||||
The three major parts of 2.4 firewalling subsystem are implemented using IP tables
|
||||
Packet filtering table 'filter'
|
||||
NAT table 'nat'
|
||||
Packet mangling table 'mangle'
|
||||
|
||||
Can potentially be used for other stuff, i.e. IPsec SPDB
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
IP Tables
|
||||
|
||||
Managing chains and tables
|
||||
|
||||
An IP table consists out of multiple chains
|
||||
A chain consists out of a list of rules
|
||||
Every single rule in a chain consists out of
|
||||
match[es] (rule executed if all matches true)
|
||||
target (what to do if the rule is matched)
|
||||
|
||||
%size 4
|
||||
matches and targets can either be builtin or implemented as kernel modules
|
||||
|
||||
%size 6
|
||||
The userspace tool iptables is used to control IP tables
|
||||
handles all different kinds of IP tables
|
||||
supports a plugin/shlib interface for target/match specific options
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
IP Tables
|
||||
|
||||
Basic iptables commands
|
||||
|
||||
To build a complete iptables command, we must specify
|
||||
which table to work with
|
||||
which chain in this table to use
|
||||
an operation (insert, add, delete, modify)
|
||||
one or more matches (optional)
|
||||
a target
|
||||
|
||||
The syntax is
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t table -Operation chain -j target match(es)
|
||||
%font "standard"
|
||||
%size 5
|
||||
|
||||
Example:
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t filter -A INPUT -j ACCEPT -p tcp --dport smtp
|
||||
%font "standard"
|
||||
%size 5
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
IP Tables
|
||||
|
||||
Matches
|
||||
Basic matches
|
||||
-p protocol (tcp/udp/icmp/...)
|
||||
-s source address (ip/mask)
|
||||
-d destination address (ip/mask)
|
||||
-i incoming interface
|
||||
-o outgoing interface
|
||||
|
||||
Match extensions (examples)
|
||||
tcp/udp TCP/udp source/destination port
|
||||
icmp ICMP code/type
|
||||
ah/esp AH/ESP SPID match
|
||||
mac source MAC address
|
||||
mark nfmark
|
||||
length match on length of packet
|
||||
limit rate limiting (n packets per timeframe)
|
||||
owner owner uid of the socket sending the packet
|
||||
tos TOS field of IP header
|
||||
ttl TTL field of IP header
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
IP Tables
|
||||
|
||||
Targets
|
||||
very dependent on the particular table.
|
||||
|
||||
Table specific targets will be discussed later
|
||||
|
||||
Generic Targets, always available
|
||||
ACCEPT accept packet within chain
|
||||
DROP silently drop packet
|
||||
QUEUE enqueue packet to userspace
|
||||
LOG log packet via syslog
|
||||
ULOG log packet via ulogd
|
||||
RETURN return to previous (calling) chain
|
||||
foobar jump to user defined chain
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Packet Filtering
|
||||
|
||||
Overview
|
||||
|
||||
Implemented as 'filter' table
|
||||
Registers with three netfilter hooks
|
||||
|
||||
NF_IP_LOCAL_IN (packets destined for the local host)
|
||||
NF_IP_FORWARD (packets forwarded by local host)
|
||||
NF_IP_LOCAL_OUT (packets from the local host)
|
||||
|
||||
Each of the three hooks has attached one chain (INPUT, FORWARD, OUTPUT)
|
||||
|
||||
Every packet passes exactly one of the three chains. Note that this is very different compared to the old 2.2.x ipchains behaviour.
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Packet Filtering
|
||||
|
||||
Targets available within 'filter' table
|
||||
|
||||
Builtin Targets to be used in filter table
|
||||
ACCEPT accept the packet
|
||||
DROP silently drop the packet
|
||||
QUEUE enqueue packet to userspace
|
||||
RETURN return to previous (calling) chain
|
||||
foobar user defined chain
|
||||
|
||||
Targets implemented as loadable modules
|
||||
REJECT drop the packet but inform sender
|
||||
MIRROR change source/destination IP and resend
|
||||
LOG log via syslog
|
||||
ULOG log via userspace
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Connection tracking...
|
||||
|
||||
implemented seperately from NAT
|
||||
enables stateful filtering
|
||||
implementation
|
||||
hooks into NF_IP_PRE_ROUTING to track packets
|
||||
hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to see if packet passed filtering rules
|
||||
protocol modules (currently TCP/UDP/ICMP)
|
||||
application helpers currently (FTP,IRC,H.323,talk,SNMP)
|
||||
divides packets in the following four categories
|
||||
NEW - would establish new connection
|
||||
ESTABLISHED - part of already established connection
|
||||
RELATED - is related to established connection
|
||||
INVALID - (multicast, errors...)
|
||||
does _NOT_ filter packets itself
|
||||
can be utilized by iptables using the 'state' match
|
||||
is used by NAT Subsystem
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Network Address Translation
|
||||
|
||||
Overview
|
||||
|
||||
Previous Linux Kernels only implemented one special case of NAT: Masquerading
|
||||
Linux 2.4.x can do any kind of NAT.
|
||||
NAT subsystem implemented on top of netfilter, iptables and conntrack
|
||||
NAT subsystem registers with all five netfilter hooks
|
||||
'nat' Table registers chains PREROUTING, POSTROUTING and OUTPUT
|
||||
Following targets available within 'nat' Table
|
||||
SNAT changes the packet's source whille passing NF_IP_POST_ROUTING
|
||||
DNAT changes the packet's destination while passing NF_IP_PRE_ROUTING
|
||||
MASQUERADE is a special case of SNAT
|
||||
REDIRECT is a special case of DNAT
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Network Address Translation
|
||||
|
||||
Source NAT
|
||||
SNAT Example:
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t nat -A POSTROUTING -j SNAT --to-source 1.2.3.4 -s 10.0.0.0/8
|
||||
%font "standard"
|
||||
%size 4
|
||||
|
||||
MASQUERADE Example:
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t nat -A POSTROUTING -j MASQUERADE -o ppp0
|
||||
%font "standard"
|
||||
%size 5
|
||||
|
||||
Destination NAT
|
||||
DNAT example
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t nat -A PREROUTING -j DNAT --to-destination 1.2.3.4:8080 -p tcp --dport 80 -i eth1
|
||||
%font "standard"
|
||||
%size 4
|
||||
|
||||
REDIRECT example
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t nat -A PREROUTING -j REDIRECT --to-port 3128 -i eth1 -p tcp --dport 80
|
||||
%font "standard"
|
||||
%size 5
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Packet Mangling
|
||||
|
||||
Purpose of mangle table
|
||||
packet manipulation except address manipulation
|
||||
|
||||
Integration with netfilter
|
||||
'mangle' table hooks in all five netfilter hooks
|
||||
priority: after conntrack
|
||||
|
||||
Targets specific to the 'mangle' table:
|
||||
DSCP - manipulate DSCP field
|
||||
IPV4OPTSSTRIP - strip IPv4 options
|
||||
MARK - change the nfmark field of the skb
|
||||
TCPMSS - set TCP MSS option
|
||||
TOS - manipulate the TOS bits
|
||||
TTL - set / increase / decrease TTL field
|
||||
|
||||
Simple example:
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t mangle -A PREROUTING -j MARK --set-mark 10 -p tcp --dport 80
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Advanced Netfilter concepts
|
||||
|
||||
%size 4
|
||||
Userspace logging
|
||||
flexible replacement for old syslog-based logging
|
||||
packets to userspace via multicast netlink sockets
|
||||
easy-to-use library (libipulog)
|
||||
plugin-extensible userspace logging daemon (ulogd)
|
||||
Can even be used to directly log into MySQL
|
||||
|
||||
Queuing
|
||||
reliable asynchronous packet handling
|
||||
packets to userspace via unicast netlink socket
|
||||
easy-to-use library (libipq)
|
||||
provides Perl bindings
|
||||
experimental queue multiplex daemon (ipqmpd)
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Current Development and Future
|
||||
|
||||
Netfilter (although it proved very stable) is still work in progress.
|
||||
|
||||
Areas of current development
|
||||
infrastructure for conntrack manipulation from userspace
|
||||
failover of stateful firewalls
|
||||
making iptables layer3 independent (pkttables)
|
||||
new userspace library (libiptables) to hide plugins from apps
|
||||
more matches and targets for advanced functions (pool, hashslot)
|
||||
more conntrack and NAT modules (RPC, SNMP, SMB, ...)
|
||||
better IPv6 support (conntrack, more matches / targets)
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Thanks
|
||||
|
||||
Thanks to
|
||||
the BBS people, Z-Netz, FIDO, ...
|
||||
for heavily increasing my computer usage in 1992
|
||||
|
||||
KNF
|
||||
for bringing me in touch with the internet as early as 1994
|
||||
for providing a playground for technical people
|
||||
for telling me about the existance of Linux!
|
||||
|
||||
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
|
||||
for implementing (one of?) the world's best TCP/IP stacks
|
||||
|
||||
Paul 'Rusty' Russell
|
||||
for starting the netfilter/iptables project
|
||||
for trusting me to maintain it today
|
||||
|
||||
Astaro AG
|
||||
for sponsoring parts of my netfilter work
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Availability of slides / Links
|
||||
|
||||
The slides and the an according paper of this presentation are available at
|
||||
http://www.gnumonks.org/
|
||||
|
||||
The netfilter homepage
|
||||
http://www.netfilter.org/
|
||||
|
|
@ -0,0 +1,537 @@
|
|||
\documentclass{article}
|
||||
\usepackage{german}
|
||||
\usepackage{fancyheadings}
|
||||
\usepackage{a4}
|
||||
|
||||
\setlength{\oddsidemargin}{0in}
|
||||
\setlength{\evensidemargin}{0in}
|
||||
\setlength{\topmargin}{0.0in}
|
||||
\setlength{\headheight}{0in}
|
||||
\setlength{\headsep}{0in}
|
||||
\setlength{\textwidth}{6.5in}
|
||||
\setlength{\textheight}{9.5in}
|
||||
\setlength{\parindent}{0in}
|
||||
\setlength{\parskip}{0.05in}
|
||||
|
||||
|
||||
\begin{document}
|
||||
\title{Linux 2.4.x netfilter/iptables firewalling internals}
|
||||
|
||||
\author{Harald Welte\\
|
||||
laforge@gnumonks.org\\
|
||||
\copyright{}2002 H. Welte}
|
||||
|
||||
\date{25. April 2002}
|
||||
|
||||
\maketitle
|
||||
|
||||
\setcounter{section}{0}
|
||||
\setcounter{subsection}{0}
|
||||
\setcounter{subsubsection}{0}
|
||||
|
||||
\section{Introduction}
|
||||
The Linux 2.4.x kernel series has introduced a totally new kernel firewalling
|
||||
subsystem. It is much more than a plain successor of ipfwadm or ipchains.
|
||||
|
||||
The netfilter/iptables project has a very modular design and it's
|
||||
sub-projects can be split in several parts: netfilter, iptables, connection
|
||||
tracking, NAT and packet mangling.
|
||||
|
||||
While most users will already have learned how to use the basic functions
|
||||
of netfilter/iptables in order to convert their old ipchains firewalls to
|
||||
iptables, there's more advanced but less used functionality in
|
||||
netfilter/iptables.
|
||||
|
||||
The presentation covers the design principles behind the netfilter/iptables
|
||||
implementation. This knowledge enables us to understand how the individual
|
||||
parts of netfilter/iptables fit together, and for which potential applications
|
||||
this is useful.
|
||||
|
||||
\section{Internal netfilter/iptables architecture}
|
||||
|
||||
\subsection{Netfilter hooks in protocol stacks}
|
||||
|
||||
One of the major motivations behind the redesign of the linux packet
|
||||
filtering and NAT system during the 2.3.x kernel series was the widespread
|
||||
firewall specific code parts within the core IPv4 stack. Ideally the core
|
||||
IPv4 stack (as used by regular hosts and routers) shouldn't contain any
|
||||
firewalling specific code, resulting in no unwanted interaction and less
|
||||
code complexity. This desire lead to the invention of {\it netfilter}.
|
||||
|
||||
\subsubsection{Architecture of netfilter}
|
||||
|
||||
Netfilter is basically a system of callback functions within the network
|
||||
stack. It provides a non-portable API towards in-kernel networking
|
||||
extensions.
|
||||
|
||||
What we call {\it netfilter hook} is a well-defined call-out point within a
|
||||
layer three protocol stack, such as IPv4, IPv6 or DECnet. Any layer three
|
||||
network stack can define an arbitrary number of hooks, usually placed at
|
||||
strategic points within the packet flow.
|
||||
|
||||
Any other kernel code can now subsequently register callback functions for
|
||||
any of these hooks. As in most sytems will be more than one callback
|
||||
function registered for a particular hook, a {\it priority} is specified upon
|
||||
registration of the callback function. This priority defines the order in
|
||||
which the individual callback functions at a particular hook are called.
|
||||
|
||||
The return value of any registered callback functions can be:
|
||||
\begin{itemize}
|
||||
\item
|
||||
{\bf NF\_ACCEPT}: continue traversal as usual
|
||||
\item
|
||||
{\bf NF\_DROP}: drop the packet; do not continue traversal
|
||||
\item
|
||||
{\bf NF\_STOLEN}: callback function has taken over the packet; do not continue
|
||||
\item
|
||||
{\bf NF\_QUEUE}: enqueue the packet to userspace
|
||||
\item
|
||||
{\bf NF\_REPEAT}: call this hook again
|
||||
\end{itemize}
|
||||
|
||||
\subsubsection{Netfilter hooks within IPv4}
|
||||
|
||||
The IPv4 stack provides five netfilter hooks, which are placed at the
|
||||
following peculiar places within the code:
|
||||
|
||||
\begin{verbatim}
|
||||
--->[1]--->[ROUTE]--->[3]--->[4]--->
|
||||
| ^
|
||||
| |
|
||||
| [ROUTE]
|
||||
v |
|
||||
[2] [5]
|
||||
| ^
|
||||
| |
|
||||
v |
|
||||
|
||||
local processes
|
||||
\end{verbatim}
|
||||
|
||||
Packets received on any network interface arrive at the left side of the
|
||||
diagram. After the verification of the IP header checksum, the
|
||||
NF\_IP\_PRE\_ROUTING [1] hook is traversed.
|
||||
|
||||
If they ``survive'' (i.e. NF\_ACCEPT is returned), the packet enters the
|
||||
routing code. Where we continue from here depends on the destintion of the
|
||||
packet.
|
||||
|
||||
Packets with a local destination (i.e. packets where the destination address is
|
||||
one of the own IP addresses of the host) traverse the NF\_IP\_LOCAL\_IN [2]
|
||||
hook. If all callback function return NF\_ACCEPT, the packet is finally passed
|
||||
to the socket code, which eventually passes the packet to a local process.
|
||||
|
||||
Packets with a remote destination (i.e. packets which are forwarded by the
|
||||
local machine) traverse the NF\_IP\_FORWARD [3] hook. If they ``survive'',
|
||||
they finally pass the NF\_IP\_POST\_ROUTING [4] hook and are sent off the
|
||||
outgoing network interface.
|
||||
|
||||
Locally generated packets first traverse the NF\_IP\_LOCAL\_OUT [5] hook, then
|
||||
enter the routing code, and finally go through the NF\_IP\_POST\_ROUTING [4]
|
||||
hook before being sent off the outgoing network interface.
|
||||
|
||||
\subsubsection{Netfilter hooks within IPv6}
|
||||
|
||||
As the IPv4 and IPv6 protocols are very similar, the netfilter hooks within the
|
||||
IPv6 stack are placed at exactly the same locations as in the IPv4 stack. The
|
||||
only change are the hook names: NF\_IP6\_PRE\_ROUTING, NF\_IP6\_LOCAL\_IN,
|
||||
NF\_IP6\_FORWARD, NF\_IP6\_POST\_ROUTING, NF\_IP6\_LOCAL\_OUT.
|
||||
|
||||
\subsubsection{Netfilter hooks within DECnet}
|
||||
|
||||
There are seven decnet hooks. The first five hooks (NF\_DN\_PRE\_ROUTING,
|
||||
NF\_DN\_LOCAL\_IN, NF\_DN\_FORWARD, NF\_DN\_LOCAL\_OUT, NF\_DN\_POST\_ROUTING)
|
||||
are prretty much the same as in IPv4. The last two hooks (NF\_DN\_HELLO,
|
||||
NF\_DN\_ROUTE) are used in conjunction with DECnet Hello and Routing packets.
|
||||
|
||||
\subsubsection{Netfilter hooks within ARP}
|
||||
|
||||
Recent kernels\footnote{IIRC, starting with 2.4.19-pre3} have added support for netfilter hooks within the ARP code.
|
||||
There are two hooks: NF\_ARP\_IN and NF\_ARP\_OUT, for incoming and outgoing
|
||||
ARP packets respectively.
|
||||
|
||||
\subsubsection{Netfilter hooks within IPX}
|
||||
|
||||
There have been experimental patches to add netfilter hooks to the IPX code,
|
||||
but they never got integrated into the kernel source.
|
||||
|
||||
\subsection{Packet selection using IP Tables}
|
||||
|
||||
The IP tables core (ip\_tables.o) provides a generic layer for evaluation
|
||||
of rulesets.
|
||||
|
||||
An IP table consists out of an arbitrary number of {\it chains}, which in turn
|
||||
consist out of a linear list of {\it rules}, which again consist out of any
|
||||
number of {\it matches} and one {\it target}.
|
||||
|
||||
{\it Chains} can further be devided into two classes: Either {\it builtin
|
||||
chains} or {\it user-defined chains}. Builtin chains are always present, they
|
||||
are created upon table registration. They are also the entry points for table
|
||||
iteration. User defined chains are created at runtime upon user interaction.
|
||||
|
||||
{\it Matches} specify the matching criteria, there can be zero or more matches
|
||||
|
||||
{\it Targets} specify the action which is to be executed in case {\bf all}
|
||||
matches match. There can only be a single target per rule.
|
||||
|
||||
Matches and targets can either be {\it builtin} or {\it linux kernel modules}.
|
||||
|
||||
There are two special targets:
|
||||
\begin{itemize}
|
||||
\item
|
||||
By using a chain name as target, it is possible to jump to the respective chain
|
||||
in case the matches match.
|
||||
\item
|
||||
By using the RETURN target, it is possible to return to the previous (calling)
|
||||
chain
|
||||
\end{itemize}
|
||||
|
||||
The IP tables core handles the following functions
|
||||
\begin{itemize}
|
||||
\item
|
||||
Registering and unregistering tables
|
||||
\item
|
||||
Registering and unregistering matches and targets (can be implemented as linux kernel modules)
|
||||
\item
|
||||
Kernel / userspace interface for manipulation of IP tables
|
||||
\item
|
||||
Traversal of IP tables
|
||||
\end{itemize}
|
||||
|
||||
\subsubsection{Packet filtering unsing the ``filter'' table}
|
||||
|
||||
Traditional packet filtering (i.e. the successor to ipfwadm/ipchains) takes
|
||||
place in the ``filter'' table. Packet filtering works like a sieve: A packet
|
||||
is (in the end) either dropped or accepted - but never modified.
|
||||
|
||||
The ``filter'' table is implemented in the {\it iptable\_filter.o} module
|
||||
and contains three builtin chains:
|
||||
|
||||
\begin{itemize}
|
||||
\item
|
||||
{\bf INPUT} attaches to NF\_IP\_LOCAL\_IN
|
||||
\item
|
||||
{\bf FORWARD} attaches to NF\_IP\_FORWARD
|
||||
\item
|
||||
{\bf OUTPUT} attaches to NF\_IP\_LOCAL\_OUT
|
||||
\end{itemize}
|
||||
|
||||
The placement of the chains / hooks is done in such way, that evey concievable
|
||||
packet always traverses only one of the built-in chains. Packets destined for
|
||||
the local host traverse only INPUT, packets forwarded only FORWARD and
|
||||
locally-originated packets only OUTPUT.
|
||||
|
||||
\subsubsection{Packet mangling using the ``mangle'' table}
|
||||
|
||||
As stated above, operations which would modify a packet do not belong in the
|
||||
``filter'' table. The ``mangle'' table is available for all kinds of packet
|
||||
manipulation - but not manipulation of addresses (which is NAT).
|
||||
|
||||
The mangle table attaches to all five netfilter hooks and provides the
|
||||
respectiva builtin chains (PREROUTING, INPUT, FORWARD, OUTPUT, POSTROUTING)
|
||||
\footnote{This has changed through recent 2.4.x kernel series, old kernels may
|
||||
only support three (PREROUTING, POSTROUTING, OUTPUT) chains.}.
|
||||
|
||||
\subsection{Connection Tracking Subsystem}
|
||||
|
||||
Traditional packet filters can only match on matching criteria within the
|
||||
currently processed packet, like source/destination IP address, port numbers,
|
||||
TCP flags, etc. As most applications have a notion of connections or at least
|
||||
a request/response style protocol, there is a lot of information which can not
|
||||
be derived from looking at a single packet.
|
||||
|
||||
Thus, modern (stateful) packet filters attempt to track connections (flows)
|
||||
and their respective protocol states for all traffic through the packet
|
||||
filter.
|
||||
|
||||
Connection tracking within linux is implemented as a netfilter module, called
|
||||
ip\_conntrack.o.
|
||||
|
||||
Before describing the connection tracking subsystem, we need to describe a couple of definitions and primitives used throughout the conntrack code.
|
||||
|
||||
A connection is represented within the conntrack subsystem using {\it struct
|
||||
ip\_conntrack}, also called {\it connection tracking entry}.
|
||||
|
||||
Connection tracking is utilizing {\it conntrack tuples}, which are tuples
|
||||
consisting out of (srcip, srcport, dstip, dstport, l4prot). A connection is
|
||||
uniquely identified by two tuples: The tuple in the original direction
|
||||
(IP\_CT\_DIR\_ORIGINAL) and the tuple for the reply direction
|
||||
(IP\_CT\_DIR\_REPLY).
|
||||
|
||||
Connection tracking itself does not drop packets\footnote{well, in some rare
|
||||
cases in combination with NAT it needs to drop. But don't tell anyone, this is
|
||||
secret.} or impose any policy. It just associates every packet with a
|
||||
connection tracking entry, which in turn has a particular state. All other
|
||||
kernel code can use this state information\footnote{state information is
|
||||
internally represented via the {\it struct sk\_buff.nfct} structure member of a
|
||||
packet.}.
|
||||
|
||||
\subsubsection{Integration of conntrack with netfilter}
|
||||
|
||||
If the ip\_conntrack.o module is registered with netfilter, it attaches to the
|
||||
NF\_IP\_PRE\_ROUTING, NF\_IP\_POST\_ROUTING, NF\_IP\_LOCAL\_IN and
|
||||
NF\_IP\_LOCAL\_OUT hooks.
|
||||
|
||||
Because forwarded packets are the most common case on firewalls, I will only
|
||||
describe how connection tracking works for forwarded packets. The two relevant
|
||||
hooks for forwarded packets are NF\_IP\_PRE\_ROUTING and NF\_IP\_POST\_ROUTING.
|
||||
|
||||
Every time a packet arrives at the NF\_IP\_PRE\_ROUTING hook, connection
|
||||
tracking creates a conntrack tuple from the packet. It then compares this
|
||||
tuple to the original and reply tuples of all already-seen connections
|
||||
\footnote{Of course this is not implemented as a linear search over all existing connections.} to find out if this just-arrived packet belongs to any existing
|
||||
connection. If there is no match, a new conntrack table entry (struct
|
||||
ip\_conntrack) is created.
|
||||
|
||||
Let's assume the case where we have already existing connections but are
|
||||
starting from scratch.
|
||||
|
||||
The first packet comes in, we derive the tuple from the packet headers, look up
|
||||
the conntrack hash table, don't find any matching entry. As a result, we
|
||||
create a new struct ip\_conntrack. This struct ip\_conntrack is filled with
|
||||
all necessarry data, like the original and reply tuple of the connection.
|
||||
How do we know the reply tuple? By inverting the source and destination
|
||||
parts of the original tuple.\footnote{So why do we need two tuples, if they can
|
||||
be derived from each other? Wait until we discuss NAT.}
|
||||
Please note that this new struct ip\_conntrack is {\bf not} yet placed
|
||||
into the conntrack hash table.
|
||||
|
||||
The packet is now passed on to other callback functions which have registered
|
||||
with a lower priority at NF\_IP\_PRE\_ROUTING. It then continues traversal of
|
||||
the network stack as usual, including all respective netfilter hooks.
|
||||
|
||||
If the packet survives (i.e. is not dropped by the routing code, network stack,
|
||||
firewall ruleset, ...), it re-appears at NF\_IP\_POST\_ROUTING. In this case,
|
||||
we can now safely assume that this packet will be sent off on the outgoing
|
||||
interface, and thus put the connection tracking entry which we created at
|
||||
NF\_IP\_PRE\_ROUTING into the conntrack hash table. This process is called
|
||||
{\it confirming the conntrack}.
|
||||
|
||||
The connection tracking code itself is not monolithic, but consists out of a
|
||||
couple of seperate modules\footnote{They don't actually have to be seperate
|
||||
kernel modules; e.g. TCP, UDP and ICMP tracking modules are all part of
|
||||
the linux kernel module ip\_conntrack.o}. Besides the conntrack core, there
|
||||
are two important kind of modules: Protocol helpers and application helpers.
|
||||
|
||||
Protocol helpers implement the layer-4-protocol specific parts. They currently
|
||||
exist for TCP, UDP and ICMP (an experimental helper for GRE exists).
|
||||
|
||||
\subsubsection{TCP connection tracking}
|
||||
|
||||
As TCP is a connection oriented protocol, it is not very difficult to imagine
|
||||
how conntection tracking for this protocol could work. There are well-defined
|
||||
state transitions possible, and conntrack can decide which state transitions
|
||||
are valid within the TCP specification. In reality it's not all that easy,
|
||||
since we cannot assume that all packets that pass the packet filter actually
|
||||
arrive at the receiving end, ...
|
||||
|
||||
It is noteworthy that the standard connection tracking code does {\bf not}
|
||||
do TCP sequence number and window tracking. A well-maintained patch to add
|
||||
this feature exists almost as long as connection tracking itself. It will
|
||||
be integrated with the 2.5.x kernel. The problem with window tracking is
|
||||
it's bad interaction with connection pickup. The TCP conntrack code is able to
|
||||
pick up already existing connections, e.g. in case your firewall was rebooted.
|
||||
However, connection pickup is conflicting with TCP window tracking: The TCP
|
||||
window scaling option is only transferred at connection setup time, and we
|
||||
don't know about it in case of pickup...
|
||||
|
||||
\subsubsection{ICMP tracking}
|
||||
|
||||
ICMP is not really a connection oriented protocol. So how is it possible to
|
||||
do connection tracking for ICMP?
|
||||
|
||||
The ICMP protocol can be split in two groups of messages
|
||||
|
||||
\begin{itemize}
|
||||
\item
|
||||
ICMP error messages, which sort-of belong to a different connection
|
||||
ICMP error messages are associated {\it RELATED} to a different connection.
|
||||
(ICMP\_DEST\_UNREACH, ICMP\_SOURCE\_QUENCH, ICMP\_TIME\_EXCEEDED,
|
||||
ICMP\_PARAMETERPROB, ICMP\_REDIRECT).
|
||||
\item
|
||||
ICMP queries, which have a request->reply character. So what the conntrack
|
||||
code does, is let the request have a state of {\it NEW}, and the reply
|
||||
{\it ESTABLISHED}. The reply closes the connection immediately.
|
||||
(ICMP\_ECHO, ICMP\_TIMESTAMP, ICMP\_INFO\_REQUEST, ICMP\_ADDRESS)
|
||||
\end{itemize}
|
||||
|
||||
\subsubsection{UDP connection tracking}
|
||||
|
||||
UDP is designed as a connectionless datagram protocol. But most common
|
||||
protocols using UDP as layer 4 protocol have bi-directional UDP communication.
|
||||
Imagine a DNS query, where the client sends an UDP frame to port 53 of the
|
||||
nameserver, and the nameserver sends back a DNS reply packet from it's UDP
|
||||
port 53 to the client.
|
||||
|
||||
Netfilter trats this as a connection. The first packet (the DNS request) is
|
||||
assigned a state of {\it NEW}, because the packet is expected to create a new
|
||||
'connection'. The dns servers' reply packet is marked as {\it ESTABLISHED}.
|
||||
|
||||
\subsubsection{conntrack application helpers}
|
||||
|
||||
More complex application protocols involving multiple connections need special
|
||||
support by a so-called ``conntrack application helper module''. Modules in
|
||||
the stock kernel come for FTP and IRC(DCC). Netfilter CVS currently contains
|
||||
patches for PPTP, H.323, Eggdrop botnet, tftp ald talk. We're still lacking
|
||||
a lot of protocols (e.g. SIP, SMB/CIFS) - but they are unlikely to appear
|
||||
until somebody really needs them and either develops them on his own or
|
||||
funds development.
|
||||
|
||||
\subsubsection{Integration of connection tracking with iptables}
|
||||
|
||||
As stated earlier, conntrack doesn't impose any policy on packets. It just
|
||||
determines the relation of a packet to already existing connections. To base
|
||||
packet filtering decision on this sate information, the iptables {\it state}
|
||||
match can be used. Every packet is within one of the following categories:
|
||||
|
||||
\begin{itemize}
|
||||
\item
|
||||
{\bf NEW}: packet would create a new connection, if it survives
|
||||
\item
|
||||
{\bf ESTABLISHED}: packet is part of an already established connection
|
||||
(either direction)
|
||||
\item
|
||||
{\bf RELATED}: packet is in some way related to an already established connection, e.g. ICMP errors or FTP data sessions
|
||||
\item
|
||||
{\bf INVALID}: conntrack is unable to derive conntrack information from this packet. Please note that all multicast or broadcast packets fall in this category.
|
||||
\end{itemize}
|
||||
|
||||
\subsection{NAT Subsystem}
|
||||
|
||||
The NAT (Network Address Translation) subsystem is probably the worst
|
||||
documented subsystem within the whole framework. This has two reasons: NAT is
|
||||
nasty and complicated. The Linux 2.4.x NAT implementation is easy to use, so
|
||||
nobody needs to know the nasty details.
|
||||
|
||||
Nonetheless, as I was traditionally concentrating most on the conntrack and NAT
|
||||
systems, I will give a short overview.
|
||||
|
||||
NAT uses almost all of the previously described subsystems:
|
||||
\begin{itemize}
|
||||
\item
|
||||
IP tables to specify which packets to NAT in which particular way. NAT
|
||||
registers a ``nat'' table with PREROUTING, POSTROUTING and OUTPUT chains.
|
||||
\item
|
||||
Connection tracking to associate NAT state with the connection.
|
||||
\item
|
||||
Netfilter to do the actuall packet manipulation transparent to the rest of the
|
||||
kernel. NAT registers with NF\_IP\_PRE\_ROUTING, NF\_IP\_POST\_ROUTING,
|
||||
NF\_IP\_LOCAL\_IN and NF\_IP\_LOCAL\_OUT.
|
||||
\end{itemize}
|
||||
|
||||
The NAT implementation supports all kinds of different nat; Source NAT,
|
||||
Destination NAT, NAT to address/port ranges, 1:1 NAT, ...
|
||||
|
||||
This fundamental design principle is still frequently misunderstood:\\
|
||||
The information about which NAT mappings apply to a certain connection
|
||||
is only gathered once - with the first packet of every connection.
|
||||
|
||||
So let's start to look at the life of a poor to-be-nat'ed packet.
|
||||
For ease of understanding, I have chosen to describe the most frequently
|
||||
used NAT scenario: Source NAT of a forwarded packet. Let's assume the
|
||||
packet has an original source address of 1.1.1.1, an original destination
|
||||
address of 2.2.2.2, and is going to be SNAT'ed to 9.9.9.9. Let's further
|
||||
ignore the fact that there are port numbers.
|
||||
|
||||
Once upon a time, our poor packet arrives at NF\_IP\_PRE\_ROUTING, where
|
||||
conntrack has registered with highest priority. This means that a conntrack
|
||||
entry with the following two tuples is created:
|
||||
\begin{verbatim}
|
||||
IP_CT_DIR_ORIGINAL: 1.1.1.1 -> 2.2.2.2
|
||||
IP_CT_DIR_REPLY: 2.2.2.2 -> 1.1.1.1
|
||||
\end{verbatim}
|
||||
After conntrack, the packet traverses the PREROUTING chain of the ``nat''
|
||||
IP table. Since only destination NAT happens at PREROUTING, no action
|
||||
occurs. After it's lengthy way through the rest of the network stack,
|
||||
the packet arrives at the NF\_IP\_POST\_ROUTING hook, where it traverses
|
||||
the POSTROUTING chain of the ``nat'' table. Here it hits a SNAT rule,
|
||||
causing the following actions:
|
||||
\begin{itemize}
|
||||
\item
|
||||
Fill in a {\it struct ip\_nat\_manip}, indicating the new source address
|
||||
and the type of NAT (source NAT at POSTROUTING). This struct is part of the
|
||||
conntrack entry.
|
||||
\item
|
||||
Automatically derive the inverse NAT transormation for the reply packets:
|
||||
Destination NAT at PREROUTING. Fill in another {\it struct ip\_nat\_manip}.
|
||||
\item
|
||||
Alter the REPLY tuple of the conntrack entry to
|
||||
\begin{verbatim}
|
||||
IP_CT_DIR_REPLY: 2.2.2.2 -> 9.9.9.9
|
||||
\end{verbatim}
|
||||
\item
|
||||
Apply the SNAT transformation to the packet
|
||||
\end{itemize}
|
||||
|
||||
Every other packt within this connection, independent of its direction,
|
||||
will only execute the last step. Since all NAT information is connected
|
||||
with the conntrack entry, there is no need to do anything but to apply
|
||||
the same transormations to all packets witin the same connection.
|
||||
|
||||
\subsection{IPv6 Firewalling with ip6tables}
|
||||
|
||||
Yes, Linux 2.4.x comes with a usable, though incomplete system to secure
|
||||
your IPv6 network.
|
||||
|
||||
The parts ported to IPv6 are
|
||||
\begin{itemize}
|
||||
\item
|
||||
IP tables (called IP6 tables)
|
||||
\item
|
||||
The ``filter'' table
|
||||
\item
|
||||
The ``mangle'' table
|
||||
\item
|
||||
The userspace library (libip6tc)
|
||||
\item
|
||||
The command line tool (ip6tables)
|
||||
\end{itemize}
|
||||
|
||||
Due to the lack of conntrack and NAT\footnote{for god's sake we don't have NAT
|
||||
with IPv6}, only traditional, stateless packet filtering is possible. Apart
|
||||
from the obvious matches/targets, ip6tables can match on
|
||||
\begin{itemize}
|
||||
\item
|
||||
{\it EUI64 checker}; verifies if the MAC address of the sender is the same as in the EUI64 64 least significant bits of the source IPv6 address
|
||||
\item
|
||||
{\it frag6 match}, matches on IPv6 fragmentation header
|
||||
\item
|
||||
{\it route6 match}, matches on IPv6 routing header
|
||||
\item
|
||||
{\it ahesp6 match}, matches on SPIDs within AH or ESP over IPv6 packets
|
||||
\end{itemize}
|
||||
|
||||
However, the ip6tables code doesn't seem to be used very widely (yet?).
|
||||
So please expect some potential remaining issues, since it is not tested
|
||||
as heavily as iptables.
|
||||
|
||||
\subsection{Recent Development}
|
||||
|
||||
Please refer to the spoken word at the presentation. Development at the
|
||||
time this paper was written can be quite different from development at the
|
||||
time the presentation is held.
|
||||
|
||||
\section{Thanks}
|
||||
|
||||
I'd like to thank
|
||||
\begin{itemize}
|
||||
\item
|
||||
{\it Linus Torvalds} for starting this interesting UNIX-like kernel
|
||||
\item
|
||||
{\it Alan Cox, David Miller, Alexey Kuznetesov, Andi Kleen} for building
|
||||
(one of?) the world's best TCP/IP stacks.
|
||||
\item
|
||||
{\it Paul ``Rusty'' Russell} for starting the netfilter/iptables project
|
||||
\item
|
||||
{\it The Netfilter Core Team} for continuing the netfilter/iptables effort
|
||||
\item
|
||||
{\it Astaro AG} for partially funding my current netfilter/iptables work
|
||||
\item
|
||||
{\it Conectiva Inc.} for partially funding parts of my past netfilter/iptables
|
||||
work and for inviting me to live in Brazil
|
||||
\item
|
||||
{\it samba.org and Kommunikationsnetz Franken e.V.} for hosting the netfilter
|
||||
homepage, CVS, mailing lists, ...
|
||||
\end{itemize}
|
||||
|
||||
\end{document}
|
|
@ -0,0 +1,50 @@
|
|||
Firewalling mit netfilter/iptables unter Linux 2.4.x
|
||||
|
||||
Der Linux 2.4.x Kernel bietet eine fortgeschrittene Infrastruktur, genannt
|
||||
netfilter, auf deren Basis ein Paketfilter, NAT und sonstige
|
||||
Paket-Manipulationen implementiert sind.
|
||||
|
||||
Das gesamte Firewalling-Subsystem wurde gegenueber Kernel 2.2.x neu entwickelt.
|
||||
Das netfilter/iptables System laesst alles bisher unter Linux existierende
|
||||
(ipfwadm, ipchains) wie aus grauer Vorzeit erscheinen.
|
||||
|
||||
netfilter/iptables bietet neben dem traditionellen Paketfilter auch optional
|
||||
Connection Tracking, mittels dessen sich im Handumdrehen eine Stateful
|
||||
Firewall realisieren laesst. Auch das NAT (Network Address Translation)
|
||||
System ist jetzt flexibel genug, um saemtliche Formen von NAT anbieten
|
||||
zu koennen: source NAT, destination NAT, static NAT, NAPT, ...
|
||||
|
||||
Die hohe Modularitaet resultiert in einer sehr leichten Erweiterbarkeit,
|
||||
so dass in einfacher Weise neue Erweiterungen zum Firewalling-System
|
||||
entwickelt werden koennen.
|
||||
|
||||
Der Vortrag beschreibt die unterschiedlichen Teile des netfilter/iptables
|
||||
Systems und gibt dadurch einen Ueberblick ueber dessen Moeglichkeiten und
|
||||
Anwendungsszenarien. Er beschaeftigt sich mit den folgenden Themen:
|
||||
|
||||
- netfilter/iptables architektur
|
||||
- netfilter hooks im Netzwerk-Stack
|
||||
- IP tables als Regelbeschreibung
|
||||
- Paketfilter
|
||||
- Connection Tracking
|
||||
- Network Address Translation
|
||||
- source NAT
|
||||
- destination NAT
|
||||
- Masquerading
|
||||
- transparent proxy support
|
||||
- Packet mangling
|
||||
- Userspace packet queuing
|
||||
- Userspace packet logging
|
||||
|
||||
|
||||
Voraussetzungen:
|
||||
- Wissen ueber TCP/IP, Routing
|
||||
- Grundlagen ueber Firewalling (insbesondere Paketfilter)
|
||||
- Gewisse Grundkenntnisse ueber die Linux/Unix Architektur
|
||||
|
||||
|
||||
Ueber den Vortragenden:
|
||||
Harald Welte ist seit 1995 aktives KNF-Mitglied und der derzeitige
|
||||
stellvertretende Technische Kontakt des KNF. Er ist der Maintainer des
|
||||
netfilter/iptables Firewalling-Subsystems im Linux 2.4.x und
|
||||
2.5.x Kernel und war massgeblich an dessen Entwicklung beteiligt.
|
|
@ -0,0 +1,466 @@
|
|||
%include "default.mgp"
|
||||
%default 1 bgrad
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
%nodefault
|
||||
%back "blue"
|
||||
|
||||
%center
|
||||
%size 7
|
||||
|
||||
|
||||
The netfilter/iptables framework in
|
||||
Linux 2.4.x
|
||||
|
||||
|
||||
%center
|
||||
%size 4
|
||||
by
|
||||
|
||||
Harald Welte <laforge@gnumonks.org>
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Contents
|
||||
|
||||
|
||||
Introduction
|
||||
Netfilter hooks in protocol stacks
|
||||
Packet selection based on IP Tables
|
||||
The Connection Tracking Subsystem
|
||||
The NAT Subsystem based on netfilter + iptables
|
||||
Packet filtering using the 'filter' table
|
||||
Packet mangling using the 'mangle' table
|
||||
Advanced netfilter concepts
|
||||
Current development and Future
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Introduction
|
||||
|
||||
Why did we need netfilter/iptables?
|
||||
Because ipchains...
|
||||
|
||||
has no infrastructure for passing packets to userspace
|
||||
makes transparent proxying extremely difficult
|
||||
has interface address dependent Packet filter rules
|
||||
has Masquerading implemented as part of packet filtering
|
||||
code is too complex and intermixed with core ipv4 stack
|
||||
is neither modular nor extensible
|
||||
only barely supports one special case of NAT (masquerading)
|
||||
has only stateless packet filtering
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Introduction
|
||||
|
||||
Who's behind netfilter/iptables
|
||||
Paul 'Rusty' Russel
|
||||
co-author of iptables in Linux 2.2
|
||||
was paid by Watchguard for about one Year of development
|
||||
James Morris
|
||||
userspace queuing (kernel, library and tools)
|
||||
REJECT target
|
||||
Marc Boucher
|
||||
NAT and packet filtering controlled by one command
|
||||
Mangle table
|
||||
Harald Welte
|
||||
Conntrack+NAT helper infrastructure (newnat)
|
||||
Userspace packet logging (ULOG)
|
||||
PPTP and IRC conntrack/NAT helpers
|
||||
Jozsef Kadlecsik
|
||||
TCP window tracking
|
||||
H.323 conntrack + NAT helper
|
||||
Continued newnat development
|
||||
Non-core team contributors
|
||||
http://www.netfilter.org/scoreboard/
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Netfilter Hooks
|
||||
|
||||
What is netfilter?
|
||||
|
||||
System of callback functions within network stack
|
||||
Callback function to be called for every packet traversing certain point (hook) within network stack
|
||||
Protocol independent framework
|
||||
Hooks in layer 3 stacks (IPv4, IPv6, DECnet, ARP)
|
||||
Multiple kernel modules can register with each of the hooks
|
||||
Asynchronous packet handling in userspace (ip_queue)
|
||||
|
||||
Traditional packet filtering, NAT, ... is implemented on top of this framework
|
||||
|
||||
Can be used for other stuff interfacing with the core network stack, like DECnet routing daemon.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Netfilter Hooks
|
||||
|
||||
Netfilter architecture in IPv4
|
||||
%font "courier"
|
||||
|
||||
--->[1]--->[ROUTE]--->[3]--->[4]--->
|
||||
| ^
|
||||
| |
|
||||
| [ROUTE]
|
||||
v |
|
||||
[2] [5]
|
||||
| ^
|
||||
| |
|
||||
v |
|
||||
|
||||
%font "standard"
|
||||
1=NF_IP_PRE_ROUTING
|
||||
2=NF_IP_LOCAL_IN
|
||||
3=NF_IP_FORWARD
|
||||
4=NF_IP_POST_ROUTING
|
||||
5=NF_IP_LOCAL_OUT
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Netfilter Hooks
|
||||
|
||||
Netfilter Hooks
|
||||
|
||||
Any kernel module may register a callback function at any of the hooks
|
||||
|
||||
The module has to return one of the following constants
|
||||
|
||||
NF_ACCEPT continue traversal as normal
|
||||
NF_DROP drop the packet, do not continue
|
||||
NF_STOLEN I've taken over the packet do not continue
|
||||
NF_QUEUE enqueue packet to userspace
|
||||
NF_REPEAT call this hook again
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
IP tables
|
||||
|
||||
Packet selection using IP tables
|
||||
|
||||
The kernel provides generic IP tables support
|
||||
|
||||
Each kernel module may create it's own IP table
|
||||
|
||||
The three major parts of 2.4 firewalling subsystem are implemented using IP tables
|
||||
Packet filtering table 'filter'
|
||||
NAT table 'nat'
|
||||
Packet mangling table 'mangle'
|
||||
|
||||
Can potentially be used for other stuff, i.e. IPsec SPDB
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
IP Tables
|
||||
|
||||
Managing chains and tables
|
||||
|
||||
An IP table consists out of multiple chains
|
||||
A chain consists out of a list of rules
|
||||
Every single rule in a chain consists out of
|
||||
match[es] (rule executed if all matches true)
|
||||
target (what to do if the rule is matched)
|
||||
|
||||
%size 4
|
||||
matches and targets can either be builtin or implemented as kernel modules
|
||||
|
||||
%size 6
|
||||
The userspace tool iptables is used to control IP tables
|
||||
handles all different kinds of IP tables
|
||||
supports a plugin/shlib interface for target/match specific options
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
IP Tables
|
||||
|
||||
Basic iptables commands
|
||||
|
||||
To build a complete iptables command, we must specify
|
||||
which table to work with
|
||||
which chain in this table to use
|
||||
an operation (insert, add, delete, modify)
|
||||
one or more matches (optional)
|
||||
a target
|
||||
|
||||
The syntax is
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t table -Operation chain -j target match(es)
|
||||
%font "standard"
|
||||
%size 5
|
||||
|
||||
Example:
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t filter -A INPUT -j ACCEPT -p tcp --dport smtp
|
||||
%font "standard"
|
||||
%size 5
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
IP Tables
|
||||
|
||||
Matches
|
||||
Basic matches
|
||||
-p protocol (tcp/udp/icmp/...)
|
||||
-s source address (ip/mask)
|
||||
-d destination address (ip/mask)
|
||||
-i incoming interface
|
||||
-o outgoing interface
|
||||
|
||||
Match extensions (examples)
|
||||
tcp/udp TCP/udp source/destination port
|
||||
icmp ICMP code/type
|
||||
ah/esp AH/ESP SPID match
|
||||
mac source MAC address
|
||||
mark nfmark
|
||||
length match on length of packet
|
||||
limit rate limiting (n packets per timeframe)
|
||||
owner owner uid of the socket sending the packet
|
||||
tos TOS field of IP header
|
||||
ttl TTL field of IP header
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
IP Tables
|
||||
|
||||
Targets
|
||||
very dependent on the particular table.
|
||||
|
||||
Table specific targets will be discussed later
|
||||
|
||||
Generic Targets, always available
|
||||
ACCEPT accept packet within chain
|
||||
DROP silently drop packet
|
||||
QUEUE enqueue packet to userspace
|
||||
LOG log packet via syslog
|
||||
ULOG log packet via ulogd
|
||||
RETURN return to previous (calling) chain
|
||||
foobar jump to user defined chain
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Packet Filtering
|
||||
|
||||
Overview
|
||||
|
||||
Implemented as 'filter' table
|
||||
Registers with three netfilter hooks
|
||||
|
||||
NF_IP_LOCAL_IN (packets destined for the local host)
|
||||
NF_IP_FORWARD (packets forwarded by local host)
|
||||
NF_IP_LOCAL_OUT (packets from the local host)
|
||||
|
||||
Each of the three hooks has attached one chain (INPUT, FORWARD, OUTPUT)
|
||||
|
||||
Every packet passes exactly one of the three chains. Note that this is very different compared to the old 2.2.x ipchains behaviour.
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Packet Filtering
|
||||
|
||||
Targets available within 'filter' table
|
||||
|
||||
Builtin Targets to be used in filter table
|
||||
ACCEPT accept the packet
|
||||
DROP silently drop the packet
|
||||
QUEUE enqueue packet to userspace
|
||||
RETURN return to previous (calling) chain
|
||||
foobar user defined chain
|
||||
|
||||
Targets implemented as loadable modules
|
||||
REJECT drop the packet but inform sender
|
||||
MIRROR change source/destination IP and resend
|
||||
LOG log via syslog
|
||||
ULOG log via userspace
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Connection tracking...
|
||||
|
||||
implemented seperately from NAT
|
||||
enables stateful filtering
|
||||
implementation
|
||||
hooks into NF_IP_PRE_ROUTING to track packets
|
||||
hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to see if packet passed filtering rules
|
||||
protocol modules (currently TCP/UDP/ICMP)
|
||||
application helpers currently (FTP,IRC,H.323,talk,SNMP)
|
||||
divides packets in the following four categories
|
||||
NEW - would establish new connection
|
||||
ESTABLISHED - part of already established connection
|
||||
RELATED - is related to established connection
|
||||
INVALID - (multicast, errors...)
|
||||
does _NOT_ filter packets itself
|
||||
can be utilized by iptables using the 'state' match
|
||||
is used by NAT Subsystem
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Network Address Translation
|
||||
|
||||
Overview
|
||||
|
||||
Previous Linux Kernels only implemented one special case of NAT: Masquerading
|
||||
Linux 2.4.x can do any kind of NAT.
|
||||
NAT subsystem implemented on top of netfilter, iptables and conntrack
|
||||
NAT subsystem registers with all five netfilter hooks
|
||||
'nat' Table registers chains PREROUTING, POSTROUTING and OUTPUT
|
||||
Following targets available within 'nat' Table
|
||||
SNAT changes the packet's source whille passing NF_IP_POST_ROUTING
|
||||
DNAT changes the packet's destination while passing NF_IP_PRE_ROUTING
|
||||
MASQUERADE is a special case of SNAT
|
||||
REDIRECT is a special case of DNAT
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Network Address Translation
|
||||
|
||||
Source NAT
|
||||
SNAT Example:
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t nat -A POSTROUTING -j SNAT --to-source 1.2.3.4 -s 10.0.0.0/8
|
||||
%font "standard"
|
||||
%size 4
|
||||
|
||||
MASQUERADE Example:
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t nat -A POSTROUTING -j MASQUERADE -o ppp0
|
||||
%font "standard"
|
||||
%size 5
|
||||
|
||||
Destination NAT
|
||||
DNAT example
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t nat -A PREROUTING -j DNAT --to-destination 1.2.3.4:8080 -p tcp --dport 80 -i eth1
|
||||
%font "standard"
|
||||
%size 4
|
||||
|
||||
REDIRECT example
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t nat -A PREROUTING -j REDIRECT --to-port 3128 -i eth1 -p tcp --dport 80
|
||||
%font "standard"
|
||||
%size 5
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Packet Mangling
|
||||
|
||||
Purpose of mangle table
|
||||
packet manipulation except address manipulation
|
||||
|
||||
Integration with netfilter
|
||||
'mangle' table hooks in all five netfilter hooks
|
||||
priority: after conntrack
|
||||
|
||||
Targets specific to the 'mangle' table:
|
||||
DSCP - manipulate DSCP field
|
||||
IPV4OPTSSTRIP - strip IPv4 options
|
||||
MARK - change the nfmark field of the skb
|
||||
TCPMSS - set TCP MSS option
|
||||
TOS - manipulate the TOS bits
|
||||
TTL - set / increase / decrease TTL field
|
||||
|
||||
Simple example:
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t mangle -A PREROUTING -j MARK --set-mark 10 -p tcp --dport 80
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Advanced Netfilter concepts
|
||||
|
||||
%size 4
|
||||
Userspace logging
|
||||
flexible replacement for old syslog-based logging
|
||||
packets to userspace via multicast netlink sockets
|
||||
easy-to-use library (libipulog)
|
||||
plugin-extensible userspace logging daemon (ulogd)
|
||||
Can even be used to directly log into MySQL
|
||||
|
||||
Queuing
|
||||
reliable asynchronous packet handling
|
||||
packets to userspace via unicast netlink socket
|
||||
easy-to-use library (libipq)
|
||||
provides Perl bindings
|
||||
experimental queue multiplex daemon (ipqmpd)
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Current Development and Future
|
||||
|
||||
Netfilter (although it proved very stable) is still work in progress.
|
||||
|
||||
Areas of current development
|
||||
infrastructure for conntrack manipulation from userspace
|
||||
failover of stateful firewalls
|
||||
making iptables layer3 independent (pkttables)
|
||||
new userspace library (libiptables) to hide plugins from apps
|
||||
more matches and targets for advanced functions (pool, hashslot)
|
||||
more conntrack and NAT modules (RPC, SNMP, SMB, ...)
|
||||
better IPv6 support (conntrack, more matches / targets)
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Thanks
|
||||
|
||||
Thanks to
|
||||
the BBS people, Z-Netz, FIDO, ...
|
||||
for heavily increasing my computer usage in 1992
|
||||
|
||||
KNF
|
||||
for bringing me in touch with the internet as early as 1995
|
||||
for providing a playground for technical people
|
||||
for telling me about the existance of Linux!
|
||||
|
||||
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
|
||||
for implementing (one of?) the world's best TCP/IP stacks
|
||||
|
||||
Paul 'Rusty' Russell
|
||||
for starting the netfilter/iptables project
|
||||
for trusting me to maintain it today
|
||||
|
||||
Linux User Group Nuernberg (ALIGN, LUG-N)
|
||||
for helping me with my initial Linux problems
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Availability of slides / Links
|
||||
|
||||
The slides and the an according paper of this presentation are available at
|
||||
http://www.gnumonks.org/
|
||||
|
||||
The netfilter homepage
|
||||
http://www.netfilter.org/
|
||||
|
|
@ -0,0 +1,201 @@
|
|||
%include "default.mgp"
|
||||
%default 1 bgrad
|
||||
%%%%%%%%%%%%%%%%
|
||||
%deffont "typewriter" tfont "MONOTYPE.TTF"
|
||||
|
||||
%page
|
||||
%nodefault
|
||||
%back "blue"
|
||||
|
||||
%center
|
||||
%size 7
|
||||
|
||||
|
||||
TCP state + windowtracking
|
||||
|
||||
|
||||
%center
|
||||
%size 4
|
||||
by
|
||||
|
||||
Harald Welte <laforge@gnumonks.org>
|
||||
|
||||
|
||||
%page
|
||||
TCP state + window tracking
|
||||
What? Why?
|
||||
|
||||
|
||||
TCP is a stateful protocol, each endpoint is a state machine
|
||||
|
||||
What is TCP state / windowtracking?
|
||||
Some intermediate System (Router/Firewall) is trying to derive the current state of the two TCP endpoints
|
||||
|
||||
Why does somebody want TCP state / windowtracking
|
||||
Evaluation of TCP stack implementations
|
||||
Hide/Protect broken implementations from a public network
|
||||
|
||||
|
||||
%page
|
||||
TCP state + windowtracking
|
||||
TCP basics
|
||||
|
||||
states of a TCP endpoint:
|
||||
LISTEN: port waiting for connection request from remote end
|
||||
SYN_SENT: we've sent a SYN packet and not received anything yet
|
||||
SYN_RECEIVED: We've received a SYN in reply to our SYN
|
||||
ESTABLISHED: fully established TCP connection
|
||||
FIN_WAIT1: waiting for FIN from remote end or ACK of sent FIN
|
||||
FIN_WAIT2 waiting for FIN from remote end
|
||||
TIME_WAIT: waiting for enough time to pass to be sure the remote end received the ACK of its FIN
|
||||
CLOSED: no connection state at all
|
||||
CLOSE_WAIT: waiting for a connection termination request from local user
|
||||
CLOSING: waiting for a connection termination request acknowledgement from the remote end
|
||||
LAST_ACK: Waiting for ACK of the FIN previously sent to remote end
|
||||
|
||||
%page
|
||||
TCP state + windowtracking
|
||||
TCP basics
|
||||
|
||||
sequence numbers
|
||||
every octet has a corrsponding sequence number
|
||||
sequence number is increased by one for every payload octet sent
|
||||
receiver acknowledges last received contiguous sequence number (cumulative ack)
|
||||
EXTENSION: selective acknowledgement (SACK) option, RFC2018
|
||||
receiver can specify seperate sequencenumber blocks it has received
|
||||
|
||||
sliding window protocol
|
||||
receiver advertises the size of the receive window
|
||||
sender can only send up to 'window' number of octets which are not ACK'ed yet
|
||||
EXTENSION: window scaling, RFC1323
|
||||
window size of 16bit is too small for high bandwith links, thus window scaling was introduced
|
||||
|
||||
%page
|
||||
TCP state + windowtracking
|
||||
TCP state tracking
|
||||
|
||||
Where do we do TCP state tracking?
|
||||
state tracker needs to see _all_ packets in both directions
|
||||
problems with asymmetric routing!
|
||||
|
||||
So where's the Problem?
|
||||
IP is an unreliable, best-effort protocol
|
||||
If man in the middle does observe a packet, he can make no assumption on whether it actually arrives at the receiver.
|
||||
|
||||
%page
|
||||
TCP state + window tracking
|
||||
Problems
|
||||
|
||||
|
||||
Example scenario 1
|
||||
A sends SYN to B
|
||||
man in the middle saves state as SYN_SENT
|
||||
B sends SYN/ACK to A
|
||||
man in the middle detects state transition to SYN_RECEIVED
|
||||
SYN/ACK doesn't arrive at A
|
||||
somebody spoofs ACK A->B to firewall
|
||||
man in the middle detects state transition to ESTABLISHED
|
||||
==> Any traffic between A and B will be accepted (wrong!)
|
||||
|
||||
%page
|
||||
TCP state + window tracking
|
||||
Problems
|
||||
|
||||
|
||||
Example scenario 2
|
||||
fully established TCP connection
|
||||
A sends FIN to B
|
||||
man in the middle saves state to FIN_WAIT1
|
||||
B sends FIN/ACK to A
|
||||
man in the middle saves state CLOSING/TIME_WAIT
|
||||
FIN/ACK doesn't arrive at A
|
||||
B retransmits FIN/ACK to A
|
||||
man in the middle doesn't accept any further packets
|
||||
.oO(booom)Oo.
|
||||
|
||||
%page
|
||||
TCP state + window tracking
|
||||
Problems
|
||||
|
||||
|
||||
Example scenario 3 (FIN/RST spoofing without windowtracking)
|
||||
fully established TCP connection
|
||||
evil guy spoofs FIN A->B (with guessed sequence number
|
||||
man in the middle saves satet as FIN_WAIT1
|
||||
B ignores FIN/ACK because of wrong sequence number
|
||||
A sends further segments to B
|
||||
man in the middle doesn't accept further segments after FIN was sent in this direction
|
||||
.oO(booom)Oo.
|
||||
|
||||
Solution: Real Window tracking
|
||||
See paper by Guido van Rooj
|
||||
|
||||
|
||||
%page
|
||||
TCP state + window tracking
|
||||
Further Problems
|
||||
|
||||
|
||||
pickup of already established connections
|
||||
window scaling sucks in this case
|
||||
window tracking has to be disabled in that case
|
||||
|
||||
selective acknowledgements
|
||||
man-in-the-middle needs to track all selectively acknowledged segments
|
||||
this can draw lots of resources at the man in the middle and is prone to DoS
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
TCP state + window tracking
|
||||
conntrack subsystem of netfilter
|
||||
|
||||
Netfilter architecture in IPv4
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
|
||||
--->[1]--->[ROUTE]--->[3]--->[4]--->
|
||||
| ^
|
||||
| |
|
||||
| [ROUTE]
|
||||
v |
|
||||
[2] [5]
|
||||
| ^
|
||||
| |
|
||||
v |
|
||||
|
||||
%font "standard"
|
||||
1=NF_IP_PRE_ROUTING
|
||||
2=NF_IP_LOCAL_IN
|
||||
3=NF_IP_FORWARD
|
||||
4=NF_IP_POST_ROUTING
|
||||
5=NF_IP_LOCAL_OUT
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
TCP state + window tracking
|
||||
conntrack subsystem of netfilter
|
||||
|
||||
Connection tracking...
|
||||
|
||||
implemented seperately from NAT
|
||||
enables stateful filtering
|
||||
implementation
|
||||
hooks into NF_IP_PRE_ROUTING to track packets
|
||||
hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to see if packet passed filtering rules
|
||||
protocol modules (currently TCP/UDP/ICMP)
|
||||
application helpers currently (FTP,IRC,H.323,talk,SNMP)
|
||||
divides packets in the following four categories
|
||||
NEW - would establish new connection
|
||||
ESTABLISHED - part of already established connection
|
||||
RELATED - is related to established connection
|
||||
INVALID - (multicast, errors...)
|
||||
does _NOT_ filter packets itself
|
||||
can be utilized by iptables using the 'state' match
|
||||
is used by NAT Subsystem
|
||||
|
||||
%page
|
||||
TCP state + window tracking
|
||||
Further Reading
|
||||
|
||||
RFC793,RFC2018,RFC1323: Transmission Control Protocol
|
||||
http://www.netfilter.org/ - netfilter hacking howto contains some info
|
|
@ -0,0 +1,147 @@
|
|||
|
||||
0. Einfuehrung / Geschichte
|
||||
1. Grundsaetzliches (.tex file, latex, .dvi, dvips, postscript, ...)
|
||||
2. wie sieht ein LaTex dokument aus?
|
||||
\documentstyle[german]{article}
|
||||
\pagestyle{empty}
|
||||
\begin{document}
|
||||
..
|
||||
\end{document}
|
||||
|
||||
3. Texteingabe
|
||||
- leerzeichen / leerzeilen irrelevant
|
||||
- absatz durch leere zeile oder \par
|
||||
- zeilenumbruch mit \\
|
||||
|
||||
Trennvorgaben:
|
||||
- lokal (nur): Donau\-dampf\-schiff
|
||||
- lokal (auch): Donau"-dampf"-schiff
|
||||
- global: \hyphenation{Donau-dampf-schiff}
|
||||
- keine trennung lokal: \mbox{Untrennbar}
|
||||
- keine trennung global: \hyphenation{Untrennbar}
|
||||
- ~ == leerzeichen, bei dem kein zeilenumbruch erfolgen darf
|
||||
|
||||
Sonderzeichen:
|
||||
- deutsch: "a "o "u "A "O "U "s
|
||||
- sonstige: \'o \`o \^o \=o \.o \u{o} \v{o} \H{o} \t{oo} \c{c} \d{o}
|
||||
z.b. C'est \c{c}a!
|
||||
- \$ \& \% \# \{ \} [ ] \_ @ \S \pounds $<$ $>$ $\backslash ?` !`
|
||||
|
||||
Striche und Anfuehrungszeichen:
|
||||
- Striche: - -- ---
|
||||
- Anfuehrungszeichen: \glq \grq \glqq \grqq \flq \frq \flqq \frqq
|
||||
(german left quote, french right quote, ...)
|
||||
Kurzform: "` "' "> "<
|
||||
|
||||
Leerraeume
|
||||
- \hspace{length}
|
||||
- \vspace{length}
|
||||
- \hfill \vfill
|
||||
Auslassungszeichen
|
||||
- \ldots{}
|
||||
- \dotfill{}
|
||||
- \vdots
|
||||
|
||||
|
||||
Texthervorhebungen
|
||||
- betonen {\em betontes}
|
||||
- unterstreichen \underline{ }
|
||||
- woertlich \verb{text}
|
||||
|
||||
Schriftarten:
|
||||
- Roman (\rm)
|
||||
- Fett (\bf)
|
||||
- Kursiv (\it) (Italic-Korrektur \/)
|
||||
- Slanted (\sl)
|
||||
- Sans Serif (\sf)
|
||||
- Small Caps (\sc)
|
||||
- Typweriter (\tt)
|
||||
|
||||
Schriftgroessen:
|
||||
- \tiny
|
||||
- \scriptsize
|
||||
- \footnotesize
|
||||
- \small
|
||||
- \normalsize
|
||||
- \large
|
||||
- \Large
|
||||
- \LARGE
|
||||
- \huge
|
||||
- \Huge
|
||||
|
||||
NFSS:
|
||||
- \family{cmr|cmss|cmtt}
|
||||
- \series{ul|el|l|sl|m|sb|b|eb|ub}
|
||||
- \series{uc|ec|c|sc|m|sx|x|ex|ux}
|
||||
- \shape{n|it|sl|sc}
|
||||
- \size{size}{linespacing}
|
||||
- \selectfont
|
||||
|
||||
|
||||
Dokumenststruktur
|
||||
|
||||
Gliederung:
|
||||
|
||||
- \part{Teilueberschrift} \part*
|
||||
- \chapter{Kapitelueberschrift} \chapter*
|
||||
|
||||
- \section{Abschnittsueberschrift}
|
||||
- \subsection{}
|
||||
- \subsubsection{}
|
||||
- \paragraph{} [text folgt in gleicher zeile wie ueberschrift]
|
||||
- \subparagraph{}
|
||||
|
||||
|
||||
- \setcounter{page|part|chapter|section|...}{wert}
|
||||
|
||||
Titelseite:
|
||||
- \title{Titel}
|
||||
- \thanks{Danksagung}
|
||||
- \author{Autor}
|
||||
- \date{Datum}
|
||||
- \maketitle
|
||||
|
||||
Zusammenfassung:
|
||||
- \begin{abstract} \end{abstract}
|
||||
|
||||
Inhaltsverzeichnis:
|
||||
- \tableofcontents
|
||||
- Tiefe des Einbindens: \setcounter{tocdepth}{tiefe}
|
||||
tiefe: -1 keine ueberschrift
|
||||
0 chapter
|
||||
1 chapter und section
|
||||
2 chapter bis subsection
|
||||
3 chapter bis subsubsection
|
||||
4 chapter bis paragraph
|
||||
5 alle
|
||||
|
||||
Dokumentaufbau:
|
||||
- \documentstyle[german, option, ...]{style}
|
||||
Gaengige styles: article, report, book
|
||||
Stiloptionen: 10pt,11pt,12pt,twoside,twocolumn,titlepage
|
||||
- \pagestyle{plain|headings|empty|myheadings}
|
||||
|
||||
- \noindent
|
||||
- Zentriert: \begin{center} \end{center}
|
||||
- Links-Rechtsbuendig: \begin{flushleft|flushright}
|
||||
- Zitat: \begin{quotation} \end{quotation}
|
||||
- Gedichtzeilen: \begin{verse} \end{verse}
|
||||
- Woertlich: \begin{verbatim} \end{verbatim}
|
||||
- Randnotiz: \marginpar{Randnotiz}
|
||||
- Fussnote: \footnote{foobaR}
|
||||
|
||||
Listen und Aufzaehlungen:
|
||||
- Liste: \begin{itemize} \item \end{itemize}
|
||||
- Aufzaehlung: \begin{enumerate} \item \end{enumerate}
|
||||
- Beschreibung: \begin{description} \item[was] \end{description}
|
||||
|
||||
Querverweise:
|
||||
- Ziel des verweises: \label{name}
|
||||
- Verweis: \ref{name}, Seite \pageref{name}
|
||||
|
||||
Literaturverzeichnis:
|
||||
- "Wie in \cite[Seiten 12,13]{1} beschrieben..."
|
||||
\begin{thebibliography}{99}
|
||||
\bibitem{1} Markus Mueller, {\sl Mahlzeit - das Kochbuch, Addison-Wesley, 1995}
|
||||
...
|
||||
\end{thebibliography}
|
|
@ -0,0 +1,430 @@
|
|||
\documentstyle[german,a4]{article}
|
||||
\pagestyle{plain}
|
||||
|
||||
\setlength{\oddsidemargin}{0in}
|
||||
\setlength{\evensidemargin}{0in}
|
||||
\setlength{\topmargin}{0.0in}
|
||||
\setlength{\headheight}{0in}
|
||||
\setlength{\headsep}{0in}
|
||||
\setlength{\textwidth}{6.5in}
|
||||
\setlength{\textheight}{9.5in}
|
||||
\setlength{\parindent}{0in}
|
||||
\setlength{\parskip}{0.05in}
|
||||
|
||||
\begin{document}
|
||||
|
||||
\title{The UNIX way of text processing: \LaTeX}
|
||||
\author{Harald Welte $<$laforge@gnumonks.org$>$}
|
||||
\maketitle
|
||||
|
||||
\begin{abstract}
|
||||
Dieses Dokument soll als kleines Begleitschreiben zu meinem Einf"uhrungskurs in
|
||||
{\LaTeX} dienen. Vervielf"altigung erlaubt und erw"unscht.
|
||||
\end{abstract}
|
||||
|
||||
\tableofcontents
|
||||
|
||||
\section{Einleitung}
|
||||
Viele Anwender arbeiten heute mit sogenannten {\em WYSIWYG}-Textverarbeitungen.
|
||||
Dieses relativ neue Konzept erm"oglicht es, ein Textdokument am Bildschirm so
|
||||
zu bearbeiten, wie es nachher auch aus dem Drucker herauskommt (zumindest
|
||||
Behaupten dies die Hersteller, die Realit"at sieht meist anders aus).
|
||||
|
||||
Eine ganz andere Philosophie verfolgen Textsatzsysteme wie \TeX oder dessen
|
||||
Erweiterung {\LaTeX}. Hier wird der Text zun"achst mit einem beliebigen
|
||||
Texteditor gescrhieben, wobei bestimmte Befehle und Steuerzeichen eingebettet
|
||||
werden. Anschlie"send wird dann der Textprocessor aufgerufen, der das
|
||||
eigentliche Resultat generiert.
|
||||
|
||||
Dieser Vorgang erinnert stark an das Programmieren: Der Autor schreibt einen
|
||||
Quelltext, welcher mittels eines Compilers in Maschinensprache "ubersetzt wird.
|
||||
Diese Analogie ist kein Zufall, wurde TeX doch von {\em Donald E. Knuth}, einem
|
||||
der renommiertesten Informatikern "ubehaupt, geschrieben. Knuth hat neben zwei
|
||||
Professuren unter anderem auch 27 Ehrendoktortitel und zahlreiche Ehrungen
|
||||
Namhafter Institutionen.
|
||||
|
||||
Knuth schreibt seit Jahrzehnten ma"sgebliche Referenzwerke der Informatik, so
|
||||
z.B. die mehrteilige Reihe {\em The Art of Programming}. Um seine B"ucher
|
||||
vern"unftig schreiben zu k"onnen, hat Knuth keine befriedigende Software
|
||||
gefunden, und hat so kurzerhand selbst eine entwickelt.
|
||||
|
||||
Die ersten TeX-Versionen wurden 1978 ver"offentlicht, und der letzte bekannte
|
||||
(in Erscheinung tretende) Bug wurde 1985 gefixed. Der Autor hat f"ur jeden
|
||||
Bugfix eine finanzielle Belohnung ausgesetzt, welche sich mit jedem Fehler
|
||||
verdoppelt (beginnend bei US\$ 1.28, heute nahe der H"ochstgrenze von
|
||||
US\$327.68).
|
||||
|
||||
\section{Warum sollte ich {\LaTeX} verwenden?}
|
||||
\begin{itemize}
|
||||
\item
|
||||
Professionelles Textsatzsystem v"ollig kostenlos
|
||||
\item
|
||||
Absolut identische Ausgabe unabh"angig von verwendeter Computerhardware, Betriebssystem, Softwareversion, Drucker
|
||||
\item
|
||||
Kompatibilit"at "uber Jahrzehnte. Welches propriet"are Textsyetem existiert seit 1978 und kann heute noch problemlos mit den alten Dokuenten arbeiten?
|
||||
\item
|
||||
Perfekte Unterst"utzung f"ur alles, was in wissenschaftlichen Dokumenten gebraucht wird: Formelnsatz, Literaturverzeichnisse, Glossar, ...
|
||||
\end{itemize}
|
||||
|
||||
\section{Mein erstes {\LaTeX}-Dokument}
|
||||
\subsection{Bearbeiten des .tex Files im Editor}
|
||||
|
||||
Man nehme seinen lieblings-Texteditor und schreibe folgendes:
|
||||
|
||||
\begin{verbatim}
|
||||
\documentstyle[german,a4]{article}
|
||||
\pagestyle{empty}
|
||||
\begin{document}
|
||||
...
|
||||
\end{document}
|
||||
\end{verbatim}
|
||||
|
||||
Die einzelnen Elemente werden sp"ater noch ausf"uhrlich besprochen. Vorerst
|
||||
sollte einfach das obige Template verwendet werden. Dort wo ``....'' steht,
|
||||
kann man jetzt seinen eigenen Text hinschreiben.
|
||||
|
||||
\subsection{Das eigentliche Tex(t)processing}
|
||||
|
||||
Nach eingabe des Quelltextes wird der Processor aufgerufen, welcher dann das Ergebnis produziert. Unter Unix/Linux sieht der Befehl folgendermassen aus:
|
||||
|
||||
\begin{verbatim}
|
||||
latex meinedatei.tex
|
||||
\end{verbatim}
|
||||
|
||||
Anschlie"send liegt eine Datei {\em meinedatei.dvi} im gleichen Verzeichnis.
|
||||
Das DVI ist ein device-independent File. Das hei"st, da"s es das Dokument in
|
||||
einem vom Ausgabeger"at unabh"angigen Format beschreibt.
|
||||
|
||||
Dieses DVI-File kann man sich unter Linux am besten mit dem Programm {\em xdvi} ansehen:
|
||||
|
||||
\begin{verbatim}
|
||||
xdvi meinedatei.dvi
|
||||
\end{verbatim}
|
||||
|
||||
Sollte man nnoch etwas am Dokumnt nachbessern wollen, so editiert man wieder das .tex-File, ruft {\LaTeX} auf und sieht sich das neue .dvi an.
|
||||
|
||||
\subsection{Das Ausgeben auf dem Drucker}
|
||||
|
||||
Das .dvi kann nun in das endg"ultige Ausgabeformat "uberf"uhrt werden. Zumeist ist das Postscript. F"ur die Konvertierung nach Postscript wird {\em dvips} verwendet:
|
||||
|
||||
\begin{verbatim}
|
||||
dvips meinedatei.dvi > meinedatei.ps
|
||||
\end{verbatim}
|
||||
|
||||
oder gleich zum Drucker schicken:
|
||||
|
||||
\begin{verbatim}
|
||||
dvips meinedatei.dvi | lpr
|
||||
\end{verbatim}
|
||||
|
||||
\section{Allgemeines zur Syntax}
|
||||
|
||||
\subsection{Leerzeichen}
|
||||
|
||||
Leerzeichen und ein einfacher Zeilenumbruch werden von {\LaTeX} nicht beachtet.
|
||||
|
||||
Ein neuer Absatz wird durch eine Leerzeile (doppelter Zeilenumbruch) begonnen.
|
||||
Zeilenumbr"uche k"onnen durch $\backslash\backslash$ am Zeilenende erzeugt
|
||||
werden. Es k"onnen manuell Leerr"aume eingef"ugt werden:
|
||||
|
||||
\begin{verbatim}
|
||||
\hspace{length} % horizontaler Leerraum
|
||||
\vspace{length} % vertikaler Leerraum
|
||||
\end{verbatim}
|
||||
|
||||
\subsection{Trennung}
|
||||
|
||||
Die Trennung wird von {\LaTeX} automatisch vorgenommen. Hierbei werden die in
|
||||
der jeweiligen Landessprache ({\em german.sty}) geltneden Trennungsregeln
|
||||
verwendet. Es k"onnen jedoch vom Autor Trennungsvorgaben gemacht werden:
|
||||
|
||||
\begin{itemize}
|
||||
\item
|
||||
Zus"atzliche Trennungsm"oglichkeit: Donau$\backslash$-dampf$\backslash$-schiff
|
||||
\item
|
||||
Ausschliessliche Trennvorgabe: Donau\"-dampf\"-schiff
|
||||
\item
|
||||
Globale Trennvorgabe f"ur ein Wort: $\backslash$hyphenation\{Donau-dampf-schiff\}
|
||||
\item
|
||||
Keine Trennungsm"oglichkeit (im text): $\backslash$mbox\{Untrennbar\}
|
||||
\item
|
||||
Keine Trennungsm"oglichkeit (global): $\backslash$hyphenation\{Untrennbar\}
|
||||
\item
|
||||
Leerzeichen, bei dem kein Zeilenumbruch erfolgen darf: $~$
|
||||
\end{itemize}
|
||||
|
||||
|
||||
\subsection{Sonderzeichen}
|
||||
Sonderzeichen k"onnen nicht einfach im Flie"stext geschrieben werden, da sie
|
||||
h"aufig von Zeichensatz zu Zeichensatz unterschiedlich sind. Einige andere
|
||||
Symbole werden von {\LaTeX} selbst als Steuer- und Kommandozeichen verwendet.
|
||||
|
||||
\subsubsection{Deutsche Umlaute}
|
||||
\begin{verbatim}
|
||||
"a "o "u "A "O "U "s
|
||||
\end{verbatim}
|
||||
"a "o "u "A "O "U "s
|
||||
|
||||
\subsubsection{Ausl"andische Sonderzeichen}
|
||||
\begin{verbatim}
|
||||
\'o \`o \^o \=o \.o \u{o} \v{o} \H{o} \t{oo} \c{c} \d{o}
|
||||
\end{verbatim}
|
||||
\'o \`o \^o \=o \.o \u{o} \v{o} \H{o} \t{oo} \c{c} \d{o}
|
||||
|
||||
\subsubsection{Sonderzeichen/Symbole}
|
||||
\begin{verbatim}
|
||||
\$ \& \% \# \{ \} [ ] \_ @ \S \pounds $<$ $>$ $\backslash$ @' !'
|
||||
\end{verbatim}
|
||||
\$ \& \% \# \{ \} [ ] \_ @ \S \pounds $<$ $>$ $\backslash$ @' !'
|
||||
|
||||
\section{Dokumentstruktur}
|
||||
|
||||
\subsection{Gliederung}
|
||||
|
||||
Gr"o"sere Dokumentvorlagen wie book.sty haben Teile und Kapitel. Dise k"onnen wie folgt verwendet werden:
|
||||
|
||||
\begin{verbatim}
|
||||
\part{Teil"uberschrift}
|
||||
\chapter{Kapitel"uberschrift}
|
||||
\end{verbatim}
|
||||
|
||||
Kleinere Dokumentvorlagen wie article.sty bieten die Unterteilung in
|
||||
Abschnitte, Unterabschnitte, Unter-Unterabschnitte, Abs"tze und Unterabs"atze.
|
||||
Selbstverst"andlich sind diese Gliederungselemente auch in gr"o"seren
|
||||
Dokumentvorlagen verwendbar.
|
||||
|
||||
\begin{verbatim}
|
||||
\section{Abschnitts"uberschrift}
|
||||
\subsection{Unterabschnitts"uberschrift}
|
||||
\subsubsection{Unter-Unterabschnitts"uberschrift}
|
||||
\paragraph{Absatz"uberschrift}
|
||||
\subparagraph{Unterabsatz"uberschrift}
|
||||
\end{verbatim}
|
||||
|
||||
Die einzelnen Gliederungselemente werden automatisch durchnummeriert. Wie tief die Nummerierung angezeigt wird, ist einstellbar. Standardm"a"sig wird nur bis Subsection nummeriert, d.h. subsubsection, paragraph und subparagraph erhalten keine angezeigte Nummerierung.
|
||||
|
||||
\begin{verbatim}
|
||||
\setcounter{secnumdepth}{wert}
|
||||
\end{verbatim}
|
||||
|
||||
\label{gliederungswerte}
|
||||
Wobei {\em wert} die folgenden Werte annehmen kann:
|
||||
\begin{description}
|
||||
\item[-1] keine Nummern
|
||||
\item[0] nur Chapter
|
||||
\item[1] Chapter und Section
|
||||
\item[2] Chapter bis Subsection
|
||||
\item[3] Chapter bis Subsubsection
|
||||
\item[4] Chapter bis Paragraph
|
||||
\item[5] alle
|
||||
\end{description}
|
||||
|
||||
\subsection{Titelseite}
|
||||
|
||||
Die meisten Dokumentvorlagen bieten die M"oglichkeit, automatisch eine Titelseite zu generieren. Dazu werden die folgenden Definitionen verwendet:
|
||||
|
||||
\begin{verbatim}
|
||||
\title{Titel}
|
||||
\thanks{Danksagung}
|
||||
\author{Autor}
|
||||
\date{Datum}
|
||||
\maketitle
|
||||
\end{verbatim}
|
||||
|
||||
\subsection{Zusammenfassung}
|
||||
|
||||
Eine Zusammenfassung kann dem eigentlichen Dokument vorangestellt werden:
|
||||
|
||||
\begin{verbatim}
|
||||
\begin{abstract}
|
||||
...
|
||||
\end{abstract}
|
||||
\end{verbatim}
|
||||
|
||||
\subsection{Inhaltsverzeichnis}
|
||||
|
||||
Aus den Gliederungselementen kann auf Wunsch automatisch ein Inhaltsverzeichnis erzeugt werden. Hierzu verwendet man den Befehl
|
||||
\begin{verbatim}
|
||||
\tableofcontents
|
||||
\end{verbatim}
|
||||
|
||||
Man kann nun noch bestimmen, bis zu welcher Gliederungsebene Eintr"age im Inhaltsverzeichnis gemacht werden sollen:
|
||||
\begin{verbatim}
|
||||
\setcounter{tocdepth}{tiefe}
|
||||
\end{verbatim}
|
||||
|
||||
Wobei {\em tiefe} die gleichen Werte annehmen kann, wie in Teil \ref{gliederungswerte} auf Seite \pageref{gliederungswerte} beschrieben.
|
||||
|
||||
|
||||
\section{Formatierung}
|
||||
|
||||
\subsection{Schriftarten}
|
||||
|
||||
Man kann selbstverst"andlich auch zwischen diversen Schriftarten w"ahlen. Der Einfachheit halber werden hier jedoch nur die Standardschriften beschrieben:
|
||||
|
||||
\begin{itemize}
|
||||
\item
|
||||
{\rm Roman}: $\backslash$rm
|
||||
\item
|
||||
{\bf Fett}: $\backslash$bf
|
||||
\item
|
||||
{\it Kursiv}: $\backslash$it (Italic-Korrektur $\backslash$/)
|
||||
\item
|
||||
{\sl Slanted}: $\backslash$sl
|
||||
\item
|
||||
{\sf Sans Serif}: $\backslash$sf
|
||||
\item
|
||||
{\sc Small Caps}: $\backslash$sc
|
||||
\item
|
||||
{\tt Typewriter}: $\backslash$tt
|
||||
\end{itemize}
|
||||
|
||||
\subsection{Schriftgr"o"sen}
|
||||
|
||||
\begin{itemize}
|
||||
\item {\tiny $\backslash$tiny}
|
||||
\item {\scriptsize $\backslash$scriptsize}
|
||||
\item {\footnotesize $\backslash$footnotesize}
|
||||
\item {\small $\backslash$small}
|
||||
\item {\normalsize $\backslash$normalsize}
|
||||
\item {\large $\backslash$large}
|
||||
\item {\Large $\backslash$Large}
|
||||
\item {\LARGE $\backslash$LARGE}
|
||||
\item {\huge $\backslash$huge}
|
||||
\item {\Huge $\backslash$Huge}
|
||||
\end{itemize}
|
||||
|
||||
\subsection{Textausrichtung}
|
||||
|
||||
Standardm"a"sig formatiert {\LaTeX} immer im Blocksatz. Dies kann ge"andert
|
||||
werden:
|
||||
|
||||
\begin{verbatim}
|
||||
\begin{center}
|
||||
Zentriert
|
||||
\end{center}
|
||||
\end{verbatim}
|
||||
|
||||
Es k"onnen neben {\em center} auch {\em flushleft} oder {\em flushright}
|
||||
verwendet werden, um linksb"undige bzw. rechtsb"undige Ausgabe zu erhalten.
|
||||
|
||||
\subsection{Zitate}
|
||||
|
||||
Zitate k"onnen wie folgt eingebunden werden:
|
||||
\begin{verbatim}
|
||||
\begin{quotation}
|
||||
Mahlzeit
|
||||
\end{quotation}
|
||||
\end{verbatim}
|
||||
|
||||
|
||||
|
||||
\section{Listen und Aufz"ahlungen}
|
||||
|
||||
\subsection{Listen}
|
||||
|
||||
Eine Liste kann wie folgt erzeugt werden:
|
||||
|
||||
\begin{verbatim}
|
||||
\begin{itemize}
|
||||
\item erster eintrag
|
||||
\item zweiter eintrag
|
||||
\end{itemize}
|
||||
\end{verbatim}
|
||||
|
||||
\subsection{Aufz"ahlungen}
|
||||
|
||||
Eine Aufz"ahlung kann wie folgt erzeugt werden:
|
||||
\begin{verbatim}
|
||||
\begin{enumerate}
|
||||
\item erster Eintrag
|
||||
\item zweiter Eintrag
|
||||
\end{enumerate}
|
||||
\end{verbatim}
|
||||
|
||||
\subsection{Beschreibungen/Definitionen}
|
||||
|
||||
Eine Beschreibung kann wie folgt erzeugt werden:
|
||||
\begin{verbatim}
|
||||
\begin{description}
|
||||
\item[Donald E. Knuth] Autor des bekannten TeX Syetems
|
||||
\item[Donald Becker] Autor von zahllosen Linux-Netzwerktreibern
|
||||
\end{description}
|
||||
\end{verbatim}
|
||||
|
||||
\section{Fu"snoten, Querverweise, Literaturverzeichnis}
|
||||
|
||||
\subsection{Fu"snoten}
|
||||
Eine Fu"snote\footnote{Fussnoten sehen so aus *g*} wird einfach in den Text
|
||||
mit hineingeschrieben, an der Stelle an der sie erscheinen soll:
|
||||
\begin{verbatim}
|
||||
Benutzt man hingegen das Sub-Etha-Sens-O-Matic\footnote{Ein Ger"at zur
|
||||
detektion in der N"ahe befindlicher Raumschiffe}, so ....
|
||||
\end{verbatim}
|
||||
|
||||
\subsection{Querverweise}
|
||||
|
||||
Es k"onnen Querverweise eingef"ugt werden, welche dann automatisch auf die
|
||||
jeweils aktuelle Abschnittsnummer / Seite verweisen, auch wenn sich das Ziel
|
||||
des Querverweises verschiebt.
|
||||
|
||||
An dem Ziel des Querverweises (also wohin man verweisen m"ochte), wird
|
||||
folgender Befehl eingef"ugt:
|
||||
\begin{verbatim}
|
||||
\label{namedeslabels}
|
||||
\end{verbatim}
|
||||
|
||||
Ein Querverweis dorthin sieht dann wie folgt aus:
|
||||
\begin{verbatim}
|
||||
Wie in Abschintt \ref{namedeslabels} auf Seite \pageref{namedeslabels}
|
||||
beschrieben
|
||||
\end{verbatim}
|
||||
|
||||
\subsection{Literaturverzeichnis}
|
||||
|
||||
Vor allem in wissenschaftlichen Dokumenten wird ein Literaturverzeichnis
|
||||
gebraucht. Es gibt zwei unterschiedliche M"oglichkeiten, ein
|
||||
Literaturverzeichnis unter {\LaTeX} zu verwenden.
|
||||
|
||||
Die einfache, hier beschriebene Variante eignet sich f"ur kleine, einzelne
|
||||
Dokumente. Wer h"aufig zu den gleichen Themen dokumente verfasst, sollte sich
|
||||
mit {\em bibtex} auseinandersetzen, hier kann man sich ein zentrales
|
||||
Literaturverzeichnis anlegen, worauf von allen Dokumenten aus verwiesen werden
|
||||
kann.
|
||||
|
||||
Ein Verweis auf ein Literaturverzeichnis sieht so aus:
|
||||
\begin{verbatim}
|
||||
Wie in \cite[Seiten 12 ff.]{1} bescrhieben, ...
|
||||
\end{verbatim}
|
||||
|
||||
Das Literaturverzeichnis am Ende sieht dann so aus:
|
||||
\begin{verbatim}
|
||||
\begin{thebibliography}{99}
|
||||
\bibitem{1} Markus M"uller, {\sl Mahlzeit - das Kochbuch, Addison-Wesley, 1995}
|
||||
\end{thebibliography}
|
||||
\end{verbatim}
|
||||
|
||||
|
||||
\section{Dokumentvorlagen}
|
||||
|
||||
Wir haben in den bisherigen Beispielen immer die Dokumentvorlage {\em article.sty} vewendet. Die zu verwendende Dokumentvorlage wird im Kopf des Dokuments mit dem {\em $\backslash$documentstyle}-Befehl angegeben.
|
||||
|
||||
G"angige Dokumentvorlagen:
|
||||
\begin{description}
|
||||
\item[article] F"r das Verfassen strukturierter Dokumente begrenzter L"ange
|
||||
\item[book] Zum Verfassen eines Ganzen Buches
|
||||
\item[dinbrief] Zum Verfassen eines sich exakt an der DIN-Norm orientierenden Briefes
|
||||
\end{description}
|
||||
|
||||
Zus"atzlich werden beim {\em documentstyle}-Befehl in den eckigen Klammern noch
|
||||
Optionen angegeben.
|
||||
|
||||
G"angige Optionen:
|
||||
\begin{description}
|
||||
\item[10pt, 11pt, 12pt] Standardschriftgr"o"se
|
||||
\item[german] Unterst"utzung f"ur deutsche Umlaute und Trennregeln
|
||||
\item[twocolumn] Zweispaltige Formatierung
|
||||
\item[twoside] Zweiseitiger Druck (Seitennummern, etc.)
|
||||
\end{description}
|
||||
|
||||
\end{document}
|
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
File diff suppressed because it is too large
Load Diff
|
@ -0,0 +1 @@
|
|||
http://www.allnet.de/ftp/pub/allnet/wireless/all0277/ALL0277_1.02.6_ETSI_0703_code.bin
|
|
@ -0,0 +1,113 @@
|
|||
%include "default.mgp"
|
||||
%default 1 bgrad
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
%nodefault
|
||||
%back "blue"
|
||||
|
||||
%center
|
||||
%size 7
|
||||
|
||||
|
||||
Reverse Engineering
|
||||
%size 5
|
||||
of Linux-Based Firmware Images
|
||||
|
||||
|
||||
%center
|
||||
%size 4
|
||||
by
|
||||
|
||||
Harald Welte <laforge@gnumonks.org>
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Linux Firmware Reverse Engineering
|
||||
Overview
|
||||
|
||||
Linux has gained ground in the commercial market
|
||||
Embedded hardware is getting cheaper
|
||||
Network Appliances become more popular
|
||||
802.11(abg) Acces Points, Bridges, Routers
|
||||
DSL 'Routers' (in reality NAT-gateways)
|
||||
Users demand more and more CPU-intensive functions
|
||||
PPPoE, PPTP
|
||||
NAT with ALG's for H.323, PPTP
|
||||
IPsec
|
||||
|
||||
Many vendors seem to conclude:
|
||||
Why not use Linux?
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Linux Firmware Reverse Engineering
|
||||
Why is this worth a presentation?
|
||||
|
||||
Vendors tend to forget about their GPL obligations
|
||||
They have to
|
||||
redistribute or make available the sourcecode
|
||||
redistribute or maka available build scripts
|
||||
inform their users about their rights and obligations under the GPL
|
||||
They are not allowed to link with GPL-incompatible code
|
||||
|
||||
Vendors tend to forget about security issues
|
||||
Most people don't know that their appliance runs linux
|
||||
Thus they won't even know that they're affected by a vulnerability
|
||||
Vendors of consumer-class equipment tend to be lazy
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Linux Firmware Reverse Engineering
|
||||
How to start (from a technical point of view)
|
||||
|
||||
In most cases you don't even need the device
|
||||
Firmware images are available for download from the vendors
|
||||
Reverse engineering starts by looking at that binary
|
||||
In a number of cases, you will either find
|
||||
a gzip signature for a compressed kernel
|
||||
a signature of a cramfs disk image
|
||||
a configuration file to enable/disable features
|
||||
some other (arj/lzh/zip/...) image
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Linux Firmware Reverse Engineering
|
||||
How to start from a technical point of view (cont'd)
|
||||
|
||||
Useful tools for looking at that image
|
||||
'strings' (from gnu binutils)
|
||||
your favourite hex editor
|
||||
'file' (especially it's 'magic' signature file)
|
||||
libmagic (library for accessing 'magic' signatures)
|
||||
|
||||
Strings to look for:
|
||||
'piggy' (compressed kernel image)
|
||||
0x28cd3d45 (compressed ram fs)
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Linux Firmware Reverse Engineering
|
||||
Practical Example
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Linux Firmware Reverse Engineering
|
||||
Thanks
|
||||
The slides of this presentation are available at http://www.gnumonks.org/
|
||||
|
||||
Thanks to
|
||||
the BBS people, Z-Netz, FIDO, ...
|
||||
for heavily increasing my computer usage in 1992
|
||||
KNF
|
||||
for bringing me in touch with the internet as early as 1994
|
||||
for providing a playground for technical people
|
||||
for telling me about the existance of Linux!
|
||||
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
|
||||
for implementing (one of?) the world's best TCP/IP stacks
|
||||
Paul 'Rusty' Russell
|
||||
for starting the netfilter/iptables project
|
||||
for trusting me to maintain it today
|
||||
Astaro AG
|
||||
for sponsoring parts of my netfilter work
|
|
@ -0,0 +1,79 @@
|
|||
#include <stdlib.h>
|
||||
#include <stdio.h>
|
||||
#include <unistd.h>
|
||||
|
||||
#include <sys/types.h>
|
||||
#include <sys/mman.h>
|
||||
#include <sys/stat.h>
|
||||
|
||||
#include <magic.h>
|
||||
|
||||
/* magic_ofs - check for 'file' magic at any possible offset within a file
|
||||
*
|
||||
* (C) 2003 by Harald Welte <laforge@gnumonks.org>
|
||||
*
|
||||
* This code is subject to the GNU GPL v2
|
||||
*/
|
||||
|
||||
int main(int argc, char **argv)
|
||||
{
|
||||
struct stat st;
|
||||
magic_t cookie;
|
||||
int fd;
|
||||
off_t i;
|
||||
void *mem;
|
||||
|
||||
if (argc < 2) {
|
||||
fprintf(stderr, "you have to name a file\n");
|
||||
exit(2);
|
||||
}
|
||||
|
||||
if (!strlen(argv[1])) {
|
||||
fprintf(stderr, "empty argument\n");
|
||||
exit(2);
|
||||
}
|
||||
|
||||
fd = open(argv[1], 0);
|
||||
if (fd < 0) {
|
||||
fprintf(stderr, "unable to open file\n");
|
||||
exit(1);
|
||||
}
|
||||
|
||||
if (fstat(fd, &st)) {
|
||||
fprintf(stderr, "unable to stat file\n");
|
||||
exit(1);
|
||||
}
|
||||
|
||||
mem = mmap(0, st.st_size, PROT_READ, MAP_SHARED, fd, (off_t ) 0);
|
||||
if (!mem) {
|
||||
fprintf(stderr, "unable to mmap file\n");
|
||||
exit(1);
|
||||
}
|
||||
|
||||
cookie = magic_open(MAGIC_CONTINUE);
|
||||
if (!cookie) {
|
||||
fprintf(stderr, "error opening libmagic\n");
|
||||
exit(1);
|
||||
}
|
||||
|
||||
if (magic_load(cookie, NULL)) {
|
||||
fprintf(stderr, "error during magic_load\n");
|
||||
magic_close(cookie);
|
||||
exit(1);
|
||||
}
|
||||
|
||||
for (i = 0; i < st.st_size; i++) {
|
||||
const char *desc;
|
||||
desc = magic_buffer(cookie, mem+i, st.st_size - i);
|
||||
if (!desc) {
|
||||
break;
|
||||
}
|
||||
if (!strcmp(desc, "data")) {
|
||||
continue;
|
||||
}
|
||||
printf("%8.8u: %s\n", i, desc);
|
||||
}
|
||||
|
||||
magic_close(cookie);
|
||||
exit(0);
|
||||
}
|
|
@ -0,0 +1,26 @@
|
|||
Wie waere es mit folgendem Titel:
|
||||
"Einfuehrung in die Architektur des Linux-Kernels - Blicke jenseits des
|
||||
Syscall-Horizonts der Userspace-Prozesse"
|
||||
|
||||
Teil 1: Theoretische Grundlagen
|
||||
- kernel/userspace: Aufgaben, Grenzen, Beruehrungspunkte
|
||||
- Execution context: User, Syscall, Softirq, Hardirq, Kernelthread, Tasklet
|
||||
- Der Scheduler
|
||||
- Primitives: Spinlocks, rwlocks, Mutex, Waitqueues
|
||||
|
||||
Teil 2: Exemplarischer Einblick in einzelne Subsysteme
|
||||
- Netzwerkstack: Vom Empfang des Pakets auf der Netzwerkkarte bis zum
|
||||
empfang im Userspace-prozess
|
||||
- Filesystem: Vom read-syscall bis zum lesen der platte und zurueck
|
||||
|
||||
- aufgaben
|
||||
- virt. speicherverwaltung
|
||||
- prozessverwaltung
|
||||
- filesystem
|
||||
- networking
|
||||
- hardware abstraction
|
||||
- interprozesskommunikation
|
||||
|
||||
- schnittstellen fuer userspace-programme
|
||||
- syscalls
|
||||
-
|
|
@ -0,0 +1,300 @@
|
|||
%include "default.mgp"
|
||||
%default 1 bgrad
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
%nodefault
|
||||
%back "blue"
|
||||
|
||||
%center
|
||||
%size 7
|
||||
|
||||
|
||||
Architecture of the Linux kernel
|
||||
%size 5
|
||||
or: The world beyond the syscall barrier
|
||||
|
||||
|
||||
%center
|
||||
%size 4
|
||||
by
|
||||
|
||||
Harald Welte <laforge@gnumonks.org>
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Architecture of the Linux kernel
|
||||
Prerequirements
|
||||
|
||||
Due to the technical nature of this presentation, the audience should be familiar with the following subjects
|
||||
|
||||
experience in programming on a Linux/*NIX system
|
||||
C language preferred
|
||||
general knowledge about computer hardware
|
||||
interrupts / IO / DMA
|
||||
general knowledge about modern CPU architeture
|
||||
address space / MMU
|
||||
'protected mode' / supervisor mode / ...
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Architecture of the Linux kernel
|
||||
Kernel / Userspace
|
||||
|
||||
OS kernel provides
|
||||
hardware abstraction (file I/O, network I/O, ...)
|
||||
ressource allocation / limiting
|
||||
address sepraration
|
||||
privilege separation
|
||||
IPC
|
||||
|
||||
the traditional process model in *NIX operating systems
|
||||
processes reside in seperate virtual address spaces
|
||||
kernel only executes one process (init) at bootup
|
||||
all other processes descend from from init
|
||||
processes are scheduled and preempted by the kernel
|
||||
processes invoke system functions via syscalls.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Architecture of the Linux kernel
|
||||
System calls
|
||||
|
||||
Definition
|
||||
|
||||
a userspace process enters the kernel
|
||||
mechanism is CPU architecture dependent
|
||||
can be software interrupt (int 0x80)
|
||||
can be special asm instruction (sysenter)
|
||||
arguments are passed on the stack
|
||||
common examples
|
||||
open/close/read/write
|
||||
exit/fork/execve/kill
|
||||
socketcall, implements (socket/bind/connect/listen)
|
||||
about 270 system calls in 2.6.x kernels
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Architecture of the Linux kernel
|
||||
Invocation of system call
|
||||
|
||||
chronological order of events in case of a system call
|
||||
|
||||
userspace process calls library function
|
||||
library function is executed within the process' address space
|
||||
library will eventually issue a systemcall, pushing arguments on the stack
|
||||
library will issue syscall (int 0x80 / sysenter / ...)
|
||||
execution will switch to syscall context in kernel mode
|
||||
kernel will look up systemcall table and dispatch to respective function
|
||||
syscall function in the kernel will handle the syscall
|
||||
all data between kernel/userspace needs to be copied between address spaces
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Architecture of the Linux kernel
|
||||
Execution contexts
|
||||
|
||||
apart from scheduling between different userspace processes, the kernel has different jobs like reacting to an external event
|
||||
|
||||
hardirq
|
||||
hardware interrupt line was triggered
|
||||
softirq
|
||||
the workhorse behind a hardirq
|
||||
userspace
|
||||
executing within userspace process
|
||||
syscall
|
||||
invoked by a system call from userspace
|
||||
vsyscall
|
||||
virtual system calls, executed in userspace context
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Architecture of the Linux kernel
|
||||
hardirq context
|
||||
|
||||
interrupt generated by hardware is received + handled
|
||||
can be interrupted by other hardirq's
|
||||
does only minimal job and returns
|
||||
examples
|
||||
packet has arrived on network board
|
||||
character was received on serial port
|
||||
dma read/write to disk drive has completed
|
||||
timer interrupt went off
|
||||
|
||||
in most cases, a hardirq is followed by softirq or tasklet.
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Architecture of the Linux kernel
|
||||
softirq context
|
||||
|
||||
softirqs are run after hardirq
|
||||
do the real work associated withe a hardirq
|
||||
multithreaded (can run simultaneously on multiple cpus)
|
||||
examples
|
||||
network receive softirq
|
||||
timer softirq
|
||||
|
||||
prior to softirq's, linux had so-called 'bottom halves'
|
||||
softirq introduced in 2.4.x (net rx/tx softirq)
|
||||
bottom halves removed in 2.6.x
|
||||
difference: only one BH can be run at a time
|
||||
BH's have to be converted to tasklets in 2.6.x
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Architecture of the Linux kernel
|
||||
tasklets
|
||||
|
||||
tasklets are somewhat in between of softirq's and bottom halves
|
||||
one particular tasklet cannot run on multiple CPUs simultaneously
|
||||
different tasklets can run on different CPUs simultaneosly
|
||||
|
||||
otherwise, same as softirq context
|
||||
tasklets are impl. inside the 'tasklet softirq'
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Architecture of the Linux kernel
|
||||
syscall / userspace context
|
||||
|
||||
userspace context
|
||||
in userspace, executing a process
|
||||
|
||||
syscall context
|
||||
inside kernel, when userspace process issues syscall()
|
||||
|
||||
vsyscalls (virtual syscalls)
|
||||
first introduced with the x86-64 (AMD Opteron) arch
|
||||
fast read-only access to kernel data structures
|
||||
can do stuff like gettimeofday() without context switch
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Architecture of the Linux kernel
|
||||
synchronization
|
||||
|
||||
Due to reentrancy and SMP, synchronization issues arise:
|
||||
|
||||
simple case: UP system
|
||||
softirq can be interrupted by hardirq
|
||||
thus, shared structures (queues, ...) need to be protected
|
||||
complex case: SMP system
|
||||
softirq can run at the same time on multiple CPU's
|
||||
as softirqs are multithreaded, synchronization between threads has to be implemented
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Architecture of the Linux kernel
|
||||
synchronization primitives
|
||||
|
||||
busy-waiting locks
|
||||
|
||||
spinlocks
|
||||
if lock was not taken, take it and continue
|
||||
if lock was taken, bysy-loop until it is free
|
||||
rwlocks
|
||||
special case of spinlocks
|
||||
useful when structure protected by lock is often read but rarely updated/written to
|
||||
allows either
|
||||
multiple readers simultaneously, or
|
||||
only one writer [and no readers]
|
||||
brlocks
|
||||
super-fast read/write locks, with write-side penalty
|
||||
avoid cache ping-pong in multi reader case
|
||||
only in kernel 2.4.x
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Architecture of the Linux kernel
|
||||
synchronization primitives (cont'd)
|
||||
|
||||
sleeper locks
|
||||
|
||||
semaphores
|
||||
if semaphore can be acquired, continue
|
||||
if semaphore cannot be acquired, put current process to sleep
|
||||
once semaphore is available again, wakeup process
|
||||
|
||||
WARNING: can only be used for sync userspace/syscall context
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Architecture of the Linux kernel
|
||||
new locking primitives in 2.6.x
|
||||
|
||||
seqlocks
|
||||
introduced with vsyscalls in 2.5/2.6
|
||||
reader/writer consistent mechanism without starving writers
|
||||
readers never block but may have to retry if write in progress
|
||||
|
||||
read copy update
|
||||
new lockless mechanism in kernel 2.5/2.6
|
||||
defers update of data structure until all CPU's have scheduled and thus nobody has any references left
|
||||
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Architecture of the Linux kernel
|
||||
example: incoming network packet
|
||||
|
||||
hardirq context
|
||||
NIC issues interrupt line after a packet was received
|
||||
kernel enters (arch/i386/kernel/entry.S:common_interrupt)
|
||||
core interrupt handler (arch/i386/kernel/irq.c:do_IRQ)
|
||||
hardirq handler of network driver (drivers/net/tulip/interrupt.c:tulip_interrupt)
|
||||
net/core/dev.c:netif_rx(): append skb to backlog queue
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Architecture of the Linux kernel
|
||||
example: incoming network packet
|
||||
|
||||
softirq context
|
||||
net/core/dev.c:net_rx_action()
|
||||
net/core/dev.c:process_backlog()
|
||||
net/core/dev.c:netif_receive_skb()
|
||||
net/core/dev.c:deliver_skb()
|
||||
net/ipv4/ip_input.c:ip_rcv()
|
||||
netfilter prerouting hook
|
||||
net/ipv4/ip_input.c:ip_rcv_finish()
|
||||
call routing code
|
||||
net/ipv4/ip_input.c:ip_local_deliver()
|
||||
netfilter localin hook
|
||||
net/ipv4/ip_input.c:ip_local_deliver_finish()
|
||||
call l4 protocol
|
||||
net/ipv4/udp.c:udp_rcv()
|
||||
lookup socket, if any
|
||||
include/net/sock.h:sock_queue_rcv_skb()
|
||||
enqueue into socket receiver queue
|
||||
net/core/sock.c:sock_def_readable()
|
||||
wake_up_interruptible() on socket waitqueue
|
||||
return from recv() via socketcall
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Architecture of the Linux kernel
|
||||
example: reading of a file
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Thanks
|
||||
The slides and the an according paper of this presentation are available at http://www.gnumonks.org/
|
||||
|
||||
Thanks to
|
||||
the BBS people, Z-Netz, FIDO, ...
|
||||
for heavily increasing my computer usage in 1992
|
||||
KNF
|
||||
for bringing me in touch with the internet as early as 1994
|
||||
for providing a playground for technical people
|
||||
for telling me about the existance of Linux!
|
||||
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
|
||||
for implementing (one of?) the world's best TCP/IP stacks
|
||||
Paul 'Rusty' Russell
|
||||
for starting the netfilter/iptables project
|
||||
for trusting me to maintain it today
|
||||
Astaro AG
|
||||
for sponsoring parts of my netfilter work
|
||||
|
|
@ -0,0 +1,315 @@
|
|||
%include "default.mgp"
|
||||
%default 1 bgrad
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
%nodefault
|
||||
%back "blue"
|
||||
|
||||
%center
|
||||
%size 7
|
||||
|
||||
|
||||
Linux Kernel Architecture
|
||||
%size 5
|
||||
SMP issues, locking primitives
|
||||
|
||||
|
||||
%center
|
||||
%size 4
|
||||
by
|
||||
|
||||
Harald Welte <laforge@gnumonks.org>
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Linux Kernel Architecture
|
||||
Prerequirements
|
||||
|
||||
Due to the technical nature of this presentation, the audience should be familiar with the following subjects
|
||||
|
||||
experience in programming on a Linux/*NIX system
|
||||
C language preferred
|
||||
general knowledge about computer hardware
|
||||
interrupts / IO / DMA
|
||||
general knowledge about modern CPU architeture
|
||||
address space / MMU
|
||||
'protected mode' / supervisor mode / ...
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Linux Kernel Architecture
|
||||
Kernel / Userspace
|
||||
|
||||
OS kernel provides
|
||||
hardware abstraction (file I/O, network I/O, ...)
|
||||
ressource allocation / limiting
|
||||
address sepraration
|
||||
privilege separation
|
||||
IPC
|
||||
|
||||
the traditional process model in *NIX operating systems
|
||||
processes reside in seperate virtual address spaces
|
||||
kernel only executes one process (init) at bootup
|
||||
all other processes descend from from init
|
||||
processes are scheduled and preempted by the kernel
|
||||
processes invoke system functions via syscalls.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Linux Kernel Architecture
|
||||
System calls
|
||||
|
||||
Definition
|
||||
|
||||
a userspace process enters the kernel
|
||||
mechanism is CPU architecture dependent
|
||||
can be software interrupt (int 0x80)
|
||||
can be special asm instruction (sysenter)
|
||||
arguments are passed on the stack
|
||||
common examples
|
||||
open/close/read/write
|
||||
exit/fork/execve/kill
|
||||
socketcall, implements (socket/bind/connect/listen)
|
||||
about 270 system calls in 2.6.x kernels
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Linux Kernel Architecture
|
||||
Invocation of system call
|
||||
|
||||
chronological order of events in case of a system call
|
||||
|
||||
userspace process calls library function
|
||||
library function is executed within the process' address space
|
||||
library will eventually issue a systemcall, pushing arguments on the stack
|
||||
library will issue syscall (int 0x80 / sysenter / ...)
|
||||
execution will switch to syscall context in kernel mode
|
||||
kernel will look up systemcall table and dispatch to respective function
|
||||
syscall function in the kernel will handle the syscall
|
||||
all data between kernel/userspace needs to be copied between address spaces
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Linux Kernel Architecture
|
||||
Execution contexts
|
||||
|
||||
apart from scheduling between different userspace processes, the kernel has different jobs like reacting to an external event
|
||||
|
||||
hardirq
|
||||
hardware interrupt line was triggered
|
||||
softirq
|
||||
the workhorse behind a hardirq
|
||||
userspace
|
||||
executing within userspace process
|
||||
syscall
|
||||
invoked by a system call from userspace
|
||||
vsyscall
|
||||
virtual system calls, executed in userspace context
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Linux Kernel Architecture
|
||||
hardirq context
|
||||
|
||||
interrupt generated by hardware is received + handled
|
||||
can be interrupted by other hardirq's
|
||||
does only minimal job and returns
|
||||
examples
|
||||
packet has arrived on network board
|
||||
character was received on serial port
|
||||
dma read/write to disk drive has completed
|
||||
timer interrupt went off
|
||||
|
||||
in most cases, a hardirq is followed by softirq or tasklet.
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Linux Kernel Architecture
|
||||
softirq context
|
||||
|
||||
softirqs are run after hardirq
|
||||
do the real work associated withe a hardirq
|
||||
multithreaded (can run simultaneously on multiple cpus)
|
||||
examples
|
||||
network receive softirq
|
||||
timer softirq
|
||||
|
||||
prior to softirq's, linux had so-called 'bottom halves'
|
||||
softirq introduced in 2.4.x (net rx/tx softirq)
|
||||
bottom halves removed in 2.6.x
|
||||
difference: only one BH can be run at a time
|
||||
BH's have to be converted to tasklets in 2.6.x
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Linux Kernel Architecture
|
||||
tasklets
|
||||
|
||||
tasklets are somewhat in between of softirq's and bottom halves
|
||||
one particular tasklet cannot run on multiple CPUs simultaneously
|
||||
different tasklets can run on different CPUs simultaneosly
|
||||
|
||||
otherwise, same as softirq context
|
||||
tasklets are impl. inside the 'tasklet softirq'
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Linux Kernel Architecture
|
||||
syscall / userspace context
|
||||
|
||||
userspace context
|
||||
in userspace, executing a process
|
||||
|
||||
syscall context
|
||||
inside kernel, when userspace process issues syscall()
|
||||
|
||||
vsyscalls (virtual syscalls)
|
||||
first introduced with the x86-64 (AMD Opteron) arch
|
||||
fast read-only access to kernel data structures
|
||||
can do stuff like gettimeofday() without context switch
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Linux Kernel Architecture
|
||||
synchronization
|
||||
|
||||
Due to reentrancy and SMP, synchronization issues arise:
|
||||
|
||||
simple case: UP system
|
||||
softirq can be interrupted by hardirq
|
||||
thus, shared structures (queues, ...) need to be protected
|
||||
complex case: SMP system
|
||||
softirq can run at the same time on multiple CPU's
|
||||
as softirqs are multithreaded, synchronization between threads has to be implemented
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Linux Kernel Architecture
|
||||
synchronization primitives
|
||||
|
||||
busy-waiting locks
|
||||
|
||||
spinlocks
|
||||
if lock was not taken, take it and continue
|
||||
if lock was taken, bysy-loop until it is free
|
||||
rwlocks
|
||||
special case of spinlocks
|
||||
useful when structure protected by lock is often read but rarely updated/written to
|
||||
allows either
|
||||
multiple readers simultaneously, or
|
||||
only one writer [and no readers]
|
||||
brlocks
|
||||
super-fast read/write locks, with write-side penalty
|
||||
avoid cache ping-pong in multi reader case
|
||||
only in kernel 2.4.x
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Linux Kernel Architecture
|
||||
synchronization primitives (cont'd)
|
||||
|
||||
sleeper locks
|
||||
|
||||
semaphores
|
||||
if semaphore can be acquired, continue
|
||||
if semaphore cannot be acquired, put current process to sleep
|
||||
once semaphore is available again, wakeup process
|
||||
|
||||
WARNING: can only be used for sync userspace/syscall context
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Linux Kernel Architecture
|
||||
new locking primitives in 2.6.x
|
||||
|
||||
seqlocks
|
||||
introduced with vsyscalls in 2.5/2.6
|
||||
reader/writer consistent mechanism without starving writers
|
||||
readers never block but may have to retry if write in progress
|
||||
|
||||
read copy update
|
||||
new lockless mechanism in kernel 2.5/2.6
|
||||
defers update of data structure until all CPU's have scheduled and thus nobody has any references left
|
||||
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Linux Kernel Architecture
|
||||
example: incoming network packet
|
||||
|
||||
hardirq context
|
||||
NIC issues interrupt line after a packet was received
|
||||
kernel enters (arch/i386/kernel/entry.S:common_interrupt)
|
||||
core interrupt handler (arch/i386/kernel/irq.c:do_IRQ)
|
||||
hardirq handler of network driver (drivers/net/tulip/interrupt.c:tulip_interrupt)
|
||||
net/core/dev.c:netif_rx(): append skb to backlog queue
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Linux Kernel Architecture
|
||||
example: incoming network packet
|
||||
|
||||
softirq context
|
||||
net/core/dev.c:net_rx_action()
|
||||
net/core/dev.c:process_backlog()
|
||||
net/core/dev.c:netif_receive_skb()
|
||||
net/core/dev.c:deliver_skb()
|
||||
net/ipv4/ip_input.c:ip_rcv()
|
||||
netfilter prerouting hook
|
||||
net/ipv4/ip_input.c:ip_rcv_finish()
|
||||
call routing code
|
||||
net/ipv4/ip_input.c:ip_local_deliver()
|
||||
netfilter localin hook
|
||||
net/ipv4/ip_input.c:ip_local_deliver_finish()
|
||||
call l4 protocol
|
||||
net/ipv4/udp.c:udp_rcv()
|
||||
lookup socket, if any
|
||||
include/net/sock.h:sock_queue_rcv_skb()
|
||||
enqueue into socket receiver queue
|
||||
net/core/sock.c:sock_def_readable()
|
||||
wake_up_interruptible() on socket waitqueue
|
||||
return from recv() via socketcall
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Linux Kernel Architecture
|
||||
Cache Effects
|
||||
|
||||
SMP systems have multiple CPU's
|
||||
Every CPU has it's own cache(s) / cache hierarchies
|
||||
Most modern CPU archs are cache coherent in hardware
|
||||
This means a certain chunk of memory can only be write-cached on one CPU at a given time
|
||||
Frequently updated data structures will ping-pong between CPU caches
|
||||
Data structures have to be organized to avoid cache issues
|
||||
Cacheline alignment
|
||||
very easy by using SLAB_HWCACHE_ALIGN
|
||||
per-cpu data structures
|
||||
e.g. packet counters: have one for every CPU
|
||||
structure layout
|
||||
put all writeable/updated members together
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Thanks
|
||||
The slides and the an according paper of this presentation are available at http://www.gnumonks.org/
|
||||
|
||||
Thanks to
|
||||
the BBS people, Z-Netz, FIDO, ...
|
||||
for heavily increasing my computer usage in 1992
|
||||
KNF
|
||||
for bringing me in touch with the internet as early as 1994
|
||||
for providing a playground for technical people
|
||||
for telling me about the existance of Linux!
|
||||
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
|
||||
for implementing (one of?) the world's best TCP/IP stacks
|
||||
Paul 'Rusty' Russell
|
||||
for starting the netfilter/iptables project
|
||||
for trusting me to maintain it today
|
||||
Astaro AG
|
||||
for sponsoring parts of my netfilter work
|
||||
linux-bangalore
|
||||
for sponsoring my trip to this conference
|
||||
|
|
@ -0,0 +1,71 @@
|
|||
- rule loadtime performance
|
||||
- loading 10k rules in 1k chains takes 4'30min on P3-733
|
||||
- 27seconds in kernelspace: mark_source_chains()
|
||||
- reimplementation finished, needs more testing
|
||||
- 4 minutes in userspace: Two n^2 complexity functions
|
||||
- one of them could be removed in old chain_cache framework
|
||||
- other function needs reimplementation (underway)
|
||||
- ctnetlink still under development, used by a couple of large sites
|
||||
- pkt_tables to be merged later in 2.6.x
|
||||
- change to liked lists of rules in linked lists of chains
|
||||
- use netlink-based kernel/userspace interface
|
||||
- iptables2/pkttables userspace
|
||||
- libnfentlink / libpkttnetlink as low-layer interface
|
||||
- move all iptables functionality into libpkttables
|
||||
- libpkttables provides query-interface
|
||||
- what matches/targets does this system support?
|
||||
- what parameters does match 'foo' support?
|
||||
- what values are acceptable for param 'bar' of match 'foo'?
|
||||
- what is the help message for param 'bar' of match 'foo'?
|
||||
- nf-hipac as high-performance alternative to iptables
|
||||
- very complex multi-dimensional tree structure
|
||||
- 530kilobyte patch, 180k kernel module
|
||||
- algorithm well-proven and regression-tested in userspace
|
||||
- scales really good even with 100k rules
|
||||
- now supports all iptables matches/targets
|
||||
- cannot replace iptables because
|
||||
- large footprint
|
||||
- high memory usage
|
||||
- most likely to be integrated after pkt_tables / pkttnetlink merge
|
||||
- Session logging
|
||||
- different implementations (SLOG one of them)
|
||||
- best solution: ctnetlink event API
|
||||
- problem: per-connection byte/packet counters in conntrack are
|
||||
performance hit
|
||||
- ipv6 connection tracking
|
||||
- usagi people are working on this
|
||||
- non-linear skb support (removal of skb_linearize())
|
||||
- thanks to rusty, 2.5.x/2.6.x now has support
|
||||
- changes in almost any netfilter/iptables API :(
|
||||
- stateful failover / state synchronization
|
||||
- no sponsor yet, but most likely in Q4/2003
|
||||
- conntrack optimization
|
||||
- new hashing algorithm in 2.4.21, should improve significantly
|
||||
- locking optimization
|
||||
- don't use timer per conntrack, but an expiration kernel thread
|
||||
- TRACE target / raw table
|
||||
- experimental patch in patch-o-matic
|
||||
- enables tracing of packet through ruleset
|
||||
- netfilter workshop, August 2003, Budapest, Hungary
|
||||
- about 20 people will attend
|
||||
- sponsored by Astaro Inc and KFKI Research Institute
|
||||
- open to the public, registration needed
|
||||
- we need more community
|
||||
- developer diaries on netfilter homepage?
|
||||
- wiki or similar tool ?
|
||||
- announcement of IRC channel(s) on website
|
||||
- patch-o-matic 2.6.x future?
|
||||
- I will only maintain patch-o-matic for 2.6.x
|
||||
- maybe somebody wants to backport patches?
|
||||
- maybe an official 2.4.x maintainer?
|
||||
- development of testing tools
|
||||
- simple packet generator not suitable for stateful filtering
|
||||
- even simple packet generators are very expensive
|
||||
- connection generator
|
||||
- user can specify profile of a connection
|
||||
- e.g. HTTP: TCP, 500 bytes one direction, 10k other
|
||||
- user can specify quantity and distribution
|
||||
- i.e. 10k 'HTTP', from random source to single dest.
|
||||
- first implementation will be userspace-only, may change later
|
||||
- work will start in September/October, I'll post an RFC
|
||||
- deprecate ipfwadm
|
|
@ -0,0 +1,368 @@
|
|||
%include "default.mgp"
|
||||
%default 1 bgrad
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
%nodefault
|
||||
%back "blue"
|
||||
|
||||
%center
|
||||
%size 7
|
||||
|
||||
|
||||
The future of Linux packet filtering
|
||||
targeted for kernel 2.6 and beyond
|
||||
|
||||
|
||||
%center
|
||||
%size 4
|
||||
by
|
||||
|
||||
Harald Welte <laforge@gnumonks.org>
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Contents
|
||||
|
||||
|
||||
Problems with current 2.4.x netfilter/iptables
|
||||
Solution to code replication
|
||||
Solution for dynamic rulesets
|
||||
Solution for API to GUI's and other management programs
|
||||
|
||||
HA for stateful firewalling
|
||||
What's special about firewalling HA
|
||||
Poor man's failover
|
||||
Real state replication
|
||||
|
||||
Other current work
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Problems with 2.4.x netfilter/iptables
|
||||
|
||||
code replication between iptables/ip6tables/arptables
|
||||
iptables was never meant for other protocols, but people did copy+paste 'ports'
|
||||
replication of
|
||||
core kernel code
|
||||
layer 3 independent matches (mac, interface, ...)
|
||||
userspace library (libiptc)
|
||||
userspace tool (iptables)
|
||||
userspace plugins (libipt_xxx.so)
|
||||
doesn't suit the needs for dynamically changing rulesets
|
||||
dynamic rulesets becomming more common due (service selection, IDS)
|
||||
a whole table is created in userspace and sent as blob to kernel
|
||||
for every ruleset the table needs to be copied to userspace and back
|
||||
inside kernel consistency checks on whole table, loop detection
|
||||
too extensible for writing any forward-compatible GUI
|
||||
new extensions showing up all the time
|
||||
a frontend would need to know about the options and use of a new extension
|
||||
thus frontends are always incomplete and out-of-date
|
||||
no high-level API other than piping to iptables-restore
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Reducing code replication
|
||||
|
||||
code replication is a real problem: unclean, bugfixes missed
|
||||
we need layer 3 independent layer for
|
||||
submitting rules to the kernel
|
||||
traversing packet-rulesets supporting match/target modules
|
||||
registering matches/targets
|
||||
layer 3 specific (like matching ipv4 address)
|
||||
layer 3 independent (like matching MAC address)
|
||||
|
||||
solution
|
||||
pkt_tables inside kernel
|
||||
pkt_tables_ipv4 registers layer 3 handler with pkt_tables
|
||||
pkt_tables_ipv6 registers layer 3 handler with pkt_tables
|
||||
everybody registering a pkt_table (like iptable_filter) needs to specify the l3 protocol
|
||||
libraries in userspace (see later)
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Supporting dynamic rulesets
|
||||
|
||||
atomic table-replacement turned out to be bad idea
|
||||
need new interface for sending individual rules to kernel
|
||||
policy routing has the same problem and good solution: rtnetlink
|
||||
solution: nfnetlink
|
||||
multicast-netlink based packet-orinented socket between kernel and userspace
|
||||
has extra benefit that other userspace processes get notified of rule changes [just like routing daemons]
|
||||
nfnetlink is a low-layer below all kernel/userspace communication
|
||||
pkttnetlink [aka iptnetlink]
|
||||
ctnetlink
|
||||
ulog
|
||||
ip_queue
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Communication with other programs
|
||||
|
||||
whole set of libraries
|
||||
libnfnetlink for low-layer communication
|
||||
libpkttnetlink for rule modifications
|
||||
will handle all plugins [which are currently part of iptables]
|
||||
query functions about avaliable matches/targets
|
||||
query functions about parameters
|
||||
query functions for help messages about specific match/parameter of a match
|
||||
generic structure from which rules can be built
|
||||
conversion functions to parse generic structure into in-kernel structure
|
||||
conversion functiosn to perse kernel structure into generic structure
|
||||
functions to convert generic structure in plain text
|
||||
libipq will stay API-compatible to current version
|
||||
libipulog will stay API-compatible to current version
|
||||
libiptc will go away [compatibility layer extremely difficult]
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Introduction
|
||||
|
||||
What is special about firewall failover?
|
||||
|
||||
Nothing, in case of the stateless packet filter
|
||||
Common IP takeover solutions can be used
|
||||
VRRP
|
||||
Hartbeat
|
||||
|
||||
Distribution of packet filtering ruleset no problem
|
||||
can be done manually
|
||||
or implemented with simple userspace process
|
||||
|
||||
Problems arise with stateful packet filters
|
||||
Connection state only on active node
|
||||
NAT mappings only on active node
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Connection tracking...
|
||||
implemented seperately from NAT
|
||||
enables stateful filtering
|
||||
implementation
|
||||
hooks into NF_IP_PRE_ROUTING to track packets
|
||||
hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to see if packet passed filtering rules
|
||||
protocol modules (currently TCP/UDP/ICMP)
|
||||
application helpers currently (FTP,IRC,H.323,talk,SNMP)
|
||||
divides packets in the following four categories
|
||||
NEW - would establish new connection
|
||||
ESTABLISHED - part of already established connection
|
||||
RELATED - is related to established connection
|
||||
INVALID - (multicast, errors...)
|
||||
does _NOT_ filter packets itself
|
||||
can be utilized by iptables using the 'state' match
|
||||
is used by NAT Subsystem
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Common structures
|
||||
struct ip_conntrack_tuple, representing unidirectional flow
|
||||
layer 3 src + dst
|
||||
layer 4 protocol
|
||||
layer 4 src + dst
|
||||
|
||||
|
||||
connetions represented as struct ip_conntrack
|
||||
original tuple
|
||||
reply tuple
|
||||
timeout
|
||||
l4 state private data
|
||||
app helper
|
||||
app helper private data
|
||||
expected connections
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Flow of events for new packet
|
||||
packet enters NF_IP_PRE_ROUTING
|
||||
tuple is derived from packet
|
||||
lookup conntrack hash table with hash(tuple) -> fails
|
||||
new ip_conntrack is allocated
|
||||
fill in original and reply == inverted(original) tuple
|
||||
initialize timer
|
||||
assign app helper if applicable
|
||||
see if we've been expected -> fails
|
||||
call layer 4 helper 'new' function
|
||||
|
||||
...
|
||||
|
||||
packet enters NF_IP_POST_ROUTING
|
||||
do hashtable lookup for packet -> fails
|
||||
place struct ip_conntrack in hashtable
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Flow of events for packet part of existing connection
|
||||
packet enters NF_IP_PRE_ROUTING
|
||||
tuple is derived from packet
|
||||
lookup conntrack hash table with hash(tuple)
|
||||
assosiate conntrack entry with skb->nfct
|
||||
call l4 protocol helper 'packet' function
|
||||
do l4 state tracking
|
||||
update timeouts as needed [i.e. TCP TIME_WAIT,...]
|
||||
|
||||
...
|
||||
|
||||
packet enters NF_IP_POST_ROUTING
|
||||
do hashtable lookup for packet -> succeds
|
||||
do nothing else
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Poor man's failover
|
||||
|
||||
principle
|
||||
every node does it's own tracking, no state replicating
|
||||
two possible implementations
|
||||
connect every node to shared media (i.e. real ethernet)
|
||||
forwarding only turned on on active node
|
||||
slave nodes use promiscuous mode to sniff packets
|
||||
copy all traffic to slave nodes
|
||||
active master needs to copy all traffic to other nodes
|
||||
disadvantage: high load, sync traffic == payload traffic
|
||||
advantages
|
||||
very easy implementation
|
||||
only addition of sniffing mode to conntrack needed
|
||||
existing means of address takeover can be used
|
||||
same load on active master and slave nodes
|
||||
disadvantages
|
||||
can only be used with real shared media (no switches, ...)
|
||||
can not be used with NAT
|
||||
remaining problem
|
||||
no initial state sync after reboot of slave node!
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Real state replication
|
||||
|
||||
Parts needed
|
||||
state replication protocol
|
||||
multicast based
|
||||
sequence numbers for detection of packet loss
|
||||
NACK-based retransmission
|
||||
no security, since private ethernet segment to be used
|
||||
event interface on active node
|
||||
calling out to callback function at all state changes
|
||||
exported interface to manipulate conntrack hash table
|
||||
kernel thread for sending conntrack state protocol messages
|
||||
registers with event interface
|
||||
creates and accumulates state replication packets
|
||||
sends them via in-kernel sockets api
|
||||
kernel thread for receiving conntrack state replication messages
|
||||
receives state replication packets via in-kernel sockets
|
||||
uses conntrack hashtable manipulation interface
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Real state replication
|
||||
|
||||
Flow of events in chronological order:
|
||||
on active node, inside the network RX softirq
|
||||
connection tracking code is analyzing a forwarded packet
|
||||
connection tracking gathers some new state information
|
||||
connection tracking updates local connection tracking database
|
||||
connection tracking sends event message to event API
|
||||
on active node, inside the conntrack-sync kernel thread
|
||||
conntrack sync daemon receives event through event API
|
||||
conntrack sync daemon aggregates multiple event messages into a state replication protocol message, removing possible redundancy
|
||||
conntrack sync daemon generates state replication protocol message
|
||||
conntrack sync daemon sends state replication protocol message
|
||||
on slave node(s), inside network RX softirq
|
||||
connection tracking code ignores packets coming from the interface attached to the private conntrac sync network
|
||||
state replication protocol messages is appended to socket receive queue of conntrack-sync kernel thread
|
||||
on slave node(s), inside conntrack-sync kernel thread
|
||||
conntrack sync daemon receives state replication message
|
||||
conntrack sync daemon creates/updates conntrack entry
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Neccessary changes to kernel
|
||||
|
||||
Neccessary changes to current conntrack core
|
||||
|
||||
event generation (callback functions) for all state changes
|
||||
|
||||
conntrack hashtable manipulation API
|
||||
is needed (and already implemented) for 'ctnetlink' API
|
||||
|
||||
conntrack exemptions
|
||||
needed to _not_ track conntrack state replication packets
|
||||
is needed for other cases as well
|
||||
currently being developed by Jozsef Kadlecsik
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Other current work
|
||||
|
||||
conntrack hash function optimization
|
||||
current hash function not good for even hash bucket count
|
||||
other hash functions in development
|
||||
hash function evaluation tool [cttest] avaliable
|
||||
introduce per-system randomness to prevent hash attack
|
||||
conntrack code optimization (locking/timers/...)
|
||||
conntrack exemptions
|
||||
not useable when NAT is active
|
||||
SLOLG (session log)
|
||||
maybe netflow compatible logs?
|
||||
getting our work submitted into the mainstream kernel
|
||||
turns out to be more difficult as expected
|
||||
newnat has finally made it into 2.4.19
|
||||
discussions about multiple targets/actions per rule
|
||||
technical implementation easy
|
||||
however, not everybody convinced that it fits into the concept
|
||||
using tc for firewalling
|
||||
Jamal Hadi Selim uses iptables targets from within TC
|
||||
leads to discussion of generic classification engine API in kernel
|
||||
netfilter for MPLS
|
||||
implementation of mpls-ping-draft as netfilter module
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Thanks
|
||||
The slides and the an according paper of this presentation are available at http://www.gnumonks.org/
|
||||
The netfilter homepage http://www.netfilter.org/
|
||||
Thanks to
|
||||
the BBS people, Z-Netz, FIDO, ...
|
||||
for heavily increasing my computer usage in 1992
|
||||
KNF (http://www.franken.de/)
|
||||
for bringing me in touch with the internet as early as 1994
|
||||
for providing a playground for technical people
|
||||
for telling me about the existance of Linux!
|
||||
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
|
||||
for implementing (one of?) the world's best TCP/IP stacks
|
||||
Paul 'Rusty' Russell
|
||||
for starting the netfilter/iptables project
|
||||
for trusting me to maintain it today
|
||||
Astaro AG
|
||||
for sponsoring parts of my netfilter work
|
||||
for sponsoring my flight ticket to this conference
|
||||
|
|
@ -0,0 +1,12 @@
|
|||
The netfilter/iptables system is about three years old. With Linux kernel 2.4.x being deployed widely during the last two years, lots of systems worldwide are using netfilter/iptables as their packet filtering subsystem.
|
||||
|
||||
netfilter/iptables is no doubt a big improvement over the old ipchains system in the 2.2.x kernels. Hoewever, as with any project - after wide deployment for some time, we start to discover aspects that can be implemented more cleanly, more efficently.
|
||||
|
||||
The constant innovation and development of new applications and protocols (like SIP) on the internet also raise new requirements towards the linux packet filter.
|
||||
|
||||
So the question is: Is it time for yet another generation of the linux packet filtering subsystem? Will the tradition of change (ipfwadm->ipchains->iptables->?) be continued? Or can we integrate all necessarry changes within the current framework?
|
||||
|
||||
The presentation will cover a summary of the problems with the current netfilter/iptables implementation and describe the proposed solutions.
|
||||
|
||||
Intended Audience: System and Network Administrators
|
||||
Prerequsites: Knowledge about Packet Filters. Usage of iptables.
|
|
@ -0,0 +1,22 @@
|
|||
<a href="http://www.gnumonks.org/users/laforge/">Harald Welte</a> is one
|
||||
of the five <a href="http://www.netfilter.org/">netfilter/iptables</a> core
|
||||
team members, and the current Linux 2.4.x firewalling maintainer.
|
||||
|
||||
His main interest in computing has always been networking. In the few time
|
||||
left besides netfilter/iptables related work, he's writing obscure documents
|
||||
like the <a href="http://www.gnumonks.org/ftp/pub/doc/uucp-over-ssl.html"> UUCP
|
||||
over SSL HOWTO</a>. Other kernel-related projects he has been contributing are
|
||||
user mode linux and the international (crypto) kernel patch.
|
||||
|
||||
In the past he has been working as an independent IT Consultant working on
|
||||
closed-source projecst for various companies ranging from banks to
|
||||
manufacturers of networking gear. During the year 2001 he was living in
|
||||
Curitiba (Brazil), where he got sponsored for his Linux related work by
|
||||
<a href="http://www.conectiva.com/">Conectiva Inc.</a>.
|
||||
|
||||
Starting with February 2002, Harald has been contracted part-time by
|
||||
<a href="http://www.astaro.com/">Astaro AG</a>, who are sponsoring him for his
|
||||
current netfilter/iptables work.
|
||||
|
||||
Harald is living in Berlin, Germany.
|
||||
|
|
@ -0,0 +1,19 @@
|
|||
- pkttables
|
||||
- linked lists instead of blob
|
||||
- explain current situation
|
||||
- dynamic rulesets are slow with iptables
|
||||
- independent of layer 3 protocol
|
||||
- current code duplication between [ip|ip6|arp]tables
|
||||
- some matches (mac, interface, ...) are independent anyway
|
||||
- nfnetlink
|
||||
- idea
|
||||
- ctnetlink
|
||||
- iptnetlink / pkttnetlink
|
||||
- ulog/queue port to it
|
||||
- libnfnetlink, libctnetlink, libpkttnetlink
|
||||
- libiptables / libpkttnetlink
|
||||
- high-level API for rule-manipulation
|
||||
- covering all the plugins which are currently part of iptables
|
||||
|
||||
- failover / load balancing for stateful firewalls
|
||||
- slides from OLS
|
|
@ -0,0 +1,299 @@
|
|||
%include "default.mgp"
|
||||
%default 1 bgrad
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
%nodefault
|
||||
%back "blue"
|
||||
|
||||
%center
|
||||
%size 7
|
||||
|
||||
|
||||
The future of Linux packet filtering
|
||||
|
||||
|
||||
|
||||
%center
|
||||
%size 4
|
||||
by
|
||||
|
||||
Harald Welte <laforge@netfilter.org>
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Contents
|
||||
|
||||
|
||||
Problems with current 2.4/2.5 netfilter/iptables
|
||||
Solution to code replication
|
||||
Solution for dynamic rulesets
|
||||
Solution for API to GUI's and other management programs
|
||||
|
||||
Other current work
|
||||
Optimizing Rule load time of large rulesets
|
||||
Making netfilter/iptables compatible with zerocopy tcp
|
||||
|
||||
HA for stateful firewalling
|
||||
What's special about firewalling HA
|
||||
Poor man's failover
|
||||
Real state replication
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Problems with 2.4.x netfilter/iptables
|
||||
|
||||
code replication between iptables/ip6tables/arptables
|
||||
iptables not meant for other protocols, but people did copy+paste 'ports'
|
||||
replication of
|
||||
core kernel code
|
||||
layer 3 independent matches (mac, interface, ...)
|
||||
userspace library (libiptc)
|
||||
userspace tool (iptables)
|
||||
userspace plugins (libipt_xxx.so)
|
||||
doesn't suit the needs for dynamically changing rulesets
|
||||
dynamic rulesets becomming more common due (service selection, IDS)
|
||||
a whole table is created in userspace and sent as blob to kernel
|
||||
for every ruleset the table needs to be copied to userspace and back
|
||||
inside kernel consistency checks on whole table, loop detection
|
||||
too extensible for writing any forward-compatible GUI
|
||||
new extensions showing up all the time
|
||||
a frontend would need to know about the options and use of a new extension
|
||||
thus frontends are always incomplete and out-of-date
|
||||
no high-level API other than piping to iptables-restore
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Reducing code replication
|
||||
|
||||
code replication is a real problem: unclean, bugfixes missed
|
||||
we need layer 3 independent layer for
|
||||
submitting rules to the kernel
|
||||
traversing packet-rulesets supporting match/target modules
|
||||
registering matches/targets
|
||||
layer 3 specific (like matching ipv4 address)
|
||||
layer 3 independent (like matching MAC address)
|
||||
solution
|
||||
pkt_tables inside kernel
|
||||
pkt_tables_ipv4 registers layer 3 handler with pkt_tables
|
||||
pkt_tables_ipv6 registers layer 3 handler with pkt_tables
|
||||
everybody registering a pkt_table (like iptable_filter) needs to specify the l3 protocol
|
||||
libraries in userspace (see later)
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Supporting dynamic rulesets
|
||||
|
||||
atomic table-replacement turned out to be bad idea
|
||||
need new interface for sending individual rules to kernel
|
||||
policy routing has the same problem and good solution: rtnetlink
|
||||
solution: nfnetlink
|
||||
multicast-netlink based packet-orinented socket between kernel and userspace
|
||||
has extra benefit that other userspace processes get notified of rule changes [just like routing daemons]
|
||||
nfnetlink will be low-layer below all kernel/userspace communication
|
||||
pkttnetlink [aka iptnetlink]
|
||||
ctnetlink
|
||||
ulog
|
||||
ip_queue
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Communication with other programs
|
||||
|
||||
libnfnetlink for low-layer communication
|
||||
libpkttnetlink for rule modifications
|
||||
will handle all plugins [which are currently part of iptables]
|
||||
query functions about avaliable matches/targets
|
||||
query functions about parameters
|
||||
query functions for help messages about specific match/parameter of a match
|
||||
generic structure from which rules can be built
|
||||
conversion functions to parse generic structure into in-kernel structure
|
||||
conversion functions to perse kernel structure into generic structure
|
||||
functions to convert generic structure in plain text
|
||||
libipq will stay API-compatible to current version
|
||||
libipulog will stay API-compatible to current version
|
||||
libiptc will go away [compatibility layer extremely difficult]
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Optimizing rule load time
|
||||
|
||||
Current situation
|
||||
loading 10,000 rules in 1,000 chains takes about 4 minutes on a PIII 733Mhz
|
||||
this is caused by two bottlenecks
|
||||
loop detection algorithm on kernel side inefficient
|
||||
a couple of O^2 complexity functions in libiptc
|
||||
|
||||
Solution
|
||||
efficient loop detection and mark_source_chains() algorithm (graph coloring)
|
||||
current CVS libiptc with only one O^2 function: 2minutes37
|
||||
whole reimplementation of libiptc needed for removing the last O^2 function
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Optimizing the connection tracking code
|
||||
|
||||
Conntrack hash function optimization
|
||||
old hash function not good for even hash bucket count
|
||||
hash function evaluation tool [cttest] avaliable
|
||||
other hash functions in development (already in 2.4.21)
|
||||
introduce per-system randomness to prevent hash attack
|
||||
code optimization (locking/timers/...)
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
netfilter and zerocopy TCP
|
||||
|
||||
Current situation (2.4.x)
|
||||
skb_linearize() at each netfilter hook effectively prevents zerocopy TCP to work if netfilter/iptables is enabled
|
||||
this is a big performance loss on stand-alone servers which filter packets locally
|
||||
|
||||
Solution
|
||||
remove skb_linearize() from conntrack, nat and ip_tables core
|
||||
all iptables extensions and conntrack/nat helpers have to use skb_copy_bits() if they want to access data beyond layer 4 header
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Introduction
|
||||
|
||||
What is special about firewall failover?
|
||||
|
||||
Nothing, in case of the stateless packet filter
|
||||
Common IP takeover solutions can be used
|
||||
VRRP
|
||||
Hartbeat
|
||||
|
||||
Distribution of packet filtering ruleset no problem
|
||||
can be done manually
|
||||
or implemented with simple userspace process
|
||||
|
||||
Problems arise with stateful packet filters
|
||||
Connection state only on active node
|
||||
NAT mappings only on active node
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Poor man's failover
|
||||
|
||||
principle
|
||||
let every node do it's own tracking rather than replicating state
|
||||
two possible implementations
|
||||
connect every node to shared media (i.e. real ethernet)
|
||||
forwarding only turned on on active node
|
||||
slave nodes use promiscuous mode to sniff packets
|
||||
copy all traffic to slave nodes
|
||||
active master needs to copy all traffic to other nodes
|
||||
disadvantage: high load, sync traffic == payload traffic
|
||||
IMHO stupid way of solving the problem
|
||||
advantages
|
||||
very easy implementation
|
||||
only addition of sniffing mode to conntrack needed
|
||||
existing means of address takeover can be used
|
||||
same load on active master and slave nodes
|
||||
no additional load on active master
|
||||
disadvantages
|
||||
can only be used with real shared media (no switches, ...)
|
||||
can not be used with NAT
|
||||
remaining problem
|
||||
no initial state sync after reboot of slave node!
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Real state replication
|
||||
|
||||
Parts needed
|
||||
state replication protocol
|
||||
multicast based
|
||||
sequence numbers for detection of packet loss
|
||||
NACK-based retransmission
|
||||
no security, since private ethernet segment to be used
|
||||
event interface on active node
|
||||
calling out to callback function at all state changes
|
||||
exported interface to manipulate conntrack hash table
|
||||
kernel thread for sending conntrack state protocol messages
|
||||
registers with event interface
|
||||
creates and accumulates state replication packets
|
||||
sends them via in-kernel sockets api
|
||||
kernel thread for receiving conntrack state replication messages
|
||||
receives state replication packets via in-kernel sockets
|
||||
uses conntrack hashtable manipulation interface
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Real state replication
|
||||
|
||||
Flow of events in chronological order:
|
||||
on active node, inside the network RX softirq
|
||||
connection tracking code is analyzing a forwarded packet
|
||||
connection tracking gathers some new state information
|
||||
connection tracking updates local connection tracking database
|
||||
connection tracking sends event message to event API
|
||||
on active node, inside the conntrack-sync kernel thread
|
||||
conntrack sync daemon receives event through event API
|
||||
conntrack sync daemon aggregates multiple event messages into a state replication protocol message, removing possible redundancy
|
||||
conntrack sync daemon generates state replication protocol message
|
||||
conntrack sync daemon sends state replication protocol message
|
||||
on slave node(s), inside network RX softirq
|
||||
connection tracking code ignores packets coming from the interface attached to the private conntrac sync network
|
||||
state replication protocol messages is appended to socket receive queue of conntrack-sync kernel thread
|
||||
on slave node(s), inside conntrack-sync kernel thread
|
||||
conntrack sync daemon receives state replication message
|
||||
conntrack sync daemon creates/updates conntrack entry
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Neccessary changes to kernel
|
||||
|
||||
Neccessary changes to current conntrack core
|
||||
|
||||
event generation (callback functions) for all state changes
|
||||
|
||||
conntrack hashtable manipulation API
|
||||
is needed (and already implemented) for 'ctnetlink' API
|
||||
|
||||
conntrack exemptions
|
||||
needed to _not_ track conntrack state replication packets
|
||||
is needed for other cases as well
|
||||
currently being developed by Jozsef Kadlecsik
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Thanks
|
||||
The slides of this presentation are available at http://www.gnumonks.org/
|
||||
|
||||
Visit the netfilter homepage http://www.netfilter.org/
|
||||
|
||||
Thanks to
|
||||
the BBS people, Z-Netz, FIDO, ...
|
||||
for heavily increasing my computer usage in 1992
|
||||
KNF
|
||||
for bringing me in touch with the internet as early as 1994
|
||||
for providing a playground for technical people
|
||||
for telling me about the existance of Linux!
|
||||
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
|
||||
for implementing (one of?) the world's best TCP/IP stacks
|
||||
Paul 'Rusty' Russell
|
||||
for starting the netfilter/iptables project
|
||||
for trusting me to maintain it today
|
||||
Astaro AG
|
||||
for sponsoring most of my current netfilter work
|
||||
|
|
@ -0,0 +1,304 @@
|
|||
%include "default.mgp"
|
||||
%default 1 bgrad
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
%nodefault
|
||||
%back "blue"
|
||||
|
||||
%center
|
||||
%size 7
|
||||
|
||||
|
||||
The future of Linux packet filtering
|
||||
|
||||
|
||||
|
||||
%center
|
||||
%size 4
|
||||
by
|
||||
|
||||
Harald Welte <laforge@netfilter.org>
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Contents
|
||||
|
||||
|
||||
Problems with current 2.4/2.5 netfilter/iptables
|
||||
Solution to code replication
|
||||
Solution for dynamic rulesets
|
||||
Solution for API to GUI's and other management programs
|
||||
|
||||
Other current work
|
||||
Optimizing Rule load time of large rulesets
|
||||
Making netfilter/iptables compatible with zerocopy tcp
|
||||
|
||||
HA for stateful firewalling
|
||||
What's special about firewalling HA
|
||||
Poor man's failover
|
||||
Real state replication
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Problems with 2.4.x netfilter/iptables
|
||||
|
||||
code replication between iptables/ip6tables/arptables
|
||||
iptables was never meant for other protocols, but people did copy+paste 'ports'
|
||||
replication of
|
||||
core kernel code
|
||||
layer 3 independent matches (mac, interface, ...)
|
||||
userspace library (libiptc)
|
||||
userspace tool (iptables)
|
||||
userspace plugins (libipt_xxx.so)
|
||||
|
||||
doesn't suit the needs for dynamically changing rulesets
|
||||
dynamic rulesets becomming more common due (service selection, IDS)
|
||||
a whole table is created in userspace and sent as blob to kernel
|
||||
for every ruleset the table needs to be copied to userspace and back
|
||||
inside kernel consistency checks on whole table, loop detection
|
||||
|
||||
too extensible for writing any forward-compatible GUI
|
||||
new extensions showing up all the time
|
||||
a frontend would need to know about the options and use of a new extension
|
||||
thus frontends are always incomplete and out-of-date
|
||||
no high-level API other than piping to iptables-restore
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Reducing code replication
|
||||
|
||||
code replication is a real problem: unclean, bugfixes missed
|
||||
we need layer 3 independent layer for
|
||||
submitting rules to the kernel
|
||||
traversing packet-rulesets supporting match/target modules
|
||||
registering matches/targets
|
||||
layer 3 specific (like matching ipv4 address)
|
||||
layer 3 independent (like matching MAC address)
|
||||
|
||||
solution
|
||||
pkt_tables inside kernel
|
||||
pkt_tables_ipv4 registers layer 3 handler with pkt_tables
|
||||
pkt_tables_ipv6 registers layer 3 handler with pkt_tables
|
||||
everybody registering a pkt_table (like iptable_filter) needs to specify the l3 protocol
|
||||
libraries in userspace (see later)
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Supporting dynamic rulesets
|
||||
|
||||
atomic table-replacement turned out to be bad idea
|
||||
need new interface for sending individual rules to kernel
|
||||
policy routing has the same problem and good solution: rtnetlink
|
||||
solution: nfnetlink
|
||||
multicast-netlink based packet-orinented socket between kernel and userspace
|
||||
has extra benefit that other userspace processes get notified of rule changes [just like routing daemons]
|
||||
nfnetlink will be low-layer below all kernel/userspace communication
|
||||
pkttnetlink [aka iptnetlink]
|
||||
ctnetlink
|
||||
ulog
|
||||
ip_queue
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Communication with other programs
|
||||
|
||||
whole set of libraries
|
||||
libnfnetlink for low-layer communication
|
||||
libpkttnetlink for rule modifications
|
||||
will handle all plugins [which are currently part of iptables]
|
||||
query functions about avaliable matches/targets
|
||||
query functions about parameters
|
||||
query functions for help messages about specific match/parameter of a match
|
||||
generic structure from which rules can be built
|
||||
conversion functions to parse generic structure into in-kernel structure
|
||||
conversion functions to perse kernel structure into generic structure
|
||||
functions to convert generic structure in plain text
|
||||
libipq will stay API-compatible to current version
|
||||
libipulog will stay API-compatible to current version
|
||||
libiptc will go away [compatibility layer extremely difficult]
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Optimizing rule load time
|
||||
|
||||
Current situation
|
||||
loading 10,000 rules in 1,000 chains takes about 4 minutes on a PIII 733Mhz
|
||||
this is caused by two bottlenecks
|
||||
loop detection algorithm on kernel side inefficient
|
||||
a couple of O^2 complexity functions in libiptc
|
||||
|
||||
Solution
|
||||
efficient loop detection and mark_source_chains() algorithm (graph coloring)
|
||||
current CVS libiptc with only one O^2 function: 2minutes37
|
||||
whole reimplementation of libiptc needed for removing the last O^2 function
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Optimizing the connection tracking code
|
||||
|
||||
Conntrack hash function optimization
|
||||
old hash function not good for even hash bucket count
|
||||
hash function evaluation tool [cttest] avaliable
|
||||
other hash functions in development (already in 2.4.21)
|
||||
introduce per-system randomness to prevent hash attack
|
||||
code optimization (locking/timers/...)
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
netfilter and zerocopy TCP
|
||||
|
||||
Current situation (2.4.x)
|
||||
skb_linearize() at each netfilter hook effectively prevents zerocopy TCP to work if netfilter/iptables is enabled
|
||||
this is a big performance loss on stand-alone servers which filter packets locally
|
||||
|
||||
Solution
|
||||
remove skb_linearize() from conntrack, nat and ip_tables core
|
||||
all iptables extensions and conntrack/nat helpers have to use skb_copy_bits() if they want to access data beyond layer 4 header
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Introduction
|
||||
|
||||
What is special about firewall failover?
|
||||
|
||||
Nothing, in case of the stateless packet filter
|
||||
Common IP takeover solutions can be used
|
||||
VRRP
|
||||
Hartbeat
|
||||
|
||||
Distribution of packet filtering ruleset no problem
|
||||
can be done manually
|
||||
or implemented with simple userspace process
|
||||
|
||||
Problems arise with stateful packet filters
|
||||
Connection state only on active node
|
||||
NAT mappings only on active node
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Poor man's failover
|
||||
|
||||
Poor man's failover
|
||||
principle
|
||||
let every node do it's own tracking rather than replicating state
|
||||
two possible implementations
|
||||
connect every node to shared media (i.e. real ethernet)
|
||||
forwarding only turned on on active node
|
||||
slave nodes use promiscuous mode to sniff packets
|
||||
copy all traffic to slave nodes
|
||||
active master needs to copy all traffic to other nodes
|
||||
disadvantage: high load, sync traffic == payload traffic
|
||||
IMHO stupid way of solving the problem
|
||||
advantages
|
||||
very easy implementation
|
||||
only addition of sniffing mode to conntrack needed
|
||||
existing means of address takeover can be used
|
||||
same load on active master and slave nodes
|
||||
no additional load on active master
|
||||
disadvantages
|
||||
can only be used with real shared media (no switches, ...)
|
||||
can not be used with NAT
|
||||
remaining problem
|
||||
no initial state sync after reboot of slave node!
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Real state replication
|
||||
|
||||
Parts needed
|
||||
state replication protocol
|
||||
multicast based
|
||||
sequence numbers for detection of packet loss
|
||||
NACK-based retransmission
|
||||
no security, since private ethernet segment to be used
|
||||
event interface on active node
|
||||
calling out to callback function at all state changes
|
||||
exported interface to manipulate conntrack hash table
|
||||
kernel thread for sending conntrack state protocol messages
|
||||
registers with event interface
|
||||
creates and accumulates state replication packets
|
||||
sends them via in-kernel sockets api
|
||||
kernel thread for receiving conntrack state replication messages
|
||||
receives state replication packets via in-kernel sockets
|
||||
uses conntrack hashtable manipulation interface
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Real state replication
|
||||
|
||||
Flow of events in chronological order:
|
||||
on active node, inside the network RX softirq
|
||||
connection tracking code is analyzing a forwarded packet
|
||||
connection tracking gathers some new state information
|
||||
connection tracking updates local connection tracking database
|
||||
connection tracking sends event message to event API
|
||||
on active node, inside the conntrack-sync kernel thread
|
||||
conntrack sync daemon receives event through event API
|
||||
conntrack sync daemon aggregates multiple event messages into a state replication protocol message, removing possible redundancy
|
||||
conntrack sync daemon generates state replication protocol message
|
||||
conntrack sync daemon sends state replication protocol message
|
||||
on slave node(s), inside network RX softirq
|
||||
connection tracking code ignores packets coming from the interface attached to the private conntrac sync network
|
||||
state replication protocol messages is appended to socket receive queue of conntrack-sync kernel thread
|
||||
on slave node(s), inside conntrack-sync kernel thread
|
||||
conntrack sync daemon receives state replication message
|
||||
conntrack sync daemon creates/updates conntrack entry
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Neccessary changes to kernel
|
||||
|
||||
Neccessary changes to current conntrack core
|
||||
|
||||
event generation (callback functions) for all state changes
|
||||
|
||||
conntrack hashtable manipulation API
|
||||
is needed (and already implemented) for 'ctnetlink' API
|
||||
|
||||
conntrack exemptions
|
||||
needed to _not_ track conntrack state replication packets
|
||||
is needed for other cases as well
|
||||
currently being developed by Jozsef Kadlecsik
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Thanks
|
||||
The slides of this presentation are available at http://www.gnumonks.org/
|
||||
|
||||
Visit the netfilter homepage http://www.netfilter.org/
|
||||
|
||||
Thanks to
|
||||
the BBS people, Z-Netz, FIDO, ...
|
||||
for heavily increasing my computer usage in 1992
|
||||
KNF
|
||||
for bringing me in touch with the internet as early as 1994
|
||||
for providing a playground for technical people
|
||||
for telling me about the existance of Linux!
|
||||
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
|
||||
for implementing (one of?) the world's best TCP/IP stacks
|
||||
Paul 'Rusty' Russell
|
||||
for starting the netfilter/iptables project
|
||||
for trusting me to maintain it today
|
||||
Astaro AG
|
||||
for sponsoring most of my current netfilter work
|
||||
|
|
@ -0,0 +1,318 @@
|
|||
\documentstyle{seminar}
|
||||
\begin{document}
|
||||
\vspace{3mm}
|
||||
\begin{slide}
|
||||
\vspace{3mm}
|
||||
\begin{center}
|
||||
\vspace{3mm}
|
||||
\vspace{3mm}
|
||||
The future of Linux packet filtering\\
|
||||
\vspace{3mm}
|
||||
\vspace{3mm}
|
||||
\vspace{3mm}
|
||||
\end{center}
|
||||
\begin{center}
|
||||
by\\
|
||||
\vspace{3mm}
|
||||
Harald Welte <laforge@netfilter.org>\\
|
||||
\vspace{3mm}
|
||||
\vspace{3mm}
|
||||
\end{center}
|
||||
\end{slide}
|
||||
\begin{slide}
|
||||
Future of Linux packet filtering\\
|
||||
Contents\\
|
||||
\vspace{3mm}
|
||||
\vspace{3mm}
|
||||
Problems with current 2.4/2.5 netfilter/iptables\\
|
||||
Solution to code replication\\
|
||||
Solution for dynamic rulesets\\
|
||||
Solution for API to GUI's and other management programs\\
|
||||
\vspace{3mm}
|
||||
Other current work\\
|
||||
Optimizing Rule load time of large rulesets\\
|
||||
Making netfilter/iptables compatible with zerocopy tcp\\
|
||||
\vspace{3mm}
|
||||
HA for stateful firewalling\\
|
||||
What's special about firewalling HA\\
|
||||
Poor man's failover\\
|
||||
Real state replication\\
|
||||
\vspace{3mm}
|
||||
\vspace{3mm}
|
||||
\vspace{3mm}
|
||||
\end{slide}
|
||||
\begin{slide}
|
||||
Future of Linux packet filtering\\
|
||||
Problems with 2.4.x netfilter/iptables\\
|
||||
\vspace{3mm}
|
||||
code replication between iptables/ip6tables/arptables\\
|
||||
iptables was never meant for other protocols, but people did copy+paste 'ports'\\
|
||||
replication of\\
|
||||
core kernel code\\
|
||||
layer 3 independent matches (mac, interface, ...)\\
|
||||
userspace library (libiptc)\\
|
||||
userspace tool (iptables)\\
|
||||
userspace plugins (libipt_xxx.so)\\
|
||||
\vspace{3mm}
|
||||
doesn't suit the needs for dynamically changing rulesets\\
|
||||
dynamic rulesets becomming more common due (service selection, IDS)\\
|
||||
a whole table is created in userspace and sent as blob to kernel\\
|
||||
for every ruleset the table needs to be copied to userspace and back\\
|
||||
inside kernel consistency checks on whole table, loop detection\\
|
||||
\vspace{3mm}
|
||||
too extensible for writing any forward-compatible GUI\\
|
||||
new extensions showing up all the time\\
|
||||
a frontend would need to know about the options and use of a new extension\\
|
||||
thus frontends are always incomplete and out-of-date\\
|
||||
no high-level API other than piping to iptables-restore\\
|
||||
\vspace{3mm}
|
||||
\vspace{3mm}
|
||||
\end{slide}
|
||||
\begin{slide}
|
||||
Future of Linux packet filtering\\
|
||||
Reducing code replication\\
|
||||
\vspace{3mm}
|
||||
code replication is a real problem: unclean, bugfixes missed\\
|
||||
we need layer 3 independent layer for\\
|
||||
submitting rules to the kernel\\
|
||||
traversing packet-rulesets supporting match/target modules\\
|
||||
registering matches/targets\\
|
||||
layer 3 specific (like matching ipv4 address)\\
|
||||
layer 3 independent (like matching MAC address)\\
|
||||
\vspace{3mm}
|
||||
solution\\
|
||||
pkt_tables inside kernel\\
|
||||
pkt_tables_ipv4 registers layer 3 handler with pkt_tables\\
|
||||
pkt_tables_ipv6 registers layer 3 handler with pkt_tables\\
|
||||
everybody registering a pkt_table (like iptable_filter) needs to specify the l3 protocol\\
|
||||
libraries in userspace (see later)\\
|
||||
\vspace{3mm}
|
||||
\vspace{3mm}
|
||||
\end{slide}
|
||||
\begin{slide}
|
||||
Future of Linux packet filtering\\
|
||||
Supporting dynamic rulesets\\
|
||||
\vspace{3mm}
|
||||
atomic table-replacement turned out to be bad idea\\
|
||||
need new interface for sending individual rules to kernel\\
|
||||
policy routing has the same problem and good solution: rtnetlink\\
|
||||
solution: nfnetlink\\
|
||||
multicast-netlink based packet-orinented socket between kernel and userspace\\
|
||||
has extra benefit that other userspace processes get notified of rule changes [just like routing daemons]\\
|
||||
nfnetlink will be low-layer below all kernel/userspace communication\\
|
||||
pkttnetlink [aka iptnetlink]\\
|
||||
ctnetlink\\
|
||||
ulog\\
|
||||
ip_queue\\
|
||||
\vspace{3mm}
|
||||
\vspace{3mm}
|
||||
\end{slide}
|
||||
\begin{slide}
|
||||
Future of Linux packet filtering\\
|
||||
Communication with other programs\\
|
||||
\vspace{3mm}
|
||||
whole set of libraries\\
|
||||
libnfnetlink for low-layer communication\\
|
||||
libpkttnetlink for rule modifications\\
|
||||
will handle all plugins [which are currently part of iptables]\\
|
||||
query functions about avaliable matches/targets\\
|
||||
query functions about parameters\\
|
||||
query functions for help messages about specific match/parameter of a match\\
|
||||
generic structure from which rules can be built\\
|
||||
conversion functions to parse generic structure into in-kernel structure\\
|
||||
conversion functions to perse kernel structure into generic structure\\
|
||||
functions to convert generic structure in plain text\\
|
||||
libipq will stay API-compatible to current version\\
|
||||
libipulog will stay API-compatible to current version\\
|
||||
libiptc will go away [compatibility layer extremely difficult]\\
|
||||
\vspace{3mm}
|
||||
\vspace{3mm}
|
||||
\end{slide}
|
||||
\begin{slide}
|
||||
Future of Linux packet filtering\\
|
||||
Optimizing rule load time\\
|
||||
\vspace{3mm}
|
||||
Current situation\\
|
||||
loading 10,000 rules in 1,000 chains takes about 4 minutes on a PIII 733Mhz\\
|
||||
this is caused by two bottlenecks\\
|
||||
loop detection algorithm on kernel side inefficient\\
|
||||
a couple of O^2 complexity functions in libiptc\\
|
||||
\vspace{3mm}
|
||||
Solution\\
|
||||
efficient loop detection and mark_source_chains() algorithm (graph coloring)\\
|
||||
current CVS libiptc with only one O^2 function: 2minutes37\\
|
||||
whole reimplementation of libiptc needed for removing the last O^2 function \\
|
||||
\vspace{3mm}
|
||||
\vspace{3mm}
|
||||
\end{slide}
|
||||
\begin{slide}
|
||||
HA for netfillter/iptables\\
|
||||
Optimizing the connection tracking code\\
|
||||
\vspace{3mm}
|
||||
Conntrack hash function optimization\\
|
||||
old hash function not good for even hash bucket count\\
|
||||
hash function evaluation tool [cttest] avaliable\\
|
||||
other hash functions in development (already in 2.4.21)\\
|
||||
introduce per-system randomness to prevent hash attack\\
|
||||
code optimization (locking/timers/...)\\
|
||||
\vspace{3mm}
|
||||
\vspace{3mm}
|
||||
\vspace{3mm}
|
||||
\end{slide}
|
||||
\begin{slide}
|
||||
Future of Linux packet filtering\\
|
||||
netfilter and zerocopy TCP\\
|
||||
\vspace{3mm}
|
||||
Current situation (2.4.x)\\
|
||||
skb_linearize() at each netfilter hook effectively prevents zerocopy TCP to work if netfilter/iptables is enabled\\
|
||||
this is a big performance loss on stand-alone servers which filter packets locally\\
|
||||
\vspace{3mm}
|
||||
Solution\\
|
||||
remove skb_linearize() from conntrack, nat and ip_tables core\\
|
||||
all iptables extensions and conntrack/nat helpers have to use skb_copy_bits() if they want to access data beyond layer 4 header\\
|
||||
\vspace{3mm}
|
||||
\vspace{3mm}
|
||||
\end{slide}
|
||||
\begin{slide}
|
||||
HA for netfillter/iptables\\
|
||||
Introduction\\
|
||||
\vspace{3mm}
|
||||
What is special about firewall failover?\\
|
||||
\vspace{3mm}
|
||||
Nothing, in case of the stateless packet filter\\
|
||||
Common IP takeover solutions can be used\\
|
||||
VRRP\\
|
||||
Hartbeat\\
|
||||
\vspace{3mm}
|
||||
Distribution of packet filtering ruleset no problem\\
|
||||
can be done manually\\
|
||||
or implemented with simple userspace process\\
|
||||
\vspace{3mm}
|
||||
Problems arise with stateful packet filters\\
|
||||
Connection state only on active node\\
|
||||
NAT mappings only on active node\\
|
||||
\vspace{3mm}
|
||||
\vspace{3mm}
|
||||
\vspace{3mm}
|
||||
\end{slide}
|
||||
\begin{slide}
|
||||
HA for netfillter/iptables\\
|
||||
Poor man's failover\\
|
||||
\vspace{3mm}
|
||||
Poor man's failover\\
|
||||
principle\\
|
||||
let every node do it's own tracking rather than replicating state\\
|
||||
two possible implementations\\
|
||||
connect every node to shared media (i.e. real ethernet)\\
|
||||
forwarding only turned on on active node\\
|
||||
slave nodes use promiscuous mode to sniff packets\\
|
||||
copy all traffic to slave nodes\\
|
||||
active master needs to copy all traffic to other nodes\\
|
||||
disadvantage: high load, sync traffic == payload traffic\\
|
||||
IMHO stupid way of solving the problem \\
|
||||
advantages\\
|
||||
very easy implementation\\
|
||||
only addition of sniffing mode to conntrack needed\\
|
||||
existing means of address takeover can be used\\
|
||||
same load on active master and slave nodes\\
|
||||
no additional load on active master\\
|
||||
disadvantages\\
|
||||
can only be used with real shared media (no switches, ...)\\
|
||||
can not be used with NAT\\
|
||||
remaining problem\\
|
||||
no initial state sync after reboot of slave node!\\
|
||||
\vspace{3mm}
|
||||
\vspace{3mm}
|
||||
\vspace{3mm}
|
||||
\end{slide}
|
||||
\begin{slide}
|
||||
HA for netfillter/iptables\\
|
||||
Real state replication\\
|
||||
\vspace{3mm}
|
||||
Parts needed\\
|
||||
state replication protocol\\
|
||||
multicast based\\
|
||||
sequence numbers for detection of packet loss\\
|
||||
NACK-based retransmission\\
|
||||
no security, since private ethernet segment to be used\\
|
||||
event interface on active node\\
|
||||
calling out to callback function at all state changes\\
|
||||
exported interface to manipulate conntrack hash table\\
|
||||
kernel thread for sending conntrack state protocol messages\\
|
||||
registers with event interface\\
|
||||
creates and accumulates state replication packets\\
|
||||
sends them via in-kernel sockets api\\
|
||||
kernel thread for receiving conntrack state replication messages\\
|
||||
receives state replication packets via in-kernel sockets\\
|
||||
uses conntrack hashtable manipulation interface\\
|
||||
\vspace{3mm}
|
||||
\vspace{3mm}
|
||||
\vspace{3mm}
|
||||
\end{slide}
|
||||
\begin{slide}
|
||||
HA for netfillter/iptables\\
|
||||
Real state replication\\
|
||||
\vspace{3mm}
|
||||
Flow of events in chronological order:\\
|
||||
on active node, inside the network RX softirq\\
|
||||
connection tracking code is analyzing a forwarded packet\\
|
||||
connection tracking gathers some new state information\\
|
||||
connection tracking updates local connection tracking database\\
|
||||
connection tracking sends event message to event API\\
|
||||
on active node, inside the conntrack-sync kernel thread\\
|
||||
conntrack sync daemon receives event through event API\\
|
||||
conntrack sync daemon aggregates multiple event messages into a state replication protocol message, removing possible redundancy\\
|
||||
conntrack sync daemon generates state replication protocol message\\
|
||||
conntrack sync daemon sends state replication protocol message\\
|
||||
on slave node(s), inside network RX softirq\\
|
||||
connection tracking code ignores packets coming from the interface attached to the private conntrac sync network\\
|
||||
state replication protocol messages is appended to socket receive queue of conntrack-sync kernel thread\\
|
||||
on slave node(s), inside conntrack-sync kernel thread\\
|
||||
conntrack sync daemon receives state replication message\\
|
||||
conntrack sync daemon creates/updates conntrack entry\\
|
||||
\vspace{3mm}
|
||||
\vspace{3mm}
|
||||
\end{slide}
|
||||
\begin{slide}
|
||||
HA for netfillter/iptables\\
|
||||
Neccessary changes to kernel\\
|
||||
\vspace{3mm}
|
||||
Neccessary changes to current conntrack core\\
|
||||
\vspace{3mm}
|
||||
event generation (callback functions) for all state changes\\
|
||||
\vspace{3mm}
|
||||
conntrack hashtable manipulation API\\
|
||||
is needed (and already implemented) for 'ctnetlink' API\\
|
||||
\vspace{3mm}
|
||||
conntrack exemptions\\
|
||||
needed to _not_ track conntrack state replication packets\\
|
||||
is needed for other cases as well\\
|
||||
currently being developed by Jozsef Kadlecsik\\
|
||||
\vspace{3mm}
|
||||
\vspace{3mm}
|
||||
\vspace{3mm}
|
||||
\end{slide}
|
||||
\begin{slide}
|
||||
Future of Linux packet filtering\\
|
||||
Thanks\\
|
||||
The slides of this presentation are available at http://www.gnumonks.org/\\
|
||||
\vspace{3mm}
|
||||
Visit the netfilter homepage http://www.netfilter.org/\\
|
||||
\vspace{3mm}
|
||||
Thanks to\\
|
||||
the BBS people, Z-Netz, FIDO, ...\\
|
||||
for heavily increasing my computer usage in 1992\\
|
||||
KNF\\
|
||||
for bringing me in touch with the internet as early as 1994\\
|
||||
for providing a playground for technical people\\
|
||||
for telling me about the existance of Linux!\\
|
||||
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen\\
|
||||
for implementing (one of?) the world's best TCP/IP stacks\\
|
||||
Paul 'Rusty' Russell\\
|
||||
for starting the netfilter/iptables project\\
|
||||
for trusting me to maintain it today\\
|
||||
Astaro AG\\
|
||||
for sponsoring most of my current netfilter work\\
|
||||
\vspace{3mm}
|
||||
\end{slide}
|
||||
\end{document}
|
|
@ -0,0 +1,73 @@
|
|||
|
||||
0 - introduction/definition: Firewalls, Proxies, Packet Filters
|
||||
- present myself and my function within the netfilter coreteam
|
||||
- what is a firewall
|
||||
- packet filters at networking layer
|
||||
- inspect each packet and make a choice based on the packet
|
||||
- traditionally don't know about connections (== layer 4)
|
||||
- advantage: fast, transparent
|
||||
- disadvantage: filtering limited to l3+l4 (sometimes l2)
|
||||
- proxies at application layer
|
||||
- terminate two connections (client->proxy and proxy->server)
|
||||
- advantage: can base policy decision on application protocol
|
||||
- disadvantage: not transparent at all (not even transparent proxies)
|
||||
- result: both of them have their application.
|
||||
- history of linux packet filtering
|
||||
- ipfwadm (2.0)
|
||||
- ipchains (2.2)
|
||||
- iptables (2.4+2.6)
|
||||
- pkttables (2.6+)
|
||||
- iptables was developed together with netfilter in the 2.3.x kernel series
|
||||
|
||||
1 - Why a free software firewall?
|
||||
- the internet was built on free/open standards and software
|
||||
- security relevant open sourcecode gets more auditing because more people read it (and thus report bugs)
|
||||
- users can put more trust in FOSS, since they can check for hidden backdoors
|
||||
- packet filters are used like routers. They are core infrastructure of the internet. Infrastructure should be open/free for the public, just like roads.
|
||||
- Everybody should be able to learn and understand how packet filtering works
|
||||
- Infrastructure should not depend on monopolistic companies.
|
||||
- problem if company goes bankrupt
|
||||
- dependent on 'upgrade pressure' and future license changes
|
||||
- no possibility to adopt it to new standards if vendor doesn't want to support it
|
||||
|
||||
2 - What can you do with netfilter/iptables
|
||||
- stateless packet filtering
|
||||
- matches: mac, src/dst ip, src/dst port,
|
||||
- stateful packet filtering by using connection tracking
|
||||
- keeps state table about all ongoing connections
|
||||
- supports l4 TCP,UDP,ICMP,GRE,PPTP
|
||||
- supports l5+ complex protocols like ftp,pptp,h323,talk,...
|
||||
- IP accounting (every rule has a packet/byte counter)
|
||||
- Network Adress Translation (NAT/NAPT)
|
||||
- Stateful, based on Connection tracking
|
||||
- Source NAT / Masquerading
|
||||
- Destination NAT / Redirect
|
||||
- 1:1 NAT of whole networks (NETMAP)
|
||||
- supports l5+ complex protocols like ftp,pptp,h323,talk,...
|
||||
- Packet Mangling
|
||||
- Clamp TCP MSS to PMTU
|
||||
- Manipulate packet header (TTL, ECN, DSCP, ...)
|
||||
- Combine with policy routing / traffic shaping systems
|
||||
- stateless IPv6 packet filtering using ip6tables
|
||||
|
||||
3 - Who is behind the project? How to get involved?
|
||||
- started by Paul 'Rusty' Russell from Australia (co-author of ipchains)
|
||||
- Marc Boucher (Canada) and James Morris (Australia) dropped in
|
||||
- Harald Welte (Germany), Jozsef Kadlecsik (Hungary), Martin Josefsson (Sweden) joined coreteam
|
||||
- Countless contributions from hundreds of poeple all over the world
|
||||
- used to keep a scoreboard, but it was eating too much time
|
||||
- Project internet presence:
|
||||
- HTTP (www.netfilter.org)
|
||||
- FTP (ftp.netfilter.org)
|
||||
- RSYNC (rsync.netfilter.org)
|
||||
- CVS (pserver.netfilter.org)
|
||||
- 5 mailinglists (lists.netfilter.org)
|
||||
- Bugzilla (bugzilla.netfilter.org)
|
||||
- CVSweb (http://cvs.netfilter.org)
|
||||
- Anybody can contribute, as long as the contribution is GPL licensed
|
||||
- development happens on netfilter-devel@lists.netfilter.org
|
||||
- user questions belong to netfilter@lists.netfilter.org
|
||||
- security relevant findings to coreteam@netfilter.org
|
||||
|
||||
Iptables is used by a lot of commercial [and also proprietary] products. Companies like Astaro and Smoothwall are offering iptables-based firewall appliances. Other companies (like Linksys, Belkin, ...) are embedding iptables into their wavelan access points - and users don't even know that they are using iptables.
|
||||
|
|
@ -0,0 +1,220 @@
|
|||
%include "default.mgp"
|
||||
%default 1 bgrad
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
%nodefault
|
||||
%back "blue"
|
||||
|
||||
%center
|
||||
%size 7
|
||||
|
||||
|
||||
The netfilter/iptables project
|
||||
|
||||
|
||||
|
||||
%center
|
||||
%size 4
|
||||
by
|
||||
|
||||
Harald Welte <laforge@netfilter.org>
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables project
|
||||
Contents
|
||||
|
||||
Introduction: Firewalls, Proxies, Packet Filters
|
||||
|
||||
Why a free software firewall?
|
||||
|
||||
What can you do with netfilter/iptables?
|
||||
|
||||
Who is behind the project? How to get involved?
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables project
|
||||
Introduction: Firewalls, Proxies, Packet Filters
|
||||
|
||||
Firewalls are security gateways between networks
|
||||
|
||||
Can be implemented in different ways, at different layers
|
||||
|
||||
Packet filters at networking layer (3)
|
||||
inspect each packet and make decision based on the packet contents
|
||||
traditionally don't know about connections
|
||||
advantage: fast, transparent
|
||||
disadvantage: filtering limited to l3 and l4 headers
|
||||
|
||||
Proxies at application layer (5-7)
|
||||
terminate two connections (client->proxy and proxy->server)
|
||||
advantage: can base decision on application protocol
|
||||
disadvantage: not transparent, need application support
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables project
|
||||
Introduction: Firewalls, Proxies, Packet Filters
|
||||
|
||||
However, the world is not that easy anymore since new techniques are blending those two concepts
|
||||
|
||||
stateful packet filters
|
||||
keep state about existing connections/flows
|
||||
allow even state tracking beyond l4 state
|
||||
thus give packet filters some features of proxies
|
||||
|
||||
transparent proxies
|
||||
can be implemented without application support
|
||||
how 'transparent' do you want to be? to the client? the server? the network?
|
||||
thus give proxies some of the transparency of packet filters
|
||||
|
||||
In reality it is sometimes hard to tell. netfilter/iptables implements a packet filter (stateless/stateful) and some support for transparent proxying.
|
||||
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables project
|
||||
History of linux packet filtering
|
||||
|
||||
%size 3
|
||||
1994: kernel 1.2.x (BSD4.4 ipfw)
|
||||
first packet filter in the linux kernel
|
||||
%size 3
|
||||
1995: kernel 2.0.x (ipfwadm)
|
||||
enhanced version of the old ipfw
|
||||
first support for masquerading
|
||||
%size 3
|
||||
1997: kernel 2.2.x (ipchains)
|
||||
enhanced version of ipfwadm
|
||||
support for multiple lists of rules (chains)
|
||||
support for transparent proxying
|
||||
masquerading helpers for ftp/irc/quake/...
|
||||
%size 3
|
||||
2000: kernel 2.4.x (iptables)
|
||||
totally new implementation (based on netfilter API)
|
||||
allows for multiple tables (which each have multiple chains)
|
||||
first support for stateful packet filtering
|
||||
support for fully symmetric NAT (SNAT/DNAT/...)
|
||||
%size 3
|
||||
2003: kernel 2.6.0-testX (iptables)
|
||||
breaking a tradition: no new packet filter (not yet...)
|
||||
support for non-linear skb's (zerocopy TCP path)
|
||||
%size 3
|
||||
2003/4: kernel 2.7.x and later 2.6.x backport (pkttables)
|
||||
totally new implementation
|
||||
layer 3 independent packet filtering framework
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables project
|
||||
Why a free software firewall?
|
||||
|
||||
Tradition
|
||||
The internet was builton free/open standards and software
|
||||
Code Quality
|
||||
Security relevant open sourcecode gets more auditing because more people read it (and thus report/fix bugs)
|
||||
Trust
|
||||
Users can have more trust in FOSS, since they can check for hidden backdoors
|
||||
Public infrastructure
|
||||
Packet Filters (like routers) are core infrastructure of the internet.
|
||||
Infrastructure should be open/free for the public, just like roads.
|
||||
Arguments against proprietary software in infrastructure
|
||||
What if the vendor of your product goes bankrupt?
|
||||
Users are dependent on 'upgrade pressure' and future license changes
|
||||
No possibility to adopt new standards if Vendor has no interest
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables project
|
||||
What can you do using netfilter/iptables?
|
||||
|
||||
stateless packet filtering
|
||||
provides matches for almost any criteria in the universe
|
||||
stateful packet filtering (using connection tracking)
|
||||
keeps state table about all ongoing connections
|
||||
currently supports TCP/UDP/ICMP/GRE
|
||||
currently supports l5+ helpers for ftp,irc,pptp,h323,talk,mms,tftp,...
|
||||
network address translation
|
||||
stateful, based on connection tracking
|
||||
source NAT / Masquerading
|
||||
destination NAT / redirect
|
||||
1:1 nat of whole networks (NETMAP)
|
||||
packet mangling
|
||||
clamp TCP MSS to PMTU for broken PMTU discovery
|
||||
manipulate packet header (TTL, ECN, DSCP, ...)
|
||||
combine with policy routing / traffic shaping
|
||||
stateless IPv6 packet filtering (ip6tables)
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Who is behind netfilter/iptables?
|
||||
|
||||
Project started by Paul 'Rusty' Russell
|
||||
Coreteam
|
||||
Rusty, Marc Boucher, James Morris, Harald Welte, Jozsef Kadlecsik, Martin Josefsson
|
||||
Elects a head of coreteam
|
||||
Countless contributions from hundreds of people all over the world
|
||||
In the past we had a scoreboard to keep track of the contributions
|
||||
|
||||
We are always in lack of volunteers, even for listadmin/webmaster/...
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables project
|
||||
How to get involved?
|
||||
|
||||
Internet services:
|
||||
Homepage - http://www.netfilter.org/
|
||||
FTP Server - ftp://ftp.netfilter.org/
|
||||
rsync server - rsync.netfilter.org
|
||||
CVS server - pserver.netfilter.org
|
||||
Bugzilla - http://bugzilla.netfilter.org/
|
||||
CVSweb - http://cvs.netfilter.org/
|
||||
Mailinglist - http://lists.netfilter.org/
|
||||
Anybody can contribute, code has to be GPL licensed
|
||||
Development discussion at netfilter-devel@lists.netfilter.org
|
||||
User questions at netfilter@lists.netfilter.org
|
||||
Security relevant issues at coreteam@netfilter.org
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables project
|
||||
Areas of current development
|
||||
|
||||
pkttables (kernel part, pkttnetlink, libpkttnetlink, libpkttables)
|
||||
make ULOG and ip_queue l3 independent (and move to nfnetlink)
|
||||
optimizing connection tracking SMP performance
|
||||
conntrack: support for more protocols (SCTP,...)
|
||||
nf-hipac: highly optimized packet matching engine
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables project
|
||||
Thanks
|
||||
|
||||
%size 4
|
||||
The slides of this presentation are available at http://www.gnumonks.org/
|
||||
Visit the netfilter homepage http://www.netfilter.org/
|
||||
Thanks to
|
||||
the BBS people, Z-Netz, FIDO, ...
|
||||
for heavily increasing my computer usage in 1992
|
||||
KNF (http://www.franken.de/)
|
||||
for bringing me in touch with the internet as early as 1994
|
||||
for providing a playground for technical people
|
||||
for telling me about the existance of Linux!
|
||||
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
|
||||
for implementing (one of?) the world's best TCP/IP stacks
|
||||
Paul 'Rusty' Russell
|
||||
for starting the netfilter/iptables project
|
||||
for trusting me to maintain it today
|
||||
Astaro AG
|
||||
for sponsoring most of my current netfilter work
|
||||
|
|
@ -0,0 +1,511 @@
|
|||
%include "default.mgp"
|
||||
%default 1 bgrad
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
%nodefault
|
||||
%back "blue"
|
||||
|
||||
%center
|
||||
%size 7
|
||||
|
||||
|
||||
Linux 2.4.x netfilter/iptables
|
||||
firewalling internals
|
||||
|
||||
|
||||
%center
|
||||
%size 4
|
||||
by
|
||||
|
||||
Harald Welte <laforge@netfilter.org>
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Contents
|
||||
|
||||
|
||||
Introduction
|
||||
Netfilter hooks in protocol stacks
|
||||
Packet selection based on IP Tables
|
||||
The Connection Tracking Subsystem
|
||||
The NAT Subsystem based on netfilter + iptables
|
||||
Packet filtering using the 'filter' table
|
||||
Packet mangling using the 'mangle' table
|
||||
Advanced netfilter concepts
|
||||
Current development and Future
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Introduction
|
||||
|
||||
Why did we need netfilter/iptables?
|
||||
Because ipchains...
|
||||
|
||||
has no infrastructure for passing packets to userspace
|
||||
makes transparent proxying extremely difficult
|
||||
has interface address dependent Packet filter rules
|
||||
has Masquerading implemented as part of packet filtering
|
||||
code is too complex and intermixed with core ipv4 stack
|
||||
is neither modular nor extensible
|
||||
only barely supports one special case of NAT (masquerading)
|
||||
has only stateless packet filtering
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Introduction
|
||||
|
||||
Who's behind netfilter/iptables
|
||||
Paul 'Rusty' Russell
|
||||
co-author of iptables in Linux 2.2
|
||||
was paid by Watchguard for about one Year of development
|
||||
James Morris
|
||||
userspace queuing (kernel, library and tools)
|
||||
Marc Boucher
|
||||
NAT and packet filtering controlled by one command
|
||||
Mangle table
|
||||
Harald Welte
|
||||
Conntrack+NAT helper infrastructure (newnat)
|
||||
Userspace packet logging (ULOG)
|
||||
Jozsef Kadlecsik
|
||||
TCP window tracking
|
||||
H.323 conntrack + NAT helper
|
||||
Non-core team contributors
|
||||
http://www.netfilter.org/scoreboard/
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Netfilter Hooks
|
||||
|
||||
What is netfilter?
|
||||
|
||||
System of callback functions within network stack
|
||||
Callback function to be called for every packet traversing certain point (hook) within network stack
|
||||
Protocol independent framework
|
||||
Hooks in layer 3 stacks (IPv4, IPv6, DECnet, ARP)
|
||||
Multiple kernel modules can register with each of the hooks
|
||||
Asynchronous packet handling in userspace (ip_queue)
|
||||
|
||||
Traditional packet filtering, NAT, ... is implemented on top of this framework
|
||||
|
||||
Can be used for other stuff interfacing with the core network stack, like DECnet routing daemon.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Netfilter Hooks
|
||||
|
||||
Netfilter architecture in IPv4
|
||||
%font "typewriter"
|
||||
%size 4
|
||||
|
||||
--->[1]--->[ROUTE]--->[3]--->[4]--->
|
||||
| ^
|
||||
| |
|
||||
| [ROUTE]
|
||||
v |
|
||||
[2] [5]
|
||||
| ^
|
||||
| |
|
||||
v |
|
||||
|
||||
%font "standard"
|
||||
1=NF_IP_PRE_ROUTING
|
||||
2=NF_IP_LOCAL_IN
|
||||
3=NF_IP_FORWARD
|
||||
4=NF_IP_POST_ROUTING
|
||||
5=NF_IP_LOCAL_OUT
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Netfilter Hooks
|
||||
|
||||
Netfilter Hooks
|
||||
|
||||
Any kernel module may register a callback function at any of the hooks
|
||||
|
||||
The module has to return one of the following constants
|
||||
|
||||
NF_ACCEPT continue traversal as normal
|
||||
NF_DROP drop the packet, do not continue
|
||||
NF_STOLEN I've taken over the packet do not continue
|
||||
NF_QUEUE enqueue packet to userspace
|
||||
NF_REPEAT call this hook again
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
IP tables
|
||||
|
||||
Packet selection using IP tables
|
||||
|
||||
The kernel provides generic IP tables support
|
||||
|
||||
Each kernel module may create it's own IP table
|
||||
|
||||
The three major parts of 2.4 firewalling subsystem are implemented using IP tables
|
||||
Packet filtering table 'filter'
|
||||
NAT table 'nat'
|
||||
Packet mangling table 'mangle'
|
||||
|
||||
Can potentially be used for other stuff, i.e. IPsec SPDB
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
IP Tables
|
||||
|
||||
Managing chains and tables
|
||||
|
||||
An IP table consists out of multiple chains
|
||||
A chain consists out of a list of rules
|
||||
Every single rule in a chain consists out of
|
||||
match[es] (rule executed if all matches true)
|
||||
target (what to do if the rule is matched)
|
||||
|
||||
%size 4
|
||||
matches and targets can either be builtin or implemented as kernel modules
|
||||
|
||||
%size 5
|
||||
The userspace tool iptables is used to control IP tables
|
||||
handles all different kinds of IP tables
|
||||
supports a plugin/shlib interface for target/match specific options
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
IP Tables
|
||||
|
||||
Basic iptables commands
|
||||
|
||||
To build a complete iptables command, we must specify
|
||||
which table to work with
|
||||
which chain in this table to use
|
||||
an operation (insert, add, delete, modify)
|
||||
one or more matches (optional)
|
||||
a target
|
||||
|
||||
The syntax is
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t table -Operation chain -j target match(es)
|
||||
%font "standard"
|
||||
%size 5
|
||||
|
||||
Example:
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t filter -A INPUT -j ACCEPT -p tcp --dport smtp
|
||||
%font "standard"
|
||||
%size 5
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
IP Tables
|
||||
|
||||
Matches
|
||||
Basic matches
|
||||
-p protocol (tcp/udp/icmp/...)
|
||||
-s source address (ip/mask)
|
||||
-d destination address (ip/mask)
|
||||
-i incoming interface
|
||||
-o outgoing interface
|
||||
|
||||
Match extensions (examples)
|
||||
tcp/udp TCP/udp source/destination port
|
||||
icmp ICMP code/type
|
||||
ah/esp AH/ESP SPID match
|
||||
mac source MAC address
|
||||
mark nfmark
|
||||
length match on length of packet
|
||||
limit rate limiting (n packets per timeframe)
|
||||
owner owner uid of the socket sending the packet
|
||||
tos TOS field of IP header
|
||||
ttl TTL field of IP header
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
IP Tables
|
||||
|
||||
Targets
|
||||
very dependent on the particular table.
|
||||
|
||||
Table specific targets will be discussed later
|
||||
|
||||
Generic Targets, always available
|
||||
ACCEPT accept packet within chain
|
||||
DROP silently drop packet
|
||||
QUEUE enqueue packet to userspace
|
||||
LOG log packet via syslog
|
||||
ULOG log packet via ulogd
|
||||
RETURN return to previous (calling) chain
|
||||
foobar jump to user defined chain
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Packet Filtering
|
||||
|
||||
Overview
|
||||
|
||||
Implemented as 'filter' table
|
||||
Registers with three netfilter hooks
|
||||
|
||||
NF_IP_LOCAL_IN (packets destined for the local host)
|
||||
NF_IP_FORWARD (packets forwarded by local host)
|
||||
NF_IP_LOCAL_OUT (packets from the local host)
|
||||
|
||||
Each of the three hooks has attached one chain (INPUT, FORWARD, OUTPUT)
|
||||
|
||||
Every packet passes exactly one of the three chains. Note that this is very different compared to the old 2.2.x ipchains behaviour.
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Packet Filtering
|
||||
|
||||
Targets available within 'filter' table
|
||||
|
||||
Builtin Targets to be used in filter table
|
||||
ACCEPT accept the packet
|
||||
DROP silently drop the packet
|
||||
QUEUE enqueue packet to userspace
|
||||
RETURN return to previous (calling) chain
|
||||
foobar user defined chain
|
||||
|
||||
Targets implemented as loadable modules
|
||||
REJECT drop the packet but inform sender
|
||||
MIRROR change source/destination IP and resend
|
||||
LOG log via syslog
|
||||
ULOG log via userspace
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Connection tracking...
|
||||
implemented seperately from NAT
|
||||
enables stateful filtering
|
||||
implementation
|
||||
hooks into NF_IP_PRE_ROUTING to track packets
|
||||
hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to see if packet passed filtering rules
|
||||
protocol modules (currently TCP/UDP/ICMP)
|
||||
application helpers currently (FTP,IRC,H.323,talk,SNMP)
|
||||
divides packets in the following four categories
|
||||
NEW - would establish new connection
|
||||
ESTABLISHED - part of already established connection
|
||||
RELATED - is related to established connection
|
||||
INVALID - (multicast, errors...)
|
||||
does _NOT_ filter packets itself
|
||||
can be utilized by iptables using the 'state' match
|
||||
is used by NAT Subsystem
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Common structures
|
||||
struct ip_conntrack_tuple, representing unidirectional flow
|
||||
layer 3 src + dst
|
||||
layer 4 protocol
|
||||
layer 4 src + dst
|
||||
connetions represented as struct ip_conntrack
|
||||
original tuple
|
||||
reply tuple
|
||||
timeout
|
||||
l4 state private data
|
||||
app helper
|
||||
app helper private data
|
||||
expected connections
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Flow of events for new packet
|
||||
packet enters NF_IP_PRE_ROUTING
|
||||
tuple is derived from packet
|
||||
lookup conntrack hash table with hash(tuple) -> fails
|
||||
new ip_conntrack is allocated
|
||||
fill in original and reply == inverted(original) tuple
|
||||
initialize timer
|
||||
assign app helper if applicable
|
||||
see if we've been expected -> fails
|
||||
call layer 4 helper 'new' function
|
||||
...
|
||||
packet enters NF_IP_POST_ROUTING
|
||||
do hashtable lookup for packet -> fails
|
||||
place struct ip_conntrack in hashtable
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Flow of events for packet part of existing connection
|
||||
packet enters NF_IP_PRE_ROUTING
|
||||
tuple is derived from packet
|
||||
lookup conntrack hash table with hash(tuple)
|
||||
assosiate conntrack entry with skb->nfct
|
||||
call l4 protocol helper 'packet' function
|
||||
do l4 state tracking
|
||||
update timeouts as needed [i.e. TCP TIME_WAIT,...]
|
||||
...
|
||||
packet enters NF_IP_POST_ROUTING
|
||||
do hashtable lookup for packet -> succeds
|
||||
do nothing else
|
||||
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Network Address Translation
|
||||
|
||||
Overview
|
||||
|
||||
Previous Linux Kernels only implemented one special case of NAT: Masquerading
|
||||
Linux 2.4.x can do any kind of NAT.
|
||||
NAT subsystem implemented on top of netfilter, iptables and conntrack
|
||||
NAT subsystem registers with all five netfilter hooks
|
||||
'nat' Table registers chains PREROUTING, POSTROUTING and OUTPUT
|
||||
Following targets available within 'nat' Table
|
||||
SNAT changes the packet's source whille passing NF_IP_POST_ROUTING
|
||||
DNAT changes the packet's destination while passing NF_IP_PRE_ROUTING
|
||||
MASQUERADE is a special case of SNAT
|
||||
REDIRECT is a special case of DNAT
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Network Address Translation
|
||||
|
||||
Source NAT
|
||||
SNAT Example:
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t nat -A POSTROUTING -j SNAT --to-source 1.2.3.4 -s 10.0.0.0/8
|
||||
%font "standard"
|
||||
%size 4
|
||||
|
||||
MASQUERADE Example:
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t nat -A POSTROUTING -j MASQUERADE -o ppp0
|
||||
%font "standard"
|
||||
%size 5
|
||||
|
||||
Destination NAT
|
||||
DNAT example
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t nat -A PREROUTING -j DNAT --to-destination 1.2.3.4:8080 -p tcp --dport 80 -i eth1
|
||||
%font "standard"
|
||||
%size 4
|
||||
|
||||
REDIRECT example
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t nat -A PREROUTING -j REDIRECT --to-port 3128 -i eth1 -p tcp --dport 80
|
||||
%font "standard"
|
||||
%size 5
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Packet Mangling
|
||||
|
||||
Purpose of mangle table
|
||||
packet manipulation except address manipulation
|
||||
Integration with netfilter
|
||||
'mangle' table hooks in all five netfilter hooks
|
||||
priority: after conntrack
|
||||
Targets specific to the 'mangle' table:
|
||||
DSCP - manipulate DSCP field
|
||||
IPV4OPTSSTRIP - strip IPv4 options
|
||||
MARK - change the nfmark field of the skb
|
||||
TCPMSS - set TCP MSS option
|
||||
TOS - manipulate the TOS bits
|
||||
TTL - set / increase / decrease TTL field
|
||||
Simple example:
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t mangle -A PREROUTING -j MARK --set-mark 10 -p tcp --dport 80
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Advanced Netfilter concepts
|
||||
|
||||
%size 4
|
||||
Userspace logging
|
||||
flexible replacement for old syslog-based logging
|
||||
packets to userspace via multicast netlink sockets
|
||||
easy-to-use library (libipulog)
|
||||
plugin-extensible userspace logging daemon (ulogd)
|
||||
Can even be used to directly log into MySQL
|
||||
|
||||
Queuing
|
||||
reliable asynchronous packet handling
|
||||
packets to userspace via unicast netlink socket
|
||||
easy-to-use library (libipq)
|
||||
provides Perl bindings
|
||||
experimental queue multiplex daemon (ipqmpd)
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Current Development and Future
|
||||
|
||||
Netfilter (although it proved very stable) is still work in progress.
|
||||
|
||||
Areas of current development
|
||||
infrastructure for conntrack manipulation from userspace
|
||||
failover of stateful firewalls (conntrack sync)
|
||||
making iptables layer3 independent (pkttables)
|
||||
new userspace library (libiptables) to hide plugins from apps
|
||||
more matches and targets for advanced functions (pool, hashslot)
|
||||
more conntrack and NAT modules (RPC, SNMP, SMB, ...)
|
||||
better IPv6 support (conntrack, more matches / targets)
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Thanks
|
||||
|
||||
Thanks to
|
||||
the BBS people, Z-Netz, FIDO, ...
|
||||
for heavily increasing my computer usage in 1992
|
||||
KNF
|
||||
for bringing me in touch with the internet as early as 1994
|
||||
for providing a playground for technical people
|
||||
for telling me about the existance of Linux!
|
||||
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
|
||||
for implementing (one of?) the world's best TCP/IP stacks
|
||||
Paul 'Rusty' Russell
|
||||
for starting the netfilter/iptables project
|
||||
for trusting me to maintain it today
|
||||
Astaro AG
|
||||
for sponsoring parts of my netfilter work
|
||||
sarovar.org
|
||||
for sponsoring www.in.netfilter.org
|
||||
%size 3
|
||||
The slides and the an according paper of this presentation are available at http://www.gnumonks.org/
|
||||
%size 3
|
||||
The netfilter homepage http://www.netfilter.org/
|
||||
|
|
@ -0,0 +1,49 @@
|
|||
Linux 2.4.x netfilter/iptables firewalling internals
|
||||
|
||||
The Linux 2.4.x kernel series has introduced a totally new kernel firewalling subsystem. It is much more than a plain successor of ipfwadm or ipchains.
|
||||
|
||||
The netfilter/iptables project has a very modular design and it's
|
||||
sub-projects can be split in several parts: netfilter, iptables, connection
|
||||
tracking, NAT and packet mangling.
|
||||
|
||||
While most users will already have learned how to use the basic functions
|
||||
of netfilter/iptables in order to convert their old ipchains firewalls to
|
||||
iptables, there's more advanced but less used functionality in
|
||||
netfilter/iptables.
|
||||
|
||||
The presentation covers the design principles behind the netfilter/iptables
|
||||
implementation. This knowledge enables us to understand how the individual
|
||||
parts of netfilter/iptables fit together, and for which potential applications
|
||||
this is useful.
|
||||
|
||||
Topics covered:
|
||||
|
||||
- overview about the internal netfilter/iptables architecture
|
||||
- the netfilter hooks inside the network protocol stacks
|
||||
- packet selection with IP tables
|
||||
- how is connection tracking and NAT integrated into the framework
|
||||
- the connection tracking system
|
||||
- how good does it track the TCP state?
|
||||
- how does it track ICMP and UDP state at all?
|
||||
- layer 4 protocol helpers (GRE, ...)
|
||||
- application helpers (ftp, irc, h323, ...)
|
||||
- restrictions/limitations
|
||||
- the NAT system
|
||||
- how does it interact with connection tracking?
|
||||
- layer 4 protocol helpers
|
||||
- application helpers (ftp, irc, ...)
|
||||
- misc
|
||||
- how far is IPv6 firewalling with ip6tables?
|
||||
- advances in failover/HA of stateful firewalls
|
||||
- ivisible firewalls with iptables on a bridge
|
||||
- userspace packet queueing with QUEUE
|
||||
- userspace packet logging with ULOG
|
||||
|
||||
Requirements:
|
||||
- knowledge about the TCP/IP protocol family
|
||||
- knowledge about general firewalling and packet filtering concepts
|
||||
- prior experience with linux packet filters
|
||||
|
||||
Audience:
|
||||
- firewall administrators
|
||||
- network developers
|
|
@ -0,0 +1,22 @@
|
|||
<a href="http://gnumonks.org/users/laforge/">Harald Welte</a> is one
|
||||
of the five <a href="http://www.netfilter.org/">netfilter/iptables</a> core
|
||||
team members, and the current Linux 2.4.x firewalling maintainer.
|
||||
|
||||
His main interest in computing has always been networking. In the few time
|
||||
left besides netfilter/iptables related work, he's writing obscure documents
|
||||
like the <a href="http://www.gnumonks.org/ftp/pub/doc/uucp-over-ssl.html"> UUCP
|
||||
over SSL HOWTO</a>. Other kernel-related projects he has been contributing are
|
||||
user mode linux and the international (crypto) kernel patch.
|
||||
|
||||
In the past he has been working as an independent IT Consultant working on
|
||||
closed-source projecst for various companies ranging from banks to
|
||||
manufacturers of networking gear. During the year 2001 he was living in
|
||||
Curitiba (Brazil), where he got sponsored for his Linux related work by
|
||||
<a href="http://www.conectiva.com/">Conectiva Inc.</a>.
|
||||
|
||||
Starting with February 2002, Harald has been contracted part-time by
|
||||
<a href="http://www.astaro.com/">Astaro AG</a>, who are sponsoring him for his
|
||||
current netfilter/iptables work.
|
||||
|
||||
Harald is living in Berlin, Germany.
|
||||
|
|
@ -0,0 +1,509 @@
|
|||
%include "default.mgp"
|
||||
%default 1 bgrad
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
%nodefault
|
||||
%back "blue"
|
||||
|
||||
%center
|
||||
%size 7
|
||||
|
||||
|
||||
Linux 2.4.x netfilter/iptables
|
||||
firewalling internals
|
||||
|
||||
|
||||
%center
|
||||
%size 4
|
||||
by
|
||||
|
||||
Harald Welte <laforge@netfilter.org>
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Contents
|
||||
|
||||
|
||||
Introduction
|
||||
Netfilter hooks in protocol stacks
|
||||
Packet selection based on IP Tables
|
||||
The Connection Tracking Subsystem
|
||||
The NAT Subsystem based on netfilter + iptables
|
||||
Packet filtering using the 'filter' table
|
||||
Packet mangling using the 'mangle' table
|
||||
Advanced netfilter concepts
|
||||
Current development and Future
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Introduction
|
||||
|
||||
Why did we need netfilter/iptables?
|
||||
Because ipchains...
|
||||
|
||||
has no infrastructure for passing packets to userspace
|
||||
makes transparent proxying extremely difficult
|
||||
has interface address dependent Packet filter rules
|
||||
has Masquerading implemented as part of packet filtering
|
||||
code is too complex and intermixed with core ipv4 stack
|
||||
is neither modular nor extensible
|
||||
only barely supports one special case of NAT (masquerading)
|
||||
has only stateless packet filtering
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Introduction
|
||||
|
||||
Who's behind netfilter/iptables
|
||||
Paul 'Rusty' Russell
|
||||
co-author of iptables in Linux 2.2
|
||||
was paid by Watchguard for about one Year of development
|
||||
James Morris
|
||||
userspace queuing (kernel, library and tools)
|
||||
Marc Boucher
|
||||
NAT and packet filtering controlled by one command
|
||||
Mangle table
|
||||
Harald Welte
|
||||
Conntrack+NAT helper infrastructure (newnat)
|
||||
Userspace packet logging (ULOG)
|
||||
Jozsef Kadlecsik
|
||||
TCP window tracking
|
||||
H.323 conntrack + NAT helper
|
||||
Non-core team contributors
|
||||
http://www.netfilter.org/scoreboard/
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Netfilter Hooks
|
||||
|
||||
What is netfilter?
|
||||
|
||||
System of callback functions within network stack
|
||||
Callback function to be called for every packet traversing certain point (hook) within network stack
|
||||
Protocol independent framework
|
||||
Hooks in layer 3 stacks (IPv4, IPv6, DECnet, ARP)
|
||||
Multiple kernel modules can register with each of the hooks
|
||||
Asynchronous packet handling in userspace (ip_queue)
|
||||
|
||||
Traditional packet filtering, NAT, ... is implemented on top of this framework
|
||||
|
||||
Can be used for other stuff interfacing with the core network stack, like DECnet routing daemon.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Netfilter Hooks
|
||||
|
||||
Netfilter architecture in IPv4
|
||||
%font "typewriter"
|
||||
%size 4
|
||||
|
||||
--->[1]--->[ROUTE]--->[3]--->[4]--->
|
||||
| ^
|
||||
| |
|
||||
| [ROUTE]
|
||||
v |
|
||||
[2] [5]
|
||||
| ^
|
||||
| |
|
||||
v |
|
||||
|
||||
%font "standard"
|
||||
1=NF_IP_PRE_ROUTING
|
||||
2=NF_IP_LOCAL_IN
|
||||
3=NF_IP_FORWARD
|
||||
4=NF_IP_POST_ROUTING
|
||||
5=NF_IP_LOCAL_OUT
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Netfilter Hooks
|
||||
|
||||
Netfilter Hooks
|
||||
|
||||
Any kernel module may register a callback function at any of the hooks
|
||||
|
||||
The module has to return one of the following constants
|
||||
|
||||
NF_ACCEPT continue traversal as normal
|
||||
NF_DROP drop the packet, do not continue
|
||||
NF_STOLEN I've taken over the packet do not continue
|
||||
NF_QUEUE enqueue packet to userspace
|
||||
NF_REPEAT call this hook again
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
IP tables
|
||||
|
||||
Packet selection using IP tables
|
||||
|
||||
The kernel provides generic IP tables support
|
||||
|
||||
Each kernel module may create it's own IP table
|
||||
|
||||
The three major parts of 2.4 firewalling subsystem are implemented using IP tables
|
||||
Packet filtering table 'filter'
|
||||
NAT table 'nat'
|
||||
Packet mangling table 'mangle'
|
||||
|
||||
Can potentially be used for other stuff, i.e. IPsec SPDB
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
IP Tables
|
||||
|
||||
Managing chains and tables
|
||||
|
||||
An IP table consists out of multiple chains
|
||||
A chain consists out of a list of rules
|
||||
Every single rule in a chain consists out of
|
||||
match[es] (rule executed if all matches true)
|
||||
target (what to do if the rule is matched)
|
||||
|
||||
%size 4
|
||||
matches and targets can either be builtin or implemented as kernel modules
|
||||
|
||||
%size 5
|
||||
The userspace tool iptables is used to control IP tables
|
||||
handles all different kinds of IP tables
|
||||
supports a plugin/shlib interface for target/match specific options
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
IP Tables
|
||||
|
||||
Basic iptables commands
|
||||
|
||||
To build a complete iptables command, we must specify
|
||||
which table to work with
|
||||
which chain in this table to use
|
||||
an operation (insert, add, delete, modify)
|
||||
one or more matches (optional)
|
||||
a target
|
||||
|
||||
The syntax is
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t table -Operation chain -j target match(es)
|
||||
%font "standard"
|
||||
%size 5
|
||||
|
||||
Example:
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t filter -A INPUT -j ACCEPT -p tcp --dport smtp
|
||||
%font "standard"
|
||||
%size 5
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
IP Tables
|
||||
|
||||
Matches
|
||||
Basic matches
|
||||
-p protocol (tcp/udp/icmp/...)
|
||||
-s source address (ip/mask)
|
||||
-d destination address (ip/mask)
|
||||
-i incoming interface
|
||||
-o outgoing interface
|
||||
|
||||
Match extensions (examples)
|
||||
tcp/udp TCP/udp source/destination port
|
||||
icmp ICMP code/type
|
||||
ah/esp AH/ESP SPID match
|
||||
mac source MAC address
|
||||
mark nfmark
|
||||
length match on length of packet
|
||||
limit rate limiting (n packets per timeframe)
|
||||
owner owner uid of the socket sending the packet
|
||||
tos TOS field of IP header
|
||||
ttl TTL field of IP header
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
IP Tables
|
||||
|
||||
Targets
|
||||
very dependent on the particular table.
|
||||
|
||||
Table specific targets will be discussed later
|
||||
|
||||
Generic Targets, always available
|
||||
ACCEPT accept packet within chain
|
||||
DROP silently drop packet
|
||||
QUEUE enqueue packet to userspace
|
||||
LOG log packet via syslog
|
||||
ULOG log packet via ulogd
|
||||
RETURN return to previous (calling) chain
|
||||
foobar jump to user defined chain
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Packet Filtering
|
||||
|
||||
Overview
|
||||
|
||||
Implemented as 'filter' table
|
||||
Registers with three netfilter hooks
|
||||
|
||||
NF_IP_LOCAL_IN (packets destined for the local host)
|
||||
NF_IP_FORWARD (packets forwarded by local host)
|
||||
NF_IP_LOCAL_OUT (packets from the local host)
|
||||
|
||||
Each of the three hooks has attached one chain (INPUT, FORWARD, OUTPUT)
|
||||
|
||||
Every packet passes exactly one of the three chains. Note that this is very different compared to the old 2.2.x ipchains behaviour.
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Packet Filtering
|
||||
|
||||
Targets available within 'filter' table
|
||||
|
||||
Builtin Targets to be used in filter table
|
||||
ACCEPT accept the packet
|
||||
DROP silently drop the packet
|
||||
QUEUE enqueue packet to userspace
|
||||
RETURN return to previous (calling) chain
|
||||
foobar user defined chain
|
||||
|
||||
Targets implemented as loadable modules
|
||||
REJECT drop the packet but inform sender
|
||||
MIRROR change source/destination IP and resend
|
||||
LOG log via syslog
|
||||
ULOG log via userspace
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Connection tracking...
|
||||
implemented seperately from NAT
|
||||
enables stateful filtering
|
||||
implementation
|
||||
hooks into NF_IP_PRE_ROUTING to track packets
|
||||
hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to see if packet passed filtering rules
|
||||
protocol modules (currently TCP/UDP/ICMP)
|
||||
application helpers currently (FTP,IRC,H.323,talk,SNMP)
|
||||
divides packets in the following four categories
|
||||
NEW - would establish new connection
|
||||
ESTABLISHED - part of already established connection
|
||||
RELATED - is related to established connection
|
||||
INVALID - (multicast, errors...)
|
||||
does _NOT_ filter packets itself
|
||||
can be utilized by iptables using the 'state' match
|
||||
is used by NAT Subsystem
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Common structures
|
||||
struct ip_conntrack_tuple, representing unidirectional flow
|
||||
layer 3 src + dst
|
||||
layer 4 protocol
|
||||
layer 4 src + dst
|
||||
connetions represented as struct ip_conntrack
|
||||
original tuple
|
||||
reply tuple
|
||||
timeout
|
||||
l4 state private data
|
||||
app helper
|
||||
app helper private data
|
||||
expected connections
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Flow of events for new packet
|
||||
packet enters NF_IP_PRE_ROUTING
|
||||
tuple is derived from packet
|
||||
lookup conntrack hash table with hash(tuple) -> fails
|
||||
new ip_conntrack is allocated
|
||||
fill in original and reply == inverted(original) tuple
|
||||
initialize timer
|
||||
assign app helper if applicable
|
||||
see if we've been expected -> fails
|
||||
call layer 4 helper 'new' function
|
||||
...
|
||||
packet enters NF_IP_POST_ROUTING
|
||||
do hashtable lookup for packet -> fails
|
||||
place struct ip_conntrack in hashtable
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Flow of events for packet part of existing connection
|
||||
packet enters NF_IP_PRE_ROUTING
|
||||
tuple is derived from packet
|
||||
lookup conntrack hash table with hash(tuple)
|
||||
assosiate conntrack entry with skb->nfct
|
||||
call l4 protocol helper 'packet' function
|
||||
do l4 state tracking
|
||||
update timeouts as needed [i.e. TCP TIME_WAIT,...]
|
||||
...
|
||||
packet enters NF_IP_POST_ROUTING
|
||||
do hashtable lookup for packet -> succeds
|
||||
do nothing else
|
||||
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Network Address Translation
|
||||
|
||||
Overview
|
||||
|
||||
Previous Linux Kernels only implemented one special case of NAT: Masquerading
|
||||
Linux 2.4.x can do any kind of NAT.
|
||||
NAT subsystem implemented on top of netfilter, iptables and conntrack
|
||||
NAT subsystem registers with all five netfilter hooks
|
||||
'nat' Table registers chains PREROUTING, POSTROUTING and OUTPUT
|
||||
Following targets available within 'nat' Table
|
||||
SNAT changes the packet's source whille passing NF_IP_POST_ROUTING
|
||||
DNAT changes the packet's destination while passing NF_IP_PRE_ROUTING
|
||||
MASQUERADE is a special case of SNAT
|
||||
REDIRECT is a special case of DNAT
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Network Address Translation
|
||||
|
||||
Source NAT
|
||||
SNAT Example:
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t nat -A POSTROUTING -j SNAT --to-source 1.2.3.4 -s 10.0.0.0/8
|
||||
%font "standard"
|
||||
%size 4
|
||||
|
||||
MASQUERADE Example:
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t nat -A POSTROUTING -j MASQUERADE -o ppp0
|
||||
%font "standard"
|
||||
%size 5
|
||||
|
||||
Destination NAT
|
||||
DNAT example
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t nat -A PREROUTING -j DNAT --to-destination 1.2.3.4:8080 -p tcp --dport 80 -i eth1
|
||||
%font "standard"
|
||||
%size 4
|
||||
|
||||
REDIRECT example
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t nat -A PREROUTING -j REDIRECT --to-port 3128 -i eth1 -p tcp --dport 80
|
||||
%font "standard"
|
||||
%size 5
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Packet Mangling
|
||||
|
||||
Purpose of mangle table
|
||||
packet manipulation except address manipulation
|
||||
Integration with netfilter
|
||||
'mangle' table hooks in all five netfilter hooks
|
||||
priority: after conntrack
|
||||
Targets specific to the 'mangle' table:
|
||||
DSCP - manipulate DSCP field
|
||||
IPV4OPTSSTRIP - strip IPv4 options
|
||||
MARK - change the nfmark field of the skb
|
||||
TCPMSS - set TCP MSS option
|
||||
TOS - manipulate the TOS bits
|
||||
TTL - set / increase / decrease TTL field
|
||||
Simple example:
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t mangle -A PREROUTING -j MARK --set-mark 10 -p tcp --dport 80
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Advanced Netfilter concepts
|
||||
|
||||
%size 4
|
||||
Userspace logging
|
||||
flexible replacement for old syslog-based logging
|
||||
packets to userspace via multicast netlink sockets
|
||||
easy-to-use library (libipulog)
|
||||
plugin-extensible userspace logging daemon (ulogd)
|
||||
Can even be used to directly log into MySQL
|
||||
|
||||
Queuing
|
||||
reliable asynchronous packet handling
|
||||
packets to userspace via unicast netlink socket
|
||||
easy-to-use library (libipq)
|
||||
provides Perl bindings
|
||||
experimental queue multiplex daemon (ipqmpd)
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
netfilter/iptables in Linux 2.4
|
||||
Current Development and Future
|
||||
|
||||
Netfilter (although it proved very stable) is still work in progress.
|
||||
|
||||
Areas of current development
|
||||
infrastructure for conntrack manipulation from userspace
|
||||
failover of stateful firewalls (conntrack sync)
|
||||
making iptables layer3 independent (pkttables)
|
||||
new userspace library (libiptables) to hide plugins from apps
|
||||
more matches and targets for advanced functions (pool, hashslot)
|
||||
more conntrack and NAT modules (RPC, SNMP, SMB, ...)
|
||||
better IPv6 support (conntrack, more matches / targets)
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Thanks
|
||||
|
||||
Thanks to
|
||||
the BBS people, Z-Netz, FIDO, ...
|
||||
for heavily increasing my computer usage in 1992
|
||||
KNF
|
||||
for bringing me in touch with the internet as early as 1994
|
||||
for providing a playground for technical people
|
||||
for telling me about the existance of Linux!
|
||||
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
|
||||
for implementing (one of?) the world's best TCP/IP stacks
|
||||
Paul 'Rusty' Russell
|
||||
for starting the netfilter/iptables project
|
||||
for trusting me to maintain it today
|
||||
Astaro AG
|
||||
for sponsoring parts of my netfilter work
|
||||
%size 3
|
||||
The slides and the an according paper of this presentation are available at http://www.gnumonks.org/
|
||||
%size 3
|
||||
The netfilter homepage http://www.netfilter.org/
|
||||
|
|
@ -0,0 +1,70 @@
|
|||
Kurz-Paper zum Vortrag "Programmierung von netfilter/iptables-Erweiterungen"
|
||||
(Schluessel 3b911575)
|
||||
|
||||
1. Warum ist dieses Thema fuer die Besucher interessant?
|
||||
|
||||
Das Thema ist aus unterschiedlichen Gruenden interessant.
|
||||
|
||||
Zum einen is vielen fortgeschrittenen Administratoren einfach nicht klar,
|
||||
was sich durch eigene Erweiterungsmodule fuer Moeglichkeiten erschliessen.
|
||||
Zu vielen denken noch in der alten, starren, monolithischen 'ipchains'-Welt.
|
||||
|
||||
Zum anderen ist es auch eine ideale Moeglichkeit, ein bisschen in die Welt
|
||||
der Kernel-Programmierung hereinzuschnuppern. Viele der komplexen
|
||||
Zusammenhaenge (locking, etc.) werden weitestgehend vom netfilter/iptables
|
||||
core uebernommen, so dass wirklich lediglich C-Programmierkenntnisse noetig
|
||||
sind, und man bisher den Kernel noch nicht angefasst haben muss.
|
||||
|
||||
Und nicht zuletzt waere dieser Vortrag erst die zweite Moeglichkeit,
|
||||
sich anstatt RTFM durch einen Workshop mit diesem Thema zu beschaeftigen ;)
|
||||
|
||||
2. Warum beschaeftigen Sie sich mit dem Thema?
|
||||
|
||||
Weil ich ein Mitglied des netfilter core teams und der gegenwaertige
|
||||
Maintainer des Linux Firewalling Subsystems bin.
|
||||
|
||||
Warum ich nun das bin, ist eine laengere Geschichte. Ich finde es jedenfalls
|
||||
wichtig, an der Weiterentwicklung des Linux-Firewallings zu arbeiten.
|
||||
Netzwerke, und insbesondere Netzwerksicherheit war schon immer mein
|
||||
Lieblingsthema.
|
||||
|
||||
3. Welche Struktur/Gliederung soll der Vortrag bzw. Workshop haben?
|
||||
|
||||
Zunaechst kommt eine kurze Uebersicht ueber die interne Architektur
|
||||
des netfilter- und iptables- subsystem.
|
||||
|
||||
Im zweiten Teil werden die im Rahmen dieser Architektur zur Verfuegung
|
||||
stehenden API's besprochen, u.a. auch mit Code-Beispielen von existierenden
|
||||
iptables matches/targets, sowie conntrack und NAT helper-Modulen.
|
||||
|
||||
Im driten Teil folgt dann eine schritt-fuer-schritt-Entwicklung eines
|
||||
iptables-Erweiterungsmoduls.
|
||||
|
||||
4. Planen Sie auch eine praktische Vorfuehrung im Rahmen des Beitrages?
|
||||
|
||||
Nunja, nach dem es sich um eine Art Programmier-Tutorial handelt werden
|
||||
wir nach dem Theoretischen Teil (einer Einfuehrung in die API's) zusammen
|
||||
ein solches Erweiterungsmodul schreiben. Ich denke das zaehlt als
|
||||
"praktische Vorfuehrung"
|
||||
|
||||
5. Welche einschlaegigen Webseiten gibt es zum Tema?
|
||||
|
||||
Die Homepage des netfilter/iptables-Projekts unter http://www.netfilter.org/,
|
||||
insbesondere das netfilter-hacking-HOWTO unter
|
||||
http://www.netfilter.org/documentation/HOWTO/netfilter-hacking-HOWTO.html ist
|
||||
als relevant zu bezeichnen.
|
||||
|
||||
6. Haben Sie schon einmal ueber dieses Thema referiert?
|
||||
|
||||
Ich habe zahlreiche Vortraege und Tutorials run um das Thema
|
||||
netfilter/iptables auf internationalen Konferenzen gehalten (unter anderem
|
||||
Linuxtag, Linux-Kongress, Ottawa Linux Symposium, Sao Paulo Linux Expo, ...).
|
||||
Eine unvollstaendige Liste ist unter http://www.netfilter.org/events.html
|
||||
|
||||
Dabei war auch bereits ein eintaegiger Workshop, in dem von den Grundlagen
|
||||
der Anwendung bis zur Programmierung von Erweiterungsmodule der komplette
|
||||
Themenbereich abgedeckt war.
|
||||
|
||||
Sonstiges:
|
||||
|
||||
Der Workshop kann in deutsch oder englisch angeboten werden.
|
|
@ -0,0 +1,54 @@
|
|||
#include <linux/module.h>
|
||||
#include <linux/sk_buff.h>
|
||||
#include <linux/netfilter_ipv4/ip_tables.h>
|
||||
#include <linux/netfilter_ipv4/ipt_workshop.h>
|
||||
|
||||
MODULE_LICENSE("GPL");
|
||||
MODULE_AUTHOR("Harald Welte <laforge@netfilter.org>");
|
||||
MODULE_DESCRIPTION("5CLT workshop iptables module");
|
||||
|
||||
static int ws_match(const struct sk_buff *skb, const struct net_device *in,
|
||||
const struct net_device *out, const void *matchinfo,
|
||||
int offset, const void *hdr, u_int16_t datalen,
|
||||
int *hotdrop)
|
||||
{
|
||||
const struct ipt_ws_info *info = matchinfo;
|
||||
const struct iphdr *iph = skb->nh.iph;
|
||||
|
||||
if (iph->ttl == info->ttl)
|
||||
return 1;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int ws_checkentry(const char *tablename, const struct ipt_ip *ip,
|
||||
void *matchinfo, unsigned int matchsize,
|
||||
unsigned int hook_mask)
|
||||
{
|
||||
if (matchsize != IPT_ALIGN(sizeof(struct ipt_ws_info)))
|
||||
return 0;
|
||||
|
||||
return 1;
|
||||
}
|
||||
|
||||
static struct ipt_match ws_match = {
|
||||
.list = { .prev = NULL, .next = NULL },
|
||||
.name = "workshop",
|
||||
.match = &ws_match,
|
||||
.checkentry = &ws_checkentry,
|
||||
.destroy = NULL,
|
||||
.me = THIS_MODULE
|
||||
};
|
||||
|
||||
static int __init init(void)
|
||||
{
|
||||
return ipt_register_match(&ws_match);
|
||||
}
|
||||
|
||||
static void __exit fini(void)
|
||||
{
|
||||
ipt_unregister_match(&ws_match);
|
||||
}
|
||||
|
||||
module_init(init);
|
||||
module_exit(fini);
|
|
@ -0,0 +1,6 @@
|
|||
#ifndef _IPT_WORKSHOP_H
|
||||
#define _IPT_WORKSHOP_H
|
||||
struct ipt_ws_info {
|
||||
u_int8_t ttl;
|
||||
};
|
||||
#endif
|
|
@ -0,0 +1,102 @@
|
|||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
#include <getopt.h>
|
||||
#include <linux/netfilter_ipv4/ip_tables.h>
|
||||
#include <linux/netfilter_ipv4/ipt_workshop.h>
|
||||
|
||||
static void help(void)
|
||||
{
|
||||
printf(
|
||||
"workshop match v%s options:\n"
|
||||
" --ttl TTL value\n"
|
||||
, IPTABLES_VERSION);
|
||||
}
|
||||
|
||||
static void init(struct ipt_entry_match *m, unsigned int *nfcache)
|
||||
{
|
||||
/* caching not implemented yet */
|
||||
*nfcache |= NFC_UNKNOWN;
|
||||
}
|
||||
|
||||
static int parse(int c, char **argv, int invert, unsigned int *flags,
|
||||
const struct ipt_entry *entry, unsigned int *nfcache,
|
||||
struct ipt_entry_match **match)
|
||||
{
|
||||
struct ipt_ws_info *info = (struct ipt_ws_info *) (*match)->data;
|
||||
|
||||
check_inverse(optarg, &invert, &optind, 0);
|
||||
|
||||
if (invert)
|
||||
exit_error(PARAMETER_PROBLEM, "invert not supported");
|
||||
|
||||
if (*flags)
|
||||
exit_error(PARAMETER_PROBLEM,
|
||||
"workshop: can't specify parameter twice");
|
||||
|
||||
if (!optarg)
|
||||
exit_error(PARAMETER_PROBLEM,
|
||||
"workshop: you must specify a value");
|
||||
|
||||
switch (c) {
|
||||
case 'z':
|
||||
info->ttl = atoi(optarg);
|
||||
/* FIXME: range 0-255 */
|
||||
*flags = 1;
|
||||
break;
|
||||
default:
|
||||
return 0;
|
||||
}
|
||||
|
||||
return 1;
|
||||
}
|
||||
|
||||
static void final_check(unsigned int flags)
|
||||
{
|
||||
if (!flags)
|
||||
exit_error(PARAMETER_PROBLEM,
|
||||
"workshop match: you must specify foo");
|
||||
}
|
||||
|
||||
static void print(const struct ipt_ip *ip,
|
||||
const struct ipt_entry_match *match,
|
||||
int numeric)
|
||||
{
|
||||
const struct ipt_ws_info *info = (struct ipt_ws_info *) match->data;
|
||||
|
||||
printf("workshop match TTL=%u ", info->ttl);
|
||||
|
||||
}
|
||||
|
||||
static void save(const struct ipt_ip *ip,
|
||||
const struct ipt_entry_match *match)
|
||||
{
|
||||
const struct ipt_ws_info *info = (struct ipt_ws_info *) match->data;
|
||||
|
||||
printf("--ttl %u ", info->ttl);
|
||||
}
|
||||
|
||||
static struct option opts[] = {
|
||||
{ "ttl", 1, 0, 'z' },
|
||||
{ 0 }
|
||||
};
|
||||
|
||||
static struct iptables_match ws = {
|
||||
.next = NULL,
|
||||
.name = "workshop",
|
||||
.version = IPTABLES_VERSION,
|
||||
.size = IPT_ALIGN(sizeof(struct ipt_ws_info)),
|
||||
.userspacesize = IPT_ALIGN(sizeof(struct ipt_ws_info)),
|
||||
.help = &help,
|
||||
.init = &init,
|
||||
.parse = &parse,
|
||||
.final_check = &final_check,
|
||||
.print = &print,
|
||||
.save = &save,
|
||||
.extra_opts = opts
|
||||
};
|
||||
|
||||
void _init(void)
|
||||
{
|
||||
register_match(&ws);
|
||||
}
|
|
@ -0,0 +1,636 @@
|
|||
%include "default.mgp"
|
||||
%default 1 bgrad
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%deffont "typewriter" tfont "MONOTYPE.TTF"
|
||||
%page
|
||||
%nodefault
|
||||
%back "blue"
|
||||
|
||||
%center
|
||||
%size 7
|
||||
|
||||
|
||||
Programming netfilter/iptables
|
||||
extensions
|
||||
|
||||
|
||||
%center
|
||||
%size 4
|
||||
by
|
||||
|
||||
Harald Welte <laforge@netfilter.org>
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Contents
|
||||
|
||||
|
||||
Introduction
|
||||
The netfilter/iptables architecture
|
||||
Netfilter hooks in protocol stacks
|
||||
Packet selection based on IP Tables
|
||||
The Connection Tracking Subsystem
|
||||
The NAT Subsystem based on netfilter + iptables
|
||||
Packet filtering using the 'filter' table
|
||||
Packet mangling using the 'mangle' table
|
||||
Advanced netfilter concepts
|
||||
Current development and Future
|
||||
Developing a netfilter module
|
||||
Developing a new iptables match
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Introduction
|
||||
|
||||
Why did we need netfilter/iptables?
|
||||
Because ipchains...
|
||||
|
||||
has no infrastructure for passing packets to userspace
|
||||
makes transparent proxying extremely difficult
|
||||
has interface address dependent Packet filter rules
|
||||
has Masquerading implemented as part of packet filtering
|
||||
code is too complex and intermixed with core ipv4 stack
|
||||
is neither modular nor extensible
|
||||
only barely supports one special case of NAT (masquerading)
|
||||
has only stateless packet filtering
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Introduction
|
||||
|
||||
Who's behind netfilter/iptables
|
||||
Paul 'Rusty' Russel
|
||||
co-author of iptables in Linux 2.2
|
||||
was paid by Watchguard for about one Year of development
|
||||
James Morris
|
||||
userspace queuing (kernel, library and tools)
|
||||
REJECT target
|
||||
Marc Boucher
|
||||
NAT and packet filtering controlled by one command
|
||||
Mangle table
|
||||
Harald Welte
|
||||
Conntrack+NAT helper infrastructure (newnat)
|
||||
Userspace packet logging (ULOG)
|
||||
PPTP and IRC conntrack/NAT helpers
|
||||
Jozsef Kadlecsik
|
||||
TCP window tracking
|
||||
H.323 conntrack + NAT helper
|
||||
Continued newnat development
|
||||
Non-core team contributors
|
||||
http://www.netfilter.org/scoreboard/
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Netfilter Hooks
|
||||
|
||||
What is netfilter?
|
||||
|
||||
System of callback functions within network stack
|
||||
Callback function to be called for every packet traversing certain point (hook) within network stack
|
||||
Protocol independent framework
|
||||
Hooks in layer 3 stacks (IPv4, IPv6, DECnet, ARP)
|
||||
Multiple kernel modules can register with each of the hooks
|
||||
Asynchronous packet handling in userspace (ip_queue)
|
||||
|
||||
Traditional packet filtering, NAT, ... is implemented on top of this framework
|
||||
|
||||
Can be used for other stuff interfacing with the core network stack, like DECnet routing daemon.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Netfilter Hooks
|
||||
|
||||
Netfilter architecture in IPv4
|
||||
%font "typewriter"
|
||||
|
||||
--->[1]--->[ROUTE]--->[3]--->[4]--->
|
||||
| ^
|
||||
| |
|
||||
| [ROUTE]
|
||||
v |
|
||||
[2] [5]
|
||||
| ^
|
||||
| |
|
||||
v |
|
||||
|
||||
%font "standard"
|
||||
1=NF_IP_PRE_ROUTING
|
||||
2=NF_IP_LOCAL_IN
|
||||
3=NF_IP_FORWARD
|
||||
4=NF_IP_POST_ROUTING
|
||||
5=NF_IP_LOCAL_OUT
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Netfilter Hooks
|
||||
|
||||
Netfilter Hooks
|
||||
|
||||
Any kernel module may register a callback function at any of the hooks
|
||||
|
||||
The module has to return one of the following constants
|
||||
|
||||
NF_ACCEPT continue traversal as normal
|
||||
NF_DROP drop the packet, do not continue
|
||||
NF_STOLEN I've taken over the packet do not continue
|
||||
NF_QUEUE enqueue packet to userspace
|
||||
NF_REPEAT call this hook again
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
IP tables
|
||||
|
||||
Packet selection using IP tables
|
||||
|
||||
The kernel provides generic IP tables support
|
||||
|
||||
Each kernel module may create it's own IP table
|
||||
|
||||
The three major parts of 2.4 firewalling subsystem are implemented using IP tables
|
||||
Packet filtering table 'filter'
|
||||
NAT table 'nat'
|
||||
Packet mangling table 'mangle'
|
||||
|
||||
Can potentially be used for other stuff, i.e. IPsec SPDB
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
IP Tables
|
||||
|
||||
Managing chains and tables
|
||||
|
||||
An IP table consists out of multiple chains
|
||||
A chain consists out of a list of rules
|
||||
Every single rule in a chain consists out of
|
||||
match[es] (rule executed if all matches true)
|
||||
target (what to do if the rule is matched)
|
||||
|
||||
%size 4
|
||||
matches and targets can either be builtin or implemented as kernel modules
|
||||
|
||||
%size 6
|
||||
The userspace tool iptables is used to control IP tables
|
||||
handles all different kinds of IP tables
|
||||
supports a plugin/shlib interface for target/match specific options
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
IP Tables
|
||||
|
||||
Basic iptables commands
|
||||
|
||||
To build a complete iptables command, we must specify
|
||||
which table to work with
|
||||
which chain in this table to use
|
||||
an operation (insert, add, delete, modify)
|
||||
one or more matches (optional)
|
||||
a target
|
||||
|
||||
The syntax is
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t table -Operation chain -j target match(es)
|
||||
%font "standard"
|
||||
%size 5
|
||||
|
||||
Example:
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t filter -A INPUT -j ACCEPT -p tcp --dport smtp
|
||||
%font "standard"
|
||||
%size 5
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
IP Tables
|
||||
|
||||
Matches
|
||||
Basic matches
|
||||
-p protocol (tcp/udp/icmp/...)
|
||||
-s source address (ip/mask)
|
||||
-d destination address (ip/mask)
|
||||
-i incoming interface
|
||||
-o outgoing interface
|
||||
|
||||
Match extensions (examples)
|
||||
tcp/udp TCP/udp source/destination port
|
||||
icmp ICMP code/type
|
||||
ah/esp AH/ESP SPID match
|
||||
mac source MAC address
|
||||
mark nfmark
|
||||
length match on length of packet
|
||||
limit rate limiting (n packets per timeframe)
|
||||
owner owner uid of the socket sending the packet
|
||||
tos TOS field of IP header
|
||||
ttl TTL field of IP header
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
IP Tables
|
||||
|
||||
Targets
|
||||
very dependent on the particular table.
|
||||
|
||||
Table specific targets will be discussed later
|
||||
|
||||
Generic Targets, always available
|
||||
ACCEPT accept packet within chain
|
||||
DROP silently drop packet
|
||||
QUEUE enqueue packet to userspace
|
||||
LOG log packet via syslog
|
||||
ULOG log packet via ulogd
|
||||
RETURN return to previous (calling) chain
|
||||
foobar jump to user defined chain
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Packet Filtering
|
||||
|
||||
Overview
|
||||
|
||||
Implemented as 'filter' table
|
||||
Registers with three netfilter hooks
|
||||
|
||||
NF_IP_LOCAL_IN (packets destined for the local host)
|
||||
NF_IP_FORWARD (packets forwarded by local host)
|
||||
NF_IP_LOCAL_OUT (packets from the local host)
|
||||
|
||||
Each of the three hooks has attached one chain (INPUT, FORWARD, OUTPUT)
|
||||
|
||||
Every packet passes exactly one of the three chains. Note that this is very different compared to the old 2.2.x ipchains behaviour.
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Packet Filtering
|
||||
|
||||
Targets available within 'filter' table
|
||||
|
||||
Builtin Targets to be used in filter table
|
||||
ACCEPT accept the packet
|
||||
DROP silently drop the packet
|
||||
QUEUE enqueue packet to userspace
|
||||
RETURN return to previous (calling) chain
|
||||
foobar user defined chain
|
||||
|
||||
Targets implemented as loadable modules
|
||||
REJECT drop the packet but inform sender
|
||||
MIRROR change source/destination IP and resend
|
||||
LOG log via syslog
|
||||
ULOG log via userspace
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Connection tracking...
|
||||
|
||||
implemented seperately from NAT
|
||||
enables stateful filtering
|
||||
implementation
|
||||
hooks into NF_IP_PRE_ROUTING to track packets
|
||||
hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to see if packet passed filtering rules
|
||||
protocol modules (currently TCP/UDP/ICMP)
|
||||
application helpers currently (FTP,IRC,H.323,talk,SNMP)
|
||||
divides packets in the following four categories
|
||||
NEW - would establish new connection
|
||||
ESTABLISHED - part of already established connection
|
||||
RELATED - is related to established connection
|
||||
INVALID - (multicast, errors...)
|
||||
does _NOT_ filter packets itself
|
||||
can be utilized by iptables using the 'state' match
|
||||
is used by NAT Subsystem
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Common structures
|
||||
struct ip_conntrack_tuple, representing unidirectional flow
|
||||
layer 3 src + dst
|
||||
layer 4 protocol
|
||||
layer 4 src + dst
|
||||
|
||||
|
||||
connetions represented as struct ip_conntrack
|
||||
original tuple
|
||||
reply tuple
|
||||
timeout
|
||||
l4 state private data
|
||||
app helper
|
||||
app helper private data
|
||||
expected connections
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Flow of events for new packet
|
||||
packet enters NF_IP_PRE_ROUTING
|
||||
tuple is derived from packet
|
||||
lookup conntrack hash table with hash(tuple) -> fails
|
||||
new ip_conntrack is allocated
|
||||
fill in original and reply == inverted(original) tuple
|
||||
initialize timer
|
||||
assign app helper if applicable
|
||||
see if we've been expected -> fails
|
||||
call layer 4 helper 'new' function
|
||||
|
||||
...
|
||||
|
||||
packet enters NF_IP_POST_ROUTING
|
||||
do hashtable lookup for packet -> fails
|
||||
place struct ip_conntrack in hashtable
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
HA for netfillter/iptables
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Flow of events for packet part of existing connection
|
||||
packet enters NF_IP_PRE_ROUTING
|
||||
tuple is derived from packet
|
||||
lookup conntrack hash table with hash(tuple)
|
||||
assosiate conntrack entry with skb->nfct
|
||||
call l4 protocol helper 'packet' function
|
||||
do l4 state tracking
|
||||
update timeouts as needed [i.e. TCP TIME_WAIT,...]
|
||||
|
||||
...
|
||||
|
||||
packet enters NF_IP_POST_ROUTING
|
||||
do hashtable lookup for packet -> succeds
|
||||
do nothing else
|
||||
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Network Address Translation
|
||||
|
||||
Overview
|
||||
|
||||
Previous Linux Kernels only implemented one special case of NAT: Masquerading
|
||||
Linux 2.4.x can do any kind of NAT.
|
||||
NAT subsystem implemented on top of netfilter, iptables and conntrack
|
||||
NAT subsystem registers with all five netfilter hooks
|
||||
'nat' Table registers chains PREROUTING, POSTROUTING and OUTPUT
|
||||
Following targets available within 'nat' Table
|
||||
SNAT changes the packet's source whille passing NF_IP_POST_ROUTING
|
||||
DNAT changes the packet's destination while passing NF_IP_PRE_ROUTING
|
||||
MASQUERADE is a special case of SNAT
|
||||
REDIRECT is a special case of DNAT
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Network Address Translation
|
||||
|
||||
flow of events for NEW packet:
|
||||
packet enters NF_IP_PRE_ROUTING after conntrack
|
||||
resolve conntrack entry for packet
|
||||
if (expectfn of helper) call it
|
||||
else iterate over rules in PREROUTING chain of nat table
|
||||
save respective NAT mappings in conntrack
|
||||
apply the NAT mappings to the packet
|
||||
call NAT helper function, if there is one for this proto
|
||||
|
||||
...
|
||||
|
||||
packet enters NF_IP_POST_ROUTING
|
||||
resolve conntrack entry for packet
|
||||
iterate over rules in POSTROUTING chain of nat table
|
||||
save respectiva NAT mappings in conntrack
|
||||
apply the NAT mappings to the packet
|
||||
call NAT helper function, if there is one for this proto
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Network Address Translation
|
||||
|
||||
flow of events for ESTABLISHED packets:
|
||||
packet enters NF_IP_PRE_ROUTING after conntrack
|
||||
reseolve conntrack entry for packet
|
||||
apply the NAT mappings (read from conntrack entry) to the packet
|
||||
call NAT helper function, if there is one for this proto
|
||||
|
||||
...
|
||||
|
||||
packet enters NF_IP_POST_ROUTING
|
||||
resolve conntrack entry for packet
|
||||
apply the NAT mappings (read from conntrack entry) to the packet
|
||||
call NAT helper function, if there is one for this proto
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Network Address Translation
|
||||
|
||||
Source NAT
|
||||
SNAT Example:
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t nat -A POSTROUTING -j SNAT --to-source 1.2.3.4 -s 10.0.0.0/8
|
||||
%font "standard"
|
||||
%size 4
|
||||
|
||||
MASQUERADE Example:
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t nat -A POSTROUTING -j MASQUERADE -o ppp0
|
||||
%font "standard"
|
||||
%size 5
|
||||
|
||||
Destination NAT
|
||||
DNAT example
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t nat -A PREROUTING -j DNAT --to-destination 1.2.3.4:8080 -p tcp --dport 80 -i eth1
|
||||
%font "standard"
|
||||
%size 4
|
||||
|
||||
REDIRECT example
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t nat -A PREROUTING -j REDIRECT --to-port 3128 -i eth1 -p tcp --dport 80
|
||||
%font "standard"
|
||||
%size 5
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Packet Mangling
|
||||
|
||||
Purpose of mangle table
|
||||
packet manipulation except address manipulation
|
||||
|
||||
Integration with netfilter
|
||||
'mangle' table hooks in all five netfilter hooks
|
||||
priority: after conntrack
|
||||
|
||||
Targets specific to the 'mangle' table:
|
||||
DSCP - manipulate DSCP field
|
||||
IPV4OPTSSTRIP - strip IPv4 options
|
||||
MARK - change the nfmark field of the skb
|
||||
TCPMSS - set TCP MSS option
|
||||
TOS - manipulate the TOS bits
|
||||
TTL - set / increase / decrease TTL field
|
||||
|
||||
Simple example:
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t mangle -A PREROUTING -j MARK --set-mark 10 -p tcp --dport 80
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Advanced Netfilter concepts
|
||||
|
||||
%size 4
|
||||
Userspace logging
|
||||
flexible replacement for old syslog-based logging
|
||||
packets to userspace via multicast netlink sockets
|
||||
easy-to-use library (libipulog)
|
||||
plugin-extensible userspace logging daemon (ulogd)
|
||||
Can even be used to directly log into MySQL
|
||||
|
||||
Queuing
|
||||
reliable asynchronous packet handling
|
||||
packets to userspace via unicast netlink socket
|
||||
easy-to-use library (libipq)
|
||||
provides Perl bindings
|
||||
experimental queue multiplex daemon (ipqmpd)
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Current Development and Future
|
||||
|
||||
Netfilter (although it proved very stable) is still work in progress.
|
||||
|
||||
Areas of current development
|
||||
infrastructure for conntrack manipulation from userspace
|
||||
failover of stateful firewalls
|
||||
making iptables layer3 independent (pkttables)
|
||||
new userspace library (libiptables) to hide plugins from apps
|
||||
more matches and targets for advanced functions (pool, hashslot)
|
||||
more conntrack and NAT modules (RPC, SNMP, SMB, ...)
|
||||
better IPv6 support (conntrack, more matches / targets)
|
||||
conntrack hash optimizations
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Developing netfilter/iptables extensions
|
||||
Developing a netfilter module
|
||||
|
||||
Netfilter modules are very low-layer
|
||||
Get called for every packet passing the hook in this l3prot
|
||||
Examples of netfilter modules are: ip_tables, ip_conntrack, iptable_nat
|
||||
API for netfilter <linux/netfilter.h>:
|
||||
%font "typewriter"
|
||||
nf_register_hook(struct nf_hook_ops *reg)
|
||||
nf_unregister_hook(struct nf_hook_ops *reg)
|
||||
struct nf_hook_ops:
|
||||
struct list_head list; /* list header {NULL,NULL}) */
|
||||
nf_hookfn *hook; /* the callback function */
|
||||
int pf; /* protocol family */
|
||||
int hooknum; /* hook to register with */
|
||||
int priority; /* priority, determines order */
|
||||
%font "standard"
|
||||
Example code see "nf_workshop.c"
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Developing netfilter/iptables extensions
|
||||
Developing an ip_tables match module
|
||||
|
||||
ip_tables modules are at a high layer
|
||||
Get called for every packet iterating a rule with this match
|
||||
Examples of iptables modules are: ipt_ttl, ipt_tos, ipt_tcpmss
|
||||
API for iptables matches <linux/netfilter_ipv4/ip_tables.h>:
|
||||
%font "typewriter"
|
||||
ipt_register_match(struct ipt_match *match)
|
||||
ipt_unregister_match(struct ipt_match *match)
|
||||
struct ipt_match:
|
||||
struct list_head list; /* list header {NULL,NULL} */
|
||||
const char name[]; /* name of the match */
|
||||
int (*match); /* called when pkt is matched */
|
||||
int (*checkentry); /* called when entry inserted */
|
||||
void (*destroy); /* called when entry deleted */
|
||||
struct modulea *me; /* set to THIS_MODULE */
|
||||
%font "standard"
|
||||
Example code see "ipt_workshop.c"
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Developing netfilter/iptables extensions
|
||||
Developing an iptables match module
|
||||
|
||||
Something has to parse the commandline optins for ipt_workshop.c
|
||||
Solution: libpt_workshop.c as iptables plugin
|
||||
API for iptables-command plugins <iptables.h>:
|
||||
%font "typewriter"
|
||||
register_match(struct iptables_match)
|
||||
struct iptables_match:
|
||||
struct iptables_match *next; /* next one */
|
||||
ipt_chainlabel name; /* name */
|
||||
const char *version; /* version */
|
||||
size_t size; /* size of match data */
|
||||
size_t userspacesize; /* size for userspace */
|
||||
void (*help); /* print help message */
|
||||
void (*init); /* init the matchinfo */
|
||||
int (*parse); /* parse getopt chars */
|
||||
void (*final_check); /* consistency check */
|
||||
void (*print); /* print (iptables -L) */
|
||||
void (*save); /* iptables-save */
|
||||
struct option extra_opts; /* getopt-style opts */
|
||||
%font "typewriter"
|
||||
Example code see "libipt_workshop.c"
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Future of Linux packet filtering
|
||||
Thanks
|
||||
The slides and the an according paper of this presentation are available at http://www.gnumonks.org/
|
||||
|
||||
The netfilter homepage http://www.netfilter.org/
|
||||
|
||||
Thanks to
|
||||
the BBS people, Z-Netz, FIDO, ...
|
||||
for heavily increasing my computer usage in 1992
|
||||
KNF
|
||||
for bringing me in touch with the internet as early as 1994
|
||||
for providing a playground for technical people
|
||||
for telling me about the existance of Linux!
|
||||
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
|
||||
for implementing (one of?) the world's best TCP/IP stacks
|
||||
Paul 'Rusty' Russell
|
||||
for starting the netfilter/iptables project
|
||||
for trusting me to maintain it today
|
||||
Astaro AG
|
||||
for sponsoring parts of my netfilter work
|
||||
for sponsoring my travel cost to 5CLT
|
||||
|
|
@ -0,0 +1,57 @@
|
|||
#include <linux/module.h>
|
||||
#include <linux/config.h>
|
||||
#include <linux/skbuff.h>
|
||||
#include <linux/ip.h>
|
||||
|
||||
#include <linux/netfilter.h>
|
||||
#include <linux/netfilter_ipv4.h>
|
||||
|
||||
MODULE_LICENSE("GPL");
|
||||
MODULE_AUTHOR("Harald Welte <laforge@netfilter.org>");
|
||||
MODULE_DESCRIPTION("5CLT workshop module");
|
||||
|
||||
static unsigned int
|
||||
workshop_fn(unsigned int hooknum,
|
||||
struct sk_buff **pskb,
|
||||
const struct net_device *in,
|
||||
const struct net_device *out,
|
||||
int (*okfn)(struct sk_buff *))
|
||||
{
|
||||
struct iphdr *iph = (*pskb)->nh.iph;
|
||||
/* do whatever we want to do */
|
||||
|
||||
printk(KERN_NOTICE "packet from %u.%u.%u.%u received\n",
|
||||
NIPQUAD(iph->saddr));
|
||||
|
||||
return NF_ACCEPT;
|
||||
}
|
||||
|
||||
static struct nf_hook_ops workshop_ops = {
|
||||
.list = { .prev = NULL, .next = NULL },
|
||||
.hook = &workshop_fn,
|
||||
.pf = PF_INET,
|
||||
.hooknum = NF_IP_PRE_ROUTING,
|
||||
.priority = NF_IP_PRI_LAST-1
|
||||
};
|
||||
|
||||
static int __init init(void)
|
||||
{
|
||||
int ret = 0;
|
||||
|
||||
ret = nf_register_hook(&workshop_ops);
|
||||
if (ret < 0) {
|
||||
printk(KERN_ERR "something went wrong while registering\n");
|
||||
return ret;
|
||||
}
|
||||
|
||||
printk(KERN_DEBUG "workshop netfilter module successfully loaded\n");
|
||||
return ret;
|
||||
}
|
||||
|
||||
static void __exit fini(void)
|
||||
{
|
||||
nf_unregister_hook(&workshop_ops);
|
||||
}
|
||||
|
||||
module_init(init);
|
||||
module_exit(fini);
|
|
@ -0,0 +1,54 @@
|
|||
#include <linux/module.h>
|
||||
#include <linux/sk_buff.h>
|
||||
#include <linux/netfilter_ipv4/ip_tables.h>
|
||||
#include <linux/netfilter_ipv4/ipt_workshop.h>
|
||||
|
||||
MODULE_LICENSE("GPL");
|
||||
MODULE_AUTHOR("Harald Welte <laforge@netfilter.org>");
|
||||
MODULE_DESCRIPTION("OLS2003 workshop iptables module");
|
||||
|
||||
static int ws_match(const struct sk_buff *skb, const struct net_device *in,
|
||||
const struct net_device *out, const void *matchinfo,
|
||||
int offset, const void *hdr, u_int16_t datalen,
|
||||
int *hotdrop)
|
||||
{
|
||||
const struct ipt_ws_info *info = matchinfo;
|
||||
const struct iphdr *iph = skb->nh.iph;
|
||||
|
||||
if (iph->ttl == info->ttl)
|
||||
return 1;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int ws_checkentry(const char *tablename, const struct ipt_ip *ip,
|
||||
void *matchinfo, unsigned int matchsize,
|
||||
unsigned int hook_mask)
|
||||
{
|
||||
if (matchsize != IPT_ALIGN(sizeof(struct ipt_ws_info)))
|
||||
return 0;
|
||||
|
||||
return 1;
|
||||
}
|
||||
|
||||
static struct ipt_match ws_match = {
|
||||
.list = { .prev = NULL, .next = NULL },
|
||||
.name = "workshop",
|
||||
.match = &ws_match,
|
||||
.checkentry = &ws_checkentry,
|
||||
.destroy = NULL,
|
||||
.me = THIS_MODULE
|
||||
};
|
||||
|
||||
static int __init init(void)
|
||||
{
|
||||
return ipt_register_match(&ws_match);
|
||||
}
|
||||
|
||||
static void __exit fini(void)
|
||||
{
|
||||
ipt_unregister_match(&ws_match);
|
||||
}
|
||||
|
||||
module_init(init);
|
||||
module_exit(fini);
|
|
@ -0,0 +1,6 @@
|
|||
#ifndef _IPT_WORKSHOP_H
|
||||
#define _IPT_WORKSHOP_H
|
||||
struct ipt_ws_info {
|
||||
u_int8_t ttl;
|
||||
};
|
||||
#endif
|
|
@ -0,0 +1,102 @@
|
|||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
#include <getopt.h>
|
||||
#include <linux/netfilter_ipv4/ip_tables.h>
|
||||
#include <linux/netfilter_ipv4/ipt_workshop.h>
|
||||
|
||||
static void help(void)
|
||||
{
|
||||
printf(
|
||||
"workshop match v%s options:\n"
|
||||
" --ttl TTL value\n"
|
||||
, IPTABLES_VERSION);
|
||||
}
|
||||
|
||||
static void init(struct ipt_entry_match *m, unsigned int *nfcache)
|
||||
{
|
||||
/* caching not implemented yet */
|
||||
*nfcache |= NFC_UNKNOWN;
|
||||
}
|
||||
|
||||
static int parse(int c, char **argv, int invert, unsigned int *flags,
|
||||
const struct ipt_entry *entry, unsigned int *nfcache,
|
||||
struct ipt_entry_match **match)
|
||||
{
|
||||
struct ipt_ws_info *info = (struct ipt_ws_info *) (*match)->data;
|
||||
|
||||
check_inverse(optarg, &invert, &optind, 0);
|
||||
|
||||
if (invert)
|
||||
exit_error(PARAMETER_PROBLEM, "invert not supported");
|
||||
|
||||
if (*flags)
|
||||
exit_error(PARAMETER_PROBLEM,
|
||||
"workshop: can't specify parameter twice");
|
||||
|
||||
if (!optarg)
|
||||
exit_error(PARAMETER_PROBLEM,
|
||||
"workshop: you must specify a value");
|
||||
|
||||
switch (c) {
|
||||
case 'z':
|
||||
info->ttl = atoi(optarg);
|
||||
/* FIXME: range 0-255 */
|
||||
*flags = 1;
|
||||
break;
|
||||
default:
|
||||
return 0;
|
||||
}
|
||||
|
||||
return 1;
|
||||
}
|
||||
|
||||
static void final_check(unsigned int flags)
|
||||
{
|
||||
if (!flags)
|
||||
exit_error(PARAMETER_PROBLEM,
|
||||
"workshop match: you must specify ttl");
|
||||
}
|
||||
|
||||
static void print(const struct ipt_ip *ip,
|
||||
const struct ipt_entry_match *match,
|
||||
int numeric)
|
||||
{
|
||||
const struct ipt_ws_info *info = (struct ipt_ws_info *) match->data;
|
||||
|
||||
printf("workshop match TTL=%u ", info->ttl);
|
||||
|
||||
}
|
||||
|
||||
static void save(const struct ipt_ip *ip,
|
||||
const struct ipt_entry_match *match)
|
||||
{
|
||||
const struct ipt_ws_info *info = (struct ipt_ws_info *) match->data;
|
||||
|
||||
printf("--ttl %u ", info->ttl);
|
||||
}
|
||||
|
||||
static struct option opts[] = {
|
||||
{ "ttl", 1, 0, 'z' },
|
||||
{ 0 }
|
||||
};
|
||||
|
||||
static struct iptables_match ws = {
|
||||
.next = NULL,
|
||||
.name = "workshop",
|
||||
.version = IPTABLES_VERSION,
|
||||
.size = IPT_ALIGN(sizeof(struct ipt_ws_info)),
|
||||
.userspacesize = IPT_ALIGN(sizeof(struct ipt_ws_info)),
|
||||
.help = &help,
|
||||
.init = &init,
|
||||
.parse = &parse,
|
||||
.final_check = &final_check,
|
||||
.print = &print,
|
||||
.save = &save,
|
||||
.extra_opts = opts
|
||||
};
|
||||
|
||||
void _init(void)
|
||||
{
|
||||
register_match(&ws);
|
||||
}
|
|
@ -0,0 +1,615 @@
|
|||
%include "default.mgp"
|
||||
%default 1 bgrad
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
#%deffont "typewriter" tfont "MONOTYPE.TTF"
|
||||
%page
|
||||
%nodefault
|
||||
%back "blue"
|
||||
|
||||
%center
|
||||
%size 7
|
||||
|
||||
|
||||
Developing netfilter/iptables
|
||||
extensions
|
||||
|
||||
|
||||
%center
|
||||
%size 4
|
||||
by
|
||||
|
||||
Harald Welte <laforge@netfilter.org>
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Contents
|
||||
|
||||
|
||||
Introduction
|
||||
The netfilter/iptables architecture
|
||||
Netfilter hooks in protocol stacks
|
||||
Packet selection based on IP Tables
|
||||
The Connection Tracking Subsystem
|
||||
The NAT Subsystem based on netfilter + iptables
|
||||
Packet filtering using the 'filter' table
|
||||
Packet mangling using the 'mangle' table
|
||||
Advanced netfilter concepts
|
||||
Current development and Future
|
||||
Developing a netfilter module
|
||||
Developing a new iptables match
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Netfilter Hooks
|
||||
|
||||
What is netfilter?
|
||||
|
||||
System of callback functions within network stack
|
||||
Callback function to be called for every packet traversing certain point (hook) within network stack
|
||||
Protocol independent framework
|
||||
Hooks in layer 3 stacks (IPv4, IPv6, DECnet, ARP)
|
||||
Multiple kernel modules can register with each of the hooks
|
||||
Asynchronous packet handling in userspace (ip_queue)
|
||||
|
||||
Traditional packet filtering, NAT, ... is implemented on top of this framework
|
||||
|
||||
Can be used for other stuff interfacing with the core network stack, like DECnet routing daemon.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Netfilter Hooks
|
||||
|
||||
Netfilter architecture in IPv4
|
||||
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
--->[1]--->[ROUTE]--->[3]--->[4]--->
|
||||
| ^
|
||||
| |
|
||||
| [ROUTE]
|
||||
v |
|
||||
[2] [5]
|
||||
| ^
|
||||
| |
|
||||
v |
|
||||
|
||||
%font "standard"
|
||||
1=NF_IP_PRE_ROUTING
|
||||
2=NF_IP_LOCAL_IN
|
||||
3=NF_IP_FORWARD
|
||||
4=NF_IP_POST_ROUTING
|
||||
5=NF_IP_LOCAL_OUT
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Netfilter Hooks
|
||||
|
||||
Netfilter Hooks
|
||||
|
||||
Any kernel module may register a callback function at any of the hooks
|
||||
|
||||
The module has to return one of the following constants
|
||||
|
||||
NF_ACCEPT continue traversal as normal
|
||||
NF_DROP drop the packet, do not continue
|
||||
NF_STOLEN I've taken over the packet do not continue
|
||||
NF_QUEUE enqueue packet to userspace
|
||||
NF_REPEAT call this hook again
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Developing netfilter/iptables extensions
|
||||
Developing a netfilter module
|
||||
|
||||
Netfilter modules are very low-layer
|
||||
Get called for every packet passing the hook in this l3prot
|
||||
Examples of netfilter modules are: ip_tables, ip_conntrack, iptable_nat
|
||||
%font "typewriter"
|
||||
%size 2
|
||||
#include <linux/netfilter.h>
|
||||
%size 2
|
||||
nf_register_hook(struct nf_hook_ops *reg)
|
||||
%size 2
|
||||
nf_unregister_hook(struct nf_hook_ops *reg)
|
||||
%size 2
|
||||
struct nf_hook_ops:
|
||||
%size 2
|
||||
struct list_head list; /* list header */
|
||||
%size 2
|
||||
nf_hookfn *hook; /* the callback function */
|
||||
%size 2
|
||||
int pf; /* protocol family */
|
||||
%size 2
|
||||
int hooknum; /* hook to register with */
|
||||
%size 2
|
||||
int priority; /* priority (ordering) */
|
||||
%font "standard"
|
||||
Example code see "nf_workshop.c"
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
IP tables
|
||||
|
||||
Packet selection using IP tables
|
||||
|
||||
The kernel provides generic IP tables support
|
||||
|
||||
Each kernel module may create it's own IP table
|
||||
|
||||
The three major parts of 2.4 firewalling subsystem are implemented using IP tables
|
||||
Packet filtering table 'filter'
|
||||
NAT table 'nat'
|
||||
Packet mangling table 'mangle'
|
||||
|
||||
Could potentially be used for other stuff, i.e. IPsec SPDB
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
IP Tables
|
||||
|
||||
Managing chains and tables
|
||||
|
||||
An IP table consists out of multiple chains
|
||||
A chain consists out of a list of rules
|
||||
Every single rule in a chain consists out of
|
||||
match[es] (rule executed if all matches true)
|
||||
target (what to do if the rule is matched)
|
||||
|
||||
%size 4
|
||||
matches and targets can either be builtin or implemented as kernel modules
|
||||
|
||||
%size 5
|
||||
The userspace tool iptables is used to control IP tables
|
||||
handles all different kinds of IP tables
|
||||
supports a plugin/shlib interface for target/match specific options
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
IP Tables
|
||||
|
||||
Basic iptables commands
|
||||
|
||||
To build a complete iptables command, we must specify
|
||||
which table to work with
|
||||
which chain in this table to use
|
||||
an operation (insert, add, delete, modify)
|
||||
one or more matches (optional)
|
||||
a target
|
||||
|
||||
The syntax is
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t table -Operation chain -j target match(es)
|
||||
%font "standard"
|
||||
%size 5
|
||||
|
||||
Example:
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t filter -A INPUT -j ACCEPT -p tcp --dport smtp
|
||||
%font "standard"
|
||||
%size 5
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
IP Tables
|
||||
|
||||
Matches
|
||||
Basic matches
|
||||
-p protocol (tcp/udp/icmp/...)
|
||||
-s source address (ip/mask)
|
||||
-d destination address (ip/mask)
|
||||
-i incoming interface
|
||||
-o outgoing interface
|
||||
|
||||
Match extensions (examples)
|
||||
tcp/udp TCP/udp source/destination port
|
||||
icmp ICMP code/type
|
||||
ah/esp AH/ESP SPID match
|
||||
mac source MAC address
|
||||
mark nfmark
|
||||
length match on length of packet
|
||||
limit rate limiting (n packets per timeframe)
|
||||
owner owner uid of the socket sending the packet
|
||||
tos TOS field of IP header
|
||||
ttl TTL field of IP header
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
IP Tables
|
||||
|
||||
Targets
|
||||
very dependent on the particular table.
|
||||
|
||||
Table specific targets will be discussed later
|
||||
|
||||
Generic Targets, always available
|
||||
ACCEPT accept packet within chain
|
||||
DROP silently drop packet
|
||||
QUEUE enqueue packet to userspace
|
||||
LOG log packet via syslog
|
||||
ULOG log packet via ulogd
|
||||
RETURN return to previous (calling) chain
|
||||
foobar jump to user defined chain
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Packet Filtering
|
||||
|
||||
Overview
|
||||
|
||||
Implemented as 'filter' table
|
||||
Registers with three netfilter hooks
|
||||
|
||||
NF_IP_LOCAL_IN (packets destined for the local host)
|
||||
NF_IP_FORWARD (packets forwarded by local host)
|
||||
NF_IP_LOCAL_OUT (packets from the local host)
|
||||
|
||||
Each of the three hooks has attached one chain (INPUT, FORWARD, OUTPUT)
|
||||
|
||||
Every packet passes exactly one of the three chains. Note that this is very different compared to the old 2.2.x ipchains behaviour.
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Packet Filtering
|
||||
|
||||
Targets available within 'filter' table
|
||||
|
||||
Builtin Targets to be used in filter table
|
||||
ACCEPT accept the packet
|
||||
DROP silently drop the packet
|
||||
QUEUE enqueue packet to userspace
|
||||
RETURN return to previous (calling) chain
|
||||
foobar user defined chain
|
||||
|
||||
Targets implemented as loadable modules
|
||||
REJECT drop the packet but inform sender
|
||||
MIRROR change source/destination IP and resend
|
||||
LOG log via syslog
|
||||
ULOG log via userspace
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Developing netfilter/iptables extensions
|
||||
Developing an ip_tables match module
|
||||
|
||||
ip_tables modules are at a high layer
|
||||
Get called for every packet iterating a rule with this match
|
||||
Examples of iptables modules are: ipt_ttl, ipt_tos, ipt_tcpmss
|
||||
%font "typewriter"
|
||||
%size 2
|
||||
#include <linux/netfilter_ipv4/ip_tables.h>
|
||||
%size 2
|
||||
ipt_register_match(struct ipt_match *match)
|
||||
%size 2
|
||||
ipt_unregister_match(struct ipt_match *match)
|
||||
%size 2
|
||||
struct ipt_match:
|
||||
%size 2
|
||||
struct list_head list; /* list header {NULL,NULL} */
|
||||
%size 2
|
||||
const char name[]; /* name of the match */
|
||||
%size 2
|
||||
int (*match); /* called when pkt is matched */
|
||||
%size 2
|
||||
int (*checkentry); /* called when entry inserted */
|
||||
%size 2
|
||||
void (*destroy); /* called when entry deleted */
|
||||
%size 2
|
||||
struct module *me; /* set to THIS_MODULE */
|
||||
%font "standard"
|
||||
Example code see "ipt_workshop.c"
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Developing netfilter/iptables extensions
|
||||
Developing an iptables match module
|
||||
|
||||
Something has to parse the commandline options for ipt_workshop.c
|
||||
Solution: libpt_workshop.c as iptables plugin
|
||||
%font "typewriter"
|
||||
%size 2
|
||||
#include <iptables.h>:
|
||||
%size 2
|
||||
register_match(struct iptables_match)
|
||||
%size 2
|
||||
struct iptables_match:
|
||||
%size 2
|
||||
struct iptables_match *next; /* next one */
|
||||
%size 2
|
||||
ipt_chainlabel name; /* name */
|
||||
%size 2
|
||||
const char *version; /* version */
|
||||
%size 2
|
||||
size_t size; /* size of match data */
|
||||
%size 2
|
||||
size_t userspacesize; /* size for userspace */
|
||||
%size 2
|
||||
void (*help); /* print help message */
|
||||
%size 2
|
||||
void (*init); /* init the matchinfo */
|
||||
%size 2
|
||||
int (*parse); /* parse getopt chars */
|
||||
%size 2
|
||||
void (*final_check); /* consistency check */
|
||||
%size 2
|
||||
void (*print); /* print (iptables -L) */
|
||||
%size 2
|
||||
void (*save); /* iptables-save */
|
||||
%size 2
|
||||
struct option extra_opts; /* getopt-style opts */
|
||||
%font "typewriter"
|
||||
Example code see "libipt_workshop.c"
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Connection tracking...
|
||||
implemented seperately from NAT
|
||||
enables stateful filtering
|
||||
implementation
|
||||
hooks into NF_IP_PRE_ROUTING to track packets
|
||||
hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to see if packet passed filtering rules
|
||||
protocol modules (currently TCP/UDP/ICMP)
|
||||
application helpers currently (FTP,IRC,H.323,talk,SNMP)
|
||||
divides packets in the following four categories
|
||||
NEW - would establish new connection
|
||||
ESTABLISHED - part of already established connection
|
||||
RELATED - is related to established connection
|
||||
INVALID - (multicast, errors...)
|
||||
does _NOT_ filter packets itself
|
||||
can be utilized by iptables using the 'state' match
|
||||
is used by NAT Subsystem
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Common structures
|
||||
struct ip_conntrack_tuple, representing unidirectional flow
|
||||
layer 3 src + dst
|
||||
layer 4 protocol
|
||||
layer 4 src + dst
|
||||
connetions represented as struct ip_conntrack
|
||||
original tuple
|
||||
reply tuple
|
||||
timeout
|
||||
l4 state private data
|
||||
app helper
|
||||
app helper private data
|
||||
expected connections
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Flow of events for new packet
|
||||
packet enters NF_IP_PRE_ROUTING
|
||||
tuple is derived from packet
|
||||
lookup conntrack hash table with hash(tuple) -> fails
|
||||
new ip_conntrack is allocated
|
||||
fill in original and reply == inverted(original) tuple
|
||||
initialize timer
|
||||
assign app helper if applicable
|
||||
see if we've been expected -> fails
|
||||
call layer 4 helper 'new' function
|
||||
...
|
||||
packet enters NF_IP_POST_ROUTING
|
||||
do hashtable lookup for packet -> fails
|
||||
place struct ip_conntrack in hashtable
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Flow of events for packet part of existing connection
|
||||
packet enters NF_IP_PRE_ROUTING
|
||||
tuple is derived from packet
|
||||
lookup conntrack hash table with hash(tuple)
|
||||
assosiate conntrack entry with skb->nfct
|
||||
call l4 protocol helper 'packet' function
|
||||
do l4 state tracking
|
||||
update timeouts as needed [i.e. TCP TIME_WAIT,...]
|
||||
...
|
||||
packet enters NF_IP_POST_ROUTING
|
||||
do hashtable lookup for packet -> succeds
|
||||
do nothing else
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Writing extensions for the conntrack subsystem
|
||||
|
||||
new l4 protocol modules are very rare
|
||||
more common: application helpers for ftp,irc,h.323,quake,mms,...
|
||||
API for conntrack helper modules:
|
||||
%font "typewriter"
|
||||
%size 2
|
||||
#include <linux/netfilter_ipv4/ip_conntrack_helper.h>
|
||||
%size 2
|
||||
struct ip_conntrack_helper
|
||||
%size 2
|
||||
struct list_head *list;
|
||||
%size 2
|
||||
const char *name;
|
||||
%size 2
|
||||
unsigned char flags;
|
||||
%size 2
|
||||
struct module *me;
|
||||
%size 2
|
||||
unsigned int max_expected;
|
||||
%size 2
|
||||
unsigned int timeout;
|
||||
%size 2
|
||||
struct ip_conntrack_tuple tuple;
|
||||
%size 2
|
||||
struct ip_conntrack_mask mask;
|
||||
%size 2
|
||||
int (*help)(const struct iphdr *iph, size_t, struct ip_conntrack, enum ip_conntrack_info);
|
||||
%size 2
|
||||
int ip_conntrack_helper_register(struct ip_conntrack_helper);
|
||||
%size 2
|
||||
void ip_conntrack_helper_unregister(struct ip_conntrack_helper);
|
||||
%size 2
|
||||
int ip_conntrack_expect_related(struct ip_conntrack, struct ip_conntrack_expect);
|
||||
%size 2
|
||||
int ip_conntrack_change_expect(struct ip_conntrack_expect, struct ip_conntrack_tuple);
|
||||
%size 2
|
||||
void ip_conntrack_unexpect_related(struct ip_conntrack_expect);
|
||||
%font "standard"
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Network Address Translation
|
||||
|
||||
Overview
|
||||
|
||||
Previous Linux Kernels only implemented one special case of NAT: Masquerading
|
||||
Linux 2.4.x can do any kind of NAT.
|
||||
NAT subsystem implemented on top of netfilter, iptables and conntrack
|
||||
NAT subsystem registers with all five netfilter hooks
|
||||
'nat' Table registers chains PREROUTING, POSTROUTING and OUTPUT
|
||||
Following targets available within 'nat' Table
|
||||
SNAT changes the packet's source whille passing NF_IP_POST_ROUTING
|
||||
DNAT changes the packet's destination while passing NF_IP_PRE_ROUTING
|
||||
MASQUERADE is a special case of SNAT
|
||||
REDIRECT is a special case of DNAT
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Network Address Translation
|
||||
|
||||
flow of events for NEW packet:
|
||||
packet enters NF_IP_PRE_ROUTING after conntrack
|
||||
resolve conntrack entry for packet
|
||||
if (expectfn of helper) call it
|
||||
else iterate over rules in PREROUTING chain of nat table
|
||||
save respective NAT mappings in conntrack
|
||||
apply the NAT mappings to the packet
|
||||
call NAT helper function, if there is one for this proto
|
||||
...
|
||||
packet enters NF_IP_POST_ROUTING
|
||||
resolve conntrack entry for packet
|
||||
iterate over rules in POSTROUTING chain of nat table
|
||||
save respectiva NAT mappings in conntrack
|
||||
apply the NAT mappings to the packet
|
||||
call NAT helper function, if there is one for this proto
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Network Address Translation
|
||||
|
||||
flow of events for ESTABLISHED packets:
|
||||
packet enters NF_IP_PRE_ROUTING after conntrack
|
||||
reseolve conntrack entry for packet
|
||||
apply the NAT mappings (read from conntrack entry) to the packet
|
||||
call NAT helper function, if there is one for this proto
|
||||
...
|
||||
packet enters NF_IP_POST_ROUTING
|
||||
resolve conntrack entry for packet
|
||||
apply the NAT mappings (read from conntrack entry) to the packet
|
||||
call NAT helper function, if there is one for this proto
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Developing a NAT helper module
|
||||
Network Address Translation
|
||||
|
||||
%font "typewriter"
|
||||
%size 2
|
||||
#include <linux/netfilter_ipv4/ip_nat_helper.h>
|
||||
%size 2
|
||||
struct ip_nat_helper
|
||||
%size 2
|
||||
struct list_head list;
|
||||
%size 2
|
||||
const char *name;
|
||||
%size 2
|
||||
unsigned char *flags;
|
||||
%size 2
|
||||
struct module *me;
|
||||
%size 2
|
||||
struct ip_conntrack_tuple tuple;
|
||||
%size 2
|
||||
struct ip_conntrack_tuple mask;
|
||||
%size 2
|
||||
unsigned int (*help)(struct ip_conntrack *, struct ip_conntrack_expect *, struct ip_nat_info *, enum ip_conntrack_info, unsigned int hooknum, struct sk_buff **)
|
||||
%size 2
|
||||
unsigned int (*expect)(struct sk_buff **, unsigned int hooknum, struct ip_conntrack, struct ip_nat_info *)
|
||||
%size 2
|
||||
int ip_nat_helper_register(struct ip_nat_helper *);
|
||||
%size 2
|
||||
void ip_nat_helper_unregister(struct ip_nat_helper *);
|
||||
%size 2
|
||||
int ip_nat_mangle_tcp_packet();
|
||||
%size 2
|
||||
int ip_nat_mangle_udp_packet();
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Advanced Netfilter concepts
|
||||
|
||||
%size 4
|
||||
Userspace logging
|
||||
flexible replacement for old syslog-based logging
|
||||
packets to userspace via multicast netlink sockets
|
||||
easy-to-use library (libipulog)
|
||||
plugin-extensible userspace logging daemon (ulogd)
|
||||
Can even be used to directly log into MySQL
|
||||
|
||||
Queuing
|
||||
reliable asynchronous packet handling
|
||||
packets to userspace via unicast netlink socket
|
||||
easy-to-use library (libipq)
|
||||
provides Perl bindings
|
||||
experimental queue multiplex daemon (ipqmpd)
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Developing netfilter/iptables extensions
|
||||
Thanks
|
||||
The slides and the an according paper of this presentation are available at http://www.gnumonks.org/
|
||||
The netfilter homepage: http://www.netfilter.org/
|
||||
Thanks to
|
||||
the BBS people, Z-Netz, FIDO, ...
|
||||
for heavily increasing my computer usage in 1992
|
||||
KNF
|
||||
for bringing me in touch with the internet as early as 1994
|
||||
for providing a playground for technical people
|
||||
for telling me about the existance of Linux!
|
||||
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
|
||||
for implementing (one of?) the world's best TCP/IP stacks
|
||||
Paul 'Rusty' Russell
|
||||
for starting the netfilter/iptables project
|
||||
for trusting me to maintain it today
|
||||
Astaro AG (http://www.astaro.com/)
|
||||
for sponsoring parts of my netfilter work
|
||||
for sponsoring my travel cost to OLS
|
||||
|
|
@ -0,0 +1,615 @@
|
|||
%include "default.mgp"
|
||||
%default 1 bgrad
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
#%deffont "typewriter" tfont "MONOTYPE.TTF"
|
||||
%page
|
||||
%nodefault
|
||||
%back "blue"
|
||||
|
||||
%center
|
||||
%size 7
|
||||
|
||||
|
||||
Developing netfilter/iptables
|
||||
extensions
|
||||
|
||||
|
||||
%center
|
||||
%size 4
|
||||
by
|
||||
|
||||
Harald Welte <laforge@netfilter.org>
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Contents
|
||||
|
||||
|
||||
Introduction
|
||||
The netfilter/iptables architecture
|
||||
Netfilter hooks in protocol stacks
|
||||
Packet selection based on IP Tables
|
||||
The Connection Tracking Subsystem
|
||||
The NAT Subsystem based on netfilter + iptables
|
||||
Packet filtering using the 'filter' table
|
||||
Packet mangling using the 'mangle' table
|
||||
Advanced netfilter concepts
|
||||
Current development and Future
|
||||
Developing a netfilter module
|
||||
Developing a new iptables match
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Netfilter Hooks
|
||||
|
||||
What is netfilter?
|
||||
|
||||
System of callback functions within network stack
|
||||
Callback function to be called for every packet traversing certain point (hook) within network stack
|
||||
Protocol independent framework
|
||||
Hooks in layer 3 stacks (IPv4, IPv6, DECnet, ARP)
|
||||
Multiple kernel modules can register with each of the hooks
|
||||
Asynchronous packet handling in userspace (ip_queue)
|
||||
|
||||
Traditional packet filtering, NAT, ... is implemented on top of this framework
|
||||
|
||||
Can be used for other stuff interfacing with the core network stack, like DECnet routing daemon.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Netfilter Hooks
|
||||
|
||||
Netfilter architecture in IPv4
|
||||
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
--->[1]--->[ROUTE]--->[3]--->[4]--->
|
||||
| ^
|
||||
| |
|
||||
| [ROUTE]
|
||||
v |
|
||||
[2] [5]
|
||||
| ^
|
||||
| |
|
||||
v |
|
||||
|
||||
%font "standard"
|
||||
1=NF_IP_PRE_ROUTING
|
||||
2=NF_IP_LOCAL_IN
|
||||
3=NF_IP_FORWARD
|
||||
4=NF_IP_POST_ROUTING
|
||||
5=NF_IP_LOCAL_OUT
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Netfilter Hooks
|
||||
|
||||
Netfilter Hooks
|
||||
|
||||
Any kernel module may register a callback function at any of the hooks
|
||||
|
||||
The module has to return one of the following constants
|
||||
|
||||
NF_ACCEPT continue traversal as normal
|
||||
NF_DROP drop the packet, do not continue
|
||||
NF_STOLEN I've taken over the packet do not continue
|
||||
NF_QUEUE enqueue packet to userspace
|
||||
NF_REPEAT call this hook again
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Developing netfilter/iptables extensions
|
||||
Developing a netfilter module
|
||||
|
||||
Netfilter modules are very low-layer
|
||||
Get called for every packet passing the hook in this l3prot
|
||||
Examples of netfilter modules are: ip_tables, ip_conntrack, iptable_nat
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
#include <linux/netfilter.h>
|
||||
%size 3
|
||||
nf_register_hook(struct nf_hook_ops *reg)
|
||||
%size 3
|
||||
nf_unregister_hook(struct nf_hook_ops *reg)
|
||||
%size 3
|
||||
struct nf_hook_ops:
|
||||
%size 3
|
||||
struct list_head list; /* list header */
|
||||
%size 3
|
||||
nf_hookfn *hook; /* the callback function */
|
||||
%size 3
|
||||
int pf; /* protocol family */
|
||||
%size 3
|
||||
int hooknum; /* hook to register with */
|
||||
%size 3
|
||||
int priority; /* priority (ordering) */
|
||||
%font "standard"
|
||||
Example code see "nf_workshop.c"
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
IP tables
|
||||
|
||||
Packet selection using IP tables
|
||||
|
||||
The kernel provides generic IP tables support
|
||||
|
||||
Each kernel module may create it's own IP table
|
||||
|
||||
The three major parts of 2.4 firewalling subsystem are implemented using IP tables
|
||||
Packet filtering table 'filter'
|
||||
NAT table 'nat'
|
||||
Packet mangling table 'mangle'
|
||||
|
||||
Could potentially be used for other stuff, i.e. IPsec SPDB
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
IP Tables
|
||||
|
||||
Managing chains and tables
|
||||
|
||||
An IP table consists out of multiple chains
|
||||
A chain consists out of a list of rules
|
||||
Every single rule in a chain consists out of
|
||||
match[es] (rule executed if all matches true)
|
||||
target (what to do if the rule is matched)
|
||||
|
||||
%size 4
|
||||
matches and targets can either be builtin or implemented as kernel modules
|
||||
|
||||
%size 5
|
||||
The userspace tool iptables is used to control IP tables
|
||||
handles all different kinds of IP tables
|
||||
supports a plugin/shlib interface for target/match specific options
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
IP Tables
|
||||
|
||||
Basic iptables commands
|
||||
|
||||
To build a complete iptables command, we must specify
|
||||
which table to work with
|
||||
which chain in this table to use
|
||||
an operation (insert, add, delete, modify)
|
||||
one or more matches (optional)
|
||||
a target
|
||||
|
||||
The syntax is
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t table -Operation chain -j target match(es)
|
||||
%font "standard"
|
||||
%size 5
|
||||
|
||||
Example:
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
iptables -t filter -A INPUT -j ACCEPT -p tcp --dport smtp
|
||||
%font "standard"
|
||||
%size 5
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
IP Tables
|
||||
|
||||
Matches
|
||||
Basic matches
|
||||
-p protocol (tcp/udp/icmp/...)
|
||||
-s source address (ip/mask)
|
||||
-d destination address (ip/mask)
|
||||
-i incoming interface
|
||||
-o outgoing interface
|
||||
|
||||
Match extensions (examples)
|
||||
tcp/udp TCP/udp source/destination port
|
||||
icmp ICMP code/type
|
||||
ah/esp AH/ESP SPID match
|
||||
mac source MAC address
|
||||
mark nfmark
|
||||
length match on length of packet
|
||||
limit rate limiting (n packets per timeframe)
|
||||
owner owner uid of the socket sending the packet
|
||||
tos TOS field of IP header
|
||||
ttl TTL field of IP header
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
IP Tables
|
||||
|
||||
Targets
|
||||
very dependent on the particular table.
|
||||
|
||||
Table specific targets will be discussed later
|
||||
|
||||
Generic Targets, always available
|
||||
ACCEPT accept packet within chain
|
||||
DROP silently drop packet
|
||||
QUEUE enqueue packet to userspace
|
||||
LOG log packet via syslog
|
||||
ULOG log packet via ulogd
|
||||
RETURN return to previous (calling) chain
|
||||
foobar jump to user defined chain
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Packet Filtering
|
||||
|
||||
Overview
|
||||
|
||||
Implemented as 'filter' table
|
||||
Registers with three netfilter hooks
|
||||
|
||||
NF_IP_LOCAL_IN (packets destined for the local host)
|
||||
NF_IP_FORWARD (packets forwarded by local host)
|
||||
NF_IP_LOCAL_OUT (packets from the local host)
|
||||
|
||||
Each of the three hooks has attached one chain (INPUT, FORWARD, OUTPUT)
|
||||
|
||||
Every packet passes exactly one of the three chains. Note that this is very different compared to the old 2.2.x ipchains behaviour.
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Packet Filtering
|
||||
|
||||
Targets available within 'filter' table
|
||||
|
||||
Builtin Targets to be used in filter table
|
||||
ACCEPT accept the packet
|
||||
DROP silently drop the packet
|
||||
QUEUE enqueue packet to userspace
|
||||
RETURN return to previous (calling) chain
|
||||
foobar user defined chain
|
||||
|
||||
Targets implemented as loadable modules
|
||||
REJECT drop the packet but inform sender
|
||||
MIRROR change source/destination IP and resend
|
||||
LOG log via syslog
|
||||
ULOG log via userspace
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Developing netfilter/iptables extensions
|
||||
Developing an ip_tables match module
|
||||
|
||||
ip_tables modules are at a high layer
|
||||
Get called for every packet iterating a rule with this match
|
||||
Examples of iptables modules are: ipt_ttl, ipt_tos, ipt_tcpmss
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
#include <linux/netfilter_ipv4/ip_tables.h>
|
||||
%size 3
|
||||
ipt_register_match(struct ipt_match *match)
|
||||
%size 3
|
||||
ipt_unregister_match(struct ipt_match *match)
|
||||
%size 3
|
||||
struct ipt_match:
|
||||
%size 3
|
||||
struct list_head list; /* list header {NULL,NULL} */
|
||||
%size 3
|
||||
const char name[]; /* name of the match */
|
||||
%size 3
|
||||
int (*match); /* called when pkt is matched */
|
||||
%size 3
|
||||
int (*checkentry); /* called when entry inserted */
|
||||
%size 3
|
||||
void (*destroy); /* called when entry deleted */
|
||||
%size 3
|
||||
struct module *me; /* set to THIS_MODULE */
|
||||
%font "standard"
|
||||
Example code see "ipt_workshop.c"
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Developing netfilter/iptables extensions
|
||||
Developing an iptables match module
|
||||
|
||||
Something has to parse the commandline options for ipt_workshop.c
|
||||
Solution: libpt_workshop.c as iptables plugin
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
#include <iptables.h>:
|
||||
%size 3
|
||||
register_match(struct iptables_match)
|
||||
%size 3
|
||||
struct iptables_match:
|
||||
%size 3
|
||||
struct iptables_match *next; /* next one */
|
||||
%size 3
|
||||
ipt_chainlabel name; /* name */
|
||||
%size 3
|
||||
const char *version; /* version */
|
||||
%size 3
|
||||
size_t size; /* size of match data */
|
||||
%size 3
|
||||
size_t userspacesize; /* size for userspace */
|
||||
%size 3
|
||||
void (*help); /* print help message */
|
||||
%size 3
|
||||
void (*init); /* init the matchinfo */
|
||||
%size 3
|
||||
int (*parse); /* parse getopt chars */
|
||||
%size 3
|
||||
void (*final_check); /* consistency check */
|
||||
%size 3
|
||||
void (*print); /* print (iptables -L) */
|
||||
%size 3
|
||||
void (*save); /* iptables-save */
|
||||
%size 3
|
||||
struct option extra_opts; /* getopt-style opts */
|
||||
%font "typewriter"
|
||||
Example code see "libipt_workshop.c"
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Connection tracking...
|
||||
implemented seperately from NAT
|
||||
enables stateful filtering
|
||||
implementation
|
||||
hooks into NF_IP_PRE_ROUTING to track packets
|
||||
hooks into NF_IP_POST_ROUTING and NF_IP_LOCAL_IN to see if packet passed filtering rules
|
||||
protocol modules (currently TCP/UDP/ICMP)
|
||||
application helpers currently (FTP,IRC,H.323,talk,SNMP)
|
||||
divides packets in the following four categories
|
||||
NEW - would establish new connection
|
||||
ESTABLISHED - part of already established connection
|
||||
RELATED - is related to established connection
|
||||
INVALID - (multicast, errors...)
|
||||
does _NOT_ filter packets itself
|
||||
can be utilized by iptables using the 'state' match
|
||||
is used by NAT Subsystem
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Common structures
|
||||
struct ip_conntrack_tuple, representing unidirectional flow
|
||||
layer 3 src + dst
|
||||
layer 4 protocol
|
||||
layer 4 src + dst
|
||||
connetions represented as struct ip_conntrack
|
||||
original tuple
|
||||
reply tuple
|
||||
timeout
|
||||
l4 state private data
|
||||
app helper
|
||||
app helper private data
|
||||
expected connections
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Flow of events for new packet
|
||||
packet enters NF_IP_PRE_ROUTING
|
||||
tuple is derived from packet
|
||||
lookup conntrack hash table with hash(tuple) -> fails
|
||||
new ip_conntrack is allocated
|
||||
fill in original and reply == inverted(original) tuple
|
||||
initialize timer
|
||||
assign app helper if applicable
|
||||
see if we've been expected -> fails
|
||||
call layer 4 helper 'new' function
|
||||
...
|
||||
packet enters NF_IP_POST_ROUTING
|
||||
do hashtable lookup for packet -> fails
|
||||
place struct ip_conntrack in hashtable
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Flow of events for packet part of existing connection
|
||||
packet enters NF_IP_PRE_ROUTING
|
||||
tuple is derived from packet
|
||||
lookup conntrack hash table with hash(tuple)
|
||||
assosiate conntrack entry with skb->nfct
|
||||
call l4 protocol helper 'packet' function
|
||||
do l4 state tracking
|
||||
update timeouts as needed [i.e. TCP TIME_WAIT,...]
|
||||
...
|
||||
packet enters NF_IP_POST_ROUTING
|
||||
do hashtable lookup for packet -> succeds
|
||||
do nothing else
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Writing extensions for the conntrack subsystem
|
||||
|
||||
new l4 protocol modules are very rare
|
||||
more common: application helpers for ftp,irc,h.323,quake,mms,...
|
||||
API for conntrack helper modules:
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
#include <linux/netfilter_ipv4/ip_conntrack_helper.h>
|
||||
%size 3
|
||||
struct ip_conntrack_helper
|
||||
%size 3
|
||||
struct list_head *list;
|
||||
%size 3
|
||||
const char *name;
|
||||
%size 3
|
||||
unsigned char flags;
|
||||
%size 3
|
||||
struct module *me;
|
||||
%size 3
|
||||
unsigned int max_expected;
|
||||
%size 3
|
||||
unsigned int timeout;
|
||||
%size 3
|
||||
struct ip_conntrack_tuple tuple;
|
||||
%size 3
|
||||
struct ip_conntrack_mask mask;
|
||||
%size 3
|
||||
int (*help)(const struct iphdr *iph, size_t, struct ip_conntrack, enum ip_conntrack_info);
|
||||
%size 3
|
||||
int ip_conntrack_helper_register(struct ip_conntrack_helper);
|
||||
%size 3
|
||||
void ip_conntrack_helper_unregister(struct ip_conntrack_helper);
|
||||
%size 3
|
||||
int ip_conntrack_expect_related(struct ip_conntrack, struct ip_conntrack_expect);
|
||||
%size 3
|
||||
int ip_conntrack_change_expect(struct ip_conntrack_expect, struct ip_conntrack_tuple);
|
||||
%size 3
|
||||
void ip_conntrack_unexpect_related(struct ip_conntrack_expect);
|
||||
%font "standard"
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Network Address Translation
|
||||
|
||||
Overview
|
||||
|
||||
Previous Linux Kernels only implemented one special case of NAT: Masquerading
|
||||
Linux 2.4.x can do any kind of NAT.
|
||||
NAT subsystem implemented on top of netfilter, iptables and conntrack
|
||||
NAT subsystem registers with all five netfilter hooks
|
||||
'nat' Table registers chains PREROUTING, POSTROUTING and OUTPUT
|
||||
Following targets available within 'nat' Table
|
||||
SNAT changes the packet's source whille passing NF_IP_POST_ROUTING
|
||||
DNAT changes the packet's destination while passing NF_IP_PRE_ROUTING
|
||||
MASQUERADE is a special case of SNAT
|
||||
REDIRECT is a special case of DNAT
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Network Address Translation
|
||||
|
||||
flow of events for NEW packet:
|
||||
packet enters NF_IP_PRE_ROUTING after conntrack
|
||||
resolve conntrack entry for packet
|
||||
if (expectfn of helper) call it
|
||||
else iterate over rules in PREROUTING chain of nat table
|
||||
save respective NAT mappings in conntrack
|
||||
apply the NAT mappings to the packet
|
||||
call NAT helper function, if there is one for this proto
|
||||
...
|
||||
packet enters NF_IP_POST_ROUTING
|
||||
resolve conntrack entry for packet
|
||||
iterate over rules in POSTROUTING chain of nat table
|
||||
save respectiva NAT mappings in conntrack
|
||||
apply the NAT mappings to the packet
|
||||
call NAT helper function, if there is one for this proto
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Network Address Translation
|
||||
|
||||
flow of events for ESTABLISHED packets:
|
||||
packet enters NF_IP_PRE_ROUTING after conntrack
|
||||
reseolve conntrack entry for packet
|
||||
apply the NAT mappings (read from conntrack entry) to the packet
|
||||
call NAT helper function, if there is one for this proto
|
||||
...
|
||||
packet enters NF_IP_POST_ROUTING
|
||||
resolve conntrack entry for packet
|
||||
apply the NAT mappings (read from conntrack entry) to the packet
|
||||
call NAT helper function, if there is one for this proto
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Developing a NAT helper module
|
||||
Network Address Translation
|
||||
|
||||
%font "typewriter"
|
||||
%size 3
|
||||
#include <linux/netfilter_ipv4/ip_nat_helper.h>
|
||||
%size 3
|
||||
struct ip_nat_helper
|
||||
%size 3
|
||||
struct list_head list;
|
||||
%size 3
|
||||
const char *name;
|
||||
%size 3
|
||||
unsigned char *flags;
|
||||
%size 3
|
||||
struct module *me;
|
||||
%size 3
|
||||
struct ip_conntrack_tuple tuple;
|
||||
%size 3
|
||||
struct ip_conntrack_tuple mask;
|
||||
%size 3
|
||||
unsigned int (*help)(struct ip_conntrack *, struct ip_conntrack_expect *, struct ip_nat_info *, enum ip_conntrack_info, unsigned int hooknum, struct sk_buff **)
|
||||
%size 3
|
||||
unsigned int (*expect)(struct sk_buff **, unsigned int hooknum, struct ip_conntrack, struct ip_nat_info *)
|
||||
%size 3
|
||||
int ip_nat_helper_register(struct ip_nat_helper *);
|
||||
%size 3
|
||||
void ip_nat_helper_unregister(struct ip_nat_helper *);
|
||||
%size 3
|
||||
int ip_nat_mangle_tcp_packet();
|
||||
%size 3
|
||||
int ip_nat_mangle_udp_packet();
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The netfilter/iptables architecture
|
||||
Advanced Netfilter concepts
|
||||
|
||||
%size 4
|
||||
Userspace logging
|
||||
flexible replacement for old syslog-based logging
|
||||
packets to userspace via multicast netlink sockets
|
||||
easy-to-use library (libipulog)
|
||||
plugin-extensible userspace logging daemon (ulogd)
|
||||
Can even be used to directly log into MySQL
|
||||
|
||||
Queuing
|
||||
reliable asynchronous packet handling
|
||||
packets to userspace via unicast netlink socket
|
||||
easy-to-use library (libipq)
|
||||
provides Perl bindings
|
||||
experimental queue multiplex daemon (ipqmpd)
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Developing netfilter/iptables extensions
|
||||
Thanks
|
||||
The slides and the an according paper of this presentation are available at http://www.gnumonks.org/
|
||||
The netfilter homepage: http://www.netfilter.org/
|
||||
Thanks to
|
||||
the BBS people, Z-Netz, FIDO, ...
|
||||
for heavily increasing my computer usage in 1992
|
||||
KNF
|
||||
for bringing me in touch with the internet as early as 1994
|
||||
for providing a playground for technical people
|
||||
for telling me about the existance of Linux!
|
||||
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
|
||||
for implementing (one of?) the world's best TCP/IP stacks
|
||||
Paul 'Rusty' Russell
|
||||
for starting the netfilter/iptables project
|
||||
for trusting me to maintain it today
|
||||
Astaro AG (http://www.astaro.com/)
|
||||
for sponsoring parts of my netfilter work
|
||||
for sponsoring my travel cost to OLS
|
||||
|
|
@ -0,0 +1,57 @@
|
|||
#include <linux/module.h>
|
||||
#include <linux/config.h>
|
||||
#include <linux/skbuff.h>
|
||||
#include <linux/ip.h>
|
||||
|
||||
#include <linux/netfilter.h>
|
||||
#include <linux/netfilter_ipv4.h>
|
||||
|
||||
MODULE_LICENSE("GPL");
|
||||
MODULE_AUTHOR("Harald Welte <laforge@netfilter.org>");
|
||||
MODULE_DESCRIPTION("OLS2003 workshop module");
|
||||
|
||||
static unsigned int
|
||||
workshop_fn(unsigned int hooknum,
|
||||
struct sk_buff **pskb,
|
||||
const struct net_device *in,
|
||||
const struct net_device *out,
|
||||
int (*okfn)(struct sk_buff *))
|
||||
{
|
||||
struct iphdr *iph = (*pskb)->nh.iph;
|
||||
/* do whatever we want to do */
|
||||
|
||||
printk(KERN_NOTICE "packet from %u.%u.%u.%u received\n",
|
||||
NIPQUAD(iph->saddr));
|
||||
|
||||
return NF_ACCEPT;
|
||||
}
|
||||
|
||||
static struct nf_hook_ops workshop_ops = {
|
||||
.list = { .prev = NULL, .next = NULL },
|
||||
.hook = &workshop_fn,
|
||||
.pf = PF_INET,
|
||||
.hooknum = NF_IP_PRE_ROUTING,
|
||||
.priority = NF_IP_PRI_LAST-1
|
||||
};
|
||||
|
||||
static int __init init(void)
|
||||
{
|
||||
int ret = 0;
|
||||
|
||||
ret = nf_register_hook(&workshop_ops);
|
||||
if (ret < 0) {
|
||||
printk(KERN_ERR "something went wrong while registering\n");
|
||||
return ret;
|
||||
}
|
||||
|
||||
printk(KERN_DEBUG "workshop netfilter module successfully loaded\n");
|
||||
return ret;
|
||||
}
|
||||
|
||||
static void __exit fini(void)
|
||||
{
|
||||
nf_unregister_hook(&workshop_ops);
|
||||
}
|
||||
|
||||
module_init(init);
|
||||
module_exit(fini);
|
|
@ -0,0 +1,105 @@
|
|||
What is open source? How does it work?
|
||||
Who writes code for nothing and why?
|
||||
|
||||
- traditional software model
|
||||
- product-oriented
|
||||
- company finances development of software
|
||||
- same copy of software object code is sold under a very restrictive
|
||||
license
|
||||
- license fees refinance cost of development
|
||||
- enforcement of restrictive license guarantees revenue
|
||||
- advantages
|
||||
- proven business model
|
||||
- disadvantage
|
||||
- have to develop everything on your own or buy licenses of 3rd
|
||||
party software
|
||||
- less flexibility for the customer
|
||||
- does the customer trust the 'black box' you are selling?
|
||||
- if vendor goes out of business, no bugfixes/updates
|
||||
|
||||
- open source model
|
||||
- service based
|
||||
- individual parties contribute code parts
|
||||
- software is distributed for free
|
||||
- software is distributed under very permissive license
|
||||
- service / support / customization refinance development
|
||||
- advantages
|
||||
- vast amount of available FOSS can be used as foundation for
|
||||
own products
|
||||
- source code is available for peer review
|
||||
- bug fixes for free, people just send you patches
|
||||
- new features impelemented by your users!
|
||||
- disadvantage
|
||||
- business model has yet to be proven to work
|
||||
|
||||
- important open source license
|
||||
- BSD style license
|
||||
- permits any use of the sourcecode as long as copyright notice
|
||||
remains
|
||||
- GPL (GNU General Public License)
|
||||
- source for resulting binary has to be provided
|
||||
- ensures that derivates of free software are still free
|
||||
- LGPL (GNU Lesser General Public License)
|
||||
- permits linking with non-gpl code (mainly used for libraries)
|
||||
|
||||
- difference free software / open source
|
||||
- term 'free software' (free as in freedom, not beer) introduced by
|
||||
Stallman / FSF 1984.
|
||||
- focus on political/ethical/philosophical freedom
|
||||
- open source software (OSS) introduced by OSI in 1997
|
||||
- focus on technological advantage by means of source review
|
||||
- most FOSS licenses match both definitions, OSS less restrictive
|
||||
|
||||
- history of FOSS
|
||||
- initially software always for free in source (e.g. IBM S/360)
|
||||
- as hardware gets less expensive, companies start to license
|
||||
software for money
|
||||
- some people (Stallman, et. al.) didn't want to give up the freedom
|
||||
they're used to.
|
||||
- 1983: GNU project is founded, goal: Implementation of a free UNIX-like
|
||||
operating system
|
||||
- 1984: Free Software Foundation is established as non-for-profit legal
|
||||
entity behind the GNU project
|
||||
- 1991: Linus Torvalds releases the first version of the Linux Kernel
|
||||
under the GNU GPL license. Together with the other parts from the
|
||||
GNU project and others, a 100% free operating system is available
|
||||
- 1994-2000: Free Software is increasingly recognized as reliable,
|
||||
stable alternative to proprietary software
|
||||
|
||||
- Who is behind FOSS?
|
||||
- in the beginning mostly computer enthusiasts with academic background
|
||||
- motivation through
|
||||
- fight: david <-> goliath
|
||||
- to show how bad most proprietary software is
|
||||
- to make the internet a better place
|
||||
- to work together with _very_ good programmers
|
||||
- to gain more experience / better reputation
|
||||
- more and more commercial entities recognize the value of FOSS
|
||||
- contributions to existing projects
|
||||
- start of new projects
|
||||
- contracting consultants and FOSS companies for implementation
|
||||
of missing features
|
||||
- experienced end-users
|
||||
- independent consultants
|
||||
- academic institutions (e.g. exim, cyrus)
|
||||
- mixed FOSS / proprietary companies (like Astaro)
|
||||
- use FOSS as foundation for their proprietary solutions
|
||||
- have a vital need for a reliable and up-to-date foundation,
|
||||
thus contribute back to and/or fund FOSS
|
||||
|
||||
- development process, communication
|
||||
- everybody who agrees to the license can contribute code
|
||||
- project is usually started by a single developer or a small group
|
||||
- different actors:
|
||||
- maintainer: official person to maintain the code, responsible
|
||||
- core team: small group of leaders behind the project
|
||||
- developers: people who write code on a regular basis
|
||||
- contibutors: people who contribute a single feature or a bug
|
||||
fix from time to time
|
||||
- users: people who use the software, often organized on
|
||||
mailinglists, newsgroups, user groups, ..
|
||||
- main communication medium are mailinglists
|
||||
- every developer can be contacted directly via email
|
||||
- leaders/managers are people with the best technical skills, unlike the 'commercial world' where you need certain diploma, connections, ...
|
||||
- communication is random. no manager <-> manager talk about technical
|
||||
stuff they don't understand
|
|
@ -0,0 +1,185 @@
|
|||
%include "default.mgp"
|
||||
%default 1 bgrad
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
%nodefault
|
||||
%back "blue"
|
||||
|
||||
%center
|
||||
%size 7
|
||||
|
||||
|
||||
What is Open Source / Free Software ?
|
||||
|
||||
|
||||
|
||||
%center
|
||||
%size 4
|
||||
by
|
||||
|
||||
Harald Welte <hwelte@astaro.com>
|
||||
Harald Welte <laforge@gnumonks.org>
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
What is Free / Open Source Software (FOSS)
|
||||
Contents
|
||||
|
||||
|
||||
The traditional (proprietary) software model
|
||||
The Free / Open Source software model
|
||||
Important Free / Open Source software licenses
|
||||
Difference Free Software / Open Source
|
||||
History of Free / Open Source software
|
||||
Who is behind FOSS?
|
||||
Development Process
|
||||
Thanks
|
||||
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
What is Free / Open Source Software (FOSS)
|
||||
The traditional (proprietary) software model
|
||||
|
||||
traditional software model
|
||||
product-oriented
|
||||
vendor finances development of software
|
||||
business model of software based on secret source code
|
||||
same copy of software object code is sold under a very restrictive license
|
||||
license fees refinance cost of development
|
||||
enforcement of restrictive license guarantees revenue
|
||||
advantages
|
||||
proven business model
|
||||
disadvantage
|
||||
vendor has to develop everything on his own or buy licenses of 3rd party software
|
||||
less flexibility for the customer
|
||||
does the customer trust the 'black box' you are selling?
|
||||
if vendor goes out of business, no bugfixes/updates
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
What is Free / Open Source Software (FOSS)
|
||||
The free / open source software model
|
||||
|
||||
Open Source / Free Software model
|
||||
service based
|
||||
individual parties contribute code parts
|
||||
software is distributed for free
|
||||
software source code is distributed under very permissive license
|
||||
service / support / customization refinance development
|
||||
advantages
|
||||
vast amount of available FOSS can be used as foundation for own products
|
||||
source code is available for peer review
|
||||
bug fixes for free, people just send you patches
|
||||
new features impelemented by your users!
|
||||
disadvantage
|
||||
business model has yet to prove scalability
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
What is Free / Open Source Software (FOSS)
|
||||
Difference Free Software / Open Source
|
||||
|
||||
difference free software / open source
|
||||
free software
|
||||
term 'free software' (free as in freedom, not beer) introduced by Richard Stallman / FSF 1984.
|
||||
focus on political/ethical/philosophical freedom
|
||||
open source
|
||||
term 'open source' software (OSS) introduced by OSI in 1997
|
||||
focus on technological advantage by means of source review
|
||||
most FOSS licenses match both definitions, OSS less restrictive
|
||||
FOSS is _not_ to be mistaken as freeware / shareware!
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
What is Free / Open Source Software (FOSS)
|
||||
Important FOSS licenses
|
||||
|
||||
|
||||
important free / open source license
|
||||
BSD (Berkeley Systems Derivate) style license
|
||||
permits any use of the sourcecode as long as copyright notice remains
|
||||
GPL (GNU General Public License)
|
||||
source for resulting binary has to be provided
|
||||
ensures that derivates of free software are still free
|
||||
LGPL (GNU Lesser General Public License)
|
||||
permits linking with non-gpl code (mainly used for libraries)
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
What is Free / Open Source Software (FOSS)
|
||||
History of Free / Open Source Software
|
||||
|
||||
history of free / open source software
|
||||
initially software always for free in source (e.g. IBM S/360)
|
||||
as hardware gets less expensive, companies start to license software for money
|
||||
some people (Stallman, et. al.) didn't want to give up the freedom they're used to.
|
||||
1983: GNU project is founded, goal: Implementation of a free UNIX-like operating system
|
||||
1984: Free Software Foundation is established as non-for-profit legal entity behind the GNU project
|
||||
1991: Linus Torvalds releases the first version of the Linux Kernel under the GNU GPL license. Together with the other parts from the GNU project and others, a 100% free operating system is available
|
||||
1994-1999: FOSS is increasingly recognized as reliable, stable alternative to proprietary software, esp. in the server + networking market
|
||||
2000-2003: FOSS is increasingly considered as an alternative on the desktop (see recent decision by Munich city administration, respective laws in latin america, ...)
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
What is Free / Open Source Software (FOSS)
|
||||
Who is behind FOSS?
|
||||
|
||||
individuals
|
||||
computer enthusiasts motivated by
|
||||
fight: david <-> goliath
|
||||
ability to show how poorly implemented most proprietary software is
|
||||
ability to gain more experience / better reputation
|
||||
experienced end-users
|
||||
independent consultants
|
||||
looking for a solution to a particular problem and already have 95% by using existing FOSS
|
||||
organizations
|
||||
commercial entities who recognize the value of FOSS
|
||||
contributions to existing projects
|
||||
start of new projects
|
||||
contracting consultants and FOSS companies for implementation of missing features
|
||||
mixed FOSS / proprietary companies (like Astaro)
|
||||
use FOSS as foundation for their proprietary solutions
|
||||
have a vital need for a reliable and up-to-date foundation, thus contribute back to and/or fund FOSS
|
||||
academic institutions (e.g. exim, cyrus)
|
||||
are traditionally involved in the exchange of research results. Why treat software differently?
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
What is Free / Open Source Software (FOSS)
|
||||
Development Process
|
||||
|
||||
development process, communication
|
||||
everybody who agrees to the license can contribute code
|
||||
project is usually started by a single developer or a small group
|
||||
different actors in development process
|
||||
maintainer: official person to maintain the code, responsible
|
||||
core team: small group of leaders behind the project
|
||||
developers: people who write code on a regular basis
|
||||
contibutors: people who contribute a single feature or a bug fix from time to time
|
||||
users: people who use the software, often organized on mailinglists, newsgroups, user groups, ..
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
What is Free / Open Source Software (FOSS)
|
||||
Development Process
|
||||
|
||||
main communication medium are mailinglists
|
||||
every developer can be contacted directly via email
|
||||
leaders/managers are people with the best technical skills, unlike the 'commercial world' where you need certain diploma, connections, ...
|
||||
communication is random. no manager <-> manager talk about technical stuff they don't understand
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
What is Free / Open Source Software (FOSS)
|
||||
Thanks
|
||||
|
||||
in the name of the netfilter/iptables project, thanks to Astaro for funding
|
||||
particular tasks on my schedule
|
||||
equipment (dual Opteron below my desk)
|
||||
my travel expenses to many FOSS conferences
|
||||
the netfilter developer workshop in August 2003 (Budapest, HU)
|
||||
|
||||
|
|
@ -0,0 +1,281 @@
|
|||
%include "default.mgp"
|
||||
%default 1 bgrad
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
%nodefault
|
||||
%back "blue"
|
||||
|
||||
%center
|
||||
%size 7
|
||||
|
||||
|
||||
Firewalls, IPsec and Linux
|
||||
|
||||
|
||||
%center
|
||||
%size 4
|
||||
by
|
||||
|
||||
Harald Welte <laforge@netfilter.org>
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalls, IPsec and Linux
|
||||
Contents
|
||||
|
||||
|
||||
Introduction
|
||||
Highly Scalable Linux Network Stack
|
||||
Netfilter Hooks
|
||||
Packet selection based on IP Tables
|
||||
The Connection Tracking Subsystem
|
||||
The NAT Subsystem
|
||||
IPsec with Free S/WAN
|
||||
IPsec with Kernel 2.6.x
|
||||
Cipe, vtun, openvpn and others
|
||||
Traffic Shaping, QoS, Policy Routing
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalls, IPsec and Linux
|
||||
Introduction
|
||||
|
||||
What this is:
|
||||
A broad overview about the advanced Linux networking features
|
||||
Intended for a network savyy audience that has little Linux background
|
||||
|
||||
What this presentation is not:
|
||||
A tutorial on how to use iptables, tc, iproute2, brctl
|
||||
An introduction into the cool code we write every day ;)
|
||||
|
||||
It will try to show you what you can do with Linux networking, not how.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalls, IPsec and Linux
|
||||
Introduction
|
||||
|
||||
Linux and Networking
|
||||
Linux is a true child of the Internet
|
||||
Early adopters: ISP's, Universities
|
||||
Lots of work went into a highly scalable network stack
|
||||
Not only for client/server, but also for routers
|
||||
Features unheared of in other OS's
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalls, IPsec and Linux
|
||||
Introduction
|
||||
|
||||
Did you know, that a stock 2.6.5 linux kernel can provide
|
||||
|
||||
a stateful packet filter ?
|
||||
fully symmetric NA(P)T ?
|
||||
policy routing ?
|
||||
QoS / traffic shaping ?
|
||||
IPv6 firewalling ?
|
||||
packet filtering, NA(P)T on a bridge ?
|
||||
layer 2 (mac) address translation ?
|
||||
|
||||
If not, chances are high that this presentation will tell you something new.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalls, IPsec and Linux
|
||||
Netfilter Hooks
|
||||
|
||||
What is netfilter?
|
||||
|
||||
System of callback functions within network stack
|
||||
Callback function to be called for every packet traversing certain point (hook) within network stack
|
||||
Protocol independent framework
|
||||
Hooks in layer 3 stacks (IPv4, IPv6, DECnet, ARP)
|
||||
Multiple kernel modules can register with each of the hooks
|
||||
|
||||
Traditional packet filtering, NAT, ... is implemented on top of this framework
|
||||
|
||||
Can be used for other stuff interfacing with the core network stack, like DECnet routing daemon.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalls, IPsec and Linux
|
||||
IP tables
|
||||
|
||||
Packet selection using IP tables
|
||||
|
||||
The kernel provides generic IP tables support
|
||||
|
||||
Each kernel module may create it's own IP table
|
||||
|
||||
The three major parts of 2.4 firewalling subsystem are implemented using IP tables
|
||||
Packet filtering table 'filter'
|
||||
NAT table 'nat'
|
||||
Packet mangling table 'mangle'
|
||||
|
||||
Could potentially be used for other stuff, e.g. IPsec SPDB
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalls, IPsec and Linux
|
||||
IP Tables
|
||||
|
||||
Managing chains and tables
|
||||
|
||||
An IP table consists out of multiple chains
|
||||
A chain consists out of a list of rules
|
||||
Every single rule in a chain consists out of
|
||||
match[es] (rule executed if all matches true)
|
||||
target (what to do if the rule is matched)
|
||||
|
||||
%size 4
|
||||
matches and targets can either be builtin or implemented as kernel modules
|
||||
|
||||
%size 5
|
||||
The userspace tool iptables is used to control IP tables
|
||||
handles all different kinds of IP tables
|
||||
supports a plugin/shlib interface for target/match specific options
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalls, IPsec and Linux
|
||||
Connection Tracking Subsystem
|
||||
|
||||
Connection tracking...
|
||||
implemented seperately from NAT
|
||||
enables stateful filtering
|
||||
protocol modules (currently TCP/UDP/ICMP/GRE/SCTP)
|
||||
application helpers (currently FTP,IRC,H.323,talk,SNMP,RTSP)
|
||||
does _NOT_ filter packets itself
|
||||
can be utilized by iptables using the 'state' match
|
||||
is used by NAT Subsystem
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalls, IPsec and Linux
|
||||
Network Address Translation
|
||||
|
||||
Network Address Translation
|
||||
|
||||
Previous Linux Kernels only implemented one special case of NAT: Masquerading
|
||||
Linux 2.4.x / 2.6.x can do any kind of NAT.
|
||||
NAT subsystem implemented on top of netfilter, iptables and conntrack
|
||||
Following targets available within 'nat' Table
|
||||
SNAT changes the packet's source whille passing NF_IP_POST_ROUTING
|
||||
DNAT changes the packet's destination while passing NF_IP_PRE_ROUTING
|
||||
MASQUERADE is a special case of SNAT
|
||||
REDIRECT is a special case of DNAT
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalls, IPsec and Linux
|
||||
Packet Mangling
|
||||
|
||||
Purpose of mangle table
|
||||
packet manipulation except address manipulation
|
||||
Targets specific to the 'mangle' table:
|
||||
DSCP - manipulate DSCP field
|
||||
IPV4OPTSSTRIP - strip IPv4 options
|
||||
MARK - change the nfmark field of the skb
|
||||
TCPMSS - set TCP MSS option
|
||||
TOS - manipulate the TOS bits
|
||||
TTL - set / increase / decrease TTL field
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalls, IPsec and Linux
|
||||
Linux Bridging
|
||||
|
||||
Bridging (brctl)
|
||||
Includes support for Spanning Tree
|
||||
Fully supports packet filtering and NAT (!) on a bridge
|
||||
Can also filter and translate layer 2 MAC addresses
|
||||
Can implement a 'brouter' (bridge certain traffic, route other)
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalls, IPsec and Linux
|
||||
Linux Policy Routing
|
||||
|
||||
Policy Routing (iproute2)
|
||||
Allows routing decisions on arbitrary information
|
||||
Provides up to 255 different routing tables within one system
|
||||
By combining via nfmark with iptables, any matches of the packet filter can be used for the routing decision
|
||||
Very useful in complex setups with mutiple links (e.g. multiple DSL uplinks with dynamic addresses, asymmetric routing, ...)
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalls, IPsec and Linux
|
||||
Linux Traffic Shaping
|
||||
|
||||
Traffic Control (tc)
|
||||
Framework for lots of algorithms like RED,SFQ,TBF,CBQ,CSZ,GRED,HTB
|
||||
Very granular control, especially for very low bandwidth links
|
||||
Present since Linux 2.2.x but still not used widely
|
||||
Lack of documentation, but situation is improving (www.lartc.org)
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalls, IPsec and Linux
|
||||
Free S/WAN
|
||||
|
||||
Free S/WAN
|
||||
Was a politically motivated effort to provide IPsec for Linux 2.0+
|
||||
Goal was to encrypt as much Internet Traffic as possible
|
||||
Software architecture didn't fit very well with Linux 2.4/2.6 network stack
|
||||
Project has been shut down, however Open S/WAN continues support
|
||||
Is in widespread production use and has received a lot of testing
|
||||
Political motivation prevented any U.S. citizen to contribute code
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalls, IPsec and Linux
|
||||
Linux 2.6.x IPsec
|
||||
|
||||
Linux 2.6.x IPsec
|
||||
Linux networking gods disaproved Free S/WAN political restrictions and software design
|
||||
Thus, they decided to write their own IPsec stack
|
||||
Result is in the stock 2.6.x kernel series
|
||||
Offers complete support for transport and tunnel mode
|
||||
Can be used with FreeSWAN (pluto) or KAME (isakmpd) userspace
|
||||
Remaining problems
|
||||
No integration with hardware crypto accelerators yet
|
||||
No implementation of NAT traversal yet
|
||||
Interaction with iptable_nat still has to be sorted out
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalls, IPsec and Linux
|
||||
cipe, vtun, openswan and others
|
||||
|
||||
Other VPN protocols/programs
|
||||
Evolved as linux specific VPN implementations since the Linux Kernel was lacking stock IPsec support for a long time
|
||||
Are totally incompatible to IPsec and only compatible to themselves
|
||||
Are of questionable security (at least in case of cipe, vtun)
|
||||
Are mostly userspace implementations
|
||||
Are way easier to configure
|
||||
Can provide layer 2 tunnels to route (or bridge!) all kinds of protocols
|
||||
openvpn with X.509 certificates is a very clean and easy solution for building strong VPN tunnels between two linux gateways
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
Firewalls, IPsec and Linux
|
||||
Thanks
|
||||
|
||||
Thanks to
|
||||
the BBS scene, Z-Netz, FIDO, ...
|
||||
for heavily increasing my computer usage in 1992
|
||||
KNF (http://www.franken.de/)
|
||||
for bringing me in touch with the internet as early as 1994
|
||||
for providing a playground for technical people
|
||||
for telling me about the existance of Linux!
|
||||
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
|
||||
for implementing (one of?) the world's best TCP/IP stacks
|
||||
Paul 'Rusty' Russell
|
||||
for starting the netfilter/iptables project
|
||||
for trusting me to maintain it today
|
||||
Astaro AG
|
||||
for sponsoring parts of my netfilter work
|
||||
%size 3
|
||||
The slides and the an according paper of this presentation are available at http://www.gnumonks.org/
|
||||
%size 3
|
Binary file not shown.
Binary file not shown.
|
@ -0,0 +1,21 @@
|
|||
Harald Welte ist der Leiter des Netfilter Core Team und is massgeblich an der Entwicklung und Pflege des Paketfilters netfilter/iptables beteiligt.
|
||||
|
||||
Sein Augenmerk innerhalb der Computerwelt lag schon immer auf der
|
||||
Netzwerktechnik. So ist z.B. der Grund sich 1994 mit Linux zu beschaeftigen
|
||||
aus der Aufgabe entstanden, ein UUCP<->ZConnect<->FIDO gateway aufzusetzen.
|
||||
|
||||
In der wenigen Zeit, die ihm heute neben netfilter/iptables bleibt, schreibt er eigenartige Dokumente wie das UUCP-over-SSL-HOWTO.
|
||||
|
||||
Seit 1997 ist er als unabhaengiger IT-Consultant und -Entwickler in
|
||||
zahlreichen Projekten fuer die unterschiedlichsten Firmen (von Banken bis zu
|
||||
Computerhardware-Herstellern) taetig.
|
||||
|
||||
Im Jahr 2001 folgte er einem Angebot, fuer den Brasilianischen
|
||||
Linux-Distributor in Curitiba (Brasilien) zu arbeiten.
|
||||
|
||||
Seit Februar 2002 wird seine Arbeit am netfilter/iptables-Projekt durch ein
|
||||
Sponsoring der Fa. Astaro AG unterstuetzt. Neben diesem Sponsoring arbeitet
|
||||
er nach wie vor als freiberuflicher Berater und Entwickler.
|
||||
|
||||
Harald lebt seit November 2002 in Berlin.
|
||||
|
|
@ -0,0 +1,30 @@
|
|||
Rechtliche Durchsetzung der GPL
|
||||
|
||||
Immer mehr Firmen setzen Linux und andere GPL-Lizensierte Software in Ihren
|
||||
Produkten ein, insbesondere im Bereich der Network Appliances wie Router,
|
||||
NAT-Gateways und 802.11 Access Points.
|
||||
|
||||
Einerseits darf man dies als grossen Erfolg fuer Freie Software weten.
|
||||
Andererseits gibt es eben leider auch eine Schattenseite: Nicht wenige dieser
|
||||
Firmen kuemmern sich nicht oder nicht hinreichend um die GPL
|
||||
Liznenzbedingungen.
|
||||
|
||||
Das netfilter/iptables Projekt hat sich deshalb zur Aufgabe gemacht, die
|
||||
vollstaendige Erfuellung der GPL-Lizenzbedingungen von den betreffenden Firmen
|
||||
in allen bekannten Faellen einzufordern, notfalls auch gerichtlich.
|
||||
|
||||
Diese Bemuehungen laufen nun seit Dezember 2003 - mit ausnahmslosem Erfolg. Das
|
||||
Ergebnis sind 12 aussergerichtliche Vergleiche, und eine Einstweilige
|
||||
Verfuegung, welche auch das Widerspruchsverfahren ueberstanden hat.
|
||||
|
||||
Die Liste der betroffenen Firmen beinhaltet nahezu ausschliesslich bekannte
|
||||
Namen wie Siemens, Asus, Belkin.
|
||||
|
||||
Der Autor wird einen Ueberblick ueber diese erfolgreiche GPL-Durchsetzung
|
||||
innerhalb des Deutschen Rechtsraums geben. Weiterhin wird er darueber
|
||||
sprechen, welche genauen Bedingungen erfuellt werden muessen, um den
|
||||
Softwarevertrieb GPL-konform zu gestalten.
|
||||
|
||||
Darueberhinaus moechte er einige Empfehlungen an Autoren Freier Software geben,
|
||||
wie diese schon im Vorfeld einer moeglichen spaetere Durchsetzung ihrer Rechte
|
||||
durch konkrete Massnahmen waehrend der Entwicklung helfen koennen.
|
|
@ -0,0 +1,253 @@
|
|||
%include "default.mgp"
|
||||
%default 1 bgrad
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
%nodefault
|
||||
%back "blue"
|
||||
|
||||
%center
|
||||
%size 7
|
||||
|
||||
|
||||
Enforcing the GNU GPL
|
||||
Copyright helps Copyleft
|
||||
|
||||
|
||||
%center
|
||||
%size 4
|
||||
by
|
||||
|
||||
Harald Welte <laforge@netfilter.org>
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
Contents
|
||||
|
||||
|
||||
Introduction
|
||||
|
||||
The GNU GPL Revisited
|
||||
Motivations for licensing under the GPL
|
||||
Enforcing the GNU GPL
|
||||
|
||||
Thanks
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
Introduction
|
||||
|
||||
Who is speaking to you?
|
||||
an independent Free Software developer
|
||||
who earns his living off Free Software since 1997
|
||||
who is one of the authors of the linux kernel firewall system called netfilter/iptables
|
||||
who IS NOT A LAWYER, although this presentation is the result of dealing six months with lawyers on the GPL
|
||||
|
||||
Why is he speaking to you?
|
||||
because he became aware of copyright (copyleft?) infringement and took legal action within German jurisdiction
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
What is copyrightable?
|
||||
|
||||
The GNU GPL is a copyright license, and thus only covers copyrighted code
|
||||
Not everything is copyrightable (German: Schoepfungshoehe)
|
||||
Small bugfixes are not copyrightable (similar to typo-fixes in a book)
|
||||
As soon as the programmer has a choice in the implementation, there is significant indication of a copyrightable work
|
||||
Choice in algorithm, not in formal representation.
|
||||
Apparently, the level for copyrightable works is relatively low.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
The GNU GPL Revisited
|
||||
|
||||
Revisiting the GNU General Public License
|
||||
|
||||
Regulates distribution of copyrighted code, not usage
|
||||
Allows distribution of source code and modified source code
|
||||
Allows distribution of binaries or modified binaries, if
|
||||
The license itself is mentioned
|
||||
A copy of the license accompanies every copy
|
||||
The complete source code is either
|
||||
included with the copy
|
||||
made available to any 3rd party
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
Complete Source Code
|
||||
|
||||
%size 3
|
||||
"... complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable."
|
||||
|
||||
Our interpretation of this is:
|
||||
Source Code
|
||||
Makefiles
|
||||
Tools for generating the firmware binary from the source
|
||||
(even if they are technically no 'scripts')
|
||||
General Rule:
|
||||
Intent of License is to enable user to run modified versions of the program. They need to be enabled to do so.
|
||||
Result: Signing binaries and only accepting signed versions without providing a signature key is not acceptable!
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
Derivative Works
|
||||
|
||||
What is a derivative work?
|
||||
Not dependent on any particular kind of technology (static/dynamic linking, dlopen, whatever)
|
||||
Even while the modification can itself be a copyrightable work, the combination with GPL-licensed code is subject to GPL.
|
||||
No precendent in Germany so far
|
||||
As soon as code is written for a specific non-standard API (such as the iptables plugin API), there is significant indication for a derivative work
|
||||
This position has been successfully enforced out-of-court with two Vendors so far (iptables modules/plugins).
|
||||
Result
|
||||
Position of my lawyers (apparently also of IBM lawyers):
|
||||
In-kernel proprietary code (binary kernel modules) are not compliant
|
||||
Case-by-case analysis required, especially when drivers/filesystems are ported from other OS's.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
Confusion about the GPL
|
||||
|
||||
%size 4
|
||||
Unfortunately, the wide misconception about copyright, free software, public domain (even the RedHat CEO!) leads to people unknowingly, or even wilfully only benefit from the freedom but not fulfill the obligations of the GPL.
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
Enforcing the GNU GPL
|
||||
|
||||
Enforcing the GPL
|
||||
GPL violations are nothing new, as GPL licensed software is nothing new.
|
||||
However, the recent Linux boom
|
||||
The FSF enforces GPL violations of code on which they hold the copyright
|
||||
silently, without public notice
|
||||
in lengthy negotiations
|
||||
During 2003 the "Linksys" case drew a lot of attention
|
||||
Linksys was selling 802.11 WLAN Acces Ponts / Routers
|
||||
Lots of GPL licensed software embedded in the device (included Linux, uClibc, busybox, iptables, ...)
|
||||
FSF led alliance took the 'qiet' approach and it took about four months until the full source code was released
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
Enforcing the GNU GPL
|
||||
|
||||
The Linksys case
|
||||
Some developers didn't agree with this approach
|
||||
not enough publicity
|
||||
violators don't loose anything by first not complying and wait for the FSF
|
||||
four months delay is too much for low product lifecycles in WLAN world
|
||||
So the netfilter/iptables project started to do their own enforcement in more cases coming up
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
Enforcing the GNU GPL
|
||||
|
||||
Enforcing the GPL
|
||||
chronological order
|
||||
reverse engineering of firmware images
|
||||
sending the infringing organization a warning notice
|
||||
wait for them to sign a statement to cease and desist
|
||||
applying for a preliminary injunction if they don't (max 4 weeks after reverse engineering)
|
||||
|
||||
Success so far
|
||||
amicable agreement with Asus, Belkin, Allnet, Fujitsu-Siemens, Siemens, Securepoint, U.S. Robotics, ...
|
||||
some of which made significant donations to charitable organizations of the free software community
|
||||
preliminary injunction against Sitecom, Sitecom also lost appeals case
|
||||
more settled cases (not public yet)
|
||||
negotiating in more cases
|
||||
public awareness
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
Enforcing the GNU GPL
|
||||
|
||||
Enforcing the GPL
|
||||
remains an important issue for Free Software
|
||||
will start to happen within the court
|
||||
has to be made public in order to raise awareness
|
||||
|
||||
Problems
|
||||
only the copyright holder (in most cases the author) can do it
|
||||
users discovering GPL'd software need to communicate those issues to all copyright holders
|
||||
|
||||
The http://www.gpl-violations.org/ project was started
|
||||
as a platform wher users can report alleged violations
|
||||
to verify those violations and inform all copyright holders
|
||||
to inform the public about ongoing enforcement efforts
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GPL enforcement report
|
||||
Cases so far
|
||||
|
||||
Cases so far
|
||||
Allnet GmbH
|
||||
Siemens AG
|
||||
Fujitsu-Siemens Computers GmbH
|
||||
Axis A.B.
|
||||
Securepoint GmbH
|
||||
U.S.Robotics Germany GmbH
|
||||
undisclosed large vendor
|
||||
Belkin Compnents GmbH
|
||||
Asus GmbH
|
||||
Gateprotect GmbH
|
||||
Sitecom GmbH
|
||||
TomTom B.V.
|
||||
Gigabyte Technologies GmbH
|
||||
D-Link GmbH
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
Make later enforcement easy
|
||||
|
||||
Practical rules for proof by reverse engineering
|
||||
Don't fix typos in error messages and symbol names
|
||||
Leave obscure error messages like 'Rusty needs more caffeine'
|
||||
Make binary contain string of copyright message, not only source
|
||||
Practical rules for potential damages claims
|
||||
Use revision control system
|
||||
Document source of each copyrightable contribution
|
||||
Name+Email address in CVS commit message
|
||||
Consider something like FSFE FLA (Fiduciary License Agreement)
|
||||
Make sure that employers are fine with contributions of their employees
|
||||
If you find out about violation
|
||||
Don't make it public (has to be new/urgent for injunctive relief)
|
||||
Contact lawyer immediately to send wanrning notice
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
Thanks
|
||||
|
||||
Thanks to
|
||||
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
|
||||
for implementing (one of?) the world's best TCP/IP stacks
|
||||
Paul 'Rusty' Russell
|
||||
for starting the netfilter/iptables project
|
||||
for trusting me to maintain it today
|
||||
Astaro AG
|
||||
for sponsoring parts of my netfilter work
|
||||
Free Software Foundation
|
||||
for the GNU Project
|
||||
for the GNU General Public License
|
||||
%size 3
|
||||
The slides of this presentation are available at http://www.gnumonks.org/
|
||||
%size 3
|
||||
The netfilter homepage http://www.netfilter.org/
|
||||
%size 3
|
||||
The http://www.gpl-violations.org/ project
|
||||
|
||||
|
|
@ -0,0 +1,21 @@
|
|||
Enforcing the GNU GPL - Copyright helps Copyleft
|
||||
|
||||
More and more vendors of various computing devices, especially network-related
|
||||
appliances such as Routers, NAT-Gateways and 802.11 Access Points are using
|
||||
Linux and other GPL licensed free software in their products.
|
||||
|
||||
While the linux community can look at this as a big success, there is a back
|
||||
side of that coin: A large number of those vendors have no idea about the GPL
|
||||
license terms, and as a result do not fulfill their obligations under the GPL.
|
||||
|
||||
The netfilter/iptables project has started legal proceedngs against a number of
|
||||
companies in violation of the GPL since December 2003. Those legal proceedings
|
||||
were quite successful so far, resulting in a number of amicable agreements and
|
||||
one granted preliminary injunction.
|
||||
|
||||
The speaker will present an overview about his recent successful enforcement of
|
||||
the GNU GPL within German jurisdiction.
|
||||
|
||||
In the end, it seems like the idea of the founding fathers of the GNU GPL
|
||||
works: Guaranteeing Copyleft by using Copyright.
|
||||
|
|
@ -0,0 +1,25 @@
|
|||
Harald Welte is the chairman of the netfilter/iptables core team.
|
||||
|
||||
His main interest in computing has always been networking. In the few time
|
||||
left besides netfilter/iptables related work, he's writing obscure documents
|
||||
like the UUCP over SSL HOWTO. Other kernel-related projects he has been
|
||||
contributing are user mode linux and the international (crypto) kernel patch.
|
||||
|
||||
He has been working as an independent IT Consultant working on projects for
|
||||
various companies ranging from banks to manufacturers of networking gear.
|
||||
During the year 2001 he was living in Curitiba (Brazil), where he got
|
||||
sponsored for his Linux related work by Conectiva Inc.
|
||||
|
||||
Starting with February 2002, Harald has been contracted part-time by
|
||||
<a href="http://www.astaro.com/">Astaro AG</a>, who are sponsoring him for his
|
||||
current netfilter/iptables work.
|
||||
|
||||
Aside from the Astaro sponsoring, he continues to work as a freelancing
|
||||
kernel developer and network security consultant.
|
||||
|
||||
He licenses his software under the terms of the GNU GPL. He is determined to bring all users, distributors, value added resellers and vendors of netfilter/iptables based products in full compliance with the GPL, even if it includes raising legal charges.
|
||||
|
||||
|
||||
Harald is living in Berlin, Germany.
|
||||
|
||||
|
|
@ -0,0 +1,228 @@
|
|||
%include "default.mgp"
|
||||
%default 1 bgrad
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
%nodefault
|
||||
%back "blue"
|
||||
|
||||
%center
|
||||
%size 7
|
||||
|
||||
|
||||
Enforcing the GNU GPL
|
||||
Copyright helps Copyleft
|
||||
|
||||
|
||||
%center
|
||||
%size 4
|
||||
by
|
||||
|
||||
Harald Welte <laforge@netfilter.org>
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
Contents
|
||||
|
||||
|
||||
Introduction
|
||||
|
||||
The GNU GPL Revisited
|
||||
Motivations for licensing under the GPL
|
||||
Enforcing the GNU GPL
|
||||
|
||||
Thanks
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
Introduction
|
||||
|
||||
Who is speaking to you?
|
||||
|
||||
an independent Free Software developer
|
||||
who earns his living off Free Software since 1997
|
||||
who is one of the authors of the linux kernel firewall system called netfilter/iptables
|
||||
who IS NOT A LAWYER, although this presentation is the result of dealing six months with lawyers on the GPL
|
||||
|
||||
Why is he speaking to you?
|
||||
|
||||
because he became aware of copyright (copyleft?) infringement and took legal action within German jurisdiction
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
What is copyrightable?
|
||||
|
||||
The GNU GPL is a copyright license, and thus only covers copyrighted code
|
||||
Not everything is copyrightable (German: Schoepfungshoehe)
|
||||
Small bugfixes are not copyrightable (similar to typo-fixes in a book)
|
||||
As soon as the programmer has a choice in the implementation, there is significant indication of a copyrightable work
|
||||
Choice in algorithm, not in formal representation.
|
||||
Apparently, the level for copyrightable works is relatively low.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
The GNU GPL Revisited
|
||||
|
||||
Revisiting the GNU General Public License
|
||||
|
||||
Regulates distribution of copyrighted code, not usage
|
||||
Allows distribution of source code and modified source code
|
||||
Allows distribution of binaries or modified binaries, if
|
||||
The license itself is mentioned
|
||||
A copy of the license accompanies every copy
|
||||
The complete source code is either
|
||||
included with the copy
|
||||
made available to any 3rd party
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
Complete Source Code
|
||||
|
||||
|
||||
%size 3
|
||||
"... complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable."
|
||||
|
||||
Our interpretation of this is:
|
||||
Source Code
|
||||
Makefiles
|
||||
Tools for generating the firmware binary from the source
|
||||
(even if they are technically no 'scripts')
|
||||
General Rule:
|
||||
Intent of License is to enable user to run modified versions of the program. They need to be enabled to do so.
|
||||
Result: Signing binaries and only accepting signed versions without providing a signature key is not acceptable!
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
Derivative Works
|
||||
|
||||
What is a derivative work?
|
||||
Not dependent on any particular kind of technology (static/dynamic linking, dlopen, whatever)
|
||||
Even while the modification can itself be a copyrightable work, the combination with GPL-licensed code is subject to GPL.
|
||||
No precendent in Germany so far
|
||||
As soon as code is written for a specific non-standard API (such as the iptables plugin API), there is significant indication for a derivative work
|
||||
This position has been successfully enforced out-of-court with two Vendors so far (iptables modules/plugins).
|
||||
Result
|
||||
Position of my lawyers and IBM lawyers:
|
||||
In-kernel proprietary code (binary kernel modules) are not compliant
|
||||
Case-by-case analysis required, especially when drivers/filesystems are ported from other OS's.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
Confusion about the GPL
|
||||
|
||||
Unfortunately, the wide misconception about copyright, free software, public
|
||||
domain (even the RedHat CEO!) leads to people unknowingly, or even wilfully
|
||||
only benefit from the freedom but not fulfill the obligations of the GPL.
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
Enforcing the GNU GPL
|
||||
|
||||
Enforcing the GPL
|
||||
GPL violations are nothing new, as GPL licensed software is nothing new.
|
||||
However, the recent Linux boom
|
||||
The FSF enforces GPL violations of code on which they hold the copyright
|
||||
silently, without public notice
|
||||
in lengthy negotiations
|
||||
During 2003 the "Linksys" case drew a lot of attention
|
||||
Linksys was selling 802.11 WLAN Acces Ponts / Routers
|
||||
Lots of GPL licensed software embedded in the device (included Linux, uClibc, busybox, iptables, ...)
|
||||
FSF led alliance took the 'qiet' approach and it took about four months until the full source code was released
|
||||
Some developers didn't agree with this approach
|
||||
not enough publicity
|
||||
violators don't loose anything by first not complying and wait for the FSF
|
||||
four months delay is too much for low product lifecycles in WLAN world
|
||||
So the netfilter/iptables project started to do their own enforcement in more cases coming up
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
Enforcing the GNU GPL
|
||||
|
||||
Enforcing the GPL
|
||||
chronological order
|
||||
reverse engineering of firmware images
|
||||
sending the infringing organization a warning notice
|
||||
wait for them to sign a statement to cease and desist
|
||||
applying for a preliminary injunction if they don't (max 4 weeks after reverse engineering)
|
||||
|
||||
Success so far
|
||||
amicable agreement with Asus, Belkin, Allnet, Fujitsu-Siemens, Siemens, Securepoint, U.S. Robotics, ...
|
||||
some of which made significant donations to charitable organizations of the free software community
|
||||
preliminary injunction against Sitecom, Sitecom also lost appeals case
|
||||
more settled cases (not public yet)
|
||||
negotiating in more cases
|
||||
public awareness
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
Enforcing the GNU GPL
|
||||
|
||||
Enforcing the GPL
|
||||
remains an important issue for Free Software
|
||||
will start to happen within the court
|
||||
has to be made public in order to raise awareness
|
||||
|
||||
Problems
|
||||
only the copyright holder (in most cases the author) can do it
|
||||
users discovering GPL'd software need to communicate those issues to all copyright holders
|
||||
|
||||
The http://www.gpl-violations.org/ project was started
|
||||
as a platform wher users can report alleged violations
|
||||
to verify those violations and inform all copyright holders
|
||||
to inform the public about ongoing enforcement efforts
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
How to make later enforcement easy
|
||||
|
||||
Practical rules for proof by reverse engineering
|
||||
Don't fix typos in error messages and symbol names
|
||||
Leave obscure error messages like 'Rusty needs more caffeine'
|
||||
Make binary contain string of copyright message, not only source
|
||||
Practical rules for potential damages claims
|
||||
Use revision control system
|
||||
Document source of each copyrightable contribution
|
||||
Name+Email address in CVS commit message
|
||||
Consider something like FSFE FLA (Fiduciary License Agreement)
|
||||
Make sure that employers are fine with contributions of their employees
|
||||
If you find out about violation
|
||||
Don't make it public (has to be new/urgent for injunctive relief)
|
||||
Contact lawyer immediately to send wanrning notice
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
Thanks
|
||||
|
||||
Thanks to
|
||||
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
|
||||
for implementing (one of?) the world's best TCP/IP stacks
|
||||
Paul 'Rusty' Russell
|
||||
for starting the netfilter/iptables project
|
||||
for trusting me to maintain it today
|
||||
Astaro AG
|
||||
for sponsoring parts of my netfilter work
|
||||
Free Software Foundation
|
||||
for the GNU Project
|
||||
for the GNU General Public License
|
||||
%size 3
|
||||
The slides of this presentation are available at http://www.gnumonks.org/
|
||||
%size 3
|
||||
The netfilter homepage http://www.netfilter.org/
|
||||
%size 3
|
||||
The http://www.gpl-violations.org/ project
|
||||
|
||||
|
|
@ -0,0 +1,24 @@
|
|||
Harald Welte is the chairman of the netfilter/iptables core team.
|
||||
|
||||
His main interest in computing has always been networking. In the few time
|
||||
left besides netfilter/iptables related work, he's writing obscure documents
|
||||
like the UUCP over SSL HOWTO. Other kernel-related projects he has been
|
||||
contributing are user mode linux, the international (crypto) kernel patch, device drivers and the neighbour cache.
|
||||
|
||||
He has been working as an independent IT Consultant working on projects for
|
||||
various companies ranging from banks to manufacturers of networking gear.
|
||||
During the year 2001 he was living in Curitiba (Brazil), where he got
|
||||
sponsored for his Linux related work by Conectiva Inc.
|
||||
|
||||
Starting with February 2002, Harald has been contracted part-time by
|
||||
<a href="http://www.astaro.com/">Astaro AG</a>, who are sponsoring him for his
|
||||
current netfilter/iptables work.
|
||||
|
||||
Aside from the Astaro sponsoring, he continues to work as a freelancing
|
||||
kernel developer and network security consultant.
|
||||
|
||||
He licenses his software under the terms of the GNU GPL. He is determined to bring all users, distributors, value added resellers and vendors of netfilter/iptables based products in full compliance with the GPL, even if it includes raising legal charges.
|
||||
|
||||
Harald is living in Berlin, Germany.
|
||||
|
||||
|
|
@ -0,0 +1,46 @@
|
|||
21c3-content@cccv.de
|
||||
|
||||
* Name: Full name of speaker
|
||||
|
||||
Harald Welte
|
||||
|
||||
* Bio: Short biography of speaker
|
||||
|
||||
See Attachment 1
|
||||
|
||||
* Contact: E-Mail, phone, instant messaging etc.
|
||||
|
||||
email: laforge@gnumonks.org
|
||||
Phone: +49-30-24033902
|
||||
Fax: +49-30-24033904
|
||||
|
||||
* Title: Name of event or lecture
|
||||
|
||||
Enforcing the GNU GPL
|
||||
|
||||
* Subtitle: Additional title description (a couple of words, optional)
|
||||
|
||||
Copyright helps Copyleft
|
||||
|
||||
* Abstract: An abstract of the event's content (max. 250 letters)
|
||||
|
||||
Linux is used more and more, especially in the embedded market. Unfortunately,
|
||||
a number of vendors do not comply with the GNU GPL. The author has enforced
|
||||
the GPL numerous times in and out of court, and will talk about his experience.
|
||||
|
||||
* Description: A detailed description of the event's content (250 to 500 words)
|
||||
|
||||
See Attachment 2
|
||||
|
||||
* Attachments: more information
|
||||
o Links to background information
|
||||
|
||||
http://www.gpl-violations.org/
|
||||
http://www.netfilter.org/licensing.html
|
||||
http://gnumonks.org/~laforge/weblog/linux/gpl-violations/
|
||||
|
||||
o Links to information on the lecture itself
|
||||
o Slides, Paper in PDF or other formats
|
||||
|
||||
Not yet available.
|
||||
|
|
@ -0,0 +1,29 @@
|
|||
Enforcing the GNU GPL - Copyright helps Copyleft
|
||||
|
||||
More and more vendors of various computing devices, especially network-related
|
||||
appliances such as Routers, NAT-Gateways and 802.11 Access Points are using
|
||||
Linux and other GPL licensed free software in their products.
|
||||
|
||||
While the Linux community can look at this as a big success, there is a back
|
||||
side of that coin: A large number of those vendors have no idea about the GPL
|
||||
license terms, and as a result do not fulfill their obligations under the GPL.
|
||||
|
||||
The netfilter/iptables project has started legal proceedngs against a number of
|
||||
companies in violation of the GPL since December 2003. Those legal proceedings
|
||||
were quite successful so far, resulting in twelve amicable agreements and one
|
||||
granted preliminary injunction. The list of companies includes large
|
||||
corporations such as Siemens, Asus and Belkin.
|
||||
|
||||
The speaker will present an overview about his recent successful enforcement of
|
||||
the GNU GPL within German jurisdiction.
|
||||
|
||||
He will go on speaking about what exactly is neccessarry to fully comply with
|
||||
the GPL, including his legal position on corner cases such as cryptographic
|
||||
signing.
|
||||
|
||||
Resulting from his experience in dealing with the german legal system, he will
|
||||
give some hints to software authors about what they can do in order to make
|
||||
eventual later license enforcement easier.
|
||||
|
||||
In the end, it seems like the idea of the founding fathers of the GNU GPL
|
||||
works: Guaranteeing Copyleft by using Copyright.
|
|
@ -0,0 +1,406 @@
|
|||
%include "default.mgp"
|
||||
%default 1 bgrad
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
%nodefault
|
||||
%back "blue"
|
||||
|
||||
%center
|
||||
%size 7
|
||||
|
||||
|
||||
The GPL is not Public Domain
|
||||
|
||||
|
||||
%center
|
||||
%size 4
|
||||
by
|
||||
|
||||
Harald Welte <laforge@gnumonks.org>
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The GNU GPL Revisited
|
||||
Contents 1/2
|
||||
|
||||
|
||||
Introduction
|
||||
What is Copyrightable?
|
||||
Terminology
|
||||
Common FOSS Licenses
|
||||
The GNU GPL Revisited
|
||||
Complete Source Code
|
||||
Derivative Works
|
||||
Non-Public Modifications
|
||||
GPL Violations
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
Contents 2/2
|
||||
|
||||
|
||||
Past GPL Enforcement
|
||||
The Linksys case
|
||||
Typical enforcement timeline
|
||||
Success so far
|
||||
Cases so far
|
||||
Future GPL Enforcement
|
||||
Thanks
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The GNU GPL Revisited
|
||||
Introduction
|
||||
|
||||
|
||||
Who is speaking to you?
|
||||
an independent Free Software developer
|
||||
who earns his living off Free Software since 1997
|
||||
who is one of the authors of the Linux kernel firewall system called netfilter/iptables
|
||||
who IS NOT A LAWYER, although this presentation is the result of dealing almost a year with lawyers on the subject of the GPL
|
||||
|
||||
Why is he speaking to you?
|
||||
because he thinks there is too much confusion about copyright and free software licenses. Even Red Hat CEO Matt Szulik stated in an interview that RedHat puts investments into 'public domain' :(
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The GNU GPL Revisited
|
||||
Disclaimer
|
||||
|
||||
Legal Disclaimer
|
||||
|
||||
All information presented here is provided on an as-is basis
|
||||
There is no warranty for correctness of legal information
|
||||
The author is not a lawyer
|
||||
This does not comprise legal advise
|
||||
The authors experience is limited to German copyright law
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The GNU GPL Revisited
|
||||
What is copyrightable?
|
||||
|
||||
The GNU GPL is a copyright license, and thus only covers copyrighted works
|
||||
Not everything is copyrightable (German: Schoepfungshoehe)
|
||||
Small bugfixes are not copyrightable (similar to typo-fixes in a book)
|
||||
As soon as the programmer has a choice in the implementation, there is significant indication of a copyrightable work
|
||||
Choice in algorithm, not in formal representation
|
||||
Apparently, the level for copyrightable works is relatively low
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The GNU GPL Revisited
|
||||
Terminology
|
||||
|
||||
Public Domain
|
||||
concept where copyright holder abandons all rights
|
||||
same legal status as works where author has died 70 years ago (German: Gemeinfreie Werke)
|
||||
Freeware
|
||||
object code, free of cost. No source code
|
||||
Shareware
|
||||
proprietary "Try and Buy" model for object code.
|
||||
Cardware/Beerware/...
|
||||
Freeware that encourages users to send payment in kind
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The GNU GPL Revisited
|
||||
Terminology
|
||||
|
||||
Free Software
|
||||
source code freely distributed
|
||||
must allow redistribution, modification, non-discriminatory use
|
||||
mostly defined by Free Software Foundation
|
||||
Open Source
|
||||
source code freely distributed
|
||||
must allow redistribution, modification, non-discriminatory use
|
||||
defined in the "Open Source Definition" by OSI
|
||||
|
||||
The rest of this document will refer to Free and Open Source Software as FOSS.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The GNU GPL Revisited
|
||||
Common FOSS licenses
|
||||
|
||||
Original BSD License
|
||||
allows redistribution, modification
|
||||
even allows proprietary extensions with no source code offer
|
||||
all docs, advertisement materials have to mention copyright holder
|
||||
Modified BSD License
|
||||
same as "Original BSD License", but no copyright statements required in docs and advertisements
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The GNU GPL Revisited
|
||||
Common FOSS licenses
|
||||
|
||||
GPL (GNU General Public Liense)
|
||||
allows redistribution, including modified works
|
||||
obliges distributor to supply source code including all modifications
|
||||
usage rights are revoked if license conditions not met
|
||||
LGPL (GNU Library General Public License)
|
||||
explicitly allows linking of proprietary applications
|
||||
written as special case for libraries (such as glibc)
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The GNU GPL Revisited
|
||||
The GNU GPL Revisited
|
||||
|
||||
Revisiting the GNU General Public License
|
||||
|
||||
Regulates distribution of copyrighted code, not usage
|
||||
Allows distribution of source code and modified source code
|
||||
The license itself is mentioned
|
||||
A copy of the license accompanies every copy
|
||||
Allows distribution of binaries or modified binaries, if
|
||||
The license itself is mentioned
|
||||
A copy of the license accompanies every copy
|
||||
The complete source code is either included with the copy made available to any 3rd party
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The GNU GPL Revisited
|
||||
Complete Source Code
|
||||
|
||||
%size 3
|
||||
"... complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable."
|
||||
Our interpretation of this is:
|
||||
Source Code
|
||||
Makefiles
|
||||
Tools for generating the firmware binary from the source
|
||||
(even if they are technically no 'scripts')
|
||||
General Rule:
|
||||
Intent of License is to enable user to run modified versions of the program. They need to be enabled to do so.
|
||||
Result: Signing binaries and only accepting signed versions without providing a signature key is not acceptable!
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The GNU GPL Revisited
|
||||
Derivative Works
|
||||
|
||||
What is a derivative work?
|
||||
Not dependent on any particular kind of technology (static/dynamic linking, dlopen, whatever)
|
||||
Even while the modification can itself be a copyrightable work, the combination with GPL-licensed code is subject to GPL.
|
||||
No precendent in Germany so far
|
||||
As soon as code is written for a specific non-standard API (such as the iptables plugin API), there is significant indication for a derivative work
|
||||
This position has been successfully enforced out-of-court with two Vendors so far (iptables modules/plugins).
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The GNU GPL Revisited
|
||||
Derivative Works
|
||||
|
||||
Position of my lawyer:
|
||||
In-kernel proprietary code (binary kernel modules) are hard to claim GPL compliant
|
||||
Case-by-case analysis required, especially when drivers/filesystems are ported from other OS's.
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The GNU GPL Revisited
|
||||
Collected Works
|
||||
|
||||
%size 3
|
||||
"... it is not the intent .. to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works ..."
|
||||
%size 3
|
||||
"... mere aggregation of another work ... with the program on a volume of a storage or distribution medium does not bring the other work under the scope of this license"
|
||||
|
||||
GPL allows "mere aggregation"
|
||||
like a general-porpose Linux distribution (SuSE, Red Hat, ...)
|
||||
|
||||
GPL disallows "collective works"
|
||||
legal grey area
|
||||
tends to depend a lot on jurisdiction
|
||||
no precendent so far
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The GNU GPL Revisited
|
||||
Non-Public modifications
|
||||
|
||||
Non-Public modifications
|
||||
A common misconception is that if you develop code within a corporation, and the code never leaves this corporation, you don't have to ship the source code.
|
||||
However, at least German law would count every distribution beyound a number of close colleague as distribution.
|
||||
Therefore, if you don't go for '3a' and include the source code together with the binary, you have to distribute the source code to any third party.
|
||||
Also, as soon as you hand code between two companies, or between a company and a consultant, the code has been distributed.
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
The GNU GPL Revisited
|
||||
GPL Violations
|
||||
|
||||
When do I violate the license
|
||||
when one ore more of the obligations are not fulfilled
|
||||
|
||||
What risk do I take if I violate the license?
|
||||
the GPL automatically revokes any usage right
|
||||
any copyright holder can obtain a preliminary injunction banning distribution of the infringing product
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
Past GPL enforcement
|
||||
|
||||
Past GPL enforcement
|
||||
|
||||
GPL violations are nothing new, as GPL licensed software is nothing new.
|
||||
However, the recent Linux hype made GPL licensed software used more often
|
||||
The FSF enforces GPL violations of code on which they hold the copyright
|
||||
silently, without public notice
|
||||
in lengthy negotiations
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
The Linksys case
|
||||
|
||||
|
||||
During 2003 the "Linksys" case drew a lot of attention
|
||||
Linksys was selling 802.11 WLAN Acces Ponts / Routers
|
||||
Lots of GPL licensed software embedded in the device (included Linux, uClibc, busybox, iptables, ...)
|
||||
FSF led alliance took the usual "quiet" approach
|
||||
Linksys bought it self a lot of time
|
||||
Some source code ws released two months later
|
||||
About four months later, full GPL compliance was achieved
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
The Linksys case
|
||||
|
||||
|
||||
Some developers didn't agree with this approach
|
||||
not enough publicity
|
||||
violators don't loose anything by first not complying and wait for the FSF
|
||||
four months delay is too much for low product lifecycles in WLAN world
|
||||
The netfilter/iptables project started to do their own enforcement in more cases that were coming up
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
Enforcement case timeline
|
||||
|
||||
|
||||
In chronological order
|
||||
some user sends us a note he found our code somewhere
|
||||
reverse engineering of firmware images
|
||||
sending the infringing organization a warning notice
|
||||
wait for them to sign a statement to cease and desist
|
||||
if no statement is signed
|
||||
contract technical expert to do a stdudy
|
||||
apply for a preliminary injunction
|
||||
if statement was signed
|
||||
try to work out the details
|
||||
grace period for boxes in stock possible
|
||||
try to indicate that a donation would be good PR
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
Sucess so far
|
||||
|
||||
|
||||
Success so far
|
||||
amicable agreements with a number of companies
|
||||
some of which made significant donations to charitable organizations of the free software community
|
||||
preliminary injunction against Sitecom, Sitecom also lost appeals case
|
||||
more settled cases (not public yet)
|
||||
negotiating in more cases
|
||||
public awareness
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GPL enforcement report
|
||||
Cases so far
|
||||
|
||||
Allnet GmbH
|
||||
Siemens AG
|
||||
Fujitsu-Siemens Computers GmbH
|
||||
Axis A.B.
|
||||
Securepoint GmbH
|
||||
U.S.Robotics Germany GmbH
|
||||
undisclosed large vendor
|
||||
Belkin Compnents GmbH
|
||||
Asus GmbH
|
||||
Gateprotect GmbH
|
||||
Sitecom GmbH
|
||||
TomTom B.V.
|
||||
Gigabyte Technologies GmbH
|
||||
D-Link GmbH
|
||||
Sun Deutschland GmbH
|
||||
Open-E GmbH
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
Future GPL Enforcement
|
||||
|
||||
GPL Enforcement
|
||||
remains an important issue for Free Software
|
||||
will start to happen within the court
|
||||
has to be made public in order to raise awareness
|
||||
|
||||
Problems
|
||||
only the copyright holder (in most cases the author) can do it
|
||||
users discovering GPL'd software need to communicate those issues to all copyright holders
|
||||
|
||||
The http://www.gpl-violations.org/ project was started
|
||||
as a platform wher users can report alleged violations
|
||||
to verify those violations and inform all copyright holders
|
||||
to inform the public about ongoing enforcement efforts
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
Make later enforcement easy
|
||||
|
||||
Practical rules for proof by reverse engineering
|
||||
Don't fix typos in error messages and symbol names
|
||||
Leave obscure error messages like 'Rusty needs more caffeine'
|
||||
Make binary contain string of copyright message, not only source
|
||||
Practical rules for potential damages claims
|
||||
Use revision control system
|
||||
Document source of each copyrightable contribution
|
||||
Name+Email address in CVS commit message
|
||||
Consider something like FSFE FLA (Fiduciary License Agreement)
|
||||
Make sure that employers are fine with contributions of their employees
|
||||
If you find out about violation
|
||||
Don't make it public (has to be new/urgent for injunctive relief)
|
||||
Contact lawyer immediately to send wanrning notice
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%page
|
||||
GNU GPL - Copyright helps Copyleft
|
||||
Thanks
|
||||
|
||||
Thanks to
|
||||
Alan Cox, Alexey Kuznetsov, David Miller, Andi Kleen
|
||||
for implementing (one of?) the world's best TCP/IP stacks
|
||||
Paul 'Rusty' Russell
|
||||
for starting the netfilter/iptables project
|
||||
for trusting me to maintain it today
|
||||
Astaro AG
|
||||
for sponsoring parts of my netfilter work
|
||||
Free Software Foundation
|
||||
for the GNU Project
|
||||
for the GNU General Public License
|
||||
%size 3
|
||||
The slides of this presentation are available at http://www.gnumonks.org/
|
||||
|
||||
Further Reading
|
||||
%size 3
|
||||
The netfilter homepage http://www.netfilter.org/
|
||||
%size 3
|
||||
The http://www.gpl-violations.org/ project
|
||||
|
||||
|
|
@ -0,0 +1,280 @@
|
|||
<?xml version='1.0' encoding='ISO-8859-1'?>
|
||||
|
||||
<!DOCTYPE article PUBLIC '-//OASIS//DTD DocBook XML V4.3//EN' 'http://www.docbook.org/xml/4.3/docbookx.dtd'>
|
||||
|
||||
<article id="gpl-enforcement-ccc2004">
|
||||
|
||||
<articleinfo>
|
||||
<title>Enforcing the GNU GPL - Copyright helps Copyleft</title>
|
||||
<authorgroup>
|
||||
<author>
|
||||
<personname>
|
||||
<firstname>Harald</firstname>
|
||||
<surname>Welte</surname>
|
||||
</personname>
|
||||
<!--
|
||||
<personblurb>Harald Welte</personblurb>
|
||||
<affiliation>
|
||||
<orgname>netfilter core team</orgname>
|
||||
<address>
|
||||
<email>laforge@netfilter.org</email>
|
||||
</address>
|
||||
</affiliation>
|
||||
|
||||
-->
|
||||
<email>laforge@gpl-violations.org</email>
|
||||
</author>
|
||||
</authorgroup>
|
||||
<copyright>
|
||||
<year>2004</year>
|
||||
<holder>Harald Welte <laforge@gpl-violations.org> </holder>
|
||||
</copyright>
|
||||
<date>Dec 01, 2004</date>
|
||||
<edition>1</edition>
|
||||
<orgname>netfilter core team</orgname>
|
||||
<releaseinfo>
|
||||
$Revision: 1.4 $
|
||||
</releaseinfo>
|
||||
|
||||
<abstract>
|
||||
<para>
|
||||
More and more vendors of various computing devices, especially network-related
|
||||
appliances such as Routers, NAT-Gateways and 802.11 Access Points are using
|
||||
Linux and other GPL licensed free software in their products.
|
||||
</para>
|
||||
<para>
|
||||
While the Linux community can look at this as a big success, there is a back
|
||||
side of that coin: A large number of those vendors have no idea about the GPL
|
||||
license terms, and as a result do not fulfill their obligations under the GPL.
|
||||
</para>
|
||||
<para>
|
||||
The netfilter/iptables project has started legal proceedngs against a number of
|
||||
companies in violation of the GPL since December 2003. Those legal proceedings
|
||||
were quite successful so far, resulting in twelve amicable agreements and one
|
||||
granted preliminary injunction. The list of companies includes large
|
||||
corporations such as Siemens, Asus and Belkin.
|
||||
</para>
|
||||
<para>
|
||||
This paper and the corresponding presentation will give an overview about the
|
||||
author's recent successful enforcement of the GNU GPL within German
|
||||
jurisdiction.
|
||||
</para>
|
||||
<para>
|
||||
The paper will go on describing what exactly is neccessarry to fully comply
|
||||
with the GPL, including the author's legal position on corner cases such as
|
||||
cryptographic signing.
|
||||
</para>
|
||||
<para>
|
||||
In the end, it seems like the idea of the founding fathers of the GNU GPL
|
||||
works: Guaranteeing Copyleft by using Copyright.
|
||||
</para>
|
||||
</abstract>
|
||||
|
||||
</articleinfo>
|
||||
|
||||
|
||||
<section>
|
||||
<title>Legal Disclaimer</title>
|
||||
<para>
|
||||
The author of this paper is a software developer, not a lawyer. The content of
|
||||
this paper represents his knowledge after dealing with the legal issues of
|
||||
about 20 gpl violation cases.
|
||||
</para>
|
||||
<para>
|
||||
All information in this paper is presented on a nas-is basis. There is no
|
||||
warranty for correctness.
|
||||
</para>
|
||||
<para>
|
||||
The paper does not comprise legal advise, and any details might be coupled to German copyright law (UrhG)
|
||||
</para>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>What is copyrightable</title>
|
||||
<para>
|
||||
Since the GNU GPL is a copyright license, it can only cover copyrightable
|
||||
works. The exact definition of what is copyrightable and what not might vary
|
||||
from legislation to legislation.
|
||||
</para>
|
||||
<para>
|
||||
Software is considered the immaterial result of a creative act, and is treated
|
||||
very much like literary works. It might therefore be applicable to look at the
|
||||
analogy of a printed book.
|
||||
</para>
|
||||
<para>
|
||||
In order for a work to be copyrightable, it has to be non-trivial (German:
|
||||
Schöpfungshöhe). Much like a lector of a book, anybody who just
|
||||
corrects spelling mistakes, compiler warnings, or even functional fixes such as
|
||||
fixing a signedness bug or a typecast are unlikely to be seen as a
|
||||
copyrightable contribution to an existing work.
|
||||
</para>
|
||||
<para>
|
||||
An indication for copyrightability can be the question: Did the author have a
|
||||
choice (i.e. between different algorithms)? As soon as there are multiple ways
|
||||
of getting a particular job done, and the author has to make decisions on which
|
||||
way to go, this is an indication for copyrightability.
|
||||
</para>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>The GNU GPL revisited</title>
|
||||
<para>
|
||||
As a copyright license, the GNU GPL mainly regulates distribution of a
|
||||
copyrighted work, not usage. To the opposite, the GNU GPL does not allow an
|
||||
author to make any additional restrictions like <quote>must not be used for
|
||||
military purpose</quote>.
|
||||
</para>
|
||||
<para>
|
||||
As a summary, the license allows distribution of the source code (including
|
||||
modifications, if any) if
|
||||
<itemizedlist>
|
||||
<listitem>The GPL license itself is mentioned</listitem>
|
||||
<listitem>A copy of the full license text accompanies every copy</listitem>
|
||||
</itemizedlist>
|
||||
</para>
|
||||
<para>
|
||||
The GPL allows distribution of the object code (including modifications) if
|
||||
<itemizedlist>
|
||||
<listitem>The GPL license itself is mentioned</listitem>
|
||||
<listitem>A copy of the full license text accompanies every copy</listitem>
|
||||
<listitem>The <quote>complete corresponding source code</quote> or a written offer to ship it to any third party is included with every copy</listitem>
|
||||
</itemizedlist>
|
||||
</para>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Complete Source Code</title>
|
||||
<para>
|
||||
The GPL contains a very specific definition of what the term <quote>full source
|
||||
code</quote> actually means in practise:
|
||||
</para>
|
||||
<quote><para>
|
||||
... complete source code means all the source code for all modules it contains,
|
||||
plus any associated interface definition files, plus the scripts used to
|
||||
control compilation and installation of the executable.
|
||||
</para></quote>
|
||||
<para>
|
||||
The interpretation of the paper's author of this (for C programs) is:
|
||||
<itemizedlist>
|
||||
<listitem>source code</listitem>
|
||||
<listitem>Header Files</listitem>
|
||||
<listitem>Makefiles</listitem>
|
||||
<listitem>Tools for installation of a modified binary, even if they are not technically implemented as scripts</listitem>
|
||||
</itemizedlist>
|
||||
<para>
|
||||
The general rule in case of any question is the intent of the license: To
|
||||
enable the user to modify the source code and run modified versions.
|
||||
</para>
|
||||
<para>
|
||||
This brings us to the conclusion that in case of a bundle of hardware and
|
||||
software, the hardware can not be implemented in a way to only accept
|
||||
cryptographically signed software, without providing either the original key,
|
||||
or the option of setting a new key in the hardware.
|
||||
</para>
|
||||
</section>
|
||||
|
||||
|
||||
<section>
|
||||
<title>Derivative Work</title>
|
||||
<para>
|
||||
The question of derivative works is probably the hardest question with regard
|
||||
to the GPL. According to the license text, any derivative work can only be
|
||||
distributed under the GPL, too. However, the definition of a derivative work
|
||||
is left to the legal framework of copyright.
|
||||
</para>
|
||||
<para>
|
||||
The paper's author is convinced that any court decision would not look at the
|
||||
particular technology used to integrate multiple software parts. It is much
|
||||
more a question of how much dependency there is between the two pieces.
|
||||
</para>
|
||||
<para>
|
||||
If a program is written against a specific non-standard API, this can be
|
||||
considered as an indication for a derivative work. If a program is written
|
||||
against standard APIs, and the GPL licensed parts that provide those APIs can
|
||||
be easily exchanged with other [existing] implementations, then it can be considered as indication for no derivative work.
|
||||
</para>
|
||||
<para>
|
||||
Unfortunately there is no precedent on this issue, so it's up to the first
|
||||
court decisions on the issue of derivative works to determine.
|
||||
</para>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Collective Works</title>
|
||||
<para>
|
||||
<quote>... it is not the intent ... to claim rights or contest your rights to work written entirely by you; rather, the intent is to excercise the right to control the distribution of derivative or collective works ...</quote>
|
||||
</para>
|
||||
<para>
|
||||
<quote>... mere aggregation of another work ... with the program on a volume of a storage or distribution medium does not bring the other work under the scope of this license</quote>
|
||||
</para>
|
||||
<para>
|
||||
So the GPL allows <quote>mere aggregation</quote>, which is what e.g. the
|
||||
GNU/Linux distributors like RedHat or SuSE do, when they ship GPL-licensed
|
||||
programs together with a proprietary Macromedia Flash player on one CD- or
|
||||
DVD-Medium.
|
||||
</para>
|
||||
<para>
|
||||
Further research is required to determine what exactly would be a collective
|
||||
work, and how far this is backed by copyright law.
|
||||
</para>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Non-Public Modifications</title>
|
||||
<para>
|
||||
Since the GPL regulates distribution and not use, any modifications that are
|
||||
not distributed in any form do not require offering the source code.
|
||||
</para>
|
||||
<para>
|
||||
Special emphasis has to be given on when distribution happens within the legal
|
||||
context.
|
||||
</para>
|
||||
Undoubtedly, as soon as you distribute modifications to a third party, such as
|
||||
a contractor or another company, you are bound by the GPL to either include the
|
||||
full source code, or a written offer. Please note that if you don't include
|
||||
the source code at any given time, the written offer must be available to any third party!
|
||||
</para>
|
||||
<para>
|
||||
Interestingly, at least in German copyright law, distribution can also happen
|
||||
within an organization. Apparently, as soon as a copy is distributed to a
|
||||
group larger than a small number of close colleagues whom you know personally,
|
||||
distribution happens - and thus the obligations of the GPL apply.
|
||||
</para>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>GPL Violations</title>
|
||||
<para>
|
||||
The GPL is violated as soon as one or more of the obligations are not fulfilled.</para>
|
||||
<para>
|
||||
For this case, the GPL automatically revokes any right, even the usage right on
|
||||
the original unmodified code. So not only the distribution is infringing, also the mere use is no longer permitted.
|
||||
</para>
|
||||
<para>
|
||||
This very strong provision is quite common in copyright licenses, especially in
|
||||
the world of proprietary software.
|
||||
</para>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Past GPL Enforcement</title>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>The Linksys Case</title>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Enforcement Case Timeline</title>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Success so far</title>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<title>Future GPL Enforcement</title>
|
||||
</section>
|
||||
|
||||
</article>
|
||||
|
|
@ -0,0 +1,4 @@
|
|||
Linux is used more and more, especially in the embedded market. Unfortunately,
|
||||
a number of vendors do not comply with the GNU GPL. The author has enforced
|
||||
the GPL numerous times in and out of court, and will talk about his experience.
|
||||
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue