strongswan/src/pluto/routing.txt

Routing and Erouting in Pluto
=============================

RCSID $Id: routing.txt,v 1.1 2004/03/15 20:35:29 as Exp $

This is meant as internal documentation for Pluto.  As such, it
presumes some understanding of Pluto's code.

It also describes KLIPS 1 erouting, including details not otherwise
documented.  KLIPS 1 documentation would be better included in KLIPS.

Routing and erouting are complicated enough that the Pluto code needs
a guide.  This document is meant to be that guide.


Mechanisms available to Pluto
-----------------------------

All outbound packets that are to be processed by KLIPS 1 must be
routed to an ipsecN network interface.  Pluto only uses normal routing
(as opposed to "Advanced Routing"), so the selection of packets is
made solely on the basis of the destination address.  (Since the
actual routing commands are in the updown script, they could be
changed by the administrator, but Pluto needs to understand what is
going on, and it currently assumes normal routing is used.)

When an outbound packet hits an ipsecN interface, KLIPS figures out
how to process it by finding an eroute that applies to the source and
destination addresses.  Eroutes are global: they are not specific to a
particular ipsecN interface (routing needs to get the packets to any
ipsecN interface; erouting takes it from there, ignoring issues of
source IP address and nexthop (because nobody knows!)).  If multiple
eroutes apply to the packet, among the ones with the most specific
source subnet, the one with the most specific destination subset is
chosen (RGB thinks).  If no eroute is discovered, KLIPS acts as if it
was covered by a DROP eroute (this is the default behaviour; it can be
changed).  At most one eroute can exist for a particular pair of
client subnets.

There are fundamentally two kinds of eroutes: "shunt" eroutes and ones
that specify that a packet is to be processed by a group of IPSEC SAs.
Shunt eroutes specify what is to be done with the packet.  Remember
that these only apply to outbound packets.

- TRAP: notify Pluto of the packet (presumably to attempt to negotiate
  an appropriate group of IPSEC SAs).  At the same time, KLIPS
  installs a HOLD shunt (see below) for the specific source and
  destination addresses from the packet and retains the packet
  for later reprocessing (KLIPS does not yet implement retention).
  Beware: if the TRAP's subnets both contained a single IP address
  then installing the HOLD would actually delete the TRAP.

- PASS: let the packet through in the clear

- DROP: discard the packet

- REJECT: discard the packet and notify the sender

- HOLD: (automatically created by KLIPS when a TRAP fires) block
  the packet, but retain it.  If there is already a retained
  packet, drop the old one and retain the new.  When the HOLD
  shunt is deleted or replaced, the retained packet is reinjected --
  there might now be a tunnel.  Note that KLIPS doesn't yet
  implement the retention part, so HOLD is really like a DROP.

One consequence of there being only one eroute for a pair of clients
is that KLIPS will only use one SA group for output for this pair,
even though there could be several SA groups that are authorised and
live.  Pluto chooses to make this the youngest such group.


KLIPS lets through in the clear outbound UDP/500 packets that would
otherwise be processed if they originate on this host and meet certain
other conditions.  The actual test is
	source == me
	&& (no_eroute || dest == eroute.dest || isanyaddr(eroute.dest))
	&& port == UDP/500
The idea is that IKE packets between us and a peer should not be
sent through an IPSEC tunnel negotiated between us.  Furthermore,
our shunt eroutes should not apply to our IKE packets (shunt eroutes
will generally have an eroute.dest of 0.0.0.0 or its IPv6 equivalent).

Inbound behaviour is controlled in a quite different way.  KLIPS
processes only those inbound packets of ESP or AH protocol, with a
destination address for this machine's ipsecN interfaces. The
processing is as dictated by the SAs involved.  Unfortunately, the
decapsulated packet's source and destination address are not checked
(part of "inbound policy checking").

To prevent clear packets being accepted, firewall rules must be put in
place.  This has nothing to do with KLIPS, but is nonetheless in
important part of security.  It isn't clear what firewalling makes
sense when Opportunism is allowed.


For routing and firewalling, Pluto invokes the updown script.  Pluto
installs eroutes via extended PF_KEY messages.


Current Pluto Behaviour
-----------------------

Data Structures:

Routes and most eroutes are associated with connections (struct
connection, a potential connection description).  The enum routing_t
field "routing" in struct connection records the state of routing and
erouting for that connection.  The values are:
    RT_UNROUTED,	/* unrouted */
    RT_UNROUTED_HOLD,	/* unrouted, but HOLD shunt installed */
    RT_ROUTED_PROSPECTIVE,	/* routed, and TRAP shunt installed */
    RT_ROUTED_HOLD,	/* routed, and HOLD shunt installed */
    RT_ROUTED_FAILURE,	/* routed, and failure-context shunt installed */
    RT_ROUTED_TUNNEL	/* routed, and erouted to an IPSEC SA group */
Notice that the routing and erouting are not independent: erouting
(except for HOLD) implies that the connection is routed.

Several struct connections may have the same destination subnet.  If
they agree on what the route should be, they can share it -- any of
them may have routing >= RT_ROUTED_PROSPECTIVE.  If they disagree,
they cannot simultaneously be routed.

invariant: for all struct connections c, d:
    (c.that.client == d.that.client
	    && c.routing >= RT_ROUTED_PROSPECTIVE
	    && d.routing >= RT_ROUTED_PROSPECTIVE)
    => c.interface == d.interface && c.this.nexthop == d.this.nexthop

There are two kinds of eroutes: shunt eroutes and ones for an IPSEC SA
Group.  Most eroutes are associated with and are represeented in a
connection.  The exception is that some HOLD and PASS shunts do not
correspond to connections; those are represented in the bare_shunt
table.

An eroute for an IPSEC SA Group is associated with the state object
for that Group.  The existence of such an eroute is also represented
by the "so_serial_t eroute_owner" field in the struct connection.  The
value is the serial number of the state object for the Group.  The
special value SOS_NOBODY means that there is no owner associated with
this connection for the eroute and hence no normal eroute.  At most
one eroute owner may exist for a particular (source subnet,
destination subnet) pair.  A Pluto-managed eroute cannot be associated
with an RT_UNROUTED connection.

invariant: for all struct connection c:
    c.routing == RT_EROUTED_TUNNEL || c.eroute_owner == SOS_NOBODY

invariant: for all struct connections c, d:
    c.this.client == d.this.client && c.that.client == d.that.client
        && &c != &d
    => c.routing == RT_UNROUTED || d.routing == RT_UNROUTED

If no normal eroute is set for a particular (source subnet,
destination subnet) pair for which a connection is routed, then a
shunt eroute would have been installed.  This specifies what should
happen to packets snared by the route.

When Pluto is notified by KLIPS of a packet that has been TRAPped,
there is no connection with which to associate the HOLD.  It is
temporarily held in the "bare_shunt table".  If Opportunism is
attempted but DNS doesn't provide Security Gateway information, Pluto
will replace the HOLD with a PASS shunt.  Since this PASS isn't
associated with a connection, it too will reside in the bare_shunt
table.  If the HOLD can be associated with a connection, it will be
removed from the bare_shunt table and represented in the connection.

There are two contexts for which shunt eroutes are installed by Pluto
for a particular connection.  The first context is with the prospect
of dealing with packets before any negotiation has been attempted.  I
call this context "prospective".  Currently is a TRAP shunt, used to
catch packets for initiate opportunistic negotiation.  In the future,
it might also be used to implement preordained PASS, DROP, or REJECT
rules.

The second context is after a failed negotiation.  I call this context
"failure".  At this point a different kind of shunt eroute is
appropriate.  Depending on policy, it could be PASS, DROP, or REJECT,
but it is unlikely to be TRAP.  The shunt eroute should have a
lifetime (this isn't yet implemented).  When the lifetime expires, the
failure shunt eroute should be replaced by the prospective shunt
eroute.

The kind and duration of a failure shunt eroute should perhaps depend
on the nature of the failure, at least as imperfectly detected by
Pluto.  We haven't looked at this.  In particular, the mapping from
observations to robust respose isn't obvious.

The shunt eroute policies should be a function of the potential
connection.  The failure shunt eroute can be specified for a
particular connection with the flags --pass and --drop in a connection
definition.  There are four combinations, and each has a distinct
meaning.  The failure shunt eroute is incompletely implemented and
cannot be represented in /etc/ipsec.conf.

There is as yet no control over the prospective shunt eroute: it is
always TRAP as far as Pluto is concerned.  This is probably
reasonable: any other fate suggests that no negotiation will be done,
and so a connection definition is inappropriate.  These should be
implemented as manual conns.  There remains the issue of whether Pluto
should be aware of them -- currently it is not.


Routines:

[in kernel.c]

bool do_command(struct connection *c, const char *verb)
    Run the updown script to perform such tasks as installing a route
    and adjust the firewall.

bool could_route(struct connection *c)
    Check to see whether we could route and eroute the connection.
    <- shunt_eroute_connection (to check if --route can be performed)
    <- install_inbound_ipsec_sa (to see if it will be possible
       to (later) install route and eroute the corresponding outbound SA)
    <- install_ipsec_sa (to see if the outbound SA can be routed and erouted)

bool trap_connection(struct connection *c)
    Install a TRAP shunt eroute for this connection.  This implements
    "whack --route", the way an admin can specify that packets for a
    connection should be caught without first bringing it up.

void unroute_connection(struct connection *c)
    Delete any eroute for a connection and unroute it if route isn't shared.
    <- release_connection
    <- whack_handle (for "whack --unroute)

bool eroute_connection(struct connection *c
, ipsec_spi_t spi, unsigned int proto, unsigned int satype
, unsigned int op, const char *opname UNUSED)
    Issue PF_KEY commands to KLIPS to add, replace, or delete an eroute.
    The verb is specified by op and described (for logging) by opname.
    <- assign_hold
    <- sag_eroute
    <- shunt_eroute

bool assign_hold(struct connection *c
, const ip_address *src, const ip_address *dst)
    Take a HOLD from the bare_shunt table and assign it to a connection.
    If the HOLD is broadened (i.e. the connection's source or destination
    subnets contain more than one IP address), this will involve replacing
    the HOLD with a different one.

bool sag_eroute(struct state *st, unsigned op, const char *opname)
    SA Group eroute manipulation.  The SA Group concerned is
    identified with a state object.
    <- route_and_eroute several times

bool shunt_eroute(struct connection *c, unsigned int op, const char *opname)
    shunt eroute manipulation.  Shunt eroutes are associated with
    connections.
    <- unroute_connection
    <- route_and_eroute
    <- delete_ipsec_sa

bool route_and_eroute(struct connection *c, struct state *st)
    Install a route and then a prospective shunt eroute or an SA group
    eroute.  The code assumes that could_route had previously
    given the go-ahead.  Any SA group to be erouted must already
    exist.
    <- shunt_eroute_connection
    <- install_ipsec_sa

void scan_proc_shunts(void)
    Every SHUNT_SCAN_INTERVAL scan /proc/net/ipsec_eroute.
    Delete any PASS eroute in the bare_shunt table that hasn't been used
    within the last SHUNT_PATIENCE seconds.
    For any HOLD for which Pluto hasn't received an ACQUIRE (possibly
    lost due to congestion), act as if an ACQUIRE were received.

[in connection.c]

struct connection *route_owner(struct connection *c, struct connection **erop)
    Find the connection to connection c's peer's client with the
    largest value of .routing.  All other things being equal,
    preference is given to c.  Return NULL if no connection is routed
    at all.  If erop is non-null, sets it to a connection sharing both
    our client subnet and peer's client subnet with the largest value
    of .routing.
    The return value is used to find other connections sharing
    a route.  The value of *erop is used to find other connections
    sharing an eroute.
    <- could_route (to find any conflicting routes or eroutes)
    <- unroute_connection (to find out if our route is still in use
       after this connection is finished with it)
    <- install_inbound_ipsec_sa (to find other IPSEC SAs for the
       same peer clients; when we find them WE KILL THEM; a
       kludge to deal with road warriors reconnecting)
    <- route_and_eroute (to find all the connections from which the
       route or eroute is being stolen)

Uses:

- setting up route & shunt eroute to TRAP packets for opportunism
  (whack --route).  Perhaps also manually designating DROP, REJECT, or
  PASS for certain packets.

  whack_handle() responds to --route; calls route_connection()


- removing same (whack --unroute)

  whack_handle() responds to --unroute; calls unroute_connection()

- installing route & normal eroute for a newly negotiated group of
  outbound IPSEC SAs

  + perhaps an (additional) route is not needed: if the negotiation
    was initiated by a TRAPped outgoing packet, then there must
    already have been a route that got the packet to ipsecN.  Mind
    you, it could have been the wrong N!

  install_ipsec_sa()

- updating a normal eroute when a new group of IPSEC SAs replaces
  an old one due to rekeying.

  install_ipsec_sa()

- replacing an old eroute when a negotiation fails.  But this is
  tricky.  If this was a rekeying, we should just leave the old
  normal eroute be -- it might still work.  Otherwise, this was
  an initial negotiation: we should replace the shunt eroute
  with one appropriate for the failure context.

- when a group of IPSEC SAs dies or is killed, and it had the eroute,
  its normal eroute should be replaced by a shunt eroute.  If there
  was an attempt to replace the group, the replacement is in the
  failure context; otherwise the replacement is in the prospective
  context.