Previously, the client had to propose no wider selectors than the certificate
permits, otherwise the complete CHILD_SA was rejected. However, with IKEv2
we can dynamically narrow the selectors to what the certificate allows. This
makes client and gateway configurations very simple by just proposing 0.0.0.0/0,
narrowed to selectors the client is permitted to route into the network.
This allows a gateway to enforce the addrblock policy on certificates that
actually have the extension only. For (legacy) certificates not having the
extension, traffic selectors are validated/narrowed by other means, most
likely by the configuration.
This way updates to the mediation config are respected and the order in
which configs are configured/loaded does not matter.
The SQL plugin currently maintains the strong relationship between
mediated and mediation connection (we could theoretically change that to a
string too).
The original name is returned in the new "name" attribute.
This fixes an issue with bindings that map VICI messages to
dictionaries. For instance, in roadwarrior scenarios where every
CHILD_SA has the same name only the information of the last CHILD_SA
would end up in the dictionary for that name.
PINs are stored in a "hidden" credential set, so that its shared
secrets are not exposed via VICI. Since they are not explicitly loaded as
shared secrets via VICI a client might consider them as removed secrets and
remove them.
After an interface disappeared we can't remove the policies correctly as
the name doesn't resolve to the previous index anymore.
And making the policies so specific might not provide that much benefit.
To handle the interfaces on the policies correctly would require some
changes to the child-cfg, kernel-interface etc. so they'd take interface
indices directly so we could target the policies correctly even if an
interface disappeared (or reappeared and got a new index).
For table dumps the kernel accepts RTA_PREFSRC to filter the routes, which is
what we do when doing userspace route calculations. For kernel-based route
lookups, however, the RTA_PREFSRC attribute is ignored and we must specify
RTA_SRC for policy based route lookups.
For gateways with many connections, installing routes is often disabled,
as we can use a static route configuration to achieve proper routing with
a single rule. If this is the case, there is no need to dump all routes and
do userspace route lookups, as there is no need to exclude routes we installed
ourself.
Doing kernel-based route lookups is not only faster with may routes, but also
can use the full power of Linux policy based routing; something we can hardly
rebuild in userspace when calculating routes.
When using vici over RPyC and its (awesome) splitbrain, encoding and decoding
strings fails in vici, most likely because of the Monkey-Patch magic splitbrain
uses.
When specifying the implicit UTF-8 as encoding scheme explicitly, Python uses
the correct method to encode/decode the string, making vici useable in
splitbrain contexts.
While trap and regular policies now often look the same (mainly because
reqids are kept constant) trap policies still need to have a lower priority
than regular policies to handle unroute/route correctly if e.g. IPComp
is used or the mode changes. But if we use a completely different
priority range that's lower than that of regular policies it is not possible
to install overlapping trap policies. By differentiating trap from
regular policies via the priority's LSB this issue is avoided while
still maintaining the proper ordering of trap and regular policies.
Fixes#1243.
The Optimistic Duplicate Address Detection (DAD) seems to fail in some
cases (`dadfailed` in `ip addr`) rendering the virtual IP address unusable.
Fixes#2183.
This implements rule 6 of RFC 6724 using the default priority table,
so that e.g. global addresses are preferred over ULAs (which also have
global scope) when the destination is a global address.
Fixes#2138.
By default, the kernel incorrectly uses an 8 byte alignment, which is
mandatory for IPv6 but prohibited for IPv4. For many algorithms this
doesn't matter but that's not the case for HMAC_SHA2_256_128.
Since 2.6.39 the kernel can be explicitly configured to use a 4 byte
alignment.
Otherwise, we'd end up with an empty TS list, which is not valid.
Because end->tohost is set to !end->subnets in starter the removed branch was
never used.
The Python VICI library does not check if the socket is closed.
If the daemon closes the connection, _recvall() spins forever.
Closesstrongswan/strongswan#56.
Fix for "Permission denied (you must be root)" error when calling
iptc_init(), which opens a RAW socket to communicate with the kernel,
when built with "--with-capabilities=libcap".
Closes strongswan/strongswan#53.
Fixes#2157.
A wrong variable is used (route instead of best), so much that the
returned interface belongs to the last seen route instead of the best
choice route.
get_route() may therefore return mismatching interface and gateway.
Fixes: 66e9165bc6 ("kernel-netlink: Return outbound interface in get_nexthop()")
Signed-off-by: Christophe Gouault <christophe.gouault@6wind.com>
The kernel will apply the mask to the mark on the packet and then
compare it to the configured mark. So to match only unmarked packets we
have to be able to set 0/0xffffffff.
If the number of flows over a gateway exceeds the flow cache size of the Linux
kernel, policy lookup gets very expensive. Policies covering more than a single
address don't get hash-indexed by default, which results in wasting most of
the cycles in xfrm_policy_lookup_bytype() and its xfrm_policy_match() use.
Starting with several hundred policies the overhead gets inacceptable.
Starting with Linux 3.18, Linux can hash the first n-bit of a policy subnet
to perform indexed lookup. With correctly chosen netbits, this can completely
eliminate the performance impact of policy lookups, freeing the resources
for ESP crypto.
WARNING: Due to a bug in kernels 3.19 through 4.7, the kernel crashes with a
NULL pointer dereference if a socket policy is installed while hash thresholds
are changed. And because the hashtable rebuild triggered by the threshold
change that causes this is scheduled it might also happen if the socket
policies are seemingly installed after setting the thresholds.
The fix for this bug - 6916fb3b10b3 ("xfrm: Ignore socket policies when
rebuilding hash tables") - is included since 4.8 (and might get backported).
As a workaround `charon.plugins.kernel-netlink.port_bypass` may be enabled
to replace the socket policies that allow IKE traffic with port specific
bypass policies.
When fresh CRLs are released with a high update frequency (e.g.
every 24 hours) or OCSP is used then the certificate cache gets
quickly filled with stale CRLs or OCSP responses. The new VICI
flush-certs command allows to flush e.g. cached CRLs or OCSP
responses only. Without the type argument all kind of certificates
(e.g. also received end entity and intermediate CA certificates)
are purged.
fgetc() returns an int and EOF is usually -1 so when this gets casted to
a char the result depends on whether `char` means `signed char` or
`unsigned char` (the C standard does not specify it). If it is unsigned
then its value is 0xff so the comparison with EOF will fail as that is an
implicit signed int.
This fixes DNS server installation if make-before-break reauthentication
is used as there the new SA and DNS server is installed before it then
is removed again when the old IKE_SA is torn down.
If running resolvconf fails handle() fails release() is not called, which
might leave an interface file on the system (or depending on which script
called by resolvconf actually failed even the installed DNS server).
We don't need them for drop policies and they might even mess with other
routes we install. Routes for policies with protocol/ports in the
selector will always be too broad and might conflict with other routes
we install.
Using the source address to determine the interface is not correct for
net-to-net shunts between two interfaces on which the host has IP addresses
for each subnet.
Other threads are free to add/update/delete other policies.
This tries to prevent race conditions caused by releasing the mutex while
sending messages to the kernel. For instance, if break-before-make
reauthentication is used and one thread on the responder is delayed in
deleting the policies that another thread is concurrently adding for the
new SA. This could have resulted in no policies being installed
eventually.
Fixes#1400.
If a pseudonym changed a new entry was added to the table storing
permanent identity objects (that are used as keys in the other table).
However, the old mapping was not removed while replacing the mapping in
the pseudonym table caused the old pseudonym to get destroyed. This
eventually caused crashes when a new pseudonym had the same hash value as
such a defunct entry and keys had to be compared.
Fixesstrongswan/strongswan#46.
The versioning scheme used by Python (PEP 440) supports the rcN suffix
but development releases have to be named devN, not drN, which are
not supported and considered legacy versions.
After adding the read callback the state is WATCHER_QUEUED and it is
switched to WATCHER_RUNNING only later by an asynchronous job. This means
that a thread that sent a Netlink message shortly after registration
might see the state as WATCHER_QUEUED. If it then tries to read the
response and the watcher thread is quicker to actually read the message
from the socket, it could block on recv() while still holding the lock.
And the asynchronous job that actually read the message and tries to queue
it will block while trying to acquire the lock, so we'd end up in a deadlock.
This is probably mostly a problem in the unit tests.
Metrics are basically defined to order routes with equal prefix, so ordering
routes by metric first makes not much sense as that could prefer totally
unspecific routes over very specific ones.
For instance, the previous code did break installation of routes for
passthrough policies with two routes like these in the main routing table:
default via 192.168.2.1 dev eth0 proto static
192.168.2.0/24 dev eth0 proto kernel scope link src 192.168.2.10 metric 1
Because the default route has no metric set (0) it was used, instead of the
more specific other one, to determine src and next hop when installing a route
for a passthrough policy for 192.168.2.0/24. Therefore, the installed route
in table 220 did then incorrectly redirect all local traffic to "next hop"
192.168.2.1.
The same issue occurred when determining the source address while
installing trap policies.
Fixes 6b57790270 ("kernel-netlink: Respect kernel routing priorities for IKE routes").
Fixes#1416.
This allows using manual priorities for traps, which have a lower
base priority than the resulting IPsec policies. This could otherwise
be problematic if, for example, swanctl --install/uninstall is used while
an SA is established combined with e.g. IPComp, where the trap policy does
not look the same as the IPsec policy (which is now otherwise often the case
as the reqids stay the same).
It also orders policies by selector size if manual priorities are configured
and narrowing occurs.
The kernel policy now considers src and dst port masks as well as
restictions to a given network interface. The base priority is
100'000 for passthrough shunts, 200'000 for IPsec policies,
300'000 for IPsec policy traps and 400'000 for fallback drop shunts.
The values 1..30'000 can be used for manually set priorities.
This allows two CHILD_SAs with reversed subnets to install two FWD
policies each. Since the outbound policy won't have a reqid set we will
end up with the two inbound FWD policies installed in the kernel, with
the correct templates to allow decrypted traffic.
We can't set a template on the outbound FWD policy (or we'd have to make
it optional). Because if the traffic does not come from another (matching)
IPsec tunnel it would get dropped due to the template mismatch.
This allows us to install more than one FWD policy. We already do this
in the kernel-pfkey plugin (there the original reason was that not all
kernels support FWD policies).
Running or undoing start actions might require enumerating IKE_SAs,
which in turn might have to enumerate peer configs concurrently, which
requires acquiring a read lock. So if we keep holding the write lock while
enumerating the SAs we provoke a deadlock.
By preventing other threads from acquiring the write lock while handling
actions, and thus preventing the modification of the configs, we largely
maintain the current synchronous behavior. This way we also don't need to
acquire additional refs for config objects as they won't get modified/removed.
Fixes#1185.
This allows e.g. modified versions of xl2tpd to set the mark in
situations where two clients are using the same source port behind the
same NAT, which CONNMARK can't restore properly as only one conntrack entry
will exist with the mark set to that of the client that sent the last packet.
Fixes#1230.
By settings a matchmask that covers the complete rule we ensure that the
correct rule is deleted (i.e. matches and targets with potentially different
marks are also compared).
Since data after the passed pointer is actually dereferenced when
comparing we definitely have to pass an array that is at least as long as
the ipt_entry.
Fixes#1229.