For inbound processing, it can be rather useful to apply the mark to the
packet in the SA, so the associated policy with that mark implicitly matches.
When using %unique as match mark, we don't know the mark beforehand, so
we most likely want to set the mark we match against.
%unique (and the upcoming %same key) are usable in specific contexts only.
To restrict the user from using it in other places where it does not get the
expected results, reject such keywords unless explicitly allowed.
We don't retransmit DPD requests like we do requests for proper exchanges,
so increasing the number with each sent DPD could result in the peer's state
getting out of sync if DPDs are lost. Because according to RFC 3706, DPDs
with an unexpected sequence number SHOULD be rejected (it does mention the
possibility of maintaining a window of acceptable numbers, but we currently
don't implement that). We partially ignore such messages (i.e. we don't
update the expected sequence number and the inbound message stats, so we
might send a DPD when none is required). However, we always send a response,
so a peer won't really notice this (it also ensures a reply for "retransmits"
caused by this change, i.e. multiple DPDs with the same number - hopefully,
other implementations behave similarly when receiving such messages).
Fixes#2714.
This is mainly for HA where a passive SA was already created when the
IKE keys were derived. If e.g. an authentication error occurs later that
SA wouldn't get cleaned up.
The reload of the configuration of the loggers so far only included
the log levels. In order to support the reload of all other options,
a reload function may be implemented.
Signed-off-by: Thomas Egerer <thomas.egerer@secunet.com>
The options control whether the DF and ECN header bits/fields are copied
from the unencrypted packets to the encrypted packets in tunnel mode (DF only
for IPv4), and for ECN whether the same is done for inbound packets.
Note: This implementation only works with Linux/Netlink/XFRM.
Based on a patch by Markus Sattler.
During a test with ~12000 established SAs it was noted that vici
related operations hung.
The operations took over 16 minutes to finish. The time was spent in
the vici message parser, which was assigning the message over and over
again, to get rid of the already parsed portions.
First fixed by cutting the consumed parts off without copying the message.
Runtime for ~12000 SAs is now around 20 seconds.
Further optimization brought the runtime down to roughly 1-2 seconds
by using an fd to read through the message variable.
Closesstrongswan/strongswan#103.
The code to support parallel Netlink queries (commit 3c7193f) made use
of nlmsg_len member from struct nlmsghdr to allocate and copy the
responses. Since NLMSG_NEXT is later used to parse these responses, they
must be aligned, or the results are undefined.
Signed-off-by: Thomas Egerer <thomas.egerer@secunet.com>
macOS supports AES_GCM_ICV16 natively using PF_KEYv2.
This change enables AES_GCM if the corresponding definition is detected
in the headers.
With this change it is no longer necessary to use the libipsec module to
use AES_GCM on macOS.
Closesstrongswan/strongswan#107.
Removing and readding the entry to a potentially different row/segment,
while driving out waiting and new threads, could prevent threads from
acquiring the SA even if they were waiting to check it out by unique
ID (which doesn't change), or if they were just trying to enumerate it.
With this change the row and segment doesn't change anymore and waiting
threads may acquire the SA. However, those looking for an IKE_SA by SPIs
might get one back that has a different SPI (but that's probably not
something that happens very often this early).
This was noticed because we check out SAs by unique ID in the Android
app to terminate them after failed retransmits if we are not reestablishing
the SA (otherwise we continue), and this sometimes failed.
Fixes: eaedcf8c00 ("ike-sa-manager: Add method to change the initiator SPI of an IKE_SA")
Instead of logging the search parameters for IKE configs (which were already
before starting the lookup) we log the configured settings.
The peer config lookup is also changed slightly by doing the IKE config
match first and skipping some checks if that or the local peer identity
doesn't match.
Although being already logged on level 2, these messages are usually just
confusing if they pop up randomly in the log when e.g. querying the configs
or installing traps. So after this the log messages will only be logged when
actually proposing or selecting traffic selectors during IKE.
This way we don't rely on the order of equally matching configs as
heavily anymore (which is actually tricky in vici) and this also doesn't
require repeating weak algorithms in all configs that might potentially be
selected if there are some clients that require them.
There is currently no ordering, so an explicitly configured exactly matching
proposal isn't a better match than e.g. the default proposal that also
contains the proposed algorithms.
In some scenarios we might find multiple usable peer configs with different
IKE proposals. This is a problem if we use a config with non-matching
proposals that later causes IKE rekeying to fail. It might even be a problem
already when creating the CHILD_SA if the proposals of IKE and CHILD_SA
are consistent.
This allows switching to probing mode if the client is on a public IP
and this is the active task and connectivity gets restored. We only add
NAT-D payloads if we are currently behind a NAT (to detect changed NAT
mappings), a MOBIKE update that might follow will add them in case we
move behind a NAT.
In case the PRF's set_key() or allocate_bytes() method failed, skeyseed
was not initialized and the chunk_clear() call later caused a crash.
This could have happened with OpenSSL in FIPS mode when MD5 was
negotiated (and test vectors were not checked, in which case the PRF
couldn't be instantiated as the test vectors would have failed).
MD5 is not included in the default proposal anymore since 5.6.1, so
with recent versions this could only happen with configs that are not
valid in FIPS mode anyway.
Fixes: CVE-2018-10811
We now check if there are other routes tracked for the same destination
and replace the installed route instead of just removing it. Same during
installation, where we previously didn't replace existing routes due to
NLM_F_EXCL. Routes with virtual IPs as source address are preferred over
routes without.
This should allow using trap policies with virtual IPs on Linux.
Fixes#85, #2162.
The client identifier serves as unique identifier just like a unique MAC
address would, so even with identity_leases disabled some DHCP servers
might assign unique leases per identity.
This increases the chances that subject DNs that might have been cut
off with the arbitrary previous limit of 64 bytes might now be sent
successfully.
The REQUEST message has the most static overhead in terms of other
options (17 bytes) as compared to DISCOVER (5) and RELEASE (7).
Added to that are 3 bytes for the DHCP message type, which means we have
288 bytes left for the two options based on the client identity (host
name and client identification). Since both contain the same value, a
FQDN identity, which causes a host name option to get added, may be
142 bytes long, other identities like subject DNs may be 255 bytes
long (the maximum for a DHCP option).
According to RFC 2131, the minimum size of the 'options' field is 312
bytes, including the 4 byte magic cookie. There also does not seem to
be any restriction regarding the message length, previously the length
was rounded to a multiple of 64 bytes. The latter might have been
because in BOOTP the options field (or rather vendor-specific area as it
was called back then) had a fixed length of 64 bytes (so max(optlen+4, 64)
might actually have been what was intended), but for DHCP the field is
explicitly variable length, so I don't think it's necessary to pad it.
Since we won't read from the socket reducing the receive buffer saves
some memory and it should also minimize the impact on other processes that
bind the same port (Linux distributes packets to the sockets round-robin).
DHCP servers will respond to port 67 if giaddr is non-zero, which we set
if we are not broadcasting. While such messages are received fine via
RAW socket the kernel will respond with an ICMP port unreachable if no
socket is bound to that port. Instead of opening a dummy socket on port
67 just to avoid the ICMPs we can also just operate with a single
socket, bind it to port 67 and send our requests from that port.
Since SO_REUSEADDR behaves on Linux like SO_REUSEPORT does on other
systems we can bind that port even if a DHCP server is running on the
same host as the daemon (this might have to be adapted to make this work
on other systems, but due to the raw socket the plugin is not that portable
anyway).
The previous code compared the port in the packet to the client port and, if
successful, checked it also against the server port, which, therefore, never
matched, but due to incorrect offsets did skip the BPF_JA. If the client port
didn't match the code also skipped to the instruction after the BPF_JA.
However, the latter was incorrect also and processing would have continued at
the next instruction anyway. Basically, DHCP packets to any port were accepted.
What's not fixed with this is that the kernel returns an ICMP Port
unreachable for packets sent to the server port (67) because we don't
have a socket bound to it.
Fixes: f0212e8837 ("Accept DHCP replies on bootps port, as we act as a relay agent if server address configured")
We don't have MOBIKE and the fallback to reauthentication does also not
make much sense as that doesn't affect the CHILD_SAs for IKEv1. So
instead of complicating the code we just ignore roam events for IKEv1
for now.
Closesstrongswan/strongswan#100.
In very early versions routed CHILD_SAs were attached to IKE_SAs, since
that's not the case anymore (they are handled via trap manager), we can
remove this special handling.
This algorithm uses a fixed-length key and we MUST NOT send a key length
attribute when proposing such algorithms.
While we could accept transforms with key length this would only work as
responder, as original initiator it wouldn't because we won't know if a
peer requires the key length. And as exchange initiator (e.g. for
rekeyings), while being original responder, we'd have to go to great
lengths to store the condition and modify the sent proposal to patch in
the key length. This doesn't seem worth it for only a partial fix.
This means, however, that ChaCha20/Poly1305 can't be used with previous
releases (5.3.3 an newer) that don't contain this fix.
Fixes#2614.
Fixes: 3232c0e64e ("Merge branch 'chapoly'")
Since these are installed overlapping (like during a rekeying) we have to use
the same (unique) marks (and possibly reqid) that were used previously,
otherwise, the policy installation will fail.
Fixes#2610.
If the responder is behind a NAT that remaps the response from the
statically forwarded port 500 to a new external port (as Azure seems to be
doing) we should still switch to port 4500 if we used port 500 so far as
it would not have been possible to send any messages to it if it wasn't
really port 500 (we only add a non-ESP marker if neither port is 500).
If a Quick mode is initiated for a CHILD_SA that is already installed
we can identify this situation and rekey the already installed CHILD_SA.
Otherwise we end up with several CHILD_SAs in state INSTALLED which
means multiple calls of child_updown are done. Unfortunately,
the deduplication code later does not call child_updown() (so up and down
were not even).
Closesstrongswan/strongswan#95.
Until now the configuration available to user for HW offload were:
hw_offload = no
hw_offload = yes
With this commit users will be able to configure auto mode using:
hw_offload = auto
Signed-off-by: Adi Nissim <adin@mellanox.com>
Reviewed-by: Aviv Heller <avivh@mellanox.com>
Until now there were 2 hw_offload modes: no/yes
* hw_offload = no : Configure the SA without HW offload.
* hw_offload = yes : Configure the SA with HW offload.
In this case, if the device does not support
offloading, SA creation will fail.
This commit introduces a new mode: hw_offload = auto
----------------------------------------------------
If the device and kernel support HW offload, configure
the SA with HW offload, but do not fail SA creation otherwise.
Signed-off-by: Adi Nissim <adin@mellanox.com>
Reviewed-by: Aviv Heller <avivh@mellanox.com>
The previous code was obviously incorrect and caused strange side effects
depending on the compiler and its optimization flags (infinite looping seen
with GCC 4.8.4, segfault when destroying the private key in build() seen
with clang 4.0.0 on FreeBSD).
Fixes#2579.
If we receive an INVALID_KE_PAYLOAD notify we should not just retry
with the requested DH group without checking first if we actually propose
the group (or any at all).
After a rekeying we keep the inbound SA and policies installed for a
while, but the outbound SA and policies are already removed. Attempting
to update them could get the refcount in the kernel interface out of sync
as the additional policy won't be removed when the CHILD_SA object is
eventually destroyed.
When initiating a trap policy we explicitly pass the reqid along. I guess
the lookup was useful to get the same reqid if a trapped CHILD_SA is manually
initiated. However, we now get the same reqid anyway if there is no
narrowing. And if the traffic selectors do get narrowed the reqid will be
different but that shouldn't be a problem as that doesn't cause an issue with
any temporary SAs in the kernel (this is why we pass the reqid to the
triggered CHILD_SA, otherwise, no new acquire would get triggered for
traffic that doesn't match the wider trap policy).
Reqids for the same traffic selectors are now stable so we don't have to
pass reqids of previously installed CHILD_SAs. Likewise, we don't need
to know the reqid of the newly installed trap policy as we now uninstall
by name.
With IKEv1 we transmit both public DH factors (used to derive the initial
IV) besides the shared secret. So these messages could get significantly
larger than 1024 bytes, depending on the DH group (modp2048 just about
fits into it). The new default of 2048 bytes should be fine up to modp4096
and for larger groups the buffer size may be increased (an error is
logged should this happen).
This is really only needed for other exchanges like DPDs not when we
just updated the addresses. The NAT-D payloads are only used here to
detect whether UDP encapsulation has to be enabled/disabled.
If we have to remove and reinstall SAs for address updates (as with the
Linux kernel) there is a short time where there is no SA installed. If
we keep the policies installed they (or any traps) might cause acquires
and temporary kernel states that could prevent the updated SA from
getting installed again.
This replaces the previous workaround to avoid plaintext traffic leaks
during policy updates, which used low-priority drop policies.
This can be useful if routing rules (instead of e.g. route metrics) are used
to switch from one to another interface (i.e. from one to another
routing table). Since we currently don't evaluate routing rules when
doing the route lookup this is only useful if the kernel-based route
lookup is used.
Resolvesstrongswan/strongswan#88.
The counter does not tell us what task is actually queued, so we might
ignore the response to an update (with NAT-D payloads) if only an address
update is queued.
Instead of destroying the new task and keeping the existing one we
update any already queued task, so we don't loose any work (e.g. if a
DPD task is active and address update is queued and we'd actually like
to queue a roam task).
This is currently not an issue for CHILD_SA rekeying tests as these only
check rekeyings of the CHILD_SA created with the IKE_SA, i.e. there is
no previous DH group to reuse.
For the CHILD_SA created with the IKE_SA the group won't be set in the
proposal, so we will use the first one configure just as if the SA was
created new with a CREATE_CHILD_SA exchange. I guess we could
theoretically try to use the DH group negotiated for IKE but then this
would get a lot more complicated as we'd have to check if that group is
actually contained in any of the CHILD_SA's configured proposals.
This way we get proper error handling if the DH group the peer requested
is not actually supported for some reason (otherwise we'd just retry to
initiate with the configured group and get back another notify).
This fixes two issues, one is a bug if a DH group is configured for the
local ESP proposals and charon.prefer_configured_proposals is disabled.
This would cause the DH groups to get stripped not from the configured but
from the supplied proposal, which usually already has them stripped. So
the proposals wouldn't match. We'd have to always strip them from the local
proposal. Since there are apparently implementations that, incorrectly, don't
remove the DH groups in the IKE_AUTH exchange (e.g. WatchGuard XTM25
appliances) we just strip them from both proposals. It's a bit more lenient
that way and we don't have to complicate the code to only clone and strip the
local proposal, which would depend on a flag.
References #2503.
IKE_SAs newly created via HA_IKE_ADD message don't have any IKE or peer
config assigned yet (this happens later with an HA_IKE_UPDATE message).
And because the state is initially set to IKE_CONNECTING the roam() method
does not immediately return, as it later would for passive HA SAs. This
might cause the check for explicitly configured local addresses to crash
the daemon with a segmentation fault.
Fixes#2500.
Otherwise, the remote identity is ignored when matching owner identities
of PSKs and this way matching PSKs that explicitly have %any assigned is
improved.
Fixes#2497.
If users want to associate secrets with any identity, let 'em. This is
also possible with vici and might help if e.g. the remote identity is
actually %any as that would match a PSK with local IP and %any better
than one with local and different remote IP.
Fixes#2497.
This allows us to use it without having to initialize libcharon, which
was required for the logging (we probably could have included debug.h
instead of daemon.h to workaround that but this seems more correct).
libstrongswan and kernel-netlink are the only two components which do
not adhere to the naming scheme used for all other tests. If the tests
are run by an external application this imposes problems due to clashing
names.
Signed-off-by: Thomas Egerer <thomas.egerer@secunet.com>
RFC 8247 demoted it to SHOULD NOT. This might break connections with
Windows clients unless they are configured to use a stronger group or
matching weak proposals are configured explicitly on the server.
References #2427.
If enabled, add the RADIUS Class attributes received in Access-Accept messages
to RADIUS accounting messages as suggested by RFC 2865 section 5.25.
Fixes#2451.
We do something similar in reestablish() for break-before-make reauth.
If we don't abort we'd be sending an IKE_AUTH without any TS payloads.
References #2430.
It seems that there is a race, at least in 10.13, that lets
if_indextoname() fail for the new TUN device. So we delay the call a bit,
which seems to "fix" the issue. It's strange anyway that the previous
delay was only applied when an iface entry was already found.
The value of DHCP_OPTEND is 255. When it is assigned this result in a
sign change as the positive int constant is cast to a signed char and -1
results. Clang 4.0 complains about this.
This causes problems e.g. on Android where we handle the alert (and
reestablish the IKE_SA) even though it usually is no problem if the
peer retries with the requested group. We don't consider it as a
failure on the initiator either.
In case we send retransmits for an IKE_SA_INIT where we propose a DH
group the responder will reject we might later receive delayed responses
that either contain INVALID_KE_PAYLOAD notifies with the group we already
use or, if we retransmitted an IKE_SA_INIT with the requested group but
then had to restart again, a KE payload with a group different from the
one we proposed. So far we didn't change the initiator SPI when
restarting the connection, i.e. these delayed responses were processed
and might have caused fatal errors due to a failed DH negotiation or
because of the internal retry counter in the ike-init task. Changing
the initiator SPI avoids that as we won't process the delayed responses
anymore that caused this confusion.
If an interface is renamed we already have an entry (based on the
ifindex) allocated but previously only set the usable state once
based on the original name.
Fixes#2403.
When querying SAs the keys will end up in this buffer (the allocated
messages that are returned are already wiped). The kernel also returns
XFRM_MSG_NEWSA as response to XFRM_MSG_ALLOCSPI but we can't distinguish
this here as we only see the response.
References #2388.
When requiring unique flags for CHILD_SAs, allow the configuration to
request different marks for each direction by using the %unique-dir keyword.
This is useful when different marks are desired for each direction but the
number of peers is not predefined.
An example use case is when implementing a site-to-site route-based VPN
without VTI devices.
A use of 0.0.0.0/0 - 0.0.0.0/0 traffic selectors with identical in/out marks
results in outbound traffic being wrongfully matched against the 'fwd'
policy - for which the underlay 'template' does not match - and dropped.
Using different marks for each direction avoids this issue as the 'fwd' policy
uses the 'in' mark will not match outbound traffic.
Closesstrongswan/strongswan#78.
Initiation might later fail, of course, but we don't really
require an IP address when installing, that is, unless the remote
traffic selector is dynamic. As that would result in installing a
0.0.0.0/0 remote TS which is not ideal when a single IP is expected as
remote.
This splits the SA installation also on the initiator, so we can avoid
installing the outbound SA if we lost a rekey collision, which might
have caused traffic loss depending on the timing of the DELETEs that are
sent in both directions.
If multiple threads want to enumerate child-cfgs and potentially lock
other locks (e.g. check out IKE_SAs) while doing so a deadlock could
be caused (as was the case with VICI configs with start_action=start).
It should also improve performance for roadwarrior connections and lots
of clients connecting concurrently.
Fixes#2374.
This prevented new listeners from receiving notifies if they joined
after another listener disconnected previously, and if they themselves
disconnected their old connection would prevent them again from getting
notifies.
Multiple CHILD_SAs sharing the same traffic selectors (e.g. during
make-before-break reauthentication) also have the same reqid assigned.
If all matching entries are removed we could end up without entry even
though an SA exists that still uses these traffic selectors.
Fixes#2373.
This way we get the log message in stroke and swanctl as last message
when establishing a connection. It's already like this for the IKE_SA
where IKE_ESTABLISHED is set after the corresponding log message.
Fixes#2364.
Due to the lookup based on the mapped algorithm ID the resulting AH
proposals were invalid.
Fixes#2347.
Fixes: 8456d6f5a8 ("ikev1: Don't require AH mapping for integrity algorithm when generating proposal")
This is similar to the eap-aka-3gpp2 plugin. K (optionally concatenated
with OPc) may be configured as binary EAP secret in ipsec.secrets or
swanctl.conf.
Based on a patch by Thomas Strangert.
Fixes#2326.
If we find a redundant CHILD_SA (the peer probably rekeyed the SA before
us) we might not want to delete the old SA because the peer might still
use it (same applies to old CHILD_SAs after rekeyings). So only delete
them if configured to do so.
Fixes#2358.
traffic_selector_t::to_subnet() always sets the net/host (unless the
address family was invalid).
Fixes: 3070697f9f ("ike: support multiple addresses, ranges and subnets in IKE address config")
Interestingly, this doesn't show up in the regression tests because the
compiler removes the first assignment (and thus the allocation) due to
-O2 that's included in our default CFLAGS.
The correct truncation is 128-bit but some implementations insist on
using 96-bit truncation. With strongSwan this can be negotiated using
an algorithm identifier from a private range. But this doesn't work
with third-party implementations. This adds an option to use 96-bit
truncation even if the official identifier is used.
After deleting a rekeyed CHILD_SA we uninstall the outbound SA but don't
destroy the CHILD_SA (and the inbound SA) immediately. We delay it
a few seconds or until the SA expires to allow delayed packets to get
processed. The CHILD_SA remains in state CHILD_DELETING until it finally
gets destroyed.
The responder has all the information needed to install both SAs before
the initiator does. So if the responder immediately installs the outbound
SA it might send packets using the new SA which the initiator is not yet
able to process. This can be avoided by delaying the installation of the
outbound SA until the replaced SA is deleted.
Using install() for the inbound SA and register_outbound() for the
outbound SA followed by install_policies(), will delay the installation of
the outbound SA as well as the installation of the outbound policies
in the kernel until install_outbound() is called later.
Unfortunately, we can't just add the generated C file to the sources in
Makefile.am as the linker would remove that object file when it notices
that no symbol in it is ever referenced. So we include it in the file
that contains the library initialization, which will definitely be
referenced by the executable.
This allows building an almost stand-alone static version of e.g. charon
when building with `--enable-monolithic --enable-static --disable-shared`
(without `--disable-shared` libtool will only build a version that links
the libraries dynamically). External libraries (e.g. gmp or openssl) are
not linked statically this way, though.
By using the total retransmit timeout, modifications of timeout settings
automatically reflect on the value of xfrm_acq_expires. If set, the
value of xfrm_acq_expires configured by the user takes precedence over
the calculated value.
When establishing a traffic-triggered CHILD_SA involves the setup of an
IKE_SA more than one exchange is required. As a result the temporary
acquire state may have expired -- even if the acquire expiration
(xfrm_acq_expires) time is set properly (165 by default). The expire
message sent by the kernel is not processed in charon since no trap can
be found by the trap manager.
A possible solution could be to track allocated SPIs. But since this is
a corner case and the tracking introduces quite a bit of overhead, it
seems much more sensible to add a new state if the update of a state
fails with NOT_FOUND.
Signed-off-by: Thomas Egerer <thomas.egerer@secunet.com>
Upcoming FreeBSD kernels will support updating the addresses of existing
SAs with new SADB_X_EXT_NEW_ADDRESS_SRC|DST extensions for the SADB_UPDATE
message.