osmo-hnbgw

Commit Graph

Author	SHA1	Message	Date
Neels Hofmeyr	311bfb5983	log osmo_fsm timeouts set osmo_fsm_log_timeouts(true); Change-Id: Ic1ca03f06fbdef5a3fbe503e4414a780eb3e0fcc	2023-02-24 15:19:24 +01:00
Neels Hofmeyr	ed424d5be4	context map: introduce RUA and SCCP FSMs to fix leaks Refactor the entire RUA <-> SCCP connection-oriented message forwarding: - conquer confusion about hnbgw_context_map release behavior, and - eradicate SCCP connection leaks. Finer points: == Context map state == So far, we had a single context map state and some flags to keep track of both the RUA and the SCCP connections. It was easy to miss connection cleanup steps, especially on the SCCP side. Instead, the two FSMs clearly define the RUA and SCCP conn states separately, and each side takes care of its own release needs for all possible scenarios. - When both RUA and SCCP are released, the context map is discarded. - A context map can stay around to wait for proper SCCP release, even if the RUA side has lost the HNB connection. - Completely drop the async "context mapper garbage collection", because the FSMs clarify the release and free steps, synchronously. - We still keep a (simplified) enum for global context map state, but this is only used so that VTY reporting remains mostly unchanged. == Context map cleanup confusion == The function context_map_hnb_released() was the general cleanup function for a context map. Instead, add separate context_map_free(). == Free context maps separately from HNB == When a HNB releases, talloc_steal() the context maps out of the HNB specific hnb_ctx, so that they are not freed along with the HNB state, possibly leaving SCCP connections afloat. (It is still nice to normally keep context maps as talloc children of their respective hnb_ctx, so talloc reports show which belongs to which.) So far, context map handling found the global hnb_gw pointer via map->hnb_ctx->gw. But in fact, a HNB may disappear at any point in time. Instead, use a separate hnb_gw pointer in map->gw. == RUA procedure codes vs. SCCP prims == So far, the RUA rx side composed SCCP prims to pass on: RUA rx ---SCCP-prim--> RANAP handling ---SCCP-prim--> SCCP tx That is a source of confusion: a RUA procedure code should not translate 1:1 to SCCP prims, especially for RUA id-Disconnect (see release charts below). Instead, move SCCP prim composition over to the SCCP side, using FSM events to forward: RUA rx --event--> RUA FSM --event--> SCCP FSM --SCCP-prim--> SCCP tx +RANAP +RANAP +RANAP RUA tx <--RUA---- RUA FSM <--event-- SCCP FSM <--event-- SCCP rx +RANAP +RANAP +RANAP Hence choose the correct prim according to the SCCP FSM state. - in hnbgw_rua.c, use RUA procedure codes, not prim types. - via the new FSM events' data args, pass msgb containing RANAP PDUs. == Fix SCCP Release behavior == So far, the normal conn release behavior was HNB HNBGW CN \| --id-Disconnect--> \| ---SCCP-Released--> \| Iu-ReleaseComplete \| \| <--SCCP-RLC-------- \| (no data) Instead, the SCCP release is now in accordance with 3GPP TS 48.006 9.2 'Connection release': The MSC sends a SCCP released message. This message shall not contain any user data field. i.e.: HNB HNBGW CN \| --id-Disconnect--> \| ---Data-Form-1(!)--> \| Iu-ReleaseComplete \| \| <--SCCP-Released---- \| (no data) \| \| ---SCCP-RLC--------> \| (no data) (Side note, the final SCCP Release Confirm step is taken care of implicitly by libosmo-sigtran's sccp_scoc.c FSM.) If the CN fails to respond with SCCP-Released, on new X31 timeout, osmo-hnbgw will send an SCCP Released to the CN as fallback. == Memory model for message dispatch == So far, an osmo_scu_prim aka "oph" was passed between RUA and SCCP handling code, and the final dispatch freed it. Every error path had to take care not to leak any oph. Instead, use a much easier and much more leakage proof memory model, inspired by fixeria: - on rx, dispatch RANAP msgb that live in OTC_SELECT. - no code path needs to msgb_free() -- the msgb is discarded via OTC_SELECT when handling is done, error or no error. - any code path may also choose to store the msgb for async dispatch, using talloc_steal(). The user plane mapping via MGW and UPF do that. - if any code path does msgb_free(), that would be no problem either (but none do so now, for simplicity). == Layer separation == Dispatch all connection-oriented RUA tx via the RUA FSM and SCCP tx via the SCCP FSM, do not call rua_tx_dt() or osmo_sccp_user_sap_down() directly. == Memory model for decoded ranap_message IEs == Use a talloc destructor to make sure that the ranap_message IEs are always implicitly freed upon talloc_free(), so that no code path can possibly forget to do so. == Implicit cleanup by talloc == Use talloc scoping to remove a bunch of explicit cleanup code. For example, make a chached message a talloc child of its handler: talloc_steal(mgw_fsm_priv, message); mgw_fsm_priv->ranap_rab_ass_req_message = message; and later implicitly free 'message' by only freeing the handler: talloc_free(mgw_fsm_priv) Related: SYS#6297 Change-Id: I6ff7e36532ff57c6f2d3e7e419dd22ef27dafd19	2023-02-24 15:19:24 +01:00
Neels Hofmeyr	60a52e4121	cosmetic: rename context_map_deactivate Rename context_map_deactivate() to context_map_hnb_released(). This function is called only when the HNB is released / lost. Change-Id: I6dcb26c94558fff28faf8f823e490967a9baf2ec	2023-02-23 01:15:54 +01:00
Neels Hofmeyr	3f4d645890	Deprecate 'sccp cr max-payload-len', remove SCCP CR limit code The SCCP CR payload length limit is now implemented in libosmo-sigtran v1.7.0. The limit no longer needs to be enforced in osmo-hnbgw. This reverts commit `2c91bd66a1`, except for keeping the cfg option, marked deprecated, and not doing anything. Fixes: OS#5906 Related: SYS#5968 Related: OS#5579 Depends: I174b2ce06a31daa5a129c8a39099fe8962092df8 (osmo-ttcn3-hacks) Change-Id: I18dece84b33bbefce8617fbb0b2d79a7e5adb263	2023-02-15 03:34:43 +01:00
Neels Hofmeyr	8eefcbee92	UE state leak: when HNB re-registers, discard previous UE state User reports show leaked UE contexts over time, in a scenario where HNB regularly disconnect and reconnect. So far, when we receive a HNB REGISTER REQ, we log as "duplicated" and continue to use the hnb_context unchanged. But it seems obvious that a HNB that registers does not expect any UE state to remain. It will obviously not tear down UE contexts (HNBAP or RUA) that have been established before the HNB REGISTER REQUEST. These hence remain in osmo-hnbgw indefinitely. When receiving a HNB REGISTER REQUEST, release all its previous UE state. Allow the HNB to register with a clean slate. The aim is to alleviate the observed build-up of apparently orphaned UE contexts, in order to avoid gradual memory exhaustion. Related: SYS#6297 Change-Id: I7fa8a04cc3b2dfba263bda5b410961c77fbed332	2023-02-11 03:32:34 +01:00
arehbein	76c4203552	osmo-hnbgw: Transition to use of 'telnet_init_default' Related: OS#5809 Change-Id: Id3256d09f62e802cc62fa9ba8aaafd403ccbb53e	2022-12-23 11:13:46 +00:00
Max	a7fcbe100c	ctrl: take both address and port from vty config Change-Id: If5b80364c28fb1ca2bc00f4ece851de64c8ce6b1	2022-12-17 21:24:13 +03:00
Pau Espin	e62af4d46a	Introduce support for libosmo-mgcp-client MGW pooling Large RAN installations may benefit from distributing the RTP voice stream load over multiple media gateways. libosmo-mgcp-client supports MGW pooling since version 1.8.0 (more than one year ago). OsmoBSC has already been making use of it since then (see osmo-bsc.git 8d22e6870637ed6d392a8a77aeaebc51b23a8a50); lets use this feature in osmo-hngw too. This commit is also part of a series of patches cleaning up libosmo-mgcp-client and slowly getting rid of the old non-mgw-pooled VTY configuration, in order to keep only 1 way to configure libosmo-mgcp-client through VTY. Related: SYS#5091 Related: SYS#5987 Change-Id: I371dc773b58788ee21037dc25d77f556c89c6b61	2022-10-20 17:03:06 +02:00
Pau Espin	b9be0ea93e	Clear SCTP tx queue upon SCTP RESTART notification Depends: libosmo-netif.git Change-Id Iecb0a4bc281647673d2930d1f1586a2df231af52 Related: SYS#6113 Change-Id: I60adf35e5b5713d38c4584615e059875dcb74bd7	2022-10-17 13:57:17 +02:00
Pau Espin	bbad8dec36	hnb_read_cb(): -EBADF must be returned if conn is freed to avoid use-after-free Otherwise the libosmo-netif stream API may continue accessing the conn after returning if the socket has the WRITE flag active in the same main loop iteration. Change-Id: I628c59a88d94d299f432f405b37fbe602381d47e	2022-10-01 21:21:24 +02:00
Pau Espin	c923d19b7b	hnb_read_cb: use local var to reduce get_ofd() calls Change-Id: Ic7058b5a05b0d34b80617006d4e929a523212221	2022-10-01 21:21:24 +02:00
Pau Espin	5f19597b02	Close conn when receiving SCTP_ASSOC_CHANGE notification It was seen on a real pcap trace (sctp & gsmtap_log) that the kernel stack may decide to kill the connection (sending an ABORT) if it fails to transmit some data after a while: ABORT Cause code: "Protocol violation (0x000d)", Cause Information: "Association exceeded its max_retrans count". When this occurs, the kernel sends the MSG_NOTIFICATION,SCTP_ASSOC_CHANGE,SCTP_COMM_LOST notification when reading from the socket with sctp_recvmsg(). This basically signals that the socket conn is dead, and subsequent writes to it will result in send() failures (and receive SCTP_SEND_FAILED notification upon follow up reads). It's important to notice that after those events, there's no other sort of different event like SHUTDOWN coming in, so that's the time at which we must tell the user to close the socket. Hence, let's signal the caller that the socket is dead by returning 0, to comply with usual recv() API. Related: SYS#6113 Change-Id: If35efd404405f926a4a6cc45862eeadd1b04e08c	2022-10-01 21:21:06 +02:00
Pau Espin	1906a30ca9	Fix handling of sctp SCTP_SHUTDOWN_EVENT notification SCTP_SHUTDOWN_EVENT is a first class event, and not a subtype of SCTP_ASSOC_CHANGE (such as SCTP_SHUTDOWN_COMP). Related: SYS#6113 Change-Id: I7fa648142c07f63c55091d2a15b9d7312bcd4cec	2022-09-30 14:43:06 +02:00
Pau Espin	3bf5395102	hnbgw: Fix recent regression not closing conn upon rx of SCTP_SHUTDOWN_EVENT Before handling of OSMO_STREAM_SCTP_MSG_FLAGS_NOTIFICATION in recent commit (see Fixes: below), osmo_stream_srv_recv() and internal _sctp_recvmsg_wrapper() in libosmo-netif would return either -EAGAIN or 0 when an sctp notification was received from the kernel. After adding handling of OSMO_STREAM_SCTP_MSG_FLAGS_NOTIFICATION, the code paths for "rc == -EAGAIN" and "rc == 0" would not be executed anymore since the first branch takes preference in the if-else tree. For "rc == -EAGAIN" it's fine because the new branch superseeds what's done on the "rc == -EAGAIN" branch. However, for the "rc == 0", we forgot to actually destroy the connection. The "rc == 0" branch was basically reached when SCTP_SHUTDOWN_EVENT was received because osmo_stream_srv_recv() tried to resemble the interface of regular recv(); let's hence check for that explicitly and destroy the conn object (and the related hnb context in the process) when we receive that event. Fixes: `1de2091515` Related: SYS#6113 Change-Id: I11b6af51a58dc250975a696b98d0c0c9ff3df9e0	2022-09-22 16:39:51 +02:00
Pau Espin	1de2091515	hnbgw: Unregister HNB if SCTP link is restarted Sometimes an hNodeB may reconnect (SCTP INIT) using same SCTP tuple without closing the previous conn. This is handled by the SCTP stack by means of pushing a RESET notification up the stack to the sctp_recvmsg() user. Let's handle this by marking the HNB as unregistered, since most probably a HNB Register Req comes next as the upper layer state is considered lost. Depends: libosmo-netif.git Change-Id I0ee94846a15a23950b9d70eaaef1251267296bdd Related: SYS#6113 Change-Id: Ib22881b1a34b1c3dd350912b3de8904917cf34ef	2022-09-19 14:58:11 +02:00
Pau Espin	d046306b63	Change log level about conn becoming closed to NOTICE Change-Id: I8973990e2cc435422e62dd2a38192e7a6da4a716	2022-09-16 10:37:00 +00:00
Pau Espin	f9825cbd4a	Improve logging around hnb_context and sctp conn lifecycle Change-Id: I44c79d86924ead84246b3d4937a6becae5d29185	2022-09-14 12:16:38 +02:00
Pau Espin	930ed702b6	hnb_context_release(): Make sure assigned conn is freed Otherwise, some paths calling hnb_context_release() (like failing to transmit HNB-REGISTER-REJECT) would end up with a conn object alive with no assigned hnb_context, which is something not wanted. This way an alive conn object always has an associated hnb_context, and they are only disassociated during synchronous release path. Related: OS#5676 Change-Id: I44fea7ec74f14e0458861c92da4acf685ff695c1	2022-09-14 12:16:18 +02:00
Harald Welte	fe7c34737d	Don't process RUA messages if HNB is not registered Related: OS#5676 Change-Id: I85442e8adfefadc3bf3ed795eaef7677eb0b36e9	2022-09-13 13:00:01 +02:00
Harald Welte	c971c657c5	Abort if processing SCTP connection without HNB context It was observed that under some circumstances (after HNBAP HNB-De-Register) we end up crashing because a connection has no HNB assigned to it. Let's explicitly assert if that happens, in order clarify and avoid same sort of thing happening without clear view on what's going on. The issue will be fixed in a follow-up patch. Closes: OS#5676 Change-Id: I1eedab6f3ac974e942b02eaae41556f87dd8b6ba	2022-09-13 11:31:46 +02:00
Pau Espin	eadf523393	hnbgw: Log new SCTP HNB connections Change-Id: I07b98ff4c3199eeab11a8c1cfd9ce44ab99bca85	2022-09-13 11:31:46 +02:00
Pau Espin	419e832473	cosmetic: Fix typo in log and whitespace Change-Id: Ie2be6937bb0f44ea66397c905c5d380caa2d4cef	2022-09-13 11:31:40 +02:00
Daniel Willmann	d129e0c86e	hnbgw_hnbap: Fix memory leaks in HNBAP handling * Use osmo_stream closed_cb to call hnb_context_release() in all cases * Also call hnbap_free_hnbregisterrequesties() when sending hnb register reject Related: OS#5656 Change-Id: I3ba02b0939413c67bc8088ea1a8f2252fc2bda31	2022-08-23 18:15:02 +02:00
Daniel Willmann	2dfeb1e218	Install show talloc-context VTY commands Related: OS#5656 Change-Id: Ia4b0023028405ce065f618f536c92ea2bcd0ce15	2022-08-23 17:51:51 +02:00
Neels Hofmeyr	e6201765cf	build: add --enable-pfcp, make PFCP dep optional Related: SYS#5895 Change-Id: I6d50c60bccda767910217243bdfb4a6fad1e39c1	2022-08-09 17:57:43 +02:00
Neels Hofmeyr	1496498713	add ps_rab_ass FSM to map GTP via UPF Related: SYS#5895 Depends: If80c35c6a942bf9593781b5a6bc28ba37323ce5e (libosmo-pfcp) Change-Id: Ic9bc30f322c4c6c6e82462d1da50cb15b336c63a	2022-08-08 20:20:34 +00:00
Neels Hofmeyr	2c91bd66a1	add option to send SCCP CR without payload It is reported that a third-party SGSN is rejecting SCCP CR when the SCCP message part exceeds a certain length. The solution is to first send an SCCP CR without payload, and send the payload in a DT later. Add config option hnbgw sccp cr max-payload-len <0-999999> If the RANAP payload surpasses the given length, osmo-hnbgw will first send an SCCP CR without payload, cache the RANAP payload, and put that in an SCCP DT once the SCCP CC is received. The original idea was to limit the size of the entire SCCP part of the message, but I'm currently not sure how to determine that without copying much of the osmo_sccp code. I figured using a limit on the RANAP payload is sufficient. To avoid the error with above third-party SGSN, the easy solution is to set max-payload-len to 0, so that we always get a separate SCCP CR without payload. Related: SYS#5968 Related: I827e081eaacfb8e76684ed1560603e6c8f896c38 (osmo-ttcn3-hacks) Change-Id: If0c5c0a76e5230bf22871f527dcb2dbdf34d7328	2022-06-07 22:51:26 +02:00
Neels Hofmeyr	0ca9567fb2	use osmo_select_main_ctx(), tweak log in handle_cn_conn_conf() Upcoming patch adds to this function. Let me first combine those four LOGP() to a single one, use proper osmo_sccp_addr_to_str_c(OTC_SELECT). To be able to use OTC_SELECT, switch hnbgw.c to osmo_select_main_ctx(). Related: SYS#5968 Change-Id: I1e0ea0a883e8cf65e6cfb45ed9b6f3d8fb7c59eb	2022-06-07 18:09:19 +02:00
Philipp Maier	81f1751896	mgw_fsm: add MGW support to osmo-hnbgw osmo-hnbgw lacks support for an co-located media gateway. This makes it virtually impossible to isolate the HNB from the core network properly. Lets add MGCP support to osmo-hnbgw so that it can control a co-located media gateway to relay the RTP streams between HNB and core network. Change-Id: Ib9b62e0145184b91c56ce5d8870760bfa49cc5a4 Related: OS#5152	2022-02-24 10:51:30 +01:00
Pau Espin	dce3870429	Initial structure + import code from osmo-iuh.git Imported from osmo-iuh.git 9b4de3f401c890fc2c0dfae9e827daaaadd80db0. Change-Id: I569d221aeb83d352c1621c44c013a0e4c82fc8a8	2022-01-04 19:48:52 +01:00

30 Commits