Commit Graph

61 Commits

Author SHA1 Message Date
Harald Welte cd58308915 counters: Distinguish between normal and abnormal release cause
It is interesting to know if a release was normal (as expected/requested
by the NAS layer, typically indicating a user-requested call end) or
abnormal (radio failure, pre-emption or whatever other event that the
user did not expect).

Related: SYS#6773
Change-Id: Idd2f845b7db448064b693ac1efdc8db006a47a11
2024-04-09 08:12:03 +00:00
Harald Welte 6917c9b2cd RAB activation/modification/release statistics
Add rate_ctr based statistics for RAB activation, deactivation and
failures.  This requires us to parse RANAP in both uplink and downlink
and to iterate deep into the setup/modify/release/failure lists.

Given the way how the protocol works, the only way to distinguish an
activation from a modification is because sender and recipient know
whether a given RAB is already active at the time of the message.  So we
also need to track the activation state of each RAB.

Depends: osmo-iuh.git Change-Id I328819c650fc6cefe735093a846277b4f03e6b29
Change-Id: I198fa37699e22380909764de6a0522ac79aa1d39
2024-03-18 13:50:36 +01:00
Harald Welte ea15086584 stats: Add per-hnb paging:{cs,ps}:attempted counter
Adding counters for number of paging succeeded is much harder,
as we currently don't parse connection-oriented DL RANAP and/or any
L3 NAS in it.

Change-Id: I7648caa410dba8474d57121a8128579ba200b18f
2024-03-13 12:10:36 +01:00
Harald Welte 14c36591fc Introduce counter for per-hnb cumulative active CS RAB duration
This counter can be used to determine the traffic in Erlangs per HNB.

Change-Id: Iffb6a3f38239094551a12c872cd8474d02a5ad56
2024-03-13 12:04:20 +01:00
Harald Welte bceef626b5 vty: Print the uptime during 'show hnb' output
Change-Id: Ic414fe301b3fa5c357bd1cd8a7842b7d6333c709
2024-03-12 23:30:19 +00:00
Harald Welte cae5bf33e8 rename hnbgw_peek_l3 to hnbgw_peek_l3_ul as it's uplink only
Let's not make the functions appear more generic than they are: They
all explicitly only support uplink so far.

Change-Id: I7db0d933a8f17f8c410141f43dab12b8c19fc8ae
2024-03-12 23:28:50 +00:00
Harald Welte 51fb6b5578 cosmetic: Rename hnbgw_rx_ranap and friends to *_rx_ranap_udt_ul
Those functions have always been handling only unit-data in uplink
direction, so let's reflect that in the function name to prevent
anyone assuming they process connection-oriented RANAP and/or
the downlink direction.

Change-Id: I29e8176ac19b2e7390e5950b8d0944c8961e491f
2024-03-12 23:28:50 +00:00
Harald Welte 2b771919dc cosmetic: talk about cs_rab_ass_* when working in CS domain
There's RAB assignment for CS and PS, let's not name functions
too generic to avoid confusion.

Change-Id: Id49ef931e11958315d612d764bbc488ef43f4290
2024-03-12 20:47:59 +00:00
Harald Welte 31178cd2cb stats: Introduce basic counters for RANAP unit-data from CN links
Let's keep track of the various unit-data messages we receive from
CN-link peers: RESET, RESET-ACK, PAGING, UNKNOWN, UNSUPPORTED,
OVERLOAD_IND, ERROR_IND.

Change-Id: Ibe4c73b0288ea20ca3d54519b42bc7cb0e9e61b2
2024-03-12 20:47:50 +00:00
Harald Welte f43fbc9c40 stats: Introduce CNPOOL counter macro
Ever since the ancient days of OpenBSC, we've always had macros to
abbreviate the cumbersome 'rate_ctr_inc(rate_ctr_group_get_ctr(...))'
construct.  Somehow this was missed here.

Also, since 2018 (libosmocore.git 175a4ae93aaf1068b61041dca12962059d65ed55)
we have rate_ctr_inc2() to make it even simpler...

Change-Id: I5b9e6b2069eed533c472cea8483a370f7fcc8e74
2024-03-12 20:47:40 +00:00
Harald Welte e874946bfb stats: Introduce CNLINK counter macros
Ever since the ancient days of OpenBSC, we've always had macros to
abbreviate the cumbersome 'rate_ctr_inc(rate_ctr_group_get_ctr(...))'
construct.  Somehow this was missed here.

Also, since 2018 (libosmocore.git 175a4ae93aaf1068b61041dca12962059d65ed55)
we have rate_ctr_inc2() to make it even simpler...

Change-Id: I2754d7aa9d6c3e333bd729bc6e924c502b40cdad
2024-03-12 20:47:35 +00:00
Harald Welte 30691e7b90 Various new per-hnb RANAP and RUA counters
example output while running HNBGW_Tests.ttcn:

OsmoHNBGW# show rate-counters skip-zero
hNodeB 0 (001-01-L2-R3-S4-C1):
          iuh:established:         17 (0/s 4/m 0/h 0/d) Number of times Iuh link was established
        rua:ps:connect:ul:          3 (0/s 0/m 0/h 0/d) Received RUA Connect requests (PS Domain)
        rua:cs:connect:ul:          9 (0/s 4/m 0/h 0/d) Received RUA Connect requests (CS Domain)
     rua:cs:disconnect:ul:          1 (0/s 0/m 0/h 0/d) Received RUA Disconnect requests in uplink (CS Domain)
     rua:ps:disconnect:dl:          2 (0/s 0/m 0/h 0/d) Transmitted RUA Disconnect requests in downlink (PS Domain)
     rua:cs:disconnect:dl:          6 (0/s 3/m 0/h 0/d) Transmitted RUA Disconnect requests in downlink (CS Domain)
rua:ps:direct_transfer:ul:          1 (0/s 0/m 0/h 0/d) Received RUA DirectTransfer in uplink (PS Domain)
rua:cs:direct_transfer:ul:         10 (0/s 5/m 0/h 0/d) Received RUA DirectTransfer in uplink (CS Domain)
rua:ps:direct_transfer:dl:          2 (0/s 0/m 0/h 0/d) Transmitted RUA DirectTransfer in downlink (PS Domain)
rua:cs:direct_transfer:dl:         12 (0/s 6/m 0/h 0/d) Transmitted RUA DirectTransfer in downlink (CS Domain)

Related: SYS#6773
Change-Id: I61bd4f51ec88fd93d8640d39228ac85f5ac5b69b
2024-03-12 21:35:40 +01:00
Harald Welte e4d0951a86 New per-hnb rate_ctr and stat_item groups; track uptime
Let's add a per-hnb rate_ctr and stat_item group.  Only one initial
counter (number of Iuh establishments) and one initial stat_item
(uptime/downtime) is implemented.

Related: SYS#6773
Change-Id: I26d7c3657cdaf7c6ba5aa10a0677381ab099f8dd
2024-03-12 21:35:40 +01:00
Harald Welte 7ec5cb7280 Introduce concept of per-HNB persistent data structure
This allows us to add a new "hnb-policy closed", which means we do not
accept any random HNB connection anymore, but only those whose identity
is pre-configured explicitly using administrative means.

Furthermore, this new data structure will allow us (in future patches)
to keep persistent data like uptime / downtime of each HNB.

Related: SYS#6773
Change-Id: Ife89a7a206836bd89334d19d3cf8c92969dd74de
2024-03-12 08:18:12 +00:00
Harald Welte fa4da8f980 Introduce umts_cell_id_from_str() as inverse of umts_cell_id_name()
We are about to introduce the stringified UMTS cell identifier to the
VTY, and for that we need to not only print but also parse the related
string.

Related: SYS#6773
Change-Id: I6da947d1f2316241e44e53bb6aaec4221cfaa2c0
2024-03-07 18:24:09 +01:00
Neels Hofmeyr eba9cbf9d1 pfcp: implement sending Network Instance IEs
Allow configuring specific Network Instance names for the Core and
Access sides, to send to the UPF, to allow the UPF to pick the proper
network interface to create GTP tunnels on.

Add VTY cfg 'hnbgw' / 'pfcp' / 'netinst (access|core) NAME' to allow
configuring Network Interface values to send in PFCP. These are "dotted"
domain name strings, as in APN.

Add these Network Interface names to the PFCP messages' detection as
well as forwarding rules, each one indicating the side that it is
detecting on or forwarding to.

This helps lift osmo-hnbgw's PFCP support out of lab situations to a
proper production scenario, where the core and access networks are in
separate subnets, with osmo-hnbgw + UPF as the gateway.

For example, in osmo-hnbgw, configure

  hnbgw
   pfcp
    netinst access my-ran
    netinst core my-core

and in osmo-upf, configure

  netinst
   add my-ran 10.9.8.7
   add my-core 1.2.3.4

In effect, all GTP tunnel endpoints towards the Access side will be
bound on 10.9.8.7, and all GTP tunnel endpoints towards the Core side
will be bound on 1.2.3.4.

Related: SYS#5895
Change-Id: Ief53dbfacf1645c32a07847d590c4884d4c8ca56
2024-01-21 02:48:38 +01:00
Neels Hofmeyr d482f8666c cfg: add 'hnbgw' / 'plmn MCC MNC'
According to 3GPP TS 25.413 8.26.2.2, "The RNC shall include the Global
RNC-ID IE in the RESET message." To be able to do that, osmo-hnbgw needs
to know the local PLMN.

Introduce explicit knowledge of the local PLMN by config, and use this
configured local PLMN in places where the local PLMN was so far derived
otherwise.

Subsequent patches will separately add the RNC-ID to the RANAP RESET
messages.

Since the PLMN is required to send correct RESET and RESET ACK, include
the new 'plmn' config item in all example configurations.

Related: SYS#6441
Change-Id: If404c3859acdfba274c719c7cb7d16f97d831a2a
2023-06-21 04:16:13 +02:00
Neels Hofmeyr 1b45792cbd cnpool: return Paging Resp to the exact CN link that Paged
Implement Paging record, remembering which CN link paged for which
mobile identity, for a set time period roundabout 15 seconds.

When receiving a Paging Response from a UE, connect it to the CN link
that sent the Paging.

Related: SYS#6412
Change-Id: I907dbcaeb442ca5630146f8cad40601c448fc40e
2023-06-21 04:16:13 +02:00
Neels Hofmeyr e14d3bd6fe detect in/active CN links by RANAP RESET
Implement cnlink FSM. This is a copy of osmo-bsc/bssmap_reset.c, except
for the connection success/failure counting.

The reason to exclude the CONN_CFM_FAILURE/_SUCCESS: it is a rather
experimental feature and should probably rather be covered by PCSTATE
notifications and protocol layer timeout tuning (e.g. SCCP ia timers).

Related: SYS#6412
Change-Id: Id3eefdea889a736fd5957b80280fa45b9547b792
2023-06-21 04:16:13 +02:00
Neels Hofmeyr fb627de4bb add ranap_domain_name()
As simple as it may seem, turning the RANAP CS/PS domain indicator to a
string so far is cumbersome. It is usually only CS or PS, but to be
accurate we should also expect invalid values. After the umpteenth
switch statement breaking up otherwise trivial code, I decided that yes,
after all, a value_string[] is a good idea even for just two enum values
that have only two-letter names.

Callers will follow in upcoming patches.

Related: SYS#6412
Change-Id: Ib3c5d772ccea8c854eec007de5c996d1b6640bc0
2023-06-21 04:16:13 +02:00
Neels Hofmeyr b1c0bb19e2 cnpool: add context_map_cnlink_lost() handling
When proper RANAP RESET handling is in place [1], a specific CN link may
be detected to be disconnected or reconnected at any point.
[1] Id3eefdea889a736fd5957b80280fa45b9547b792

When that happens, all related context maps should disconnect RUA and
SCCP links immediately. Add the context map side of this, so that
context_map_cnlink_lost() is ready to be dispatched in [1].

Related: SYS#6412
Change-Id: Ic0a6fcfb747dc093528ca2bd12a269ad390d465c
2023-06-21 04:16:10 +02:00
Neels Hofmeyr 11fa15e4d9 tweak lots of logging
Make lots of small tweaks in logging around RUA to SCCP context maps, CN
link selection and HNBAP:

- remove duplicate log context (e.g. LOGHNB() already shows the HNB Id)
- bring more sense into logging categories and levels / reduce noise
- add and clarify the details being logged at what points in time

Related: SYS#6412
Change-Id: I41275d8c3e272177976a9302795884666c35cd06
2023-06-16 04:25:31 +02:00
Neels Hofmeyr 15e79a2b7a make public: umts_cell_id_name()
Upcoming logging tweaks will call this from hnbgw_hnbap.c as well.
See I41275d8c3e272177976a9302795884666c35cd06

Related: SYS#6412
Change-Id: I499d18876685062d066a9a66d2def827efb77640
2023-06-16 04:07:24 +02:00
Neels Hofmeyr e513ffb983 add rate_ctr infra; add rate_ctrs for cnpool
Introduce rate counter stats to osmo-hnbgw.

Add the first rate counters -- they will be fed in upcoming commit
"cnpool: select CN link from pool by NRI or round robin"
I66fba27cfbe6e2b27ee3443718846ecfbbd8a974

Related: SYS#6412
Change-Id: I0132d053223a38e5756cede74106019c47ddcd94
2023-06-16 04:07:24 +02:00
Neels Hofmeyr 56323378db cnpool: extract Mobile Identity from RANAP payload
For InitialUE messages, decode the RANAP and NAS PDU to obtain the
Mobile Identity. Store it in map->l3.

A following patch will use this mobile identity to decide on which CN
links to pick from the CN pools for this subscriber.

Add the new information to LOG_MAP() as ubiquitous logging context.

Depends: libosmocore Ic5e0406d9e20b0d4e1372fa30ba11a1e69f5cc94
Related: SYS#6412
Change-Id: I373d665c9684b607207f68094188eab63209db51
2023-06-16 04:07:24 +02:00
Neels Hofmeyr 83dce9f7da add hnbgw_decode_ranap_co()
An upcoming patch will re-use this pattern of decoding RANAP and
attaching a talloc destructor to the result for the third time. The
caller will be in hnbgw_rua.c, so make this a "public" function.

Change-Id: I3695babfb083ed1f5fff700dcb7646fa95496a52
2023-06-02 17:11:25 +02:00
Neels Hofmeyr c957f4efd5 tdefs; combine timer groups 'ps' and 'cmap' to 'hnbgw'
Makes no sense to have groups with one member each.
Soon I will add a new timer that matches neither 'cmap' nor 'ps', and I
would have to open a third group with a single member.

Add a shim that redirects 'ps' to 'hnbgw'.
The group 'cmap' has not yet been released, so just drop that one
without a shim.

Change-Id: Ica6063fae4588b51897e5574d975b3185fa9ae80
2023-06-02 17:11:25 +02:00
Neels Hofmeyr 9ee468f5b2 cnpool: make NRI mappings VTY configurable
Implement only the VTY configuration part, applying NRI to select CN
links follows in I66fba27cfbe6e2b27ee3443718846ecfbbd8a974.

Use the osmo_nri_ranges API to manage each cnlink's NRI ranges by VTY
configuration.

Analogous to osmo-bsc
 MSC pooling: make NRI mappings VTY configurable
 4099f1db504c401e3d7211d9761b034d62d15f7c
 I6c251f2744d7be26fc4ad74adefc96a6a3fe08b0
plus
 vty: fix doc: default value for 'nri bitlen'
 f15d236e3eee8e7cbdc184d401f33b062d1b1aac
 I8af840a4589f47eaca6bf10e37e0859f9ae53dfa

Related: SYS#6412
Change-Id: Ifb87e01e5971962e5cfe5e127871af4a67806de1
2023-06-02 17:11:25 +02:00
Neels Hofmeyr 9901316371 cnpool: add multiple 'msc' and 'sgsn' cfg (use only the first)
Allow configuring multiple MSCs and SGSNs. Show this in new transcript
test cnpool.vty and in sccp.dot chart.

Still use only the first CN link; actual CN link selection from the pool
follows in I66fba27cfbe6e2b27ee3443718846ecfbbd8a974.

Config examples and VTY tests cfg files will be adjusted to the new cfg
syntax in patch If999b71a8a8237699f6ccfcaa31d1885e66c0518. Only the VTY
write() changes are visible here, to pass all transcript tests.

Related: SYS#6412
Change-Id: I5479eded786ec26062d49403a8be12967f113cdb
2023-06-02 17:11:22 +02:00
Neels Hofmeyr bd5901d3f9 cnpool prep: add SCCP_EV_USER_ABORT
To ease patch review, I decided to submit this separately from the
caller, which follows in subsequent patch
I5479eded786ec26062d49403a8be12967f113cdb

The new event will be dispatched when there are SCCP config changes
pending, and the human user types 'apply sccp' on the telnet VTY. The
aim is to disconnect all connections so we can create a new SCCP User.

Related: SYS#6412
Change-Id: Idff8e09b5328c904b175e4d4df7d4044a34f4a20
2023-06-02 16:46:12 +02:00
Neels Hofmeyr c6393a1f80 cnpool: split up context_map_find_or_create_by_rua_ctx_id()
Have separate find, alloc, set_cnlink functions instead of all in one.

For CN pooling, I need these operations to be separate:
- we need to decode the RANAP and L3 PDU only if no existing context_map
  was found (see I373d665c9684b607207f68094188eab63209db51 ).
- while decoding RANAP and L3 PDU, and while deciding which cnlink to
  use, I already want to use the logging context of LOG_MAP(). So I want
  to allocate a map first, and connect it to the selected cnlink later.

Related: SYS#6412
Change-Id: Ifc5242547260154eb5aecd3a9d9c2aac8419e8db
2023-06-02 16:46:10 +02:00
Neels Hofmeyr f3caea850b cnpool: allow separate cs7 for IuPS and IuCS
Prepare for CN pooling; allow using separate SS7 instances for IuCS and
IuPS.

New struct hnbgw_sccp_user describes a RANAP SCCP user, one per cs7.
Limit struct hnbgw_cnlink to describing a CN peer, using whichever SCCP
instance that is indicated by hnbgw_sccp_user.

Chart sccp.dot shows the changes made.

Related: SYS#6412
Change-Id: Iea1824f1c586723d989c80a909bae16bd2866e08
2023-06-01 01:41:35 +02:00
Neels Hofmeyr 548d3d2f66 use new osmo_sccp_instance_next_conn_id()
Change-Id: I0fc6e486ce5d68b06d0bc9869292f082ec7a67f0
2023-05-26 23:23:21 +02:00
Neels Hofmeyr aa99546d23 immediately SCCP RLSD on HNB re-register
The RUA side may disconnect
a) gracefully (RUA Disconnect with RANAP IU Release Complete) or
b) by detecting that the HNB is gone / has restarted

For a), the SCCP side waits for a RLSD from the CN that follows the IU
Release Complete.
For b), we so far also wait for a RLSD from CN, which will never come,
and only after a timeout send an SCCP RLSD to the CN.

Instead, for b), immediately send an SCCP RLSD.

To distinguish between the graceful release (a) and the disruptive
release (b), add new state MAP_RUA_ST_DISRUPTED on the RUA side,
and add new event MAP_SCCP_EV_RAN_LINK_LOST for the SCCP side.

Any non-graceful disconnect of RUA enters ST_DISRUPTED.
disrupted_onenter() dispatches EV_RAN_LINK_LOST to SCCP,
and SCCP directly sends RLSD to the CN without timeout.

These changes are also shown in doc/charts/hnbgw_context_map.msc.

Change-Id: I4e78617ad39bb73fe92097c8a1a8069da6a7f6a1
2023-05-26 23:23:21 +02:00
Neels Hofmeyr 164f3a67f1 unbloat: drop context_map_check_released()
When I implemented it, I thought context_map_check_released() would help
clarify context map deallocation, but instead it just bloats. Simplify
and tweak related comment.

Change-Id: I535780de0d3b09893ba89d66804e5e36b26db049
2023-05-26 23:23:21 +02:00
Neels Hofmeyr ade986280e drop dead code: cnlink.T_RafC
This looks like working RANAP RESET handling, but is in fact dead code.
Instead of testing and completing this implementation, I will copy the
RESET handling FSM from osmo-bsc, which is proven to work well.

See Id3eefdea889a736fd5957b80280fa45b9547b792
"detect in/active CN links by RANAP RESET"

The future patch will start to use transmit_rst(). So instead of
dropping now and resurrecting later, let's keep the code -- but it has
to be '#if 0'-ed to avoid compiler complaints about the unused function.

Related: SYS#6412
Change-Id: I7cacaec631051cf5420202f2f0dd9665a5565b17
2023-05-26 23:23:21 +02:00
Neels Hofmeyr 0484441530 move main() to separate file
Conform to recent osmocom practice by having an osmo_hnbgw_main.c file
for main() and its immediate infrastructure.

Prepares for adding a non-installed libhnbgw.la -- which must not
contain a main() function -- so that linking regression tests is easy.

Separating to a new file seems appropriate because hnbgw.c has gathered
a bunch of stuff that is not strictly main() related / might be useful
to access in a C test.

Change-Id: I5f26a6b68f0d380617d73371f40f3b9055cac362
2023-05-08 15:20:26 +02:00
Neels Hofmeyr f1d4e3b7dd simplify: one g_hnbgw as global state and root ctx
So far we jump through hoops everywhere to pass around osmo-hnbgw's
global singleton state to all code paths. Some files choose to spawn a
static "global" pointer. And we also have the talloc root tall_hnb_ctx.

Simplify:
- Have a single global g_hnbgw pointer. Drop all function args and
  backpointers to the global state.
- Use that global g_hnbgw as talloc root context, instead of passing a
  separate tall_hnb_ctx around.

(Cosmetic preparation for separate osmo_hnbgw_main.c file)

Change-Id: I3d54a5bb30c16a990c6def2a0ed207f60a95ef5d
2023-05-07 00:18:34 +02:00
Neels Hofmeyr 0b2983d1af eliminate function hnb_contexts()
It's really just a bloated llist_count() wrapper.
(Cosmetic preparation for separate osmo_hnbgw_main.c file)

Change-Id: Icd4100ddeb98f4dc2e2ec6de2d44a4b861a66078
2023-05-06 03:53:20 +00:00
Neels Hofmeyr 52f9e327bb drop empty hnbgw_rua_init()
Change-Id: I8e3ff452242d8baa527c2a61d182bbe9786341e3
2023-05-06 03:53:20 +00:00
Neels Hofmeyr 7992ce0163 tweak LOGHNB()
Add braces around context part.

In the HNBGW_Tests.ttcn output, I see this:

 DRUA DEBUG TTCN3 HNodeB transmitting RUA DirectTransfer

which reads like the hNodeB would transmit a RUA to us. Instead, this is
us sending RUA to the hNodeB, which is much clearer like this:

 DRUA DEBUG (TTCN3 HNodeB) transmitting RUA DirectTransfer

This matches the way we typically show context info in osmo logging.

Change-Id: If6f0c3ae81c737b7488fa93c435179dcf27a5c94
2023-03-09 03:24:30 +01:00
Neels Hofmeyr ed424d5be4 context map: introduce RUA and SCCP FSMs to fix leaks
Refactor the entire RUA <-> SCCP connection-oriented message forwarding:
- conquer confusion about hnbgw_context_map release behavior, and
- eradicate SCCP connection leaks.

Finer points:

== Context map state ==
So far, we had a single context map state and some flags to keep track
of both the RUA and the SCCP connections. It was easy to miss connection
cleanup steps, especially on the SCCP side.
Instead, the two FSMs clearly define the RUA and SCCP conn states
separately, and each side takes care of its own release needs for all
possible scenarios.
- When both RUA and SCCP are released, the context map is discarded.
- A context map can stay around to wait for proper SCCP release, even if
  the RUA side has lost the HNB connection.
- Completely drop the async "context mapper garbage collection", because
  the FSMs clarify the release and free steps, synchronously.
- We still keep a (simplified) enum for global context map state, but
  this is only used so that VTY reporting remains mostly unchanged.

== Context map cleanup confusion ==
The function context_map_hnb_released() was the general cleanup function
for a context map. Instead, add separate context_map_free().

== Free context maps separately from HNB ==
When a HNB releases, talloc_steal() the context maps out of the HNB
specific hnb_ctx, so that they are not freed along with the HNB state,
possibly leaving SCCP connections afloat.
(It is still nice to normally keep context maps as talloc children of
their respective hnb_ctx, so talloc reports show which belongs to
which.)

So far, context map handling found the global hnb_gw pointer via
map->hnb_ctx->gw. But in fact, a HNB may disappear at any point in time.
Instead, use a separate hnb_gw pointer in map->gw.

== RUA procedure codes vs. SCCP prims ==
So far, the RUA rx side composed SCCP prims to pass on:

 RUA rx ---SCCP-prim--> RANAP handling ---SCCP-prim--> SCCP tx

That is a source of confusion: a RUA procedure code should not translate
1:1 to SCCP prims, especially for RUA id-Disconnect (see release charts
below).
Instead, move SCCP prim composition over to the SCCP side, using FSM
events to forward:

 RUA rx --event--> RUA FSM --event--> SCCP FSM --SCCP-prim--> SCCP tx
         +RANAP             +RANAP              +RANAP

 RUA tx <--RUA---- RUA FSM <--event-- SCCP FSM <--event-- SCCP rx
          +RANAP             +RANAP              +RANAP

Hence choose the correct prim according to the SCCP FSM state.
- in hnbgw_rua.c, use RUA procedure codes, not prim types.
- via the new FSM events' data args, pass msgb containing RANAP PDUs.

== Fix SCCP Release behavior ==
So far, the normal conn release behavior was

 HNB                 HNBGW                   CN
  | --id-Disconnect--> | ---SCCP-Released--> |  Iu-ReleaseComplete
  |                    | <--SCCP-RLC-------- |  (no data)

Instead, the SCCP release is now in accordance with 3GPP TS 48.006 9.2
'Connection release':

 The MSC sends a SCCP released message. This message shall not contain
 any user data field.

i.e.:

 HNB                 HNBGW                    CN
  | --id-Disconnect--> | ---Data-Form-1(!)--> |  Iu-ReleaseComplete
  |                    | <--SCCP-Released---- |  (no data)
  |                    | ---SCCP-RLC--------> |  (no data)

(Side note, the final SCCP Release Confirm step is taken care of
implicitly by libosmo-sigtran's sccp_scoc.c FSM.)

If the CN fails to respond with SCCP-Released, on new X31 timeout,
osmo-hnbgw will send an SCCP Released to the CN as fallback.

== Memory model for message dispatch ==
So far, an osmo_scu_prim aka "oph" was passed between RUA and SCCP
handling code, and the final dispatch freed it. Every error path had to
take care not to leak any oph.
Instead, use a much easier and much more leakage proof memory model,
inspired by fixeria:
- on rx, dispatch RANAP msgb that live in OTC_SELECT.
- no code path needs to msgb_free() -- the msgb is discarded via
  OTC_SELECT when handling is done, error or no error.
- any code path may also choose to store the msgb for async dispatch,
  using talloc_steal(). The user plane mapping via MGW and UPF do that.
- if any code path does msgb_free(), that would be no problem either
  (but none do so now, for simplicity).

== Layer separation ==
Dispatch *all* connection-oriented RUA tx via the RUA FSM and SCCP tx
via the SCCP FSM, do not call rua_tx_dt() or osmo_sccp_user_sap_down()
directly.

== Memory model for decoded ranap_message IEs ==
Use a talloc destructor to make sure that the ranap_message IEs are
always implicitly freed upon talloc_free(), so that no code path can
possibly forget to do so.

== Implicit cleanup by talloc ==
Use talloc scoping to remove a bunch of explicit cleanup code. For
example, make a chached message a talloc child of its handler:
  talloc_steal(mgw_fsm_priv, message);
  mgw_fsm_priv->ranap_rab_ass_req_message = message;
and later implicitly free 'message' by only freeing the handler:
  talloc_free(mgw_fsm_priv)

Related: SYS#6297
Change-Id: I6ff7e36532ff57c6f2d3e7e419dd22ef27dafd19
2023-02-24 15:19:24 +01:00
Neels Hofmeyr 29f82a7b48 cosmetic: regroup members of hnbgw_context_map
Change the order in the struct definition so that the items for RUA and
for SCCP are grouped together. Tweak comments.

Change-Id: I84424617996d8e1a6814aa122e63454c0666b724
2023-02-23 02:03:07 +01:00
Neels Hofmeyr 60a52e4121 cosmetic: rename context_map_deactivate
Rename context_map_deactivate() to context_map_hnb_released().

This function is called only when the HNB is released / lost.

Change-Id: I6dcb26c94558fff28faf8f823e490967a9baf2ec
2023-02-23 01:15:54 +01:00
Neels Hofmeyr 43cc12bac3 various comment tweaks
Change-Id: Ie40aa672948062282c566c90300f6e96963e05ec
2023-02-21 00:57:03 +01:00
Neels Hofmeyr 9ffebe784d cosmetic: drop stray backslash
Change-Id: I311b62ff393c6520d72fc6aee21d86723487133f
2023-02-21 00:57:03 +01:00
Neels Hofmeyr 3f4d645890 Deprecate 'sccp cr max-payload-len', remove SCCP CR limit code
The SCCP CR payload length limit is now implemented in libosmo-sigtran
v1.7.0. The limit no longer needs to be enforced in osmo-hnbgw.

This reverts commit 2c91bd66a1,
except for keeping the cfg option, marked deprecated, and not doing
anything.

Fixes: OS#5906
Related: SYS#5968
Related: OS#5579
Depends: I174b2ce06a31daa5a129c8a39099fe8962092df8 (osmo-ttcn3-hacks)
Change-Id: I18dece84b33bbefce8617fbb0b2d79a7e5adb263
2023-02-15 03:34:43 +01:00
Neels Hofmeyr 8eefcbee92 UE state leak: when HNB re-registers, discard previous UE state
User reports show leaked UE contexts over time, in a scenario where HNB
regularly disconnect and reconnect.

So far, when we receive a HNB REGISTER REQ, we log as "duplicated" and
continue to use the hnb_context unchanged. But it seems obvious that a
HNB that registers does not expect any UE state to remain. It will
obviously not tear down UE contexts (HNBAP or RUA) that have been
established before the HNB REGISTER REQUEST. These hence remain in
osmo-hnbgw indefinitely.

When receiving a HNB REGISTER REQUEST, release all its previous UE
state. Allow the HNB to register with a clean slate.

The aim is to alleviate the observed build-up of apparently orphaned UE
contexts, in order to avoid gradual memory exhaustion.

Related: SYS#6297
Change-Id: I7fa8a04cc3b2dfba263bda5b410961c77fbed332
2023-02-11 03:32:34 +01:00
Neels Hofmeyr 87ecf69b55 fix SCCP conn leak on non-graceful HNB shutdown
Clean up SCCP connections when a HNB disconnects.

When a HNB disconnects, we clean up all RUA <-> SCCP connection state
for that HNB. In that cleanup, discarding the SCCP connection is so far
missing.

Add a flag indicating true between SCCP CC and DISCONNECT. Hence we can
tell during context_map_deactivate() whether the cleanup is graceful
(DISCONNECT already sent) or non-graceful (need to DISCONNECT).

Change-Id: Icc2db9f6c0b2d0a814ff1110ffbe5e8f7f629222
2023-01-20 20:30:03 +01:00
Pau Espin e62af4d46a Introduce support for libosmo-mgcp-client MGW pooling
Large RAN installations may benefit from distributing the RTP voice
stream load over multiple media gateways.

libosmo-mgcp-client supports MGW pooling since version 1.8.0 (more than
one year ago). OsmoBSC has already been making use of it since then (see
osmo-bsc.git 8d22e6870637ed6d392a8a77aeaebc51b23a8a50); lets use this
feature in osmo-hngw too.

This commit is also part of a series of patches cleaning up
libosmo-mgcp-client and slowly getting rid of the old non-mgw-pooled VTY
configuration, in order to keep only 1 way to configure
libosmo-mgcp-client through VTY.

Related: SYS#5091
Related: SYS#5987
Change-Id: I371dc773b58788ee21037dc25d77f556c89c6b61
2022-10-20 17:03:06 +02:00