In the downlink path, we cannot assume map->hnb_ctx is always
non-NULL. If the HNB has just disconnected, it might be NULL,
while we're still processing downlink messages from the CN which
were sent by it before it realized that HNB was gone.
Closes: SYS#7010
Change-Id: I9a304b9e0cbc18dbf7b699f4aae6b91ca0c16173
Fix a recently introduced problem with MGCP to osmo-mgw:
Send the first CRCX in recvonly mode, not sendrecv. osmo-hnbgw always
sends an additional MDCX including sendrecv mode anyway.
osmo-mgw currently forbids sending an initial CRCX in connection mode
'sendrecv', with this error message:
DLMGCP ERROR endpoint:rtpbridge/2@mgw CI:7F4C8EDD CRCX: selected connection mode type requires an opposite end! (mgcp_protocol.c:1090)
I am submitting an osmo-mgw patch to not fail there, but we want to and
can easily be compatible with current and earlier osmo-mgw:
Sending the initial CRCX in sendrecv was introduced in commit:
"drop legacy hack: do not start MGW endp in loopback mode"
da7d33e284
I0eca75d7abf66f8b9fde9c68ec10d4265f64a189
This patch has not been part of a release yet.
The intention of that commit was to get away from loopback mode. The
logical mode to pick instead indeed is sendrecv, but by that osmo-hnbgw
triggers above osmo-mgw error.
Related: SYS#6974 SYS#6907
Related: osmo-mgw Ic089485543c5c97a35c7ae24fe0f622bf57d1976
Change-Id: I004f96ae36774ceb33f177c9f58f820fefa3ca14
Last year, I have fixed some more ue_context leak situations.
But since we don't really use ue_context for anything, we could also
just drop this completely.
On HNBAP UE Register, we collect the ue_contexts in a ue_list. But we
never do anything with this list, at all. Furthermore, users are
reporting the list of ue_context growing indefinitely.
Simply drop the ue_context listing. Simply acknowledge all HNBAP UE
Register and DeRegister requests without storing any context IDs.
Change-Id: Ida7eadf36abcf465ae40003725c49e8e321a28c9
Make sure errors of getting counters from nft are logged.
Some context: we'll try again each X34 period, hence this is only a
problem when the error persists.
For example, when the get-counters thread is faster than the maintenance
thread can even create the nft table initially, this error will show.
We could add a definite check whether the maintenance thread has created
the tables yet, and wait for that event. But this complexity is not
really needed: it is fine to just fail getting counters once or twice.
Related: SYS#6773
Change-Id: I84340482e4a5bfcac158a21c9378a9511fa5ea10
We have a situation where HNBAP is not answered by osmo-hnbgw, and the
log is all silent. Add logging to a lot more of the possible HNBAP
failure paths to find out what is wrong.
Related: SYS#6810
Change-Id: I17d2809f59087d32e7c11a3ada1d3fadf6f0b660
- Set osmo_fsm_set_dealloc_ctx(OTC_SELECT) in osmo-hnbgw's main().
- Only dispatch RANAP when FSM instances aren't terminated.
This way we possibly pre-empt use-after-free crashes for deallocating
FSM "nests" for obscure corner cases.
Use-after-free is a general problem for FSM design. For this, we created
osmo_fsm_set_dealloc_ctx(): When an FSM is terminated, move it to a
separate talloc context, instead of being deallocated.
An actual use-after-free was observed as described in OS#6484, but that
needs a separate, orthogonal fix:
When the Iuh link is lost while the CN link is waiting for SCCP CC or
CREF -- the better solution is described in OS#6085: don't wait for CC
at all, just dispatch DISCONN to SCCP-SCOC.
So even though the code where a crash was observed will be removed, this
patch is a general safeguard against corner case crashes, improving
general stability.
Related: OS#6484
Change-Id: Ib41e1a996aaa03221e73643636140947ac8f99e2
Add timer hnbgw X35: Clean up all hNodeB persistent state after this
time of the hNodeB being disconnected. Set to zero to never clear hNodeB
persistent state. (default is 60*60*24*27 = a week).
The idea is to bound memory usage at sites where hNodeB do not stay on
one fixed cell id, but use a different cell id after each Iuh
reconnection.
Related: SYS#6773
Related: osmo-ttcn3-hacks Ibec009203d38f65714561b7c28edbdbd8b34e704
Change-Id: Ic819d7cbc03fb39e98c204b70d016c5170dc6307
Properly represent the mnc_3_digits flag in umts_cell_id, and preserve
the three digit indicator as received on the wire.
Before this patch, the indicator for a three digit MNC received on the
wire was discarded, and instead g_hnbgw->config.plmn.mnc_3_digits was
used to convert any PLMN to string, whether it had 3 digits or not.
== hnb_persistent_list:
The cell id is used as primary key in the list of hnb_persistent
instances. This patch prevents any collisions between 2-digit and
3-digit MNCs (however unlikely in practice this may be).
== nft_kpi.c:
Just like the cell ids in hnb_persistent, the ids' strings are used as
primary key in nftables rulesets in nft_kpi.c -- also prevent MNC
collisions there:
Properly transport the 3-digit property in conversions:
struct umts_cell_id <-> string
Uncouple to_str conversion from the PLMN set in the hnbgw VTY cfg.
Related: OS#6457
Change-Id: Id9a91c80cd2745424a916aef4736993bb7cd8ba0
Prepare for adding proper mnc_3_digits support to struct umts_cell_id.
Show current behavior of the umts_cell_id <-> string conversions.
Show two expected errors in umts_cell_id_test.ok: the three-digit MNC
with leading zeros is lost (because the g_hnbgw->config.plmn has
mnc_3_digits == false).
The expected errors will be fixed in upcoming patch
Id9a91c80cd2745424a916aef4736993bb7cd8ba0
Related: SYS#6773
Change-Id: Ibbb61a2c53a11dea794f451d3074bc9ba50862fe
A subsequent patch adds umts_cell_id_to_str_buf(), and before the old
foo_name() pattern spreads further, I'd rather rename it.
Rationale:
- There is a umts_cell_id_from_str() function.
- The foo_name() is an old pattern, we prefer foo_to_str() now.
- Contained within osmo-hnbgw, no API problems.
Change-Id: I3124d1f5e634bc895ec347cb1a9816789fd9ab69
Use hashing like the linux kernel.
Related: SYS#6773
Depends: I0c9652bbc9e2a18b1200e7d63bb6f64ded7d75fa
Change-Id: I5441db4293dc6b57a1c606ef830656fa9fa01943
Add missing stat_item free in hnb_persistent_free().
We recently fixed a rate_ctr leak in
I14e050bfb91b993f194e3800eacdc0d10f2b1a4e, but missed the also leaking
stat_item.
Particularly relevant with upcoming patch
Ic819d7cbc03fb39e98c204b70d016c5170dc6307 -- testing that patch revealed
the leak.
Related: osmo-ttcn3-hacks Ibec009203d38f65714561b7c28edbdbd8b34e704
Change-Id: I7326c53d595dce7b442eced89ff8f4b972bd2a82
Mainly the new nft counters do a lot of hnb_persistent lookups by cell
id.
Add a hashtable to optimize looking up hnb_persistent instances. So far
we iterate the linear list of hnb_persistent for each and every counter
returned from nft_kpi.c.
Also improves lookups for HNBAP HNB* operations (rare).
A follow-up patch uses a better hash function from libosmocore
(jhash.h).
Related: SYS#6773
Change-Id: Iecb81eba28263ecf90a09c108995f6fb6f5f81f2
Add optional feature: retrieve GTP-U traffic counters per hNodeB (not
per individual subscriber!) using nftables, to provide new rate_ctr
stats.
This is a "workaround" to get performance indicators per hNodeB, without
needing a UPF that supports URR.
When an hNodeB registers, set up nftables rules to count GTP-U packets
(UDP port 2152) to and from that hNodeB's address -- we are assuming
that it is the same address that Iuh is connecting from.
From the per-hNodeB packet and byte counters from nftables, also derive
a "UE bytes" counter, which is counting only the GTP-U payload. Assume
IP header of 20 bytes; UDP and GTP-U headers are 8 bytes each:
ue_bytes = total_bytes - packets * (20 + 8 + 8)
Query these periodically, as configurable by new timer X34. Default is
one second of wait time between querying counters (excluding the time it
takes to retrieve and update the counters).
Add compile-time switch --enable-nftables, to build with/without
external dependency libnftables. Default is without, as before.
Add jenkins axis NFTABLES to switch --enable-nftables.
Add cfg file option 'hnbgw' / 'nft-kpi' to enable use of nftables.
This requires osmo-hnbgw to be run with cap_net_admin.
The VTY config commands are always visible -- simplifies VTY testing.
Refuse to start osmo-hnbgw when the user is requesting nft-kpi in the
config but when built without --enable-nftables.
Do nft commands in 2 separate threads. Run the same request queue
implementation twice, with two thread workers to handle them:
- one thread receives all requests to init the nft table, add and remove
hNodeB counters, and start and stop counting for a specific hNodeB.
- Another thread handles all retrieval and parsing of counters from nft.
The main() thread hence never blocks for nftables commands, and services
the responses from nft when they are ready, via an osmo_it_q registered
in the main() select loop.
Persistently keep an nftables named counter for each seen hNodeB cell id
in the nftables ruleset, for the lifetime of a hnb_persistent instance
that holds the target rate_ctrs.
Add the rules to feed into these persistent counters to the ruleset when
the particular cell attaches and detaches via HNBAP HNB (De-)Register.
On hnb_persistent_free(), remove all items relating to this cell id from
nftables, including the persistent named counters.
Loosely related: upcoming patches will implement
- a hashtable for faster cell id lookup (important for updating
counters)
Iecb81eba28263ecf90a09c108995f6fb6f5f81f2
- proper MNC-3-digit support in cell ids (better have a 100% correct
primary key).
Id9a91c80cd2745424a916aef4736993bb7cd8ba0
- idle timeout for disconnected hnbp, so we are sure stale state does
not build up for eternity.
Ic819d7cbc03fb39e98c204b70d016c5170dc6307
Related: SYS#6773
Related: OS#6425
Change-Id: Ib2f0a9252715ea4b2fe9c367aa65f771357768ca
The variable was left unused by recent patch
I3e1ad7a2aa71674a22a27c31512600f2de139032 aka
a5974d7906
Change-Id: I871bc43f6f47d4b78fbf88826615f2dbb8e1f807
From an operational perspective, it may be interesting to know how many
LU/RAU/Attach attempts, rejects and accepts are happening in a given
hNodeB. Let's add some common infrastructure for DTAP related
statistics as well as some initial counters.
Related: SYS#6885
Depends: osmo-iuh.git Change-Id I7dea74102da8b610ff2a310c5814f5c89f08e7a6
Change-Id: I3e1ad7a2aa71674a22a27c31512600f2de139032
Do not attempt to change permissions/ownership if the package gets
upgraded from a version higher than the next release.
Do not fail if the user deleted the config file.
Be verbose when changing permissions.
Related: OS#4107
Change-Id: I1bcbe414fd18101e4d875a16539deab7baf9cb5f
* Explicitly chown /var/lib/osmocom to osmocom:osmocom, instead of
relying on systemd to do it when the service starts up. This does not
work with the systemd versions in debian 10 and almalinux 8.
* deb: Use "useradd" instead of the interactive "adduser" perl script
from Debian. This makes it consistent with how we do it in rpm, and
avoids the dependency on "adduser".
* deb: Consistently use tabs through the file, instead of mixing tabs
and spaces.
* deb: Remove support for the "dpkg-statoverride --list" logic. This
seems to be a rather obscure feature to override permissions for
certain files or directories, for which it does not seem to be a good
idea to make the postinst script less maintainable. Something similar
can be achieved by using your own Osmocom config file in a different
path with different permissions.
Related: OS#4107
Change-Id: I6dd0205fb65d4ad5a79821c111865e67fb293a73
We used to tell osmo-mgw to create an IuUP endpoint in loopback mode, in
order to hack it into responding to an IuUP Initialization. The loopback
mode here in osmo-hnbgw is a leftover from that hack. Drop it.
Change-Id: I0eca75d7abf66f8b9fde9c68ec10d4265f64a189
Create osmocom user & group during package installation.
Fix the configuration dir/files permission to match.
Related: OS#4107
Tweaked-By: Oliver Smith <osmith@sysmocom.de>
Change-Id: Ife9433291ae03392ae114ebda418bce8cc93fe3b
It is interesting to know if a release was normal (as expected/requested
by the NAS layer, typically indicating a user-requested call end) or
abnormal (radio failure, pre-emption or whatever other event that the
user did not expect).
Related: SYS#6773
Change-Id: Idd2f845b7db448064b693ac1efdc8db006a47a11
Whenever we receive a message and cannot decode the most basic IEs,
or receive an unknown/unsupported procedure code, we should respond
with an ErrorIndication in order to inform the peer.
Change-Id: I7aaa66f83f62ee1b5ba5204248e9f4cc754263ed
If we receive a procedure (like UE-REGISTER) in a state
where it's not permitted (e.g. HNB not registered), we should
send a UE-REGISTER-REJ with proper cause value, rather than not
sending any response at all.
Change-Id: I300db368a3d1d2fb5967f69f2ed4ac90ecf85e75
We always should respond to a UE-REGISTER-REQ. Either it's an ACK
or we must send a REJ. There should not be any "quiet" error cases
where we don't respond at all:
* send general Error Indication at a point where we cannot decode the
UE-Identity-IE (which is mandatory in UE-REGISTER-REJECT)
* send UE-REGISTER-REJECT with matching cause value whenever we have
the decoded UE Identity around
Change-Id: I5338a1324545b2c6d31fb45f1e69fee45842e207
When we reject the HNB-REGISTER-REQ, let's use an as specific as possible
cause value to let the peer know why we rejected registration.
Change-Id: Iadddd26b751a9fd80c829068792aa93cd538c43d
This way the caller can provide the cause value to be used during
reject.
Requires: osmo-iuh I7db92b51847c282d23d568970dfd2bedecdea486
Change-Id: Ic83674523c0326a7ae51fb176bddfd6641ed3ac4
Prior to this patch we always decoded CS RANAP, but only decoded PS
RANAP in case PFCP support was enabled. This meant that PS KPIs
were only counted when PFCP was enabled, too.
Let's move to a mode where we unconditionally decode RANAP and always
call the KPI module for updating the rate counters.
Change-Id: I6054b6efcc202ebd71cd6e135e49c279ba616a01
Usually, a hnb_context still has a hnb_persistent associated at release
time. But that is not guaranteed.
See also further below, where the function tests for ctx->persistent
correctly.
Change-Id: I77ddd627ebfe96c7674c6a197af8b2c4b1a4024c
Add rate_ctr based statistics for RAB activation, deactivation and
failures. This requires us to parse RANAP in both uplink and downlink
and to iterate deep into the setup/modify/release/failure lists.
Given the way how the protocol works, the only way to distinguish an
activation from a modification is because sender and recipient know
whether a given RAB is already active at the time of the message. So we
also need to track the activation state of each RAB.
Depends: osmo-iuh.git Change-Id I328819c650fc6cefe735093a846277b4f03e6b29
Change-Id: I198fa37699e22380909764de6a0522ac79aa1d39
The mgw_fsm only supports CS RABs in the CS domain; let's add some
ASSERTs to make sure the impossible doesn't happen.
Change-Id: I264c4b3da17b6f59ebcdd02031318402a483041a
In uplink we use *ranap_rx_udt_ul*, so let's use the same naming
pattern for processing dowlink unit-data messages to make things more
consistent. Also, make sure udt is always part of functions that only
handle unitdata - not to be confused with connection-oriented messages.
Change-Id: I1792e4c2cdce145ae906c181898163bcda36328d
Adding counters for number of paging succeeded is much harder,
as we currently don't parse connection-oriented DL RANAP and/or any
L3 NAS in it.
Change-Id: I7648caa410dba8474d57121a8128579ba200b18f
Let's not make the functions appear more generic than they are: They
all explicitly only support uplink so far.
Change-Id: I7db0d933a8f17f8c410141f43dab12b8c19fc8ae
Those functions have always been handling only unit-data in uplink
direction, so let's reflect that in the function name to prevent
anyone assuming they process connection-oriented RANAP and/or
the downlink direction.
Change-Id: I29e8176ac19b2e7390e5950b8d0944c8961e491f