Found by clang:
warning: implicit conversion from enumeration type
'enum lchan_activate_for' to different enumeration type
'enum assign_for' [-Wenum-conversion]
This is indeed a bug, because both enum items have different values:
* ACTIVATE_FOR_VTY (from enum lchan_activate_for) is 4,
* ASSIGN_FOR_VTY (from enum assign_for) is 3.
Change-Id: I44544d4577833e0aed62b07d0c7c1c2821b05dd4
Fault Reports are commonly oberved with a TLV id 0xd2
as are reports with up to 20 TLVs.
Let's not have these cause logging at level ERROR.
Closes: OS#5593
Change-Id: Ibe0b38835362c59d1576a206b2f64cea4427295f
Current regex 'handover' is way too restrictive because it completely
forbids the use of word 'handover'. Adding new VTY commands with this
word in the syntax makes this VTY test fail.
Use regex '^\s+handover', which only matches lines starting with
some whitespace and the word 'handover'. Lines simply containing
the word 'handover' will be ignored.
Change-Id: I8a1550c6c97437832e05b6b4bebbcc33c2fa3d46
Related: SYS#5460
This brings our README in line with the various other osmo-*
projects such as osmo-bts:
* use markdown formatting
* links to mailing list, documentation, git repo, ...
* reference the method of contribution + code review
Change-Id: I201bf47550a8fea500925205e0de1060d58d6136
This way the CBSP peers knows the state of specific cells and can avoid
sending messages to unoperative ones, etc.
Related: SYS#5910
Change-Id: I94f0a1ac3c59cffe5af57f972d5d96fc92281d34
When calculating average lchan duration based on the new stats for
BTS_CTR_CHAN_{TCH,SDCCH}_ACTIVE_MILLISECONDS_TOTAL there are
discrepancies which emerge. Specificially in bandwidth-constrained
environments, there are still-unknown failure states which can
occur that cause the TCH or SDCCH activity count to increment but
zero milliseconds of activity on the lchan to accumulate. This
portrays a failure as a success.
These new fully-established stats are intended to provide a more
accurate denominator when calculating average lchan duration as
they are incremented in proximity to the duration timestamp
initialization.
Change-Id: I417940ad9479719f5324fb12d45883cd3cb2c578
This allows running CBCH/ETWS related procedures only when the CBCH
towards MS under that cell is operative.
This also allows providing awarness of per-cell status to the CBSP peer
as required per specs.
Related: SYS#5910
Change-Id: Ia93919be94132fc010acb5bbfef0a6fd51c42981
For statistical clarity and site tuning, it is sometimes
desirable to completely disable the use of TCH for signaling.
In the existing version of this VTY command, there is no way to
accomplish this. We can only restrict TCH for signaling non-voice
related actions.
This patch deprecates 'allow-tch-for-signalling (0|1)' and
adds 'tch-signalling-policy (never|emergency|voice|always)' to
provide more options.
Change-Id: I4459941ddad4e4a3bec8409b180d9a23a735e640
This way we separate all the VTY boilerplate from the actual logic, as
we usually do in all other subsystems.
Change-Id: Ifc7d1693d745dd2a3c31e3ee9610d8c634b50812
Drop "to this MSC" from the NRI_STR, as it is not only used for MSC
specific configuration, but also in cfg_net_nri_* which affect all MSCs.
Drop "for this MSC" from the description of cfg_net_nri_null_del, it
affects all MSCs (unlike cfg_msc_nri_del).
Change-Id: Ic8888775a965b6d607af51b9359bd8ffc2834e16
The all_allocated_update_bsc() does inefficient iterating to count
active/inactive lchans, which scales badly for high numbers of TRX
managed by osmo-bsc.
We need to update the all_allocated flags immediately (periodic counting
alone would suffer from undersampling), so, until now, we are calling
this inefficient function every time a channel state changes.
Instead of iterating all channels for any chan state changes anywhere,
keep global state of the current channel counts, and on channel state
change only update those ts, trx, bts counts that actually change.
A desirable side effect: for connection stats and handover decision 2,
we can now also use the globally updated channel counts and save a bunch
of inefficient iterations.
To get accurate channel counts at all times, spread around some
chan_counts_ts_update() calls in pivotal places. It re-counts the given
timeslot and cascades counter changes, iff required.
Just in case I missed some channel accounting, still run an inefficient
iterating count regularly that detects errors, logs them and fixes them.
No real harm done if such error appears. None show in ttcn3 BSC_Tests.
It is fine to do the inefficient iteration once per second; channel
state changes can realistically happen hundreds of times per second.
Related: SYS#5976
Change-Id: I580bfae329aac8d4552723164741536af6512011
Reduce some code dup in all_allocated accounting and cosmetically
prepare for upcoming performance fix.
Have a struct all_allocated, allow easy re-use of function
all_allocated_update().
Rename function to all_allocated_update_bsc(). Upcoming patch will also
add all_allocated_update_bts().
Related: SYS#5976
Change-Id: Id7a82c65d56a87818fc35bbeedf67e2af2f89f11
ts_is_usable() returns the current state; logging is the job of calling
functions. An upcoming patch adds some calls to ts_is_usable(), this
avoids the log flaring up with useless messages.
Related: SYS#5976
Change-Id: I0635c47609fd7c7d0195b6658b7da231d6527b4b
It was described in [1] that the NM FSM failed to trigger the
S_NM_RUNNNG_CHG signal when locking/unlocking the TRX.
That's because current osmo-bts doesn't fully conform to TS 52.021 and
it doesn't go back to Op=Disabled Avail=Dependency when becoming
Admin=Locked. It's true though that TS 52.021 sec 5.3.1 is not really
helpful since it doesn't explicitly state that specific object should go
into Disabled Dependency, despite saying it for most of the other ones.
Hence, let's account for both possibilities at the BSC side.
[1] https://gerrit.osmocom.org/c/osmo-bsc/+/28205
Related: OS#5576
Change-Id: Ifbdc066fd88bdbf826800d14524e74416815b625
Since b7ef6884f9, the state is updated
before triggering the signal S_NM_STATECHG, so the warning does no
longer hold true.
Change-Id: I7b7dd30b4fcdc92febca42e3e6a75e6f98e184ff
Add missing conn->assignment.created_ci_for_msc to
gscon_forget_mgw_endpoint_ci().
Before this patch, when assignment.created_ci_for_msc lingers after a
DLCX, it can cause a use-after-free on assignment_reset(). Possible
scenario is rx BSSMAP Clear Cmd during ongoing Assignment.
In assignment_reset(), locally cache the ci pointer, because
gscon_forget_mgw_endpoint_ci() now NULLs created_ci_for_msc.
Related: OS#5572
Change-Id: If89610020f47fd6517081dd11b83911b043bd0f1
VAMOS lchans are behind the primary ones in the ts->lchan[] array.
For example, for TCH/F, there is a primary ts->lchan[0] and a VAMOS
ts->lchan[1]. We should print 'ss 0' for both of them.
Change-Id: I8e7a5a2ecc9b9a33e3ddb76cb1bc04d7802fd320
This patch adds two stats which track cummulative lchan lifetime by
type TCH and SDCCH. These new counters will accomplish two things:
1) Provide a glanceable way to see if lchan durations look healthy. When
examining a site, short-lived (<5s) and long-lived (>30s) TCH lchans
are difficult to tell apart. If we only see short-lived TCH lchans,
there is most likely an RF or signaling problem to investigate. This
new counter will expose channel ages in the VTY output
2) Provide a more accurate count for Erlangs per site. Currently, we
are basing Erlangs on active TCH channel counts per stats period. This
method skews high very quickly. Each active TCH in that period
translates into the full 10s of activity. This counter should improve
accuracy by two orders of magnitude.
Change-Id: Ie3771233ecbd4bc24a24fb22c1064a18e7b8b2b0
As per TS 48.049 Table 8.1.3.1.1 the WRITE-REPLACE message always
has a Warning Security Information IE if it relates to ETWS. This
is also implemented in the libosmocore CBSP parser.
As the previous Change Id369bb3676ba279bafc234378fbe21dbc7b0614b has
pointed out, the CBSP parser structure doesn't even permit any way
of handing a decoded message to us without the warning_sec_info
static struct member.
So as a result, there's also no need to dynamically allocate
bts_etws_state.input.sec_info via talloc. We can have it in-line
as a static struct member and reduce code complexity and runtime
memory allocations.
Change-Id: Ib1b8e4af37b1f9f9398b81dad29942e82218c70b
The proper way to fix this is having a use count on the SCCP conn, one
each for a busy lchan and a busy Location Request. That would require a
bunch more work and testing.
This patch is the least-effort way to avoid the following scenario:
Emergency call is started;
Location Request is started to locate the emergency;
lchan releases early for any reason;
Perfectly fine Location Request gets canceled by Clear Request;
The information was there, but we did not forward the location;
No help at emergency because of my code.
Allow Location Request to complete for these cases:
- rx RLL REL IND (or any other reason for gscon_lchan_releasing())
- rx RSL CONN FAIL
Related: SYS#5912
Related: Idea690a4aa4aecbe4642a16e96d086cc0538564a (osmo-ttcn3-hacks)
Change-Id: Ib44dd05b0adee84234f671313b156ff6625357cc
ACC used to be stared/stopped based on operational/administrative state
changes. The new S_NM_RUNNING_CHG triggers a single boolean based on the
same logic, so we can now simplify the mechanism.
Change-Id: I2e09bcb18a6c3bb2e88bba98579fb4854a6b0699
This way we avoid triggering timers and doing extra poll loops for each
BTS which is configured but not up. It also has the effect of removing
logging about estimating paging buffers for BTS which are down, which
can be confusing.
Furthermore, since work is delayed until the TRX and cell in general is
configured, the first estimation is properly done now since the correct
configuration is in place at that time.
Related: SYS#5922
Change-Id: I1b5b1a98115b4e9d821eb3330fc5b970a0e78a44
This allows different parts of the code to hook to some signals which
allow start/stopping processes based, for instance, on whether C0 is
available or not.
This can be later used by paging or CBSP code. Also ACC code can be
ported to this new system (acc_ramp_nm_sig_cb()).
Same signal can be used for other NM objects, but is left unimplemented
until there's use for them.
Change-Id: I206d4c7863a77fbab6a600126742a6a6b8fc3614
This way code triggered through signal has an updated view of the object
tree when running generic code which queries the current state of
objects.
This way for instance one can use APIs like trx_is_usable() or alike.
Change-Id: Ib46234e3f3e446e866d27b0dfee65edf4af4d2ba
Found by GCC 12.1.0:
smscb.c: In function 'etws_primary_to_bts':
smscb.c:537:13: warning: the comparison will always evaluate as 'true'
for the address of 'warning_sec_info'
will never be NULL [-Waddress]
537 | if (wrepl->u.emergency.warning_sec_info) {
| ^~~~~
In file included from smscb.c:31:
/usr/local/include/osmocom/gsm/cbsp.h:99:33: note: 'warning_sec_info' declared here
99 | uint8_t warning_sec_info[50];
| ^~~~~~~~~~~~~~~~
Indeed, address of &warning_sec_info[0] is always not NULL.
Change-Id: Id369bb3676ba279bafc234378fbe21dbc7b0614b
Don't wait until RSL link goes up to check the reported features against
the config. Do it in the OML bring up right after the features are
reported.
Related: SYS#5922, OS#5538
Change-Id: I6b1b4ef3e163528ed186050d848ec089a4315a7c
Especially during emergencies / natural disasters, it is particularly
likely that networks become unreliable and BTSs disconnect and
reconnect. If upon reconnect there still is an active ETWS/PWS
emergency message active for this BTS, send it to the BTS to ensure
it re-starts broadcasting that message until disabled.
Change-Id: I175c33297c08e65bdbf38447e697e37f8a64d527
This reverts commit 5e2ac29703.
This patch was found to be a troublemaker regarding osmo-bsc
performance, since it's scheduling one timer every 100ms for each
channel. On a BSC with dozens of BTS, each with several TRX, this ends
up in a huge amount of timers scheduled in a tight timeframe, which ends
up in osmo-bsc spending CPU time getting in and out of the poll() main
loop.
Related: SYS#5922
Change-Id: Ibd5123e7f04ae8f4eb8f08b63525527f526f0b2c
This allows external monitoring to see where the T3113 timer has been
adjusted to, in case it is set dynamically.
Change-Id: I533f2ca3c8e66c143154cbf03b827c9cbbacccdf
Reaching this point will only make system load (CPU, mem) grow, making
it hard for the process to keep up with work to do, with no benefit
since the requests will anyway be scheduled too late.
Related: SYS#5922
Change-Id: I6523c6816a4d16b71084d004e979be40cf0aeeb0
In lchan_fsm_cleanup(), ensure that the time_cc timer is actually inactive
before deallocating. Do so via lchan_reset(), to also make sure the
timer is stopped in all other situations where the lchan is deactivated.
This fixes an infinite-loop deadlock as described in OS#5554:
- run BSC_Tests.TC_chan_act_ack_est_ind_noreply
- restart the BTS process after the test is done
- osmo-bsc enters infinite loop in osmo_timer_del()
The reason is that lchan_fsm_cleanup() fails to stop a running active_cc
timer upon lchan deallocation. TC_chan_act_ack_est_ind_noreply
incidentally terminates OML while the timer is still active.
Related: OS#5554
Change-Id: I901bb86a78d7d021c8efe751fd9d93e5956ac0e0
We have seen an increased CPU load in osmo-bsc recently since the paging
improvements where merged, centered round poll() calls.
It is expected most of them will be fixed with previous patch. In any
case, let's avoid unnecessary poll() calls being called for no reason.
Related: OS#5922
Change-Id: Ie767bdc8d4353aafe375a424e02d698ef7fd3dea
We want to recalculate the timer based on last time the work_timer was
triggered (that is, the time when the worker re-armed the pag req to
retransmit). We don't want to recalculate based on the last time the pag
ret tro retransmit was scheduled.
In loaded paging queue, there's lots of retrans (let's say 200) and it
may take more than 500ms to actually retransmit them. That means in most
cases we could end up in a situation where only pag req to retrans where
in the queue, hitting this recalculate path. Since the 500ms were for
sure elapsed, that would most probably schedule the work_timer at {0,0}
for each new paging request that arrived. As a result, the worker would
be scheduled lots of times per second (once for each new req arriving)
and only submitting 1 pag req (the new one) plus potentially 1 or
serveral pag req to retransmit.
In summary, there was not throthling applied in the scenario where only
pag req to retransmit where in the queue and new pag reqs kept arriving.
This incurrs into augmented paging throughput and also augmented
frequency of polls().
Related: OS#5922
Fixes: 4821c9f4df
Change-Id: I7ce6f436286b50dc31331d218ff256cf7be3f619
There's no need to use pointers there, it is only asking for errors from
code handling the data structe from the signal by attempting to change
them. Even for mem size point of view it doesn't make sense, since it's
3 byte vs a 4 byte pointer.
Furthermore, this is a preparation for new commit, where the NM object
current state will be updated before emitting the signal. This patch
eases a lot the follow up mentioned patch.
Change-Id: I9b648dfd8392b7b40bfe2b38f3345017481f5129
One callback function was being registered for each BTS.
That means, when a C0 RCARRIER of one specific BTS changed NM state,
the outcome on whether to trigger/abort ramping would end up being
applied to all BTS.
Change-Id: I56c4dd1809fdcf8441a69bf77ad173e1ccc8eea7
This makes sure code accessing those fields is not changing its values,
since it would make no sense to change those. Follow up commit will make
convert those pointers to be full structs instead, as there's no need to
have pointers there.
Change-Id: I9979e62eac861e25bbe2161ab187ddb2b40fd097