This fixes TBF objects leaking and ending up alive when the MS object is
explicitly freed through talloc_free (and sporadically
crashing TbfTest once a timeout for them occur).
This mostly affects unit tests, where most of the explicit free()
happens.
In osmo-pcu, in general, the GprsMs object only gets _free() called when
its resource count reaches 0, aka no more TBFs are attached to it. Hence
in general GprsMs object is freed() only when no TBFs (to be leaked) are
present.
However, in the unit tests it's usual that we want to wipe the entire
context by eg. feeing the PCU, the BTS or MS object, which should also
free the related TBFs.
When running osmo-pcu this may only be an issue when the MS object is
freed explicitly, which could happen for instance when a BTS is torn down,
ie. PCUIF going down, moment at which all GprsMs of that BTS are freed.
But in there actually it iterates over PDCHs to free all TBFs, so it's
fine.
If we iterated over MS, this could have ended up in a crash, like
it happened in TbfTest sporadically, but it's not a bit problem if we
crash + restart at that time since anyway the BTS is gone ore just
getting up around that time.
Related: OS#6359
Change-Id: Ibbdec94acb8132be20508d3178d88da44bfaf91d
Change format to print the state at the end, to resemble more the same
format used by FSMs.
Furthermore, by moving it at the end, print it only when "enclousure" is
requested, aka when not requested by FSM to update its internal name.
The consequence of this logc is that log lines printed from FSM don't
end up with the same state string printed twice in different places.
While at it, shorten the EGPRS/GPRS indicator to one character, which
should be understandable enough since it matches what's usually seen in
mobile phones to signal one or another.
Change-Id: I86b5f042fae77721b22fc026228677bd56768ba9
The functions l1if_open_pdch and l1if_close_pdch have a misleading
naming since what they actually do is opening and closing the TRX since
they return and accept a context (obj) that is valid for a whole TRX.
This also explains why the other functions accept a timeslot as
parameter in addition to the context. Let's rename those functions so
that it is clear what they do.
Related: OS#6022
Change-Id: I395a60b2fba39bac4facec78989bac20f0cef0d3
This patch finally decouples TBF allocation from resource allocation.
This will allow in the future reserving resources without having to
require a TBF object to exist.
Change-Id: I2856c946cb62d6e5372a1099b60e5f3456eb8fd4
Simply apply the content of the configured timer when the MS goes idle.
Having that field is convenient to do tricky stuff in unit tests, but
makes the main osmo-pcu app more complex for no good enough reason.
Change-Id: I8d44318b37b6605afd84db8ccec0d75e6db293b9
Make the caller hold a reference to the MS object just allocated, so
that it hs to explicitly unref it and, in turn, if no new references
were added during its use, trigger release of the MS object.
This is useful to avoid leaking MS object if it was allocated and then
no TBF is attached to it because allocation of TBF failed.
Related: OS#6002
Change-Id: I2088a7ddd76fe9157b6626ef96ae4315e88779ea
This commit changes lots of stuff in the MS release lifecycle, but
there's no really good way to split this into patches which make sense,
since all the chaos is intensively entangled.
Get rid of the ms_callback complex mess, it is not needed at all.
Previous MS release was strange due to the existance of previous
ms_callback.idle concept and MS storage: the MS signalled when it went
idle (no TBFs attached) and waited for somebody outside to free it,
while then arming itself the release timer to release itself if it was
not released by whoever.
The new lifecycle follows an easier (expected) approach: Whenever all
TBFs become detached from the MS and it becomes idle (use_count becomes
0), then it frees its reserved resources (TFI, etc.) and either:
* frees itself immediatelly under certain conditions (release timeout
configured = 0 or MS garbage with TLLI=GSM_RESERVED_TMSI)
* Arms release_timer and frees itself when it triggers.
If during release_timer the MS is required again (for instance because a
new TBF with TLLI/IMSI of the MS is observed), then a TBF is attached to
the MS and it is considered to become active again, hence the release_timer
is stopped.
OS#6002
Change-Id: Ibe5115bc15bb4d76026918adc1be79469c2f4839
gprs_default_cb_ms_idle() is changed to have the same implementation as
previous bts_ms_idle_cb(), since that's the only one being used in
osmo-pcu code. It makes no sense to use different callback logic in unit
tests.
This is another step towards simplifying the code and getting rid of the
idle/active_cb().
Change-Id: I2a06d17588572a21dc5a14ddbde83766076b446d
That information is not required during allocation of the object, and
most times it is not known.
Defer setting it only to meaningul values in paths obtaining the
information from peers.
Change-Id: I36f07dc389f7abe205fc4bcddbde93735f5d5cfc
There is a function l1if_connect_pdch, but no complementary function
like we have it with l1if_open_pdch and l1if_close_pdch. The reason for
this is that the PHY implementations that rely on a femtocell DSP do not
need to disconnect the pdch explcitly. However, the planned support for
the E1 based Ercisson RBS CCU will require an explicit disconnect. So
lets add a function call for this.
Change-Id: Ied88f3289bda87c48f5f9255c4591470633cc805
Related: OS#5198
This allows having full information on the control TS easily reachable
(like TRX ofthe PDCH), and makes it easy to compare TS by simply
matching the pointer address.
Change-Id: I6a97b6528b2f9d78dfbca8fb97ab7c621f777fc7
First commit towards trying to have alloc algorithm as isolated as
possible from others parts of the code trying to avoid state changes on
data structures.
Change name also because the alloc_algo not only allocated TS, but TFIs
and USFs.
Change-Id: I33a6c178c64a769f05d3880a69c38acb154afa62
The triggering of the assignment for the new TBF allocation is part of
the "upgrade to multislot" process, hence move that inside the function.
Change-Id: Ic2b959f2476b900cb263ccd8f6698ed843bafd29
* Make clear the code relates to DL-TBF and not UL-TBF.
* Change wording to "upgrade" to match the existing field and API
"tbf_can_upgrade_to_multislot()".
* Free the TBF if we cannot allocate new resources.
Change-Id: I0e4f8d7e46235a471b2124b280c81ff07b6967a4
The field contains a common value between the 2 active TBFs of the MS,
so it makes no sense to have them duplicated on each TBF. It can be
sanely stored in the MS object.
Change-Id: I8df01a99ccbfaf7a442ade5000ee282bd638fbba
The 2 types of TBF share some parts of the implementation, but actually
half of the code is specific to one direction or another.
Since FSM are becoming (and will become even) more complex, split the
FSM implementation into 2 separate FSMs, one for each direction.
The FSM interface is kept compatible (events and states), hence code can
still operate on the tbfs on events and states which are shared.
Test output changes are mainly due to:
* FSM instance now created in subclass constructor, so order of log
lines during constructor of parent class and subclass slightly
changes.
* osmo_fsm doesn't allow having 2 FSM types with same name registered,
hence the "TBF" name has to be changed to "DL_TBF" and "UL_TBF" for
each FSM class.
* FSM classes now use DTBFUL and DTBFDL instead of DTBF for logging. Some
tests don't have those categories enabled, hence some log lines
dissappear (it's actually fine we don't care about those in the test
cases).
Change-Id: I879af3f4b3e330687d498cd59f5f4933d6ca56f0
Use same formatting similar to what's now used in TBF, which is far more
easy to grep and follow. This way one can easily follow what happens to
a given IMSI, a give TFI, a given TLLI, etc.
Change-Id: If9b325764c8fd540d60b6419f32223fd7f5a5898
use a format in tbf::name() which is sanitized (osmo_sanitize) and hence
can be used both in regular log as well as for its internal FSM ids.
Until now, the FSMs contained a small amount of information with
different formatting than the regular LOGPTFB(), which made it difficult
to grep or follow a TBF through its lifetime looking at logs. The new
unified format makes that a lot easier.
Extra information is now printed if available, such as IMSI.
Furthermore, the TFI is updated to include BTS and TRX, since the TFI is
unique within the scope of a TRX.
Change-Id: I3ce1f53942a2f881d0adadd6e5643f5cdf6e31da
That function was pretty confusing since it used a "enum
gprs_rlcmac_tbf_direction dir" param (whose type is expected to describe
the data direction of a TBF) to describe the direction of the the packet
which triggered its call.
The parameter was actually always called with "GPRS_RLCMAC_UL_TBF" which
in this case meant "uplink direction" which meant "TLLI was updated from
the MS, not the SGSN".
The DL direction was only used in unit tests, which can hence be simply
replaced by ms_confirm_tlli(), which this commit does.
So this update_ms() function was actually used in practice in osmo-pcu
to trigger update of TLLI and find duplicates only when an RLCMAC block
(control or data) was received from the MS. Therefore, this function is
renamed in this patch and moved to the gprs_ms class, since it really
does nothing with the TBF.
Related: OS#5700
Change-Id: I1b7c0fde15b9bb8a973068994dbe972285ad0aff
Remove the paragraph about writing to the Free Software Foundation's
mailing address. The FSF has changed addresses in the past, and may do
so again. In 2021 this is not useful, let's rather have a bit less
boilerplate at the start of source files.
Change-Id: I4a49dbeeec89b22624c968152118aecf8886dac6
We basically want to probe whether it's possible to allocate TBFs, or
whether we know it will fail due to all main resources being already in
use (TFI, USF).
Having bts_all_pdch_allocated() return false doesn't mean though that an
MS will be able to allocate a TBF for sure. That's because further
restrictions are applied based on MS: whether it was already attached to
a specific TRX, whether the ms_class allows for a certain multislot
combination, etc. However, it should provide a general idea on whether
for sure the PCU is unable to provide more allocations. More fine
grained state about failures can still be followed by looking at
tbf:alloc:failed:* rate counters.
Related: SYS#4878
Depends: Iabb17a08e6e1a86f168cdb008fba05ecd4776bdd (libosmocore)
Change-Id: Ie0f0c451558817bddc3fe1a0f0df531f14c9f1d3
This commit actually addresses 2 errors:
1- gprs_bssgp_pcu_rx_paging_ps() called gprs_rlcmac_paging_request()
with MI which can be either TMSI or IMSI, and the later always called
bts_pch_timer_start() passing mi->imsi regardless of the MI type. Hence,
trash was being accessed & stored into bts_pch_timer structures if MI
type used for paging was TMSI.
2- When the MS received the PS paging on CCCH and requests an UL TBF, it
will send some data. If one phase access is used for whatever reason,
the IMSI may not be yet available in the GprsMs object since we never
received it (and we'd only have it by means of PktResourceReq). Hence,
let's better first try to match the paging by TLLI/TMSI if set in both
places, and otherwise use the IMSI.
Related: OS#5297
Change-Id: Iedffb7c6978a3faf0fc26ce2181dde9791a8b6f4
This allows distinguishing when a TBF didn't set the TFI. Useful to
identify dummy reject TBFs, etc, and make sure a non-dummy TBF set its
TFI properly.
Change-Id: Iecf54a24041bd14f4ef5b86e57c3732e1b69d463
This helps distinguishing the case where a TBF is in the initial state
and the unexpected case where osmo_fsm_inst_state_name reports "NULL"
due to fi pointer being NULL.
Change-Id: Ieaabfc9fa0dedb299bcf4541783cf80e366a88c3
We are freeing the object immediately afterwards anyway, so no need to
pretend it went through the normal state release.
Leaving current state as it is actually provides more information on
what was the status/state at the time the TBF had to be freed.
Change-Id: I3016caaccc2c43e1e300f3c6042d69f8adcd9d69
At some point later in time the state_flags will most probably be split
into different variables, one ending up in a different FSM. It is moved
so far to the exsiting FSM from the C++ class since it's easier to
access it from C and C++ code, and anyway that kind of information
belongs to the FSM.
Related: OS#2709
Change-Id: I3c62e9e83965cb28065338733f182863e54d7474
Run bts_pch_timer_remove() on each entry of the BTS specific pch_timer
list, so we don't have a memory leak and so the timer doesn't
potentially fire for a deallocated BTS.
Fixes: d3c7591 ("Add counters: pcu.bts.N.pch.requests.timeout")
Change-Id: Ia5e33d1894408e93a51c452002ef2f5758808269
This is only an initial implementation, where all state changes are
still done outside the FSM itself.
The idea is to do the move in several commits so that they can be
digested better in logical steps and avoid major break up.
Related: OS#2709
Change-Id: I6bb4baea2dee191ba5bbcbec2ea9dcf681aa1237
Since a while ago, the data architecture was changed so that TBF is
guaranteed to always have a MS object associated. Hence, it makes no
sense to pass the MS object as a separate param as we can take it from
tbf object and makes code less confusing.
Change-Id: Idc0c76cf6f007afa4236480cdad0d8e99dabec5f
Before this patch, allocate_usf() was implemented to only allocate 1 USF
per TBF, regardless of the available ul_slot mask.
As a result, only 1 slot at max was allocated to any TBF. That's a pity
because usual multislot classes like 12 support up to 2 UL slots per TBF
(in common TS with DL).
This patch reworks allocate_usf() to allocate as many UL multislots as
possible (given mslot class, current USF availability, TFI availability,
related DL TBF slots for the same MS, etc.).
As a result, it can be seen that AllocTest results change substantially
and maximum concurrent TBF allocation drops under some conditions.
That happens due to more USFs being reserved (because each TBF has now
more UL slots reserved). Hence now USF exhaustion becomes the usual
limitation factor as per the number of concurrent TBFs than can be
handled per TRX (as opposed to TFIs previously).
Some of the biggest limitations in test appear though because really
high end multislot classes are used, which can consume high volumes of
UL slots (USFs), and which are probably not the most extended devices in
the field.
Moreover, in general the curren timeslot allocator for a given
multislot class will in general try to optimize the DL side gathering
most of the possible timeslots there. That means, for instance on ms
class 12 (4 Tx, 4Rx, 5 Sum), 4 DL slots and 1 UL slot will still be
selected. But in the case where only 3 PDCHs are available, then with
this new multi-slot UL support a TBF will reserve 3 DL slots and 2 UL
slots, while before this patch it would only taken 1 UL slot instead of
2.
This USF exhaustion situation can be improved in the future by
parametrizing (VTY command?) the maximum amount of UL slots that a TBF
can reserve, making for instance a default value of 2, meaning usual
classes can gather up 2 UL timelosts at a time while forbidding high-end
hungry classes to gather up to 8 UL timeslots.
Another approach would be to dynamically limit the amount of allowed
reservable UL timeslots based on current USF reservation load.
Related: OS#2282
Change-Id: Id97cc6e3b769511b591b1694549e0dac55227c43
Let's disable category here since we don't care about its formatting here.
In any case, every test relying on logging output validation should
always explicitly state the config to avoid issues in the future if
default values change.
Change-Id: I7f9c56313cfaa74ebe666f44763a83d8102f5484
Related: OS#5034
This patch doesn't really tests whether osmo-pcu can work on a multi-bts
environment, but it prepares the data structures to be able to do so at
any later point in time.
Change-Id: I6b10913f46c19d438c4e250a436a7446694b725a
Previous work on BTS class started to get stuff out of the C++ struct
into a C struct (BTS -> struct gprs_glcmac_bts) so that some parts of
it were accessible from C code. Doing so, however, ended up being messy
too, since all code needs to be switching from one object to another,
which actually refer to the same logical component.
Let's instead rejoin the structures and make sure the struct is
accessible and usable from both C and C++ code by rewriting all methods
to be C compatible and converting 3 allocated suboject as pointers.
This way BTS can internally still use those C++ objects while providing
a clean APi to both C and C++ code.
Change-Id: I7d12c896c5ded659ca9d3bff4cf3a3fc857db9dd
Currently the BTS object (and gprs_rlcmac_bts struct) are used to hold
both PCU global fields and BTS specific fields, all mangled together.
The BTS is even accessed in lots of places by means of a singleton.
This patch introduces a new struct gprs_pcu object aimed at holding all
global state, and several fields are already moved from BTS to it. The
new object can be accessed as global variable "the_pcu", reusing and
including an already exisitng "the_pcu" global variable only used for
bssgp related purposes so far.
This is only a first step towards having a complete split global pcu and
BTS, some fields are still kept in BTS and will be moved over follow-up
smaller patches in the future (since this patch is already quite big).
So far, the code still only supports one BTS, which can be accessed
using the_pcu->bts. In the future that field will be replaced with a
list, and the BTS singletons will be removed.
The cur_fn output changes in TbfTest are actually a side effect fix,
since the singleton main_bts() now points internally to the_pcu->bts,
hence the same we allocate and assign in the test. Beforehand, "the_bts"
was allocated in the stack while main_bts() still returned an unrelated
singleton BTS object instance.
Related: OS#4935
Change-Id: I88e3c6471b80245ce3798223f1a61190f14aa840
When allocating multiple slots for a UE the following example
is not allowed 'UU----UU' for a UE class 12.
The time slot number can not roll over 7 and move to 0.
44.060 or 45.002 only specifies contigous however it was unclear
it this is an allowed pattern.
Only the example 45.002 B.3 in release 12 cleared this up.
It gives an example for a multi slot class 5 UE which has 7 possible
configuration this means the rolled over is not allowed.
Multislot class type 2 UE doesn't have this limitation.
Further if a UE supports 8 time slots this is not a limitation because
the window size (45.002 B.1) can include all time slots.
Releated: SYS#5073
Change-Id: I16019bdbe741b37b83b62749b840a3b7f4ddc6c7
When both TBFs (Dl, Ul), are detached, ms_detach_tbf() will call
ms_start_timer() which will hold a reference of the MS (ms_ref()) and
wait for X seconds (VTY config, T=-2030, 60 seconds by default) before
unrefing the MS, which will trigger ms_update_status() finally (ref==0)
and will in turn call cb.ms_idle(), which will tell the ms_storage to
free the MS.
This mechanism is used to keep MS objects around for a certain time so
that when new TBFs are established, we have cached interesting
information about the MS, ready to use.
However, in AllocTest, tons of MS are allocated in a loop calling a
function (such as test_alloc_b_ul_dl()). In that function, a BTS is
allocated in the stack and at the end of the function BTS::cleanup() is
called due to implicit destructor, which ends up calling
ms_storage::cleanup() which removes all MS from its list and frees them
*if they are not idle*. The problem here, is that due to T=-2030, an
extra reference is hold and hence the ms is not considered idle
(ms_is_idle() checks ms->ref==0). As a result, the MS is never freed,
because we don't use libosmocore mainloop here (and in any case, it
would take 60 seconds to free it).
By setting the timeout of T=-2030 to 0, ms_start_timer will avoid using
the timer and will also avoid holding the extra reference, hence
allowing ms_storage to free the object during cleanup().
This fix really helps in improving performance for AllocTest specially
after MS object contains a rate_ctr. As tons of MS objects were left
alive, they stood in the rate_ctr single per-process queue, making the
test last crazy amount of time and spending 50% of the time or more
iterating the list full of MS related rate counters.
Change-Id: I6b6ebe8903e4fe76da5e09b02b6ef28542007b6c
As we integrate osmo-pcu more and more with libosmocore features, it
becomes really hard to use them since libosmocore relies heavily on C
specific compilation features, which are not available in old C++
compilers (such as designated initializers for complex types in FSMs).
GprsMs is right now a quite simple object since initial design of
osmo-pcu made it optional and most of the logic was placed and stored
duplicated in TBF objects. However, that's changing as we introduce more
features, with the GprsMS class getting more weight. Hence, let's move
it now to be a C struct in order to be able to easily use libosmocore
features there, such as FSMs.
Some helper classes which GprsMs uses are also mostly move to C since
they are mostly structs with methods, so there's no point in having
duplicated APIs for C++ and C for such simple cases.
For some more complex classes, like (ul_,dl_)tbf, C API bindings are
added where needed so that GprsMs can use functionalitites from that
class. Most of those APIs can be kept afterwards and drop the C++ ones
since they provide no benefit in general.
Change-Id: I0b50e3367aaad9dcada76da97b438e452c8b230c