This fixes TBF objects leaking and ending up alive when the MS object is
explicitly freed through talloc_free (and sporadically
crashing TbfTest once a timeout for them occur).
This mostly affects unit tests, where most of the explicit free()
happens.
In osmo-pcu, in general, the GprsMs object only gets _free() called when
its resource count reaches 0, aka no more TBFs are attached to it. Hence
in general GprsMs object is freed() only when no TBFs (to be leaked) are
present.
However, in the unit tests it's usual that we want to wipe the entire
context by eg. feeing the PCU, the BTS or MS object, which should also
free the related TBFs.
When running osmo-pcu this may only be an issue when the MS object is
freed explicitly, which could happen for instance when a BTS is torn down,
ie. PCUIF going down, moment at which all GprsMs of that BTS are freed.
But in there actually it iterates over PDCHs to free all TBFs, so it's
fine.
If we iterated over MS, this could have ended up in a crash, like
it happened in TbfTest sporadically, but it's not a bit problem if we
crash + restart at that time since anyway the BTS is gone ore just
getting up around that time.
Related: OS#6359
Change-Id: Ibbdec94acb8132be20508d3178d88da44bfaf91d
We now use PCUIF v11 in the TTCN3 tests exclusively and also osmo-bts
and osmo-bsc only support PCUIF v11. There is no longer a need to
maintain a backward compatibility to PCUIF v10 in osmo-pcu.
Related: OS#5927
Change-Id: I68a3f59d5c960ae3a4fbd74f9d4a894295cb9ed8
Scenario: A DL TBF is assigned over PCH (CCCH) and we start transmitting
DL data blocks blindly after X2002, but at the same time the MS start
packet-access-procedure to request an UL TBF.
Right now osmo-pcu correctly detects the MS is available in PDCH and
re-assigns a DL TBF using PACCH, but the LLC frames it transmitted in
the old PCH-assigned DL TBF get lost when that older TBF is freed
(because the DL blocks were removed from the GprsMs llc_queue).
This issue is now more frequent since X2002 timer was added recently to
delay starting requesting USF for a UL TBF, hence the contention
resolution in general takes more time and hence the PACCH assignment of
the DL TBF takes more time too, so more DL data blocks are transmitted
to the DL TBF assigned over PCH during that time.
This patch improves the situation to at least recover the DL blocks
transmitted if the DL TBF is freed (due to MS merge trigger by scenario
mentioned above), where no DL ACK/NACK was ever received by the MS.
Ideal solution would be to have complete tracking of which LLC PDUs from
the llc_queues were completely ACKed at RLC/MAC level, but that really
requires a lot of work and major refactoring, which are left as a future
improvement.
Change-Id: I9be4035fb2cf2b3ee56e91dcc12cc8c24028b4aa
Otherwise the scheduler selects the UL TBF for USF during that time, and
of course no one has that USF assigned yet, so no answer and that ends
up triggering MAX_N3101 on the UL TBF.
This is specially important when PCU is connected to the BSC, since the
amount of time for the ImmAss to be scheduled on the BTS and reach the
MS can be bigger.
Change-Id: I48babd70ca44f11110240cbcfbab43d0e3a0fb59
T3192 on the MS side (boadcasted by the network on SI13) is used for the
MS to stay monitoring the PDCH of a DL TBF after sending PKT DL ACK/NACK
for the last DL data block (FinalAck=1).
T3193 is the counterpart in the network, defined as >T3192, which is
used to make sure the TFI is no longer used by the MS and hence can be
reused/re-assigned again.
Hence, we want to differentiate between those 2 timers, since the first
one gives us information on how late can we still tx PktDlAss on PACCH
on that DL TBF, while the later only provides info on when to free the
DL TBF in order to reuse the TFI.
Change-Id: I7421342461bf159d945651037e6fe690f88c6fae
When sending the last DL block, the FSM moves FLOW->FINISHED. Hence, it
is impossible to receive the FINAL_ACK in state flow, when we didn't yet
sent the last DL block (it's only possible if there's a bug in the MS).
TbfTest need to be adapted since they where acting wrong.
Change-Id: I3288743ff01ff84c3d345b6efb7c3fb6ca934791
Dispatching TBF_EV_CONTENTION_RESOLUTION_MS_SUCCESS means the UL TBF is
able to be used to assign DL TBFs over PACCH.
However, there's an extra implicit restriction: if the PCU already sent
the final UL ACK/NACK to that UL TBF, then whatever message sent
after it cannot be reliably answered, since the MS will go back to idle
state after issues the PKT CTRL ACK for that final UL ACK/NACK.
This condition is already being checked in
contention_resolution_success():
"""
if (ms_need_dl_tbf(ms()) && !tbf_ul_ack_waiting_cnf_final_ack(this))
ms_new_dl_tbf_assigned_on_pacch(ms(), this);
"""
Since we are considering the UL TBF to have done contention resolution
when we transmit the *first* UL ACK/NACK, it mans we are doing both things
on the same code path iif the *first* UL ACK/NACK is at the same time
the *final* UL ACK for that UL TBF. This can happen if the UL TBF is
only sending 1 block, which will have CV=0. This can usually happen
during GMM Attach Request.
In this scenario, with current code goes into a situation where first
the TBF_EV_CONTENTION_RESOLUTION_MS_SUCCESS is triggered and *afterwards*
the UL_ACK/NACK state is updated. As a result, the code snippet check
above (!tbf_ul_ack_waiting_cnf_final_ack()) will be true and the DL TBF
be assigned, but afterwards in the same code path it figures out the
final ack happens.
In order to fix this, first update the ACK/NACK state and only
afterwards trigger the Contention Resolution Success event.
Change-Id: I62ae91b494e4fd0ade3f4a3ba3817bcaedbdebf5
First print what is going to be attempted, later alarm about the
possible error, finally early return.
Change-Id: I417e9689f60e7f5d3c8ef67543e56fea87c8eebd
It makes no sense to continue trying to assign the UL TBF over PACCH
after T3168 * 4 retrans time out.
This helps in releasing the TBF after we got rid of incorrect use of
N3015 in UL TBFs.
While at it, update tbf_ul_fsm to use T3168 instead of X2001, since it
really needs to match T3168. Ideally it would not even have a timer
itself and receive an event from tbf_ul_ass_fsm, but that's left as a
TODO (it was already before) and simply the timer is updated.
Change-Id: I87dff68dedd06b60501e7586d20faf02bb1f0c93
T3195 is triggered as consequence of N3105 reaching MAX, which only
occurs in DL TBFs. Hence, T3195 also only applies to DL TBFs.
The references to T3195 were there as a remains from time
where we had a single tbf_fsm for both UL and DL TBFs. It can now be
further simplified.
Change-Id: I07a43c13289d50707115187a3f22df21443a7b4a
This counter is only used in DL TBFs as per TS 44.060.
section 13.4 "N3105":
"When the network after sending a RRBP field in the _downlink RLC data block_ ..."
(DL data blocks are only sent in DL TBFs).
section 8.1.2.1:
"If N3105 = N3105max, the network shall release the downlink TBF internally
and start timer T3195 for that TBF. When T3195 expires, the network may reuse
the TFI"
Change-Id: I4d4f4d4d3e6e0539ea8ec2395bed00d059b84e04
While at it, fix a typo in the test output.
This function is nowadays only used in ms_current_pacch_slots(), which
is used only to print the PACCH TS (the first common UL & DL TS).
Change-Id: Id1d0b681f6866618f9f3a8c64d6a6c809ca50ea7
The FN time counter is not really PDCH specific, but to the whole BTS,
and all other calls t the bts_set_current_frame_number() are already
laced in pcu_l1_if.cpp; move it there.
Change-Id: If36f22a1067c904fa7fda87bed5062b6738f0dd1
Those function don't really require the full FN, hence let's pass only
the required information.
This makes the implementation here less dependent on how/if we are able to calculate
full FNs based on RFN: We get an RFN, and we have to encode so that the RFN can be
derived again, so feels less cumbersome having to go through RFN->FN->RFN which may
only cause possible issues if there's some FN timing bug.
3GPP TS 44.018 10.5.2.30 Request Reference:
"The purpose of the Request Reference information element is to provide the random
access information used in the channel request and the frame number, FN modulo 42432"
3GPP TS 44.018 10.5.2.38 Starting Time:
"The purpose of the Starting Time information element is to provide the start TDMA
frame number, FN modulo 42432."
Change-Id: If9b758434c00f2a3868534d5be84946809c989a9
The previous code was really confusing, passing full FNs as RFNs under
certain external conditions, and then assuming the RFN input of
rfn_to_fn() function could actually be a FN.
As a result, we had a lot of code behaving slightly different depending
on whether the incomding FN from pcuif was filled in by a BSC or a BTS.
Avoid this b ehavior differentiation and always assume the most
restricted one, aka RFN.
Change-Id: Ib3b5702168195b595711cd0ff32c211b9aba429d
Change format to print the state at the end, to resemble more the same
format used by FSMs.
Furthermore, by moving it at the end, print it only when "enclousure" is
requested, aka when not requested by FSM to update its internal name.
The consequence of this logc is that log lines printed from FSM don't
end up with the same state string printed twice in different places.
While at it, shorten the EGPRS/GPRS indicator to one character, which
should be understandable enough since it matches what's usually seen in
mobile phones to signal one or another.
Change-Id: I86b5f042fae77721b22fc026228677bd56768ba9
The functions l1if_open_pdch and l1if_close_pdch have a misleading
naming since what they actually do is opening and closing the TRX since
they return and accept a context (obj) that is valid for a whole TRX.
This also explains why the other functions accept a timeslot as
parameter in addition to the context. Let's rename those functions so
that it is clear what they do.
Related: OS#6022
Change-Id: I395a60b2fba39bac4facec78989bac20f0cef0d3
This patch finally decouples TBF allocation from resource allocation.
This will allow in the future reserving resources without having to
require a TBF object to exist.
Change-Id: I2856c946cb62d6e5372a1099b60e5f3456eb8fd4
This way the alloc_algo() becomes idempotent, simplifying implementation
of new alloc_algos as well as rolling back if allocation fails (for
instance because some resource is exhausted at the time).
For now the code applying the results is moved to tbf::alloc_algo(), but
it will eventually get out of tbf code, so that the MS object is
responsible for running it. As a result, there's no even need to create
TBF object before trying to allocate resources, which will help furher in
rollback operations described above.
Change-Id: I5ffd00f5f80bde4b73b78db44896f65e70e12b20
This is a first step towards isolating the allocation algorithm from
applying changes on PCU state.
In next steps the tbf pointer will be dropped and the allocation
algorithm will only result a "result" struct which then the caller can
apply to whatever TBF object it requires.
Change-Id: Ie4d9ace526ad012d97738bc55bdb5cc1472c632d
Simply apply the content of the configured timer when the MS goes idle.
Having that field is convenient to do tricky stuff in unit tests, but
makes the main osmo-pcu app more complex for no good enough reason.
Change-Id: I8d44318b37b6605afd84db8ccec0d75e6db293b9
This is useful to track the lifecycle of MS objects. The RLCMAC cateogry
used so far in those log messages is too broad.
Change-Id: Ib4ce88d0f7309ac77c064a94bb0d667e8dbc33dd
Make the caller hold a reference to the MS object just allocated, so
that it hs to explicitly unref it and, in turn, if no new references
were added during its use, trigger release of the MS object.
This is useful to avoid leaking MS object if it was allocated and then
no TBF is attached to it because allocation of TBF failed.
Related: OS#6002
Change-Id: I2088a7ddd76fe9157b6626ef96ae4315e88779ea
Whether the TBF is GPRS or EGPRS is known at allocation time since it
comes from the information known in the MS object used to create it.
Hence, no need to delay calling it to later steps such as setup().
So far it was probably left in setup() due to the constrains about
requiring the subclass to be constructed (use of window() virtual API).
Change-Id: I2e9d2a98c666a930333d52fb6c0463d7593c2615
This commit changes lots of stuff in the MS release lifecycle, but
there's no really good way to split this into patches which make sense,
since all the chaos is intensively entangled.
Get rid of the ms_callback complex mess, it is not needed at all.
Previous MS release was strange due to the existance of previous
ms_callback.idle concept and MS storage: the MS signalled when it went
idle (no TBFs attached) and waited for somebody outside to free it,
while then arming itself the release timer to release itself if it was
not released by whoever.
The new lifecycle follows an easier (expected) approach: Whenever all
TBFs become detached from the MS and it becomes idle (use_count becomes
0), then it frees its reserved resources (TFI, etc.) and either:
* frees itself immediatelly under certain conditions (release timeout
configured = 0 or MS garbage with TLLI=GSM_RESERVED_TMSI)
* Arms release_timer and frees itself when it triggers.
If during release_timer the MS is required again (for instance because a
new TBF with TLLI/IMSI of the MS is observed), then a TBF is attached to
the MS and it is considered to become active again, hence the release_timer
is stopped.
OS#6002
Change-Id: Ibe5115bc15bb4d76026918adc1be79469c2f4839
It is interesting to log that a tbf is being detached *before* it
actually happens, so that the reader can see which TBF is being
detached.
Change-Id: I2008beb9ab8f97f7ea5ed7b45cfb3f23dfe7b27f
* Make sure that the tbf being attached has already the MS assigned.
* Check no re-attaching of alredy attached TBF ever happens.
* Document and early skip case where a non-attached TBF detach is
attempted.
* Avoid recursive call mess to tbf_set_ms(tbf, NULL); during detach.
* MsTest needs to be modified since it uses fake TBF objects which use
different set of calls than the regular TBFs in osmo-pcu. Since the
ul_tbf object is reused, it needs to be assigned ul_tbf->ms again before
re-assigning it, as per what happens usually in tbf_set_ms() in
osmo-pcu.
Change-Id: Ia18fe2de1fb3bf72f530157e5f30de64f2b11e12
gprs_default_cb_ms_idle() is changed to have the same implementation as
previous bts_ms_idle_cb(), since that's the only one being used in
osmo-pcu code. It makes no sense to use different callback logic in unit
tests.
This is another step towards simplifying the code and getting rid of the
idle/active_cb().
Change-Id: I2a06d17588572a21dc5a14ddbde83766076b446d
With this change the MS no longer is removed from the llist without
potentially skipping free() if not idle in bts_ms_idle_cb().
As a result, some unit tests now can free it during bts tear down
instead of having them leaked.
The tests int MsTest need changes because the tbfs created are fake and
cannot be freed using tbf_free(), and hence cannot be detached from MS
using regular code paths. Instead first call explicit talloc_free(ms)
like other unit tests in the file already do.
Change-Id: Id53f8dfb9963366dd4b19a324615bbc83c4f23e7
That class is mostly a C++ class holding a llist plus some callbacks.
Having that in a separate class makes code more complex for no good
reason. This patch moves the llist into bts and allocates stuff directly
from within bts.
This will allow further cleanup of MS lieficyle in future patches.
Change-Id: I627f5db5073189c23ddf2b7f09c90abb24846f62
That information is not required during allocation of the object, and
most times it is not known.
Defer setting it only to meaningul values in paths obtaining the
information from peers.
Change-Id: I36f07dc389f7abe205fc4bcddbde93735f5d5cfc
Inside osmo-pcu code, the code path is to always call ms_alloc with
tlli=GSM_RESERVED_TMSI; a different value is only passed during unit
tests.
It makes no sense to have unit tests using differnet code paths than
ther app itself, since in the app it always desired to go through the
ms_set_tlli() and ms_confirm_tlli() which does more stuff.
Hence, drop the tlli param in ms_alloc and change the unit tests to use
the available APIs used by the application.
Change-Id: I730ec911a43b0f4e78faee4eeffb3ca8601564f8
This check (tlli != 0) was added in 2015 in
6d86628e5b, with the rationale below:
"""
To avoid dangling entries without a TLLI there (which cannnot be
retrieved anyway), the timer in the MS objects is not started after
all TBF have been detached, so that they get deleted immediately in
that case.
"""
The rationale makes sense, but through time the MS class was fixed to
return GSM_RESERVED_TMSI (0xFFFFFFFF) when no TMSI was available.
Hence, the check was wrong, and as a result, free() of MS containing
GSM_RESERVED_TMSI would be delayed over time by release timer.
Related: OS#6002
Change-Id: I7a694a30f8709c00af774846d7c4925cef253a71
When debugging frame number offset problems betwen l1 and below and the
upper layers of the PCU, it may be helpful do know which blocks/frame
numbers got reserverd for uplink transmissions.
Change-Id: I4277c572a4cc6cbbf3ac4e67442c9036be687627
Related: OS#5198
The previous gprs/protocol/gsm_04_60.h header from libosmogsm was
misleading because it was placed in the subdirectory used by libosmogb,
and hence was recently deprecated in favour of gsm/protocol/gsm_04_60.h.
Let's follow the pragma message advising to move to the new header file.
Depends: libosmocore 0158b05337a825352d9fd7f074170b686e9fd1e5
Change-Id: I027abbf3ed4c71331000565af1ef4f08f10cfafc