The signal is already there but not being used.
Let's further split generic paging code from RSL specificites.
Change-Id: Iabc1c29908a5136501d6dc6e60f8777dab511b86
This patch caches the counts of initial paging requests for each paging
group. This count is needed to estimate T3113 when a new incoming paging
request is received and it has to be inserted into the queue.
With this there's no need to traverse the whole initial_req_list every
time a new incoming paging request is receiving, potentially saving lots
of iteration and hence lots of CPU when the queue is long.
Related: SYS#6200
Change-Id: I6994127827d120a0b4dd3de51e1ddde39f2fe531
If queue size (in transmit delay of requests) is too long (above
threshold) when a new initial incoming request arrives, instead of
directly discarding it, see if we can drop a pending retransmission and
insert the new one instead, in order to avoid losing initial requests.
This is done under the assumption that it is more important to transmit
intial requests than to retransmit already transmitted ones. The
rationale is that there's lower chances that an MS which didn't answer
lately will answer now (aka being reachable at the cell), so it's better
to allocate resources for new requests (new MS) which may be available
in the cell.
Change-Id: Idfd93254ae456b1ee08416e05479488299dd063d
Related: OS#5552
Initial requests: paging requests which haven't been yet transmitted
Retrans requests: paging requests which have already been transmitted at
least once.
Until now one queue was used (llist) to store both. The initial requests
were stored at the start of the queue in FIFO order. After the last
initial requests, the retrans requests followed also in FIFO.
This ordering was used in order to prioritze scheduling of initial
paging requests over retransmit paging requests.
In the end, having both types in the same list only make code handling
the list more complex.
Hence, this patch splits the shared llist into 2 llists.
Related: SYS#6200
Change-Id: Ib68f2169e3790aea4ac77ec20ad79f242b7c2747
When adding a new incoming request (aka "initial_req"), if the queue is
full only of retransmis (no not-yet-ever-transmitted initial requests
queued), then the new incoming request is inserted at the start of the
queue, in order to take scheduling priority over retransmits.
This was already fine before this patch, but the counting of requests
before it (used to calculate its T3113) was wrong, because it counted all
the retransmissions. This patch fixes it to end up with
reqs_before(_same_group)=0.
Change-Id: Ib2e810b0bcc51eac117713584310272462c58867
This allows configuring the maximum delay of paging requests to be
queued according to other parameters, such as MSC paging request
timeouts, etc.
Related: OS#5552
Change-Id: Ia556ef4e474e6a2d0d1618bab680a3330a3c062b
The BTS can be obtained easily from the backpointer in req->bts. Having
the extra param can only lead to confusion and introduction of bugs, as
happened in bug fixed by Change-Id
I8c6828f86b7ccbb2c4a09ca1aec859a2c597b679.
Related: SYS#6200
Change-Id: I545fd853fdffce7f23f0d99203c66c3b39144e4b
When rewriting the loop, the pointer passed all the time to
paging_remove_request() was the one of the BTS which answered the
request, not the one actually handling the related unanwared still
active paging request.
Fixes: 70a1d60a83
Related: SYS#6200
Change-Id: I8c6828f86b7ccbb2c4a09ca1aec859a2c597b679
This allows tracking each BTS active paging request queue length over
time, and evaluate CPU load based on the number of active paging
requests queued.
Related: SYS#6200
Change-Id: I6d296cdeba1392ef95fc31f6c04210c73f1b23e5
This way all paging related stats can be grouped (more will be added in
follow-up commits).
Related: SYS#6200
Change-Id: Iede1b6f68df468c7a3b3bf5fce7f68bb08b78832
Prior to this patch the whole paging queue of each BTS was iterated.
After the patch only the active paging_req for a given subscriber are
iterated.
Related: SYS#6200
Change-Id: I225d5e08427c6bb9d92ce6a1dccb6ce36053eab5
This is triggered by BSC_Tests.TC_lcs_loc_req_no_subscriber.
Before, the NULL ptr was not a problem because paging_request_cancel()
only used the pointer to compare it against other pointers, but never
accessing it. A follow-up patch is, however, changing the implementation
to optimize the lookup by using the subscriber pointer, which generates
a crash.
Related: SYS#6200
Change-Id: Id0de43ac5bde0f52f258de6c9bf58b173301c8db
Before this patch, the entire queue of paging_request had to be iterated
in order to find if the subscriber already had an active paging request
(discarding duplicates).
Now that bsc_subscriber holds a list of its active paging requests, it's
easier to look at its list to find out if it is already active on that
BTS.
As a result, there's no need to unconditionally iterate the whole list
and we can optimize the loop finding the spot to insert the new queue:
- First the potentially way smaller loop over
bsub->active_paging_requests is done
- Second, only if the first loop allowed, the bts->pending_requests is
iterated and potentially early terminated.
Related: SYS#6200
Change-Id: I7912275026c4d4983269c8870aa5565c93277c5a
This allows havily decreasing the algorithmic cost of removing all
pending active paging requests for a given subscriber once it answers
on a given BTS.
Beforehand, the whole paging queue of all BTS were iterated. Now, only
the active requests for that subscriber are iterated.
Related: SYS#6200
Change-Id: I831d0fe01d7812c34500362b90f47cd65645b666
This saves the BSC from iterating twice the whole paging list of the BTS
which received the paging response.
Related: SYS#6200
Change-Id: I5f9215f31428ce0249cd9ece6d2d4e93155f429f
This way we avoid triggering timers and doing extra poll loops for each
BTS which is configured but not up. It also has the effect of removing
logging about estimating paging buffers for BTS which are down, which
can be confusing.
Furthermore, since work is delayed until the TRX and cell in general is
configured, the first estimation is properly done now since the correct
configuration is in place at that time.
Related: SYS#5922
Change-Id: I1b5b1a98115b4e9d821eb3330fc5b970a0e78a44
This allows external monitoring to see where the T3113 timer has been
adjusted to, in case it is set dynamically.
Change-Id: I533f2ca3c8e66c143154cbf03b827c9cbbacccdf
Reaching this point will only make system load (CPU, mem) grow, making
it hard for the process to keep up with work to do, with no benefit
since the requests will anyway be scheduled too late.
Related: SYS#5922
Change-Id: I6523c6816a4d16b71084d004e979be40cf0aeeb0
We have seen an increased CPU load in osmo-bsc recently since the paging
improvements where merged, centered round poll() calls.
It is expected most of them will be fixed with previous patch. In any
case, let's avoid unnecessary poll() calls being called for no reason.
Related: OS#5922
Change-Id: Ie767bdc8d4353aafe375a424e02d698ef7fd3dea
We want to recalculate the timer based on last time the work_timer was
triggered (that is, the time when the worker re-armed the pag req to
retransmit). We don't want to recalculate based on the last time the pag
ret tro retransmit was scheduled.
In loaded paging queue, there's lots of retrans (let's say 200) and it
may take more than 500ms to actually retransmit them. That means in most
cases we could end up in a situation where only pag req to retrans where
in the queue, hitting this recalculate path. Since the 500ms were for
sure elapsed, that would most probably schedule the work_timer at {0,0}
for each new paging request that arrived. As a result, the worker would
be scheduled lots of times per second (once for each new req arriving)
and only submitting 1 pag req (the new one) plus potentially 1 or
serveral pag req to retransmit.
In summary, there was not throthling applied in the scenario where only
pag req to retransmit where in the queue and new pag reqs kept arriving.
This incurrs into augmented paging throughput and also augmented
frequency of polls().
Related: OS#5922
Fixes: 4821c9f4df
Change-Id: I7ce6f436286b50dc31331d218ff256cf7be3f619
So far we only calculated the expiration time based on the amount of
requests of the same paging group in the BSC queue.
However, also extra time has to be taken into account regarding requests
of other paging groups. This is important because requests of all paging
groups are mixed in the BSC queue, meaning having a lot of requests from
other groups before a certain req will anyway delay transmit of that
req to some extend.
Related: OS#5537
Change-Id: Ib55f947c5d3490b27e1d08d39392992919512f9a
If the queue is only holding requests in retransmition state, when we
add a new one, we have to re-calculate the work timer to a lower value
instead of letting it wait for the first retransmit to be ready.
Change-Id: Ibd4f8921c92f7481f0b9943041c141640ab812c8
There's no need to keep the timer running, since anyway upon next
trigger it will simply early exit in paging_handle_pending_requests()
becuase there's no more work to do.
Change-Id: I096ab7231f52c741c5fded37acd5b309e1de06e3
Before this patch, on each BTS a 500ms timer was used to schedule some
work, sending up to 20 paging requests verytime.
This means, however, for an initial paging request it may take up to
500ms delay to be scheduled to the BTS, which is huge.
While we still want to maintain this 500ms interval for retransmits, it
doesn't make sense to wait that much for other cases. It's far better
sending less requests (10 instead of 20) every half time (250ms instead
of 500ms), since it will spread the load and paging more over time,
allowing for other work to be done in the middle.
Change-Id: I7a1297452cc4734b6ee8c38fb94cf32f38d57c3d
PAging happens over C0 RSL link, not over OML, so let's actually
validate that the C0 RSL link is up before paging instead of the OML
one.
Change-Id: I11e5bb6f952534763935aa01470e514d4af247ed
Decouple credit_timer from event "available_slots became 0".
Let's actually relate credit_timer to the fact of not receiving CCCH
Load Indication: If no CCCH Load Indication is received (cch_load_ind_period * 2),
then assume we are below CCH Load Indication threshold (10% load) and
start estimating the available_slots in a cch_load_ind_period*2 time
frame.
Moreover, in paging_schedule_if_needed(), there's no use in delaying
start of processing work if the work_timer is not already doing work.
Related: OS#5537
Change-Id: I6a0da03c408270044079e81d431f6641527c00cd
Currently, the Tx paging_req queue at the BSC always has new paging
requests adding at the end (as long as the subscriber is not already in
the queue).
That means, if the queue is full of retransmitions, it will take a long
time until the first paging_req is sent towards that subscriber.
The rationale here is that it makes sense to attempt the first paging
ASAP, and give lower prio to paging_req retransmitions, since it may
well be that those other subsribers are not available/reachable and
won't answer.
Related: SYS#5922
Change-Id: I1ae6d97152c458247bc538233b97c2d245196359
Having one paging request being sent every PAGING_TIMER (500msec) is too
slow in case BSC is serving lots of subscribers on a BTS. Hence, we want
to send many paging requests at once while still trying not to fill the
BTS buffer.
Morever, we don't want to send tons of paging requests at once, hence we
limit the amount of paging requests sent in one timer iteration
(MAX_PAGE_REQ_PER_ITER) in order to avoid the BSC doing lots of work
there at once, keeping it busy from processing other tasks.
Related: SYS#5922
Change-Id: I609fa67834b426456f48f6fb2acb601c5905f178
Similar to paging:attempted, count paging:expired not only per BTS, but
also for the whole BSC. Add active_paging_requests to struct bsc_subscr,
to increase the counter only once if paging expires, and not once per
BTS where paging expired.
Related: SYS#4878
Change-Id: I9c118e7e3d61ed8c9f1951111255b196905eba4d
Instead of having static const structs in header files (which end up
duplicated in each and every compile unit!), have one .c file with the
rate_ctr and stat_item descriptions.
Related: SYS#5542
Change-Id: I8fd6380b5ae8ed2d3347e7cfbf674c30b6841ed9