libosmocore/tests/stats
Neels Hofmeyr 554f7b8a77 rate_ctr: fix osmo-sgsn DoS: don't return NULL on already used index
Recent patch I563764af1d28043e909234ebb048239125ce6ecd introduced returning
NULL from rate_ctr_group_alloc() when the index passed already exists.

Instead of returning NULL, find an unused group index and use that, adjust the
error message.

In stats_test.c, adjust, and also assert allocated counter group indexes
everywhere.

Rationale:

The original patch causes osmo-sgsn to crash as soon as the second subscriber
attempts to establish an MM context. Of course osmo-sgsn is wrong to a) fail to
check a NULL return value and crash and b) to fail to allocate an MM context
just because the rate counter group could not be allocated (it still rejects
the MM context completely if rate_ctr_group_alloc() fails).

Nevertheless, the price we pay for rate counter correctness is, at least in
this instance, way too high: osmo-sgsn becomes completely unusable for more
than one subscriber.

Numerous other places exist where rate_ctr_group_alloc() is called with a
constant index number; from a quick grep magic I found these possible breaking
points:

osmo-sgsn/src/gprs/gb_proxy.c:1431:     cfg->ctrg = rate_ctr_group_alloc(tall_bsc_ctx, &global_ctrg_desc, 0);
osmo-sgsn/src/gprs/gprs_sgsn.c:139:     sgsn->rate_ctrs = rate_ctr_group_alloc(tall_bsc_ctx, &sgsn_ctrg_desc, 0);
osmo-sgsn/src/gprs/gprs_sgsn.c:270:     ctx->ctrg = rate_ctr_group_alloc(ctx, &mmctx_ctrg_desc, 0);
osmo-sgsn/src/gprs/gtphub.c:888:        b->counters_io = rate_ctr_group_alloc(osmo_gtphub_ctx,
                                                                              &gtphub_ctrg_io_desc, 0);
osmo-bsc/src/libfilter/bsc_msg_acc.c:87:        lst->stats = rate_ctr_group_alloc(lst, &bsc_cfg_acc_list_desc, 0);
osmo-pcu/src/bts.cpp:228:               m_ratectrs = rate_ctr_group_alloc(tall_pcu_ctx, &bts_ctrg_desc, 0);
osmo-pcu/src/tbf.cpp:793:       tbf->m_ctrs = rate_ctr_group_alloc(tbf, &tbf_ctrg_desc, 0);
osmo-pcu/src/tbf.cpp:879:       tbf->m_ul_egprs_ctrs = rate_ctr_group_alloc(tbf, &tbf_ul_egprs_ctrg_desc, 0);
osmo-pcu/src/tbf.cpp:880:       tbf->m_ul_gprs_ctrs = rate_ctr_group_alloc(tbf, &tbf_ul_gprs_ctrg_desc, 0);
osmo-pcu/src/tbf.cpp:970:               tbf->m_dl_egprs_ctrs = rate_ctr_group_alloc(tbf, &tbf_dl_egprs_ctrg_desc, 0);
osmo-pcu/src/tbf.cpp:977:               tbf->m_dl_gprs_ctrs = rate_ctr_group_alloc(tbf, &tbf_dl_gprs_ctrg_desc, 0);
osmo-pcu/src/tbf.cpp:1475:      ul_tbf->m_ctrs = rate_ctr_group_alloc(ul_tbf, &tbf_ctrg_desc, 0);
osmo-pcu/src/bts.cpp:226:               m_ratectrs = rate_ctr_group_alloc(tall_pcu_ctx, &bts_ctrg_desc, 1);

We can fix all of these callers and then reconsider returning NULL, but IMO
even into the future, rate counter group indexes are not something worth
failing to provide service for. For future bugs we should keep the automatic
index picking in case of index collisions. We will get an error message barfed
and can fix the issue in our own time, while the application remains completely
usable, and even the rate counters can still be queried (at wrong indexes, but
life is tough).

Related: I49aa95b610f2faec52dede2e4816da47ca1dfb14 (osmo-sgsn's segfault)
Change-Id: Iba6e41b8eeaea5ff6ed862bab3f34a62ab976914
2017-12-20 01:29:59 +01:00
..
stats_test.c rate_ctr: fix osmo-sgsn DoS: don't return NULL on already used index 2017-12-20 01:29:59 +01:00
stats_test.ok stats_test: Extend check to include test for counter group name mangling 2017-10-24 16:00:45 +00:00