Commit Graph

28 Commits

Author SHA1 Message Date
Harald Welte e61d459cef Support building with -Werror=strict-prototypes / -Werror=old-style-definition
Unfortunately "-std=c99" is not sufficient to make gcc ignore code that
uses constructs of earlier C standards, which were abandoned in C99.

See https://lwn.net/ml/fedora-devel/Y1kvF35WozzGBpc8@redhat.com/ for
some related discussion.

Change-Id: I84fd99442d0cc400fa562fa33623c142649230e2
2022-11-03 12:44:28 +01:00
Oliver Smith 04bfb7165b treewide: remove FSF address
Remove the paragraph about writing to the Free Software Foundation's
mailing address. The FSF has changed addresses in the past, and may do
so again. In 2021 this is not useful, let's rather have a bit less
boilerplate at the start of source files.

Change-Id: I5050285e75cf120407a1d883e99b3c4bcae8ffd7
2021-12-14 12:44:03 +01:00
Neels Hofmeyr 6a5940740a refactor stat_item: report only changed values
Change the functionality of skipping unchanged values: instead of
looking up whether new values have been set on a stat item, rather
remember the last reported value and skip reporting identical values.

stats_test.c shows that previously, a stat item reported a value of 10
again, even though the previous report had already sent a value of 10.
That's just because the value 10 was explicitly set again, internally.

From a perspective of preserving all data points, it could make sense to
send consecutive identical values. But since we already collapse all
data points per reporting period into a max, that is pointless.

Related: SYS#5542
Change-Id: I8f4cf34dfed17e0879716fa2cbeee137c158978b
2021-09-30 18:33:43 +00:00
Neels Hofmeyr e90c7176be refactor stat_item: get rid of FIFO and "skipped" error
Intead of attempting to store all distinct values of a reporting period,
just store min, max, last as well as a sum and N of each reporting
period.

This gets rid of error messages like

  DLSTATS ERROR stat_item.c:285 num_bts:oml_connected: 44 stats values skipped

while at the same time more accurately reporting the max value for each
reporting period. (So far stats_item only reports the max value; keep
that part unchanged, as shown in stats_test.c.)

With the other so far unused values (min, sum), we are ready to also
report the minimum value as well as an average value per reporting
period in the future, if/when our stats reporter allows for it.

Store the complete record of the previous reporting period. So far we
only compare the 'max' value, but like this we are ready to also see
changes in min, last and average value between reporting periods.

This patch breaks API by removing:
- struct members osmo_stats_item.stats_next_id, .last_offs and .values[]
- struct osmo_stats_item_value
- osmo_stat_item_get_next()
- osmo_stat_item_discard()
- osmo_stat_item_discard_all()
and by making struct osmo_stats_item opaque.
In libosmocore, we do have a policy of never breaking API. But since the
above should never be accessed by users of the osmo_stats_item API -- or
if they are, would no longer yield useful results, we decided to make an
exception in this case. The alternative would be to introduce a new
osmo_stats_item2 API and maintaining an unused legacy osmo_stats_item
forever, but we decided that the effort is not worth it. There are no
known users of the removed items.

Related: SYS#5542
Change-Id: I137992a5479fc39bbceb6c6c2af9c227bd33b39b
2021-09-30 18:33:43 +00:00
Neels Hofmeyr 599601e12c stats_test: assert counter and stat item val counts separately
Instead of just a send_count, keep one such count for the counter
updates, and a separate one for the stat item updates.
Print those numbers in the test output.

An upcoming patch will tweak stat_item reporting so that only an
actually changed value results in sending a new stat value. This patch
allows illustrating that change clearly.

Related: SYS#5542
Change-Id: I2da003ee6ec15f1c3959efe69e01b4ee24af82bb
2021-09-20 13:05:32 +00:00
Oliver Smith 11da4a4abd stats: send real last value if no new values come
Background:
* Individual values can be added to osmo_stat_item.values at any time.
* Stats are reported at a fixed interval (see vty 'stats interval'),
  e.g. every 10 seconds.
* In order to report a new stat value, we use the maximum of all
  osmo_stat_item.values added since the last report.
* By default, we do not send new stat values if they did not change
  (see vty 'config-stats' -> 'flush-period' default of 0).

Fix the following bug:
* If 'flush-period' is 0, and no new osmo_stat_item.values are coming
  in, the last value that gets reported is not necessarily the last
  entry in osmo_stat_item.values.
* For attached reporters (statsd), it could then be that the given stat
  stays at the wrong value for a long stretch of time (think of several
  hours/days/forever).

Explanation of how the test shows that it is fixed:
* stats get reported (value is irrelevant)
* osmo_stat_item gets a new value: 20
* osmo_stat_item gets a new value: 10
* stats get reported (value: 20, the maximum of both new values)
* osmo_stat_item gets no new values
* stats get reported (value: 10, this is new because of the bug fix,
  the real last value in osmo_stat_item, different from the 20 sent
  earlier, without the fix it would not send anything here and the last
  sent value would be 20)
* osmo_stat_item gets no new values
* stats get reported (nothing gets sent, since the real last value was
  already sent and 'flush-period' is 0)

Fixes: OS#5215
Change-Id: Ibeefd0e3d1dbe4be454ff05a21df4848b2abfabe
2021-08-20 14:04:54 +00:00
Oliver Smith a79a549273 tests/stats: show how last item sent may be wrong
Extend the test to illustrate the bug described in the related issue,
which will be fixed with the next patch.

Related: OS#5215
Change-Id: I1d26867ac1b837bea6a9754a3203e53c147e7a5f
2021-08-20 14:04:54 +00:00
Pau Espin 7b894a7de0 Use new stat item/ctr getter APIs
Generated with spatch:
"""
@@
expression E1, E2;
@@
- &E2->ctr[E1]
+ rate_ctr_group_get_ctr(E2, E1)
"""

"""
@@
expression E1, E2, E3;
@@
- E2->items[E1]
+ osmo_stat_item_group_get_item(E2, E1)
"""

Change-Id: I41297a8df68e28dfc6016330ac82b0ed5dd0ebc1
2021-06-04 18:19:37 +02:00
Oliver Smith 2623fca8ad stats: log error when missing stats values (v2)
Related: SYS#4877
Change-Id: I5140d967c2f1d36dadf93b03e52b9bbd42e2a3a6
2021-04-07 18:38:54 +00:00
Oliver Smith c7930589fd stats_test: restore stat_item_get_next asserts
This is a partial revert of b27b352e ("stats: Use a global index for
stat item values"). Now that osmo_stat_item_get_next correctly returns
how many values have been skipped, we can use the accurate asserts on
its return value again.

Fix the initial values of next_id_a,b (1 instead of 0), so we don't get
a skipped value on the first read. This is needed, because b27b352e
refactored osmo_stat_item_get_next to have the next id as parameter
instead of the last read one, and the initial value was not adjusted in
the tests.

Related: OS#5088
Change-Id: I9d4cda2487a62f52361c24058363dfa90e502c63
2021-04-07 18:38:54 +00:00
Oliver Smith 6140194347 stat_item: make value ids item specific
Fix counting of values missed because of FIFO overflow in
osmo_stat_item_get_next(), by assigning a new item value id effectively
as item->value[n + 1].id = item->value[n].id + 1, instead of increasing
a global_value_id that is shared between all items and groups. With
global_value_id, the count of values missed was wrong for one item, as
soon as a new value was added to another item.

This partially reverts b27b352e ("stats: Use a global index for stat
item values") from 2015, right after stats was added to libosmocore. It
was supposed to make multiple readers (reporters) possible, which could
read independently from stat_item (and later added comments explain it
like that). But this remained unused, stats has implemented multiple
reporters by reading all stat_items once and sending the same data to
all enabled reporters. The patch caused last_value_index in struct
osmo_stat_item to always remain at -1.

Replace this unused last_value_index with stats_next_id, so stats can
store the item-specific next_id in the struct again. It appears that
stats is the only direct user of osmo_stat_item, but if there are
others, they can bring their own item-specific next_id: functions in
stat_item.c still accept a next_id argument.

Related: OS#5088
Change-Id: Ie65dcdf52c8fc3d916e20d7f0455f6223be6b64f
2021-04-07 18:38:54 +00:00
Oliver Smith d3490bc442 stat_item: make next_id argument name consistent
Let osmo_stat_item_get_next, osmo_stat_item_discard,
osmo_stat_item_discard_all consistently refer to their next_id arg as
such (and not idx or next_idx). It refers to an ID (item->values[i].id),
not an index (item->values[i]), and it is always the next one, never the
current one.

Do the same change for _index/_idx variables in stats.c, which are used
as arguments to these functions. Replace rd_ with next_id_ in
stats_test.c, too.

Related: OS#5088
Change-Id: I5dd566b08dff7174d1790f49abd2d6ac020e120e
2021-04-06 11:27:34 +02:00
Oliver Smith a7eb735b8d Revert "stats: log error when missing stats values"
This reverts commit d290439b4a, which
caused "stats values skipped" messages to appear even if they were not
skipped. Revert for now, replace with a proper version in the future.

Related: SYS#4877
Change-Id: Ib43bd53188a4d31d771feb921ea14abe1a3ec877
2021-03-19 16:47:52 +01:00
Oliver Smith d290439b4a stats: log error when missing stats values
Let the user know when the stats were not consumed fast enough for the
given FIFO length.

Related: SYS#4877
Change-Id: If0e8ab55103007693101538fb6ea310075217774
2021-03-17 17:52:37 +01:00
Oliver Smith d89d35e933 tests/stats: enable logging in test output
Move test output from stdout to stderr and enable logging to stderr.
This is in preparation for the next patch, which will add a new log
message when osmo_stat_item_get_next() skips a value.

Related: SYS#4877
Change-Id: Ie0eaec2f93ac6859397a6bfca45039fdcc27cb9e
2021-03-17 16:39:35 +01:00
Daniel Willmann 2aa527bd99 stats: Ensure that each osmo_stat_item only reports once per interval
We should never report multiple values for a metric. It is confusing for
the log reporter and wrong for statsd. Statsd will record only one value,
but will it be the first, last, ...?
This can happen if an osmo_stat_item changes more than once within the
same reporting interval.

With this patch only one aggregate value is sent to the log reporters.
The value reported is the maximum during this interval. Other
aggregations could be possible (min, last), but reporting a (useful)
average is not because the values don't include a timestamp and most
osmo_stat_items change at unregular intervals.

Change-Id: I366ab1c66f4ae6363111ea4e41b66b7d5bcade9c
Related: SYS#4877
2021-03-09 14:08:15 +01:00
Alexander Couzens cc72cc45a4 add osmo_stat_item_inc/osmo_stat_item_dec to set it relative
Change-Id: Id2462c4866bd22bc2338c9c8f69b775f88ae7511
2019-05-07 13:20:57 +00:00
Neels Hofmeyr 554f7b8a77 rate_ctr: fix osmo-sgsn DoS: don't return NULL on already used index
Recent patch I563764af1d28043e909234ebb048239125ce6ecd introduced returning
NULL from rate_ctr_group_alloc() when the index passed already exists.

Instead of returning NULL, find an unused group index and use that, adjust the
error message.

In stats_test.c, adjust, and also assert allocated counter group indexes
everywhere.

Rationale:

The original patch causes osmo-sgsn to crash as soon as the second subscriber
attempts to establish an MM context. Of course osmo-sgsn is wrong to a) fail to
check a NULL return value and crash and b) to fail to allocate an MM context
just because the rate counter group could not be allocated (it still rejects
the MM context completely if rate_ctr_group_alloc() fails).

Nevertheless, the price we pay for rate counter correctness is, at least in
this instance, way too high: osmo-sgsn becomes completely unusable for more
than one subscriber.

Numerous other places exist where rate_ctr_group_alloc() is called with a
constant index number; from a quick grep magic I found these possible breaking
points:

osmo-sgsn/src/gprs/gb_proxy.c:1431:     cfg->ctrg = rate_ctr_group_alloc(tall_bsc_ctx, &global_ctrg_desc, 0);
osmo-sgsn/src/gprs/gprs_sgsn.c:139:     sgsn->rate_ctrs = rate_ctr_group_alloc(tall_bsc_ctx, &sgsn_ctrg_desc, 0);
osmo-sgsn/src/gprs/gprs_sgsn.c:270:     ctx->ctrg = rate_ctr_group_alloc(ctx, &mmctx_ctrg_desc, 0);
osmo-sgsn/src/gprs/gtphub.c:888:        b->counters_io = rate_ctr_group_alloc(osmo_gtphub_ctx,
                                                                              &gtphub_ctrg_io_desc, 0);
osmo-bsc/src/libfilter/bsc_msg_acc.c:87:        lst->stats = rate_ctr_group_alloc(lst, &bsc_cfg_acc_list_desc, 0);
osmo-pcu/src/bts.cpp:228:               m_ratectrs = rate_ctr_group_alloc(tall_pcu_ctx, &bts_ctrg_desc, 0);
osmo-pcu/src/tbf.cpp:793:       tbf->m_ctrs = rate_ctr_group_alloc(tbf, &tbf_ctrg_desc, 0);
osmo-pcu/src/tbf.cpp:879:       tbf->m_ul_egprs_ctrs = rate_ctr_group_alloc(tbf, &tbf_ul_egprs_ctrg_desc, 0);
osmo-pcu/src/tbf.cpp:880:       tbf->m_ul_gprs_ctrs = rate_ctr_group_alloc(tbf, &tbf_ul_gprs_ctrg_desc, 0);
osmo-pcu/src/tbf.cpp:970:               tbf->m_dl_egprs_ctrs = rate_ctr_group_alloc(tbf, &tbf_dl_egprs_ctrg_desc, 0);
osmo-pcu/src/tbf.cpp:977:               tbf->m_dl_gprs_ctrs = rate_ctr_group_alloc(tbf, &tbf_dl_gprs_ctrg_desc, 0);
osmo-pcu/src/tbf.cpp:1475:      ul_tbf->m_ctrs = rate_ctr_group_alloc(ul_tbf, &tbf_ctrg_desc, 0);
osmo-pcu/src/bts.cpp:226:               m_ratectrs = rate_ctr_group_alloc(tall_pcu_ctx, &bts_ctrg_desc, 1);

We can fix all of these callers and then reconsider returning NULL, but IMO
even into the future, rate counter group indexes are not something worth
failing to provide service for. For future bugs we should keep the automatic
index picking in case of index collisions. We will get an error message barfed
and can fix the issue in our own time, while the application remains completely
usable, and even the rate counters can still be queried (at wrong indexes, but
life is tough).

Related: I49aa95b610f2faec52dede2e4816da47ca1dfb14 (osmo-sgsn's segfault)
Change-Id: Iba6e41b8eeaea5ff6ed862bab3f34a62ab976914
2017-12-20 01:29:59 +01:00
Max 3ef14a241a Do not allocate already existing counter group
Check that no group with the given name and index already exist before
allocating it. Add corresponding test case.

Change-Id: I563764af1d28043e909234ebb048239125ce6ecd
Related: OS#2757
2017-12-17 20:12:34 +00:00
Harald Welte e08da97570 Fix/Update copyright notices; Add SPDX annotation
Let's fix some erroneous/accidential references to wrong license,
update copyright information where applicable and introduce a
SPDX-License-Identifier to all files.

Change-Id: I39af26c6aaaf5c926966391f6565fc5936be21af
2017-11-13 01:35:12 +09:00
Harald Welte 04c881207f stats_test: Extend check to include test for counter group name mangling
In Change-Id Ifc6ac824f5dae9a848bb4a5d067c64a69eb40b56 we introduce
name mangling to replace any '.' in counter (group) names to be
converted to ':'. Let's test for this functionality explicitly as part
of the stats_test.

Change-Id: Ie35682aa79526e2ffeab6995cd640b7847d855bf
2017-10-24 16:00:45 +00:00
Harald Welte a7a5065385 Convert lib-internal rate_ctr from '.' separator to ':' separator
The rate_ctr.c code would do this mangling automatically, but let's
avoid using this from new versions of our code for
simplicity/explicitness.

Change-Id: I24a556f447cfac25efb6e83cac2d0c2972d98fe3
2017-10-24 16:00:45 +00:00
Neels Hofmeyr b41b48e76a stats_test: fix mismatching osmo_stats_reporter->send_item signature
The function pointer expects the last arg as int64_t, stats_test.c uses
an int instead. Fix the argument type as well as the printf format for it.

Fixes this compiler warning seen on our FreeBSD build slave:

    CC       stats/stats_test.o
  ../../tests/stats/stats_test.c:288:18: warning: incompatible pointer types assigning to 'int (*)(struct osmo_stats_reporter *, const struct osmo_stat_item_group *, const struct osmo_stat_item_desc *, int64_t)' from 'int (struct osmo_stats_reporter *, const struct osmo_stat_item_group *, const struct osmo_stat_item_desc *, int)' [-Wincompatible-pointer-types]
          srep->send_item = stats_reporter_test_send_item;
                          ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  1 warning generated.

Change-Id: I91cbfd4dd25a881b803943430101dabf07dafc7c
2017-01-15 18:10:15 +00:00
Jacob Erlbeck f13de868be stats/test: Add memory leak check
Adds a rudimentary leak check for the counters and stat items.

Sponsored-by: On-Waves ehf
2015-11-26 12:52:24 +01:00
Jacob Erlbeck 46b703d083 stats/test: Add test for reporting
This tests uses a dedicated test reported to check several aspects of
the value reporting.
  - addition/removal of stats reporter
  - addition/removal of counters/items
  - setting of max_class
  - initial value flush
  - updating single counters/items
  - reporter retrieval
  - enable/disable

Sponsored-by: On-Waves ehf
2015-11-26 12:52:24 +01:00
Jacob Erlbeck fc9533d6c4 stats: Add osmo_ name prefix to identifiers
Since the the stat_item and stats functions and data types are meant
to be exported, they get an osmo_ prefix.

Sponsored-by: On-Waves ehf

[hfreyther: Prepended the enum values too. This was requested by
Jacob]
2015-11-02 15:39:01 +01:00
Jacob Erlbeck b27b352e93 stats: Use a global index for stat item values
Currently each stat item has a separate index value which basically
counts each single value added to the item and which can be used by
a reporter to get all new values that have not been reported yet.
The drawback is, that such an index must be stored for each stat
item.

This commit introduces a global index which is incremented for each
new stat item value. This index is then stored together with the item
value. So a single stored index per reporter is sufficient to make
sure that only new values are reported.

Sponsored-by: On-Waves ehf
2015-10-28 23:51:24 +01:00
Jacob Erlbeck 9732cb4a92 stats: Add stat_item for value monitoring
This commit adds instrumentation function to gather measurement
and statistical values similar to counter groups.

Multiple values can be stored per item, which can be retrieved in
FIFO order. Getting values from the item does not modify its state to
allow for multiple independant backends (e.g. VTY and statd).

When a new value is set, the oldest value gets silently overwritten.
Lost values are skipped when getting values from the item.

Sponsored-by: On-Waves ehf
2015-10-28 23:51:04 +01:00