Imagine following scenario:
1- client connects to CTRL iface, a new conn is created with POLL_READ
enabled.
2- A non-related event happens which triggers a TRAP to be sent. As a
result, the wqueue for the conn has now enabled POLL_WRITE, and message
will be sent next time we go through osmo_main_select().
3- At the same time, we receive the GET cmd from the CTRL client, which
means POLL_READ event will be also triggered next time we call
osmo_main_select().
4- osmo_main_select triggers osmo_wqueue_bfd_cb with both READ/WRITE
flags set.
5- The read_cb of wqueue is executed first. The handler closes the CTRL
conn for some reason, freeing the osmo_fd struct and returns.
6- osmo_qeueue_bfd_cb keeps using the already freed osmo_fd and calls
write_cb.
So in step 6 we get a heap-use-after-free catched by AddressSanitizer:
[0;m20180424135406115 [1;32mDLCTRL[0;m <0018> control_if.c:506 accept()ed new CTRL connection from (r=10.42.42.1:53910<->l=10.42.42.7:4249)
[0;m20180424135406116 [1;34mDLCTRL[0;m <0018> control_cmd.c:378 Command: GET bts.0.oml-connection-state
[0;m20180424135406117 [1;34mDLINP[0;m <0013> bts_ipaccess_nanobts.c:417 Identified BTS 1/0/0
[0;m[1;36m20180424135406118 [1;34mDNM[0;m[1;36m <0005> abis_nm.c:1628 Get Attr (bts=0)
[0;m[1;36m20180424135406118 [1;34mDNM[0;m[1;36m <0005> abis_nm.c:1628 Get Attr (bts=0)
[0;m20180424135406118 [1;34mDCTRL[0;m <000e> osmo_bsc_ctrl.c:158 BTS connection (re)established, sending TRAP.
[0;m20180424135406119 [1;32mDLCTRL[0;m <0018> control_if.c:173 close()d CTRL connection (r=10.42.42.1:53910<->l=10.42.42.7:4249)
[0;m=================================================================
==12301==ERROR: AddressSanitizer: heap-use-after-free on address 0x611000003e04 at pc 0x7f23091c3a2f bp 0x7ffc0cb73ff0 sp 0x7ffc0cb73fe8
READ of size 4 at 0x611000003e04 thread T0
#0 0x7f23091c3a2e in osmo_wqueue_bfd_cb /home/osmocom-build/jenkins/workspace/osmo-gsm-tester_build-osmo-bsc/libosmocore/src/write_queue.c:65
#1 0x7f23091ad5d8 in osmo_fd_disp_fds /home/osmocom-build/jenkins/workspace/osmo-gsm-tester_build-osmo-bsc/libosmocore/src/select.c:216
#2 0x7f23091ad5d8 in osmo_select_main /home/osmocom-build/jenkins/workspace/osmo-gsm-tester_build-osmo-bsc/libosmocore/src/select.c:256
#3 0x56538bdb7a26 in main /home/osmocom-build/jenkins/workspace/osmo-gsm-tester_build-osmo-bsc/osmo-bsc/src/osmo-bsc/osmo_bsc_main.c:532
#4 0x7f23077532e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0)
#5 0x56538bdb8999 in _start (/home/jenkins/workspace/osmo-gsm-tester_run-prod/trial-896/inst/osmo-bsc/bin/osmo-bsc+0x259999)
Fixes: OS#3206
Change-Id: I84d10caaadcfa6bd46ba8756ca89aa0badcfd2e3
The CTRL interface has a ctrl_cmd_def_* API that allows deferring a CTRL
command reply until later. However, the command handling currently fails to
acknowledge this and deallocates the struct ctrl_cmd anyway.
Fix: in struct ctrl_cmd, add a defer pointer to be populated by
ctrl_cmd_def_make(). A cmd thus marked as deferred is not deallocated at the
end of command handling. This fix needs no change in calling code.
(Another idea was to return a different code than CTRL_CMD_HANDLED when the
command is to be deferred, but that would require adjusting each user of
ctrl_cmd_def_make(). The implicit marking is safer and easier.)
Show that handling deferred commands is fixed by adjusting the expectations of
ctrl_test.c's test_deferred_cmd() and removing the now obsolete exit_early
label.
One symptom of the breakage is that osmo-bts-sysmo crashes when asked to report
a trx's clock-info, which is aggravated by the fact that the sysmobts-mgr does
ask osmo-bts-sysmo for a clock-info.
The crash appears since Id583b413f8b8bd16e5cf92a8a9e8663903646381 -- it looked
like just fixing an obvious memory leak, which it did as shown by the unit
test, but deferred ctrl commands actually relied on that leak. Both fixed now.
Related: OS#3120
Change-Id: I24232be7dcf7be79f4def91ddc8b8f8005b56318
Now that we use osmo_sock_get_name() to print connection information
at disconnect, let's use the same also at accept() time.
Furthermore, let's call it CTRL connection everywhere for consistency.
Change-Id: I33ee7d0ed853c5b2a4ae4e8ef945f8f27753cdea
When read() or write() system calls return '0' on a stream socket,
it means that the connection has been closed ("EOF"). We must
accordingly close this socket and remove all related state.
Before this patch, every new CTRL connection would introduce a leak
of both some memory/state, as well as a file descriptor :(
Change-Id: I4fb70e5f123b37dece29f156c5f430c875e7cbaf
The recently added ctrl_cmd_parse2() returns non-NULL cmd with error messages
upon parsing errors. In handle_control_read(), use ctrl_cmd_parse2() and send
those back to the CTRL command sender as reply.
Retain the previous "Command parser error" reply only in case ctrl_cmd_parse2()
should return NULL, which shouldn't actually happen at all.
Change-Id: Ie35a02555b76913bb12734a76fc40fde7ffb244d
In ctrl_handle_msg() (code recently propagated from handle_control_read()),
talloc_free() the parsed ctrl_cmd in all code paths. In particular, a free was
missing in case ctrl_cmd_handle() returns CTRL_CMD_HANDLED.
CTRL_CMD_HANDLED is triggered by GET_REPLY / SET_REPLY parsing, as show by
ctrl_test.c. With the memleak fixed, adjust expected test output and make a
detected mem leak abort the test immediately.
Change-Id: Id583b413f8b8bd16e5cf92a8a9e8663903646381
In order to allow unit testing the ctrl iface msgb handling, have a separate
msgb entry point function from the actual fd read function.
An upcoming patch will prove a memory leak in CTRL msgb handling by a unit test
that needs this separation.
Change-Id: Ie09e39db668b866eeb80399b82e7b04b8f5ad7c3
Previously ctrl request for all counters in
group (e. g. 'rate_ctr.abs.msc.0') will result in human-readable
description which is not regular enough and is hard to both parse and
generate. The ctrl interface is intended for m2m, not for human
interaction. Let's simplify things by making response similar to counter
group request ('rate_ctr.*').
Reply now looks as follows:
GET_REPLY 9084354783926137287 rate_ctr.abs.msc.0 loc_update_type:attach 0;loc_update_type:normal 0;
Previously it was:
GET_REPLY 9084354783926137287 rate_ctr.abs.msc.0 All counters in msc.0
loc_update_type:attach 0
loc_update_type:normal 0
Change-Id: I7a24cc307450efdcd28168fffe477320c59fcd36
Related: OS#2550
This should never happen with the current code, but if it ever does, we
should log the error instead of silently returning 0.
Change-Id: I544001d3072e5f12a96a67e4178f9b945c5f6b6c
Related: OS#2550
Before user have to know group name and index in advance to request rate
counter value. Introduce introspection function which allows user to
obtain all the groups and their indexes by requesting 'rate_ctr.*'
variable.
This simplifies KPI dumping over ctrl interface.
Change-Id: Ifad8b4f0360c8bcd123a838676516476e84c246a
Related: OS#2550
Let's fix some erroneous/accidential references to wrong license,
update copyright information where applicable and introduce a
SPDX-License-Identifier to all files.
Change-Id: I39af26c6aaaf5c926966391f6565fc5936be21af
Add ctrl_interface_setup_dynip2() to add a node_count parameter, which can be
used to define more ctrl nodes without having to merge a patch to libosmocore.
In consequence, also add ctrl_handle_alloc2(), since
ctrl_interface_setup_dynip() uses ctrl_handle_alloc() to allocate the node
slots, and add node_count param to static ctrl_init().
Passing zero as node_count indicates to use the default of _LAST_CTRL_NODE as
before, i.e. to not define more ctrl nodes. Assert that we never allocate less
than _LAST_CTRL_NODE slots.
The current ctrl_interface_setup_dynip() and ctrl_handle_alloc() become simple
wrappers that pass zero as node_count. Their use is still valid and they do not
need to be deprecated.
The API comment to ctrl_interface_setup_dynip2() explains how to define more
node IDs.
This patch was verified to work by osmo-hlr.git change
I98ee6a06b3aa6a67adb868e0b63b0e04eb42eb50 which adds two node IDs for use by
osmo-hlr only.
Change-Id: I1bd62ae0d4eefde7e1517db15a2155640a1bab58
With stat_item, stats.c and stats_statsd.c, it is becoming a bit
difficult to understand file naming. Also, the 'statistics.c' file
actually only contained osmo_counter handling, so let's rename it to
counter.c altogether.
Change-Id: I2cfb2310543902b7da46cb15a76e2da317eaed7d
Considering the various styles and implications found in the sources, edit
scores of files to follow the same API doc guidelines around the doxygen
grouping and the \file tag.
Many files now show a short description in the generated API doc that was so
far only available as C comment.
The guidelines and reasoning behind it is documented at
https://osmocom.org/projects/cellular-infrastructure/wiki/Guidelines_for_API_documentation
In some instances, remove file comments and add to the corresponding group
instead, to be shared among several files (e.g. bitvec).
Change-Id: Ifa70e77e90462b5eb2b0457c70fd25275910c72b
Especially for short descriptions, it is annoying to have to type \brief for
every single API doc.
Drop all \brief and enable the AUTOBRIEF feature of doxygen, which always takes
the first sentence of an API doc as the brief description.
Change-Id: I11a8a821b065a128108641a2a63fb5a2b1916e87
Replace if-else ladder & gotos with single switch statement & explicit
return to make reading code easier.
Change-Id: Ida1b389b571c60c26813cd29e61b3e4423c5df0f
This allows programmatic access to introspection of FSM instances, which
is quite handy from e.g. external test cases: Send a message to the
code, then use the CTRL interface to check if that message has triggered
the right kind of state transition.
Change-Id: I0f80340ee9c61c88962fdd6764a6098a844d0d1e
Sometimes (particularly when testing), we may want to parse+execute an
arbitrary control command simply form a string buffer, rather than from
a msgb. Let's add a helper for that.
Change-Id: Iaca748e0d942bb2a1ee7c2776b37485e1439eb0c
When executing test cases, we don't want to bind to a local TCP port, as
we cannot make assumptions as to which ports are actually free.
Change-Id: I5717f9dd92d1f143f069cecd4b4c8ba3d03b25f8
The existing code assumes that the main application knows about all
control command nodes and can thus present one lookup function.
As libraries are getting their own control interface handling, this
is too restrictive, and we need a way how library code can dynamically
register more node lookup helpers. We can now do this by means of a
ctrl_lookup_register() function.
Change-Id: Ib69908d1c57f5bb721d5496e3b4a5258fca450e3
Previously *_REPLY and ERROR messages were not explicitly handled which
would lead to sending error in response to them which in turn would
prompt other party to send error as well which would result in infinite
cycle.
Handle it explicitly by logging message id and other relevant data.
Change-Id: Id96f3a2fc81fa4549f49556d83f062c6b2f59e28
Related: OS#1615
* remove unused ctrl_interface_connect() which is not part of public API
* add default read callback to osmo_ctrl_conn_alloc()
Change-Id: Iaa209e34a849ce0dfe2e29b482c3208ade1a32a4
Related: OS#1615
Add function for allocating CTRL connection to public headers and
replace call to previous static function with it. Add doxygen docs for
this function.
It's useful if we need to allocate ctrl connection but don't need to
bind to any interfaces: when we act as ctrl client.
Related: OS#1615
Change-Id: I522ed809cbebfd3d7dd08b4ed9137b39ff192e32
FreeBSD 11.0 uses clang version 3.8.0 which spits various warnings
during libosmocore compilation. Let's clean this up a bit.
Change-Id: Ic14572e6970bd0b8916604fabf807f1608fa07e5
Log 'CTRL at 1.2.3.4 5678' from ctrl_interface_setup*. All callers can now drop
any extra 'CTRL at 1.2.3.4 5678' logging.
Change-Id: If449d0514e3d0cc1b346d7452194d931aa090166
Allow getting either particular
counter (e. g. rate_ctr.per_hour.e1inp.0.hdlc.abort) or entire rate
counter group for a given index (e. g. rate_ctr.per_hour.e1inp.0).
Change-Id: I2b0109536170f7b5388d3236df30b98f457aa98d
Fixes: OS#1730
Reviewed-on: https://gerrit.osmocom.org/274
Tested-by: Jenkins Builder
Reviewed-by: Harald Welte <laforge@gnumonks.org>
Make the ctrl interface bind address configurable, so that it may be made
available on other addresses than 127.0.0.1. The specific aim is to allow
running multiple osmo-nitbs alongside each other (commits in openbsc follow).
In e15ac060e7 we tried to fix
the nuttx build but we never included "netinet/tcp.h" after
it and the compiler warned about the unused "on" parameter
which we didn't notice because of the other warnings...
Include config.h so we can see if there is a tcp.h and then
include it.
this fixes some compilation issues with libosmocore under NuttX,
particularly as some #defines are missing or some header files are
slightly different.
Sometimes a control interface command cannot be processed
and responded immediately, but we need to process it asynchronously.
In order to support this, we introduce the 'ctrl_cmd_def', which
represents such a deferred command. It is created by the service
implementing the command using ctrl_cmd_def_make(), and a response is
later sent using ctrl_cmd_def_send().
ctrl_cmd_def_is_zombie() must be called to handle the case where
the control connection has disconnected/died between receiving the
command and sending the response.
The control interface user now only has to register a very short
node lookup function callback. This function is optional, and only
required if hierarchical command lookup should be supported.
Instead of using one flat talloc context (and one that is specific to
openbsc), we should attach the objects to whatever parent context they
are being used in.