Commit Graph

37 Commits

Author SHA1 Message Date
Harald Welte 4eb0f16e25 fsm: Add osmo_fsm_inst_broadcast_children()
This is a helper function to broadcast an event to all of the
siblings of a specified FSM instance.

Change-Id: I2ce398741a8672d7b7c4058d056f46e2fe7353c1
2020-12-21 15:45:45 +01:00
Neels Hofmeyr 344776d251 fsm.h: add missing include of logging.h
Change-Id: I783bf0eb40b674fb6a77f7673563fdf156975f5a
2019-11-24 19:59:35 +01:00
Neels Hofmeyr 988f6d72c5 add osmo_fsm_set_dealloc_ctx(), to help with use-after-free
This is a simpler and more general solution to the problem so far solved by
osmo_fsm_term_safely(true). This extends use-after-free fixes to arbitrary
functions, not only FSM instances during termination.

The aim is to defer talloc_free() until back in the main loop.

Rationale: I discovered an osmo-msc use-after-free crash from an invalid
message, caused by this pattern:

void event_action()
{
       osmo_fsm_inst_dispatch(foo, FOO_EVENT, NULL);
       osmo_fsm_inst_dispatch(bar, BAR_EVENT, NULL);
}

Usually, FOO_EVENT takes successful action, and afterwards we also notify bar.
However, in this particular case, FOO_EVENT caused failure, and the immediate
error handling directly terminated and deallocated bar. In such a case,
dispatching BAR_EVENT causes a use-after-free; this constituted a DoS vector
just from sending messages that cause *any* failure during the first event
dispatch.

Instead, when this is enabled, we do not deallocate 'foo' until event_action()
has returned back to the main loop.

Test: duplicate fsm_dealloc_test.c using this, and print the number of items
deallocated in each test loop, to ensure the feature works. We also verify that
the deallocation safety works simply by fsm_dealloc_test.c not crashing.

We should probably follow up by refusing event dispatch and state transitions
for FSM instances that are terminating or already terminated:
see I0adc13a1a998e953b6c850efa2761350dd07e03a.

Change-Id: Ief4dba9ea587c9b4aea69993e965fbb20fb80e78
2019-10-29 16:46:04 +01:00
Harald Welte 7b74551b93 fsm: Allow millisecond granularity in osmo_fsm built-in timer
So far, the public API of osmo_fsm only allowed integral seconds as
timeout.  Let's change that to milli-seconds in order to cover more
use cases.

This introduces
* osmo_fsm_inst_state_chg_ms()
* osmo_fsm_inst_state_chg_keep_or_start_timer_ms()

Which both work exactly like their previous counterparts without the _ms
suffix - the only difference being that the timeout parameter is
specified in milli-seconds, not in seconds.

The value range for an unsigned long in milli-seconds even on a 32bit
platform extends to about 48 days.

This patch also removes the documentation notice about limiting the
maximum value to 0x7fffffff due to time_t signed-ness.  We don't use
time_t but unsigned long.

Change-Id: I35b330e460e80bb67376c77e997e464439ac5397
2019-06-05 10:01:20 +00:00
Neels Hofmeyr b480b74192 add identifier sanitation for setting FSM instance ids
We often compose FSM instance IDs from context information, for example placing
an MSISDN string or IP:port information in the FSM instance id, using
osmo_fsm_inst_update_id_f(). This fails if any characters are contained that
don't pass osmo_identifier_valid(). Hence it is the task of the caller to make
sure only characters allowed in an FSM id are applied.

Provide API to trivially allow this by replacing illegal chars:
- osmo_identifier_sanitize_buf(), with access to the same set of illegal
  characters defined in utils.c,
- osmo_fsm_inst_update_id_f_sanitize() implicitly replaces non-identifier
  chars.

This makes it easy to add strings like '192.168.0.1:2342' or '+4987654321' to
an FSM instance id, without adding string mangling to each place that sets an
id; e.g. replacing with '-' to yield '192-168-0-1:2342' or '-4987654321'.

Change-Id: Ia40a6f3b2243c95fe428a080b938e11d8ab771a7
2019-04-12 01:00:16 +02:00
Neels Hofmeyr 1f9cc01861 fsm: support graceful osmo_fsm_inst_term() cascades
Add global flag osmo_fsm_term_safely() -- if set to true, enable the following
behavior:

Detect osmo_fsm_inst_term() occuring within osmo_fsm_inst_term():
- collect deallocations until the outermost osmo_fsm_inst_term() is done.
- call osmo_fsm_inst_free() *after* dispatching the parent event.

If a struct osmo_fsm_inst enters osmo_fsm_inst_term() while another is already
within osmo_fsm_inst_term(), do not directly deallocate it, but talloc-reparent
it to a separate talloc context, to be deallocated with the outermost FSM inst.

The effect is that all osmo_fsm_inst freed within an osmo_fsm_inst_term()
cascade will stay allocated until all osmo_fsm_inst_term() are complete and all
of them will be deallocated at the same time.

Mark the deferred deallocation state as __thread in an attempt to make cascaded
deallocation handling threadsafe.  Keep the enable/disable flag separate, so
that it is global and not per-thread.

The feature is showcased by fsm_dealloc_test.c: with this feature, all of those
wild deallocation scenarios succeed.

Make fsm_dealloc_test a normal regression test in testsuite.at.

Rationale:

It is difficult to gracefully handle deallocations of groups of FSM instances
that reference each other. As soon as one child dispatching a cleanup event
causes its parent to deallocate before fsm.c was ready for it, deallocation
will hit a use-after-free. Before this patch, by using parent_term events and
distinct "terminating" FSM states, parent/child FSMs can be taught to wait for
all children to deallocate before deallocating the parent. But as soon as a
non-child / non-parent FSM instance is involved, or actually any other
cleanup() action that triggers parent FSMs or parent talloc contexts to become
unused, it is near impossible to think of all possible deallocation events
ricocheting, and to avoid running into freeing FSM instances that were still in
the middle of osmo_fsm_inst_term(), or FSM instances to enter
osmo_fsm_inst_term() more than once. This patch makes deallocation of "all
possible" setups of complex cross referencing FSM instances easy to handle
correctly, without running into use-after-free or double free situations, and,
notably, without changing calling code.

Change-Id: I8eda67540a1cd444491beb7856b9fcd0a3143b18
2019-04-11 05:36:36 +00:00
Neels Hofmeyr 3b414a4adc fsm: add flag to ensure osmo_fsm_inst_term() happens only once
To prevent re-entering osmo_fsm_inst_term() twice for the same osmo_fsm_inst,
add flag osmo_fsm_inst.proc.terminating. osmo_fsm_inst_term() sets this to
true, or exits if it already is true.

Update fsm_dealloc_test.err for illustration. It is not relevant for unit
testing yet, just showing the difference.

Change-Id: I0c02d76a86f90c49e0eae2f85db64704c96a7674
2019-04-11 05:36:36 +00:00
Neels Hofmeyr d4b79c8772 fsm: add osmo_fsm_inst_state_chg_keep_or_start_timer()
During FSM design for osmo-msc, I noticed that the current behavior that
keep_timer=true doesn't guarantee a running timer can make FSM design a bit
complex, especially when using osmo_tdef for timeout definitions.

A desirable keep_timer=true behavior is one that keeps the previous timer
running, but starts a timer if no timer is running yet.

The simplest example is: a given state repeatedly transitions back to itself,
but wants to set a timeout only on first entering, avoiding to restart the
timeout on re-entering.

Another example is a repeated transition between two or more states, where the
first time we enter this group a timeout should start, but it should not
restart from scratch on every transition.

When using osmo_tdef timeout definitions for this, so far separate meaningless
states have to be introduced that merely set a fixed timeout.

To simplify, add osmo_fsm_inst_state_chg_keep_or_start_timer(), and use this in
osmo_tdef_fsm_inst_state_chg() when both keep_timer == true *and* T != 0.

In tdef_test.ok, the changes show that on first entering state L, the previous
T=1 is now kept with a large remaining timeout. When entering state L from O,
where no timer was running, this time L's T123 is started.

Change-Id: Id647511a4b18e0c4de0e66fb1f35dc9adb9177db
2019-03-07 23:10:21 +01:00
Neels Hofmeyr 5734bff3b0 represent negative T-timers as Osmocom-specific X-timers
fi->T values are int, i.e. can be negative. Do not log them as unsigned, but
define a distinct timer class "Xnnnn" for negative T values: i.e. for T == -1,
print "Timeout of X1" instead of "Timeout of T4294967295".

The negative T timer number space is useful to distinguish freely invented
timers from proper 3GPP defined T numbers. So far I was using numbers like
T993210 or T9999 for invented T, but X1, X2 etc. is a better solution. This way
we can make sure to not accidentally define an invented timer number that
actually collides with a proper 3GPP specified timer number that the author was
not aware of at the time of writing.

Add OSMO_T_FMT and OSMO_T_FMT_ARGS() macros as standardized timer number print
format. Use that in fsm.c, tdef_vty.c, and adjust vty tests accordingly.

Mention the two timer classes in various API docs and VTY online-docs.

Change-Id: I3a59457623da9309fbbda235fe18fadd1636bff6
2019-03-06 00:51:15 +01:00
Neels Hofmeyr 050f2d3259 log: fsm: allow logging the timeout on state change
Add a flag that adds timeout info to osmo_fsm_inst state change logging.

To not affect unit testing, make this an opt-in feature that is disabled by
default -- mostly because osmo_fsm_inst_state_chg_keep_timer() will produce
non-deterministic logging depending on timing (logs remaining time).

Unit tests that don't verify log output and those that use fake time may also
enable this feature. Do so in fsm_test.c.

The idea is that in due course we will add osmo_fsm_log_timeouts(true) calls to
all of our production applications' main() initialization.

Change-Id: I089b81021a1a4ada1205261470da032b82d57872
2019-02-26 20:57:58 +00:00
Neels Hofmeyr 56632b69a7 osmo_fsm_state_name: make robust against NULL fi
Change-Id: I61d4f7dfada2763948f330745ac886405d889a12
2019-01-29 00:59:32 +01:00
Neels Hofmeyr 691ba52194 add LOGPFSMSL(), LOGPFSMSLSRC()
Using an FSM instace's logging context is very useful. Sometimes it makes sense
to log something on a different logging category than the FSM definition's
default category.

For example, an MSC conn has aspects concerning MM, CC, RR, MGCP, ..., and
currently all of those log on DMM.

This came up in I358cfbaf0f44f25148e8b9bafcb9257b1952b35a, where I want to log
an MGCP event using a ran_conn context, and used the conn->fi->id. That of
course omits context like the current conn FSM state...

I remember at least one other place where I recently added logging using some
fi->id as context, so it might turn out useful in various places.

Change-Id: I11b182a03f5ecb6df7cd8f260757d3626c8e945d
2019-01-03 00:11:49 +01:00
Neels Hofmeyr b0b39af1bc LOGPFSM*: guard against fi == NULL
The LOGPFSM macros are in such wide use that they should guard against a NULL
fi pointer. In case of NULL, default to subsys = DLGLOBAL, loglevel =
LOGL_ERROR and state = "fi=NULL".

Change-Id: I9eaf8b7e2cf1e450ae626cb2fc928862008f6233
2019-01-03 00:11:49 +01:00
Neels Hofmeyr b11ff7c537 add API doc for LOGPFSM* macros
Change-Id: I3bf6500889aa58195f50a726dec0876c0c2baec3
2019-01-03 00:11:49 +01:00
Neels Hofmeyr e5bde90532 define LOGPFSM fmt only once, in LOGPFSMLSRC
Instead of duplicating the fmt and args in LOGPFSML and LOGPFSMLSRC, rather
make LOGPFSML invoke LOGPFSMLSRC with __FILE__ and __LINE__.

This is a cosmetic preparation for more tweaks coming up.

Change-Id: I2f23c57ebfdb5355919c06ac5ded7732e3b17a97
2018-12-23 10:08:55 +00:00
Neels Hofmeyr 983dcb9af3 use __FILE__, not __BASE_FILE__
The intention was to use the file's basename, but __BASE_FILE__ means "the root
file that is being parsed and contains #include statements".

If we had a function using __BASE_FILE__ and that was defined in an #included
file, __BASE_FILE__ would indicate the first file where the #include is, and
not the file where the function is defined. __BASE_FILE__ works for us because
we don't ever include function definitions that log something, so __BASE_FILE__
always coincides with __FILE__ for our logging; but still __BASE_FILE__ is
semantically the wrong constant.

Related: OS#2740
Change-Id: Ibc1d3746f1876ac42d6b1faf0e5f83bd2283cdcc
2018-08-20 20:23:18 +00:00
Harald Welte 3385e4b0c4 fsm: Change semantics of LOGPFSML() log-level
The general idea about each osmo_fsm_instance having a separate
log-level was to be able to selectively increase/show/enable logging
for some FSM instances (e.g. of a particular subscriber) while
maintaining normal logging verbosity for all other instances of the
same FSM.

The introduction of LOGPFSML() in Change-Id
If295fdabb3f31a0fd9490d1e0df57794c75ae547 broke that idea, as it would
use a compile-time log level, irrespective of the
osmo_fsm_inst.log_level setting of the given instance.

Let's combine the two:
Use the explicit level stated at LOGPFSML(), _unless_ this instance
has a higher log_level configured.

This way, all FSMs should normally be created with
osmo_fsm_inst.log_level == LOGL_DEBUG.  At that point LOGPFSM()
statements would be rendered at debug level, typically below the
threshold of most logging configurations.

Code that has explicit higher log levels like LOGPFSML(fi, LOGL_ERROR)
would always be printed, as it is an error message.

And if we now increase the osmo_fsm_inst.log_level, then even the normal
LOGPFSM() statements would suddenly be logged at that higher level,
selectively increasing log verbosity - like originally intended.

Change-Id: I1820f04d0c6f5d5ff08eb95b8c0e88764534491a
2018-06-08 10:22:41 +00:00
Neels Hofmeyr 407df02e7c add osmo_fsm_inst_state_chg_keep_timer()
Change-Id: I3c0e53b846b2208bd201ace99777f2286ea39ae8
2018-05-31 21:01:33 +00:00
Neels Hofmeyr a64c45a03e add osmo_fsm_inst_update_id_f()
In the osmo-msc, I would like to set the subscr conn FSM identifier by a string
format, to include the type of Complete Layer 3 that is taking place. I could
each time talloc a string and free it again. This API is more convenient.

From osmo_fsm_inst_update_id(), call osmo_fsm_inst_update_id_f() with "%s" (or
pass NULL).

Put the name updating into separate static update_name() function to clarify.

Adjust the error message for erratic ID: don't say "allocate", it might be from
an update. Adjust test expectation.

Change-Id: I76743a7642f2449fd33350691ac8ebbf4400371d
2018-04-09 17:57:15 +02:00
Philipp Maier d1f5793b5e fsm: allow graceful exit on FSM termination
The function _osmo_fsm_inst_term() terminates all child FSMs befor
it calls fi->fsm_cleanup(). This prevents the cleanup callback to
perform last actions on the child FSMs (e.g.
osmo_fsm_inst_unlink_parent()).

- Since moving the cleanup callack to the beginning of the function
  would alter the termination behavior and possibly cause malfunction
  in already existing implementation that use OSMO fsm, a new
  optional callback that is called immediately at the beginning of
  the terminatopn process is added.

Change-Id: I0fdda9fe994753f975a658c0f3fb3615949cc8bb
Closes: OS#2915
2018-02-19 08:30:21 +00:00
Daniel Willmann b0c43a6063 fsm: Add a function to change the FSM instance ID later
Sometimes we want to create an FSM instance before we know its name. In
that case we should be able to update the id later.

Change-Id: Ic216e5b11d4440f8e106a297714f4f06c1152945
2018-02-15 18:03:58 +00:00
Philipp Maier 2a06a491b1 fsm: add functions for unlinking and changing parents
At the moment it is not possible to unlink a child from from
its parent, nor is it possible to assign a new parent to a
child FSM.

- osmo_fsm_inst_unlink_parent():
  Make it possible to unlink childs from a parent.

- osmo_fsm_inst_change_parent():
  Make it possible to change the parent of a child.

Change-Id: I6d18cbd4ada903cf3720b3ad2a89fc643085beef
2018-01-24 12:03:28 +01:00
Neels Hofmeyr 17518fe393 doxygen: unify use of \file across the board
Considering the various styles and implications found in the sources, edit
scores of files to follow the same API doc guidelines around the doxygen
grouping and the \file tag.

Many files now show a short description in the generated API doc that was so
far only available as C comment.

The guidelines and reasoning behind it is documented at
https://osmocom.org/projects/cellular-infrastructure/wiki/Guidelines_for_API_documentation

In some instances, remove file comments and add to the corresponding group
instead, to be shared among several files (e.g. bitvec).

Change-Id: Ifa70e77e90462b5eb2b0457c70fd25275910c72b
2017-06-23 00:18:23 +00:00
Neels Hofmeyr 87e4550585 doxygen: enable AUTOBRIEF, drop \brief
Especially for short descriptions, it is annoying to have to type \brief for
every single API doc.

Drop all \brief and enable the AUTOBRIEF feature of doxygen, which always takes
the first sentence of an API doc as the brief description.

Change-Id: I11a8a821b065a128108641a2a63fb5a2b1916e87
2017-06-23 00:18:22 +00:00
Harald Welte 4585e6755d osmo_fsm: Lookup functions to find FSM Instance by name or ID
Introduce two lookup helper functions to resolve a fsm_instance based on
the FSM and name or ID.  Also, add related test cases.

Change-Id: I707f3ed2795c28a924e64adc612d378c21baa815
2017-04-16 17:28:23 +02:00
Neels Hofmeyr fca04bba5a fsm: convenience: add inline osmo_fsm_inst_state_name()
Change-Id: If9a6ecc4d6e2beaf716569e9a6053d73488e860b
2017-03-01 23:52:55 +00:00
Harald Welte 8808bb49a2 Add osmo_fsm_find_by_name() and avoid registering FSM with same name
This addresses a FIXME in the fsm.c code: osmo_fsm_register() should
fail in case a FSM with the given name already exists.

Change-Id: I5fd882939859c79581eba70c14cbafd64560b583
2017-01-07 12:20:59 +00:00
Neels Hofmeyr c014f606d0 fsm: factor out osmo_fsm_inst_term_children() from osmo_fsm_inst_term()
osmo_fsm_inst_term() has code for safe child removal, publish that part as
osmo_fsm_inst_term_children(); also use from osmo_fsm_inst_term().

As with osmo_fsm_inst_term(), add osmo_fsm_inst_term_children() macro to pass
the caller's source file and line to new _osmo_fsm_inst_term_children().

Rationale: in openbsc's VLR, I want to discard child FSMs when certain events
are handled. I could keep a pointer to each one, or simply iterate all
children, making the code a lot simpler in some places.

(Unfortunately, the patch may be displayed subobtimally. This really only moves
the children-loop to a new function, replaces it with a call to
_osmo_fsm_inst_term_children(fi, OSMO_FSM_TERM_PARENT, NULL, file, line) and
drops two local iterator variables. No other code changes are made, even though
the diff may show large removal + addition chunks)

Change-Id: I8dac1206259cbd251660f793ad023aaa1dc705a2
2016-12-24 17:11:52 +00:00
Neels Hofmeyr eeacf906dd fsm: move LOGPFSMSRC and LOGPFSMLSRC to .h
LOGPFSM and LOGPFSML are in the header file, put the *SRC variants also there
so users of the osmo_fsm_inst API may conveniently create own functions that
log the caller's source file and line.

Very useful if many action functions call the same event dispatching function,
like foo_fsm_done(), and one needs to know which of the callers to debug.

Change-Id: I39447b1d15237b28f88d8c5f08d82c764679dc80
2016-12-24 17:11:52 +00:00
Neels Hofmeyr 42b59c1bf0 fsm api doc: fix typo in doxygen marker '\breif'
Change-Id: I5c57e35b29d50cb409becada6b9b120ce5210ae0
2016-12-20 12:39:32 +01:00
Neels Hofmeyr 5c5c78aacd add value strings for enum osmo_fsm_term_cause and use for logging
Change-Id: Iaf63d3cadb0d46bf454e3314ebb439240cafd834
2016-12-15 11:35:56 +01:00
Neels Hofmeyr 725698a4f1 fsm: log caller's source for events and state changes, not fsm.c lines
When looking at log output, it is not interesting to see that a state
transition's petty details are implemented in fsm.c. Rather log the *caller's*
source file and line that caused an event, state change and cascading events.

To that end, introduce LOGPSRC() absorbing the guts of LOGP(), to be able to
explicitly pass the source file and line information.

Prepend an underscore to the function names of osmo_fsm_inst_state_chg(),
osmo_fsm_inst_dispatch() and osmo_fsm_inst_term(), and add file and line
arguments to them. Provide the previous names as macros that insert the
caller's __BASE_FILE__ and __LINE__ constants for the new arguments. Hence no
calling code needs to be changed.

In fsm.c, add LOGPFSMSRC to call LOGPSRC, and add LOGPFSMLSRC, and use them in
above _osmo_fsm_inst_* functions.

In addition, in _osmo_fsm_inst_term(), pass the caller's source file and line
on to nested event dispatches, so showing where a cascade originated from.

Change-Id: Iae72aba7bbf99e19dd584ccabea5867210650dcd
2016-12-14 18:00:52 +01:00
Neels Hofmeyr 6a13e7f563 fsm: add LOGPFSML to pass explicit logging level
Provide one central LOGPFSML to print FSM information, take the FSM logging
subsystem from the FSM instance but use an explicitly provided log level
instead of the FSM's default level.

Use to replace some, essentially, duplications of the LOGPFSM macro.

In effect, the fsm_test's expected error changes, since the previous code dup
for logging events used round braces to indicate the fi's state, while the
central macro uses curly braces.

Change-Id: If295fdabb3f31a0fd9490d1e0df57794c75ae547
2016-12-14 17:56:48 +01:00
Neels Hofmeyr dda5e7991c doc: fsm timer_cb: explain return value
Change-Id: Ic6fbe95737862ed8b8de78058989c8b2ae330006
2016-12-09 16:10:46 +01:00
Max 8b25a3f5c3 Add osmo_fsm_unregister() to header
Previously function was defined but not exposed so there were a way to
register FSM but no way to unregister it.

Change-Id: I2e749d896009784b77d6d5952fcc38e1c131db2b
2016-11-02 08:56:29 +00:00
Harald Welte f627c0f0af fsm: Introduce default time-out handling
If a FSM doesn't specify any timer_cb, simply terminate the FSM by
default on time-out.  This is a reasonable default for most cases, and
avoids copy+pasting a one-line timer_cb function in every FSM.

Also, even if there is a timer_cb, let it have a return value to decide
if the core should terminate after return from timer_cb or not.

Change-Id: I0461a9593bfb729c82b7d1d1cf9f30b1079d0212
2016-06-27 15:17:53 +02:00
Harald Welte 136e73764e Add Finite State Machine abstraction code
This code is supposed to formalize some of the state machine handling in
Osmocom code.

Change-Id: I0b0965a912598c1f6b84042a99fea9d522642466
Reviewed-on: https://gerrit.osmocom.org/163
Tested-by: Jenkins Builder
Reviewed-by: Harald Welte <laforge@gnumonks.org>
2016-06-16 21:43:45 +00:00