Commit Graph

32 Commits

Author SHA1 Message Date
Vadim Yanitskiy 477d99f7c9 tests/fsm: also test .onenter and .onleave callbacks
Extend the existing testing coverage to check per-state enter/leave
callbacks.  An interesting behavior can be seen from the test output:
when allocating an FSM instance, the .onenter callback is not being
called for the initial FSM state (ST_NULL).  Likewise, the .onleave
callback is not being called when free()ing an FSM instance.

Change-Id: I22edcf91375a09854f0dab1e2e02e034629310f7
2023-12-27 05:10:11 +07:00
Harald Welte e61d459cef Support building with -Werror=strict-prototypes / -Werror=old-style-definition
Unfortunately "-std=c99" is not sufficient to make gcc ignore code that
uses constructs of earlier C standards, which were abandoned in C99.

See https://lwn.net/ml/fedora-devel/Y1kvF35WozzGBpc8@redhat.com/ for
some related discussion.

Change-Id: I84fd99442d0cc400fa562fa33623c142649230e2
2022-11-03 12:44:28 +01:00
Vadim Yanitskiy 76bdbd4b2b fsm: fix state_chg(): pass microseconds to osmo_timer_schedule()
As was demonstrated in [1], osmo_fsm_inst_state_chg_ms() is broken.
The problem is in state_chg(): this function is passing *milli*seconds
to osmo_timer_schedule(), while it expects *micro*seconds.

Change-Id: Ib0b6c3bdb56e4279df9e5ba7db16841645c452aa
Related: [1] I5a35730a8448292b075aefafed897353678250f9
Fixes: I35b330e460e80bb67376c77e997e464439ac5397
Fixes: OS#5622
2022-07-19 15:32:36 +07:00
Vadim Yanitskiy d2964a94c2 fsm: add unit tests verifying state timeout s/ms accuracy
This change adds test_state_chg_Ts() and test_state_chg_Tms() checking
state timeout s/ms accuracy when using osmo_fsm_inst_state_chg() and
osmo_fsm_inst_state_chg_ms() calls, respectively.

test_state_chg_Tms() demonstrates that osmo_fsm_inst_state_chg_ms()
is not working as expected: the timeout fires earlier than expected.

Change-Id: I5a35730a8448292b075aefafed897353678250f9
Related: OS#5622
2022-07-19 06:09:05 +07:00
Pau Espin 01e0d3e7fd Drop use of log_set_print_filename() API inside libosmocore
Let's use log_set_print_filename2() API instead, which has less ackward
behavior implications like changing the print status of category-hex.

Related: OS#5034
Change-Id: Ifc78e1dcba5baf0b41f6ccbbbd1e3f06552d73da
2021-02-20 17:13:58 +00:00
Pau Espin 690b661fbc tests: Set print_category values explicitly
This will alow easily changing default values for print_category vs
print_category_hex later.

In any case, every test relying on logging output validation should
always explicitly state the config to avoid issues in the future if
default values change.

Related: OS#5034
Change-Id: If29b40557d5c2bcda04b964f344070bad58d8f28
2021-02-20 17:13:58 +00:00
Pau Espin eb028fa1df tests/fsm_test.c: Disable use color in logging output
This should avoid other problems in the future.

Change-Id: I81368578c0830477d381566a54671fdde6067b23
2020-07-30 20:33:24 +00:00
Harald Welte ae5016f193 Check for osmo_fsm_register() error return value
Change-Id: Idbc1557739b2a253b73914e6f1f18a6d169d882e
2019-12-01 13:45:03 +01:00
Neels Hofmeyr b1bbe24c09 fsm: refuse state chg and events after term
Refuse state changes and event dispatch for FSM instances that are already
terminating.

It is assumed that refusing state changes and events after FSM termination is
seen as the sane expected behavior, hence this change in behavior is merged
without being configurable.

There is no fallout in current Osmocom code trees. fsm_dealloc_test needs a
changed expected output, since it is explicitly creating complex FSM structures
that terminate. Currently no other C test in Osmocom code needs adjusting.

Rationale:

Where multiple FSM instances are collaborating (like in osmo-bsc or osmo-msc),
a terminating FSM instance often causes events to be dispatched back to itself,
or causes state changes in FSM instances that are already terminating. That is
hard to avoid, since each FSM instance could be a cause of failure, and wants
to notify all the others of that, which in turn often choose to terminate.

Another use case: any function that dispatches events or state changes to more
than one FSM instance must be sure that after the first event dispatch, the
second FSM instance is in fact still allocated. Furthermore, if the second FSM
instance *has* terminated from the first dispatch, this often means that no
more actions should be taken. That could be done by an explicit check for
fsm->proc.terminating, but a more general solution is to do this check
internally in fsm.c.

In practice, I need this to avoid a crash in libosmo-mgcp-client, when an
on_success() event dispatch causes the MGCP endpoint FSM to deallocate. The
earlier dealloc-in-main-loop patch fixed part of it, but not all.

Change-Id: Ia81a0892f710db86bd977462730b69f0dcc78f8c
2019-10-29 17:28:30 +01:00
Neels Hofmeyr 988f6d72c5 add osmo_fsm_set_dealloc_ctx(), to help with use-after-free
This is a simpler and more general solution to the problem so far solved by
osmo_fsm_term_safely(true). This extends use-after-free fixes to arbitrary
functions, not only FSM instances during termination.

The aim is to defer talloc_free() until back in the main loop.

Rationale: I discovered an osmo-msc use-after-free crash from an invalid
message, caused by this pattern:

void event_action()
{
       osmo_fsm_inst_dispatch(foo, FOO_EVENT, NULL);
       osmo_fsm_inst_dispatch(bar, BAR_EVENT, NULL);
}

Usually, FOO_EVENT takes successful action, and afterwards we also notify bar.
However, in this particular case, FOO_EVENT caused failure, and the immediate
error handling directly terminated and deallocated bar. In such a case,
dispatching BAR_EVENT causes a use-after-free; this constituted a DoS vector
just from sending messages that cause *any* failure during the first event
dispatch.

Instead, when this is enabled, we do not deallocate 'foo' until event_action()
has returned back to the main loop.

Test: duplicate fsm_dealloc_test.c using this, and print the number of items
deallocated in each test loop, to ensure the feature works. We also verify that
the deallocation safety works simply by fsm_dealloc_test.c not crashing.

We should probably follow up by refusing event dispatch and state transitions
for FSM instances that are terminating or already terminated:
see I0adc13a1a998e953b6c850efa2761350dd07e03a.

Change-Id: Ief4dba9ea587c9b4aea69993e965fbb20fb80e78
2019-10-29 16:46:04 +01:00
Neels Hofmeyr d28aa0c2f1 fsm_dealloc_test: no need for ST_DESTROYING
A separate ST_DESTROYING state originally helped with certain deallocation
scenarios. But now that fsm.c avoids re-entering osmo_fsm_inst_term() twice and
gracefully handles FSM instance deallocations for termination cascades, it is
actually just as safe without a separate ST_DESTROYING state. ST_DESTROYING was
used to flag deallocation and prevent entering osmo_fsm_inst_term() twice,
which works only in a very limited range of scenarios.

Remove ST_DESTROYING from fsm_dealloc_test.c to show that all tested scenarios
still clean up gracefully.

Change-Id: I05354e6cad9b82ba474fa50ffd41d481b3c697b4
2019-04-11 05:36:36 +00:00
Neels Hofmeyr 1f9cc01861 fsm: support graceful osmo_fsm_inst_term() cascades
Add global flag osmo_fsm_term_safely() -- if set to true, enable the following
behavior:

Detect osmo_fsm_inst_term() occuring within osmo_fsm_inst_term():
- collect deallocations until the outermost osmo_fsm_inst_term() is done.
- call osmo_fsm_inst_free() *after* dispatching the parent event.

If a struct osmo_fsm_inst enters osmo_fsm_inst_term() while another is already
within osmo_fsm_inst_term(), do not directly deallocate it, but talloc-reparent
it to a separate talloc context, to be deallocated with the outermost FSM inst.

The effect is that all osmo_fsm_inst freed within an osmo_fsm_inst_term()
cascade will stay allocated until all osmo_fsm_inst_term() are complete and all
of them will be deallocated at the same time.

Mark the deferred deallocation state as __thread in an attempt to make cascaded
deallocation handling threadsafe.  Keep the enable/disable flag separate, so
that it is global and not per-thread.

The feature is showcased by fsm_dealloc_test.c: with this feature, all of those
wild deallocation scenarios succeed.

Make fsm_dealloc_test a normal regression test in testsuite.at.

Rationale:

It is difficult to gracefully handle deallocations of groups of FSM instances
that reference each other. As soon as one child dispatching a cleanup event
causes its parent to deallocate before fsm.c was ready for it, deallocation
will hit a use-after-free. Before this patch, by using parent_term events and
distinct "terminating" FSM states, parent/child FSMs can be taught to wait for
all children to deallocate before deallocating the parent. But as soon as a
non-child / non-parent FSM instance is involved, or actually any other
cleanup() action that triggers parent FSMs or parent talloc contexts to become
unused, it is near impossible to think of all possible deallocation events
ricocheting, and to avoid running into freeing FSM instances that were still in
the middle of osmo_fsm_inst_term(), or FSM instances to enter
osmo_fsm_inst_term() more than once. This patch makes deallocation of "all
possible" setups of complex cross referencing FSM instances easy to handle
correctly, without running into use-after-free or double free situations, and,
notably, without changing calling code.

Change-Id: I8eda67540a1cd444491beb7856b9fcd0a3143b18
2019-04-11 05:36:36 +00:00
Neels Hofmeyr 3b414a4adc fsm: add flag to ensure osmo_fsm_inst_term() happens only once
To prevent re-entering osmo_fsm_inst_term() twice for the same osmo_fsm_inst,
add flag osmo_fsm_inst.proc.terminating. osmo_fsm_inst_term() sets this to
true, or exits if it already is true.

Update fsm_dealloc_test.err for illustration. It is not relevant for unit
testing yet, just showing the difference.

Change-Id: I0c02d76a86f90c49e0eae2f85db64704c96a7674
2019-04-11 05:36:36 +00:00
Neels Hofmeyr 223d66a414 add fsm_dealloc_test.c
Despite efforts to properly handle "GONE" events and entering a ST_DESTROYING
only once, so far this test runs straight into a heap use-after-free. With
current fsm.c, it is hard to resolve the situation with the objects named
"other" also causing deallocations besides the FSM instance parent/child
relations.

For illustration, add an "expected" test output file fsm_dealloc_test.err,
making this pass will follow in a subsequent patch.

Change-Id: If801907c541bca9f524c9e5fd22ac280ca16979a
2019-04-11 05:36:36 +00:00
Neels Hofmeyr 050f2d3259 log: fsm: allow logging the timeout on state change
Add a flag that adds timeout info to osmo_fsm_inst state change logging.

To not affect unit testing, make this an opt-in feature that is disabled by
default -- mostly because osmo_fsm_inst_state_chg_keep_timer() will produce
non-deterministic logging depending on timing (logs remaining time).

Unit tests that don't verify log output and those that use fake time may also
enable this feature. Do so in fsm_test.c.

The idea is that in due course we will add osmo_fsm_log_timeouts(true) calls to
all of our production applications' main() initialization.

Change-Id: I089b81021a1a4ada1205261470da032b82d57872
2019-02-26 20:57:58 +00:00
Neels Hofmeyr bd5a1dc84f osmo_fsm_inst_state_chg(): set T also for zero timeout
Before this patch, if timeout_secs == 0 was passed to
osmo_fsm_inst_state_chg(), the previous T value remained set in the
osmo_fsm_inst->T.

For example:

  osmo_fsm_inst_state_chg(fi, ST_X, 23, 42);
  // timer == 23 seconds; fi->T == 42
  osmo_fsm_inst_state_chg(fi, ST_Y, 0, 0);
  // no timer; fi->T == 42!

Instead, always set to the T value passed to osmo_fsm_inst_state_chg().

Adjust osmo_fsm_inst_state_chg() API doc; need to rephrase to accurately
describe the otherwise unchanged behaviour independently from T.

Verify in fsm_test.c.

Rationale: it is confusing to have a T number remaining from some past state,
especially since the user explicitly passed a T number to
osmo_fsm_inst_state_chg(). (Usually we are passing timeout_secs=0, T=0).

I first thought this behavior was introduced with
osmo_fsm_inst_state_chg_keep_timer(), but in fact osmo_fsm_inst_state_chg()
behaved this way from the start.

This shows up in the C test for the upcoming tdef API, where the test result
printout was showing some past T value sticking around after FSM state
transitions. After this patch, there will be no such confusion.

Change-Id: I65c7c262674a1bc5f37faeca6aa0320ab0174f3c
2019-01-29 10:25:26 +00:00
Neels Hofmeyr 407df02e7c add osmo_fsm_inst_state_chg_keep_timer()
Change-Id: I3c0e53b846b2208bd201ace99777f2286ea39ae8
2018-05-31 21:01:33 +00:00
Neels Hofmeyr a64c45a03e add osmo_fsm_inst_update_id_f()
In the osmo-msc, I would like to set the subscr conn FSM identifier by a string
format, to include the type of Complete Layer 3 that is taking place. I could
each time talloc a string and free it again. This API is more convenient.

From osmo_fsm_inst_update_id(), call osmo_fsm_inst_update_id_f() with "%s" (or
pass NULL).

Put the name updating into separate static update_name() function to clarify.

Adjust the error message for erratic ID: don't say "allocate", it might be from
an update. Adjust test expectation.

Change-Id: I76743a7642f2449fd33350691ac8ebbf4400371d
2018-04-09 17:57:15 +02:00
Neels Hofmeyr 6e8c088472 cosmetic: osmo_fsm_inst_update_id(): don't log "allocate"
On erratic id in osmo_fsm_inst_update_id(), don't say "Attempting to allocate
FSM instance".

Escape the invalid id using osmo_quote_str().

Change-Id: I770fc460de21faa42b403f694e853e8da01c4bef
2018-04-09 17:57:15 +02:00
Neels Hofmeyr 71f76a1f42 fsm: id: properly set name in case of NULL id
Since alloc relies on osmo_fsm_inst_update_id() to set the name, never skip
that.

In osmo_fsm_inst_alloc(), we allow passing a NULL id, and in
osmo_fsm_inst_update_id(), we set the name without id if id is NULL.

Change-Id: I6d6b09a811b82770818f19b189a57d9fc4a8133b
2018-04-09 17:57:15 +02:00
Neels Hofmeyr 975ee6bd44 fsm_test: more thoroughly test FSM inst ids and names
Place id and name testing in its separate section, test_id_api().

Add a test that actually allocates an FSM instance with a NULL id, which is
allowed, but uncovers a bug of an unset FSM instance name. osmo_fsm_inst_name()
falls back to the fsm struct's name on NULL, but osmo_fsm_inst_find_by_name()
fails to match if the instance's name is NULL (and until recently even
crashed). Show this in fsm_test.c with loud comments.

Add test to clear the id by passing NULL.

Add test for setting an empty id.

Add test for setting an invalid identifier (osmo_identifier_valid() == false).

Change-Id: I646ed918576ce196c395dc5f42a1507c52ace2c5
2018-04-09 17:57:15 +02:00
Neels Hofmeyr d8f175cd2a fsm_test: terminate the main loop instead of exit on timeout
In fsm_test.c, we have FSM instance cleanup after the select main loop, but we
exit(0) in the timer cb; hence the final code is never called.

Rather clean up the instance and hence also test that, by using a global flag
to exit the main loop upon timeout.

Adjust expected stderr output.

BTW, in a subsequent commit, I want to move the fsm instance id testing to
below the main loop, to more clearly group the tested bits.

Change-Id: Ia47811ffcc1bd68d2630c86be7ab98fc1f338773
2018-04-09 17:57:15 +02:00
Daniel Willmann 04a2a3231f fsm: Update the name as well if the id is updated and accept NULL
If the name stays the same the log messages will still log with the old
id. Since we can now change the id we need to update the name as well.

NULL as id was allowed before so we should allow that as well.

Change-Id: I6b01eb10b8a05fee3e4a5cdefdcf3ce9f79545b4
2018-03-19 20:28:11 +00:00
Stefan Sperling 888dc7d31a print BIG FAT ERROR message if osmo_fsm lacks event names
Event names are displayed in VTY commands so all FSM should have them.
Print an error message if an FSM is registered without event names.
We could also return an error code, however at present no caller checks
the return value of osmo_fsm_register() so this would be pointless.

Add event names to the test FSM and update expected output accordingly.

Change-Id: I08b100d62b5c50bf025ef87d31ea39072539cf37
Related: OS#3008
2018-02-26 19:00:23 +00:00
Vadim Yanitskiy 35b54d12bb fsm_test.c: fix unreachable check
Change-Id: Ic3d5da00f7ece6dbcd4c999187a5748c9331e60f
2017-05-15 12:51:15 +00:00
Harald Welte 31c0fef2fd control_if: Add control interface commands for FSMs
This allows programmatic access to introspection of FSM instances, which
is quite handy from e.g. external test cases: Send a message to the
code, then use the CTRL interface to check if that message has triggered
the right kind of state transition.

Change-Id: I0f80340ee9c61c88962fdd6764a6098a844d0d1e
2017-04-27 09:50:47 +02:00
Harald Welte 4585e6755d osmo_fsm: Lookup functions to find FSM Instance by name or ID
Introduce two lookup helper functions to resolve a fsm_instance based on
the FSM and name or ID.  Also, add related test cases.

Change-Id: I707f3ed2795c28a924e64adc612d378c21baa815
2017-04-16 17:28:23 +02:00
Neels Hofmeyr cba8eb9b21 fsm_test.c: fix compiler warning: timer cb return type
Change-Id: Ifd7e85cd69b5e7e473000abc1ef7a56748aafc0e
2016-12-24 17:11:52 +00:00
Neels Hofmeyr 6a13e7f563 fsm: add LOGPFSML to pass explicit logging level
Provide one central LOGPFSML to print FSM information, take the FSM logging
subsystem from the FSM instance but use an explicitly provided log level
instead of the FSM's default level.

Use to replace some, essentially, duplications of the LOGPFSM macro.

In effect, the fsm_test's expected error changes, since the previous code dup
for logging events used round braces to indicate the fi's state, while the
central macro uses curly braces.

Change-Id: If295fdabb3f31a0fd9490d1e0df57794c75ae547
2016-12-14 17:56:48 +01:00
Max 3de97e1926 Add logging and testing for FSM deallocation
osmo_fsm_inst_alloc() logs allocation but osmo_fsm_inst_free() is
silent. Fix this by adding log message for deallocation to make FSM
lifecycle tracking easier. Also make sure it's covered by test suite.

Change-Id: I7e5b55a1fff8e36cf61c7fb61d3e79c1f00e29d2
2016-11-08 19:35:19 +00:00
Max 8b25a3f5c3 Add osmo_fsm_unregister() to header
Previously function was defined but not exposed so there were a way to
register FSM but no way to unregister it.

Change-Id: I2e749d896009784b77d6d5952fcc38e1c131db2b
2016-11-02 08:56:29 +00:00
Harald Welte 136e73764e Add Finite State Machine abstraction code
This code is supposed to formalize some of the state machine handling in
Osmocom code.

Change-Id: I0b0965a912598c1f6b84042a99fea9d522642466
Reviewed-on: https://gerrit.osmocom.org/163
Tested-by: Jenkins Builder
Reviewed-by: Harald Welte <laforge@gnumonks.org>
2016-06-16 21:43:45 +00:00