2020-03-10 10:46:39 +00:00
|
|
|
== Troubleshooting
|
|
|
|
|
|
|
|
=== Format: YAML, and its Drawbacks
|
|
|
|
|
|
|
|
The general configuration format used is YAML. The stock python YAML parser
|
|
|
|
does have several drawbacks: too many complex possibilities and alternative
|
|
|
|
ways of formatting a configuration, but at the time of writing seems to be the
|
|
|
|
only widely used configuration format that offers a simple and human readable
|
|
|
|
formatting as well as nested structuring. It is recommended to use only the
|
|
|
|
exact YAML subset seen in this manual in case the osmo-gsm-tester should move
|
|
|
|
to a less bloated parser in the future.
|
|
|
|
|
|
|
|
Careful: if a configuration item consists of digits and starts with a zero, you
|
|
|
|
need to quote it, or it may be interpreted as an octal notation integer! Please
|
|
|
|
avoid using the octal notation on purpose, it is not provided intentionally.
|
2020-03-16 18:03:44 +00:00
|
|
|
|
|
|
|
=== {app-name} not running but resources still allocated
|
|
|
|
|
|
|
|
The <<state_dir,reserved_resources.state>> is used to keep shared state of the
|
|
|
|
the resources allocated by any {app-name} instance. Each {app-name} instance
|
|
|
|
being run is responsible to de-allocate the used resources before exiting. In
|
|
|
|
general, upon receiving a shutdown action (ie. 'CTRL+C', 'SIGINT', python
|
|
|
|
exception, etc.), {app-name} is able to handle properly the situation and
|
|
|
|
de-allocate the resources before the process exits. Similarly, {app-name} also
|
|
|
|
takes care of terminating all its children processes being managed before
|
|
|
|
exiting itself.
|
|
|
|
|
|
|
|
However, under some circumstances, {app-name} will be unable to de-allocate the
|
|
|
|
resources and they will remain allocated for subsequent {app-name} instances
|
|
|
|
which try to use them. That situation is usually reached when someone terminates
|
|
|
|
{app-name} in a hard way. Main reasons are {app-name} process receiving a
|
|
|
|
'SIGKILL' signal ('kill -9 $pid') which cannot be caught, or due to the entire
|
|
|
|
host being shut down in a non proper way.
|
|
|
|
|
|
|
|
As a noticeable example, SIGKILL is known to be sent to {app-name} when it runs
|
|
|
|
under a jenkins shell script and any of the two following things happen:
|
|
|
|
|
|
|
|
- User presses the red cross icon in the Jenkins UI to terminate the running
|
|
|
|
job.
|
|
|
|
- Connection between Jenkins master (UI) and Jenkins slave running the job is
|
|
|
|
lost.
|
|
|
|
|
|
|
|
Once this situation is reached, one needs to follow 2 steps:
|
|
|
|
|
|
|
|
- Gain console access to the <<install_main_unit,Main Unit>> and manually clean
|
|
|
|
or completely remove the 'reserved_resources.state' in the
|
|
|
|
<<state_dir,state_dir>>. In general it's a good idea to make sure no
|
|
|
|
{app-name} instance is running at all and then remove completely all files in
|
|
|
|
<<state_dir,state_dir>>, since {app-name} could theoretically have been killed
|
|
|
|
while writing some file and it may have ended up with corrupt content.
|
|
|
|
- Gain console access to the <<install_main_unit,Main Unit>> and each of the
|
|
|
|
<<install_slave_unit,Slave Units>> and kill any hanging long-termed processes
|
|
|
|
in there which may have been started by {app-name}. Some popular processes in
|
|
|
|
this list include 'tcpdump', 'osmo-\*', 'srs*', etc.
|