osmo-ci

Commit Graph

Author	SHA1	Message	Date
Oliver Smith	a9c93850c3	jobs/osmo-gsm-tester-virtual: kill old instances Make sure osmo-gsm-tester gets killed eventually, even if a bug causes it to run forever or if aborted manually. * add a name to the docker container * kill the docker container if it runs longer than 24h with docker-cleanup.sh * rename fix_permissions_trap to clean_up_trap and kill it there, when it is still running before the job starts and after it is done (in my testing this did not kill it after pressing abort, but it would be killed either at the start of the next job running on the same jenkins node, or after 24h by docker-cleanup.sh) Related: OS#6304 Change-Id: I6fc874d319d74aabdc33c10910cbcca2978d5bbb	2023-12-14 11:11:27 +01:00
Oliver Smith	a13ce691d1	scripts/docker-cleanup: buildkit cache too In newer docker versions, a buildkit cache was introduced. It gets used while building images. Clean it up as well. Related: https://osmocom.org/projects/osmocom-servers/wiki/Docker_cache_clean_up Change-Id: Icf5237def75d4bcef6b0065f3f1f1da2ff362322	2023-11-21 13:01:03 +00:00
Oliver Smith	b206b2f1d2	scripts/docker-cleanup: remove containers > 24h Remove containers starting with jenkins- or having ttcn3 in the name, if they have been running for more than 24 hours. This can happen with the ttcn3 testsuites, as they typically start multiple docker containers in the background (one per Osmocom program) before they start the testsuite docker container in the foreground. Usually the clean up trap makes sure that all containers get killed, but we have seen that a few containers have been running for a few months. One reason for this could be temporary loss of connection between the jenkins server and the node running the job. Extend the clean script to remove the containers that were not properly removed by the clean up trap. Historically we used to kill docker containers of the same name before starting a testsuite, but this had the downside that we could not start the same testsuite multiple times in parallel. This was refactored in docker-playground Ifcd384272c56d585e220e2588f2186dc110902ed. Change-Id: I58c17b57c998eaba411658e83b7295d7cfcf9a23	2023-10-04 17:53:51 +02:00
Oliver Smith	21a641d6c2	scripts/docker-cleanup: remove fallback code Remove the fallback clean up code, as it also may lead to images getting removed right before we need to use them. Besides that, it should be dead code by now since docuum should be running on all our jenkins nodes to clean up old images based on last use date. Change-Id: I9ca0c2ba245bdd75d9fb8eaf341055e8c2ab1b55	2022-12-09 10:40:58 +01:00
Oliver Smith	a7df704d4d	scripts/docker-cleanup: fix timing problems Don't delete images while they are being used, to fix these errors we see from time to time in the middle of "docker build" on jenkins: unknown parent image ID sha256:1b072e35048cd8b680eddabdc641ac678edb1184d222d5e7b3fbe0b3c333129a This happens because "docker build" creates so-called dangling images for each step processed of a Dockerfile. The "docker system prune" call deletes these dangling images (among other things). Remove the "docker system prune" call. We already have the docuum daemon to deal with unused images (dangling and not dangling), it removes them based on last use date so that the used space is always below a configured limit. As it deletes images that haven't been used the longest when it reaches the limit, it will not result in the problem explained above. Besides images, "docker system prune" also removes unused containers (instances of images created with 'docker run' without --rm) and networks. Add "docker container prune" and "docker network prune" commands to remove them from now on. Also remove the redundant container removal logic (previous it was redundant with "docker system prune", now redundant with "docker container prune"). Related: https://docs.docker.com/config/pruning/ Change-Id: Ia1b466eea43dd135373949e8e3e6b005c169ea0c	2022-12-09 10:40:33 +01:00
Oliver Smith	88521fbc14	scripts/docker-cleanup.sh: conditional img clean Only run the simple image clean code if docuum is not running. It works well enough in most cases, but has the drawbacks that it never deletes "latest" images or images not matching "^osmocom-build", and may delete images that are still being used (OS#5447). With the other tool, all images are considered for removal, and the ones that have not been used the longest time are removed first. Related: OS#5477, OS#5066, SYS#5827 Change-Id: I1cef0833c096de0fa5acf77156bb5dd362e2ef9c	2022-02-11 15:44:44 +01:00
Oliver Smith	b5ebf6ea6b	scripts/docker-cleanup.sh: use "docker system prune" Do not only clean up dangling images, but also containers, volumes and networks. Related: SYS#5827 Change-Id: If441b251de50063f0229d36fb1bc260a4cb1dd87	2022-02-11 15:44:16 +01:00
Oliver Smith	2b77e64c48	scripts/docker-cleanup.sh: delete containers too Related: SYS#5827 Change-Id: I73b2f13875286c1aaa5424809edab2202f41768b	2022-02-11 15:44:16 +01:00
Oliver Smith	e94b29e837	scripts/docker-cleanup.sh: use set -x Change-Id: Iba170128e55a9778467c3d3bcf33a91321a8c29f	2022-02-11 15:44:16 +01:00
Alexander Couzens	bd20389e49	scripts/docker-cleanup.sh: set permissions to 755 It will otherwise not executed by the cron, because the cron is checking for the executable bit Change-Id: Ie9d67b157d62b38b62f5e74406d14344f90d07b8	2018-04-16 16:33:08 +02:00
Harald Welte	acdde1617b	add docker-cleanup.sh script This script should be executed regularly on all build slaves that have docker in order to discard unused images/layers. It would be a good idea to call "fstrim /" afterwards in order to get more SSD performance. However, the latter requires root access, and hence cannot be called by the 'osmocom-build' user and thus jenkins. Maybe we should install it as a cron job or systemd periodic timer job? Related: OS#3144 Change-Id: I688b952578507a9cc28fe682221b5c7e3a245519	2018-04-11 06:07:12 +00:00

11 Commits