I have a very simple Python program called test.py that I’m using to test Monit. Monit seems unable to detect that the program is running. Here’s the current context:
Monit version 5.31.0 running on Ubuntu 22.04.3 LTS running in a Proxmox LXC container; I’ve also tried 5.26.0 and 5.33.0
- I’ve been using Monit for many years but never on this system.
- Monit successfully starts the program but attempts to restart it after the specified timeout period
- Relevant section of Monit config:
check process test MATCHING "/home/pi/monit/test.py"
start program = "/usr/bin/python3 /home/pi/monit/test.py >> /home/pi/monit/test.log 2>&1" timeout 30 seconds
stop program = "/usr/bin/pkill -f '/home/pi/monit/test.py'"
if does not exist then restart
The PID file (in case the PID method is used):
-rw-r--r-- 1 root root 6 Aug 10 16:37 test.pid
From the Monit log file:
[2023-08-18T13:40:15+0200] error : 'test' process is not running
[2023-08-18T13:40:15+0200] info : 'test' trying to restart
[2023-08-18T13:40:15+0200] info : 'test' start: '/usr/bin/python3 /home/pi/monit/test.py >> /home/pi/monit/test.log 2>&1'
[2023-08-18T13:40:45+0200] error : 'test' failed to start (exit status -1) -- Program '/usr/bin/python3 /home/pi/monit/test.py >> /home/pi/monit/test.log 2>&1' timed out after 30 s
[2023-08-18T13:41:46+0200] error : 'test' process is not running
[2023-08-18T13:41:46+0200] info : 'test' trying to restart
[2023-08-18T13:41:46+0200] info : 'test' start: '/usr/bin/python3 /home/pi/monit/test.py >> /home/pi/monit/test.log 2>&1'
[2023-08-18T13:42:16+0200] error : 'test' failed to start (exit status -1) -- Program '/usr/bin/python3 /home/pi/monit/test.py >> /home/pi/monit/test.log 2>&1' timed out after 30 s
[2023-08-18T13:43:17+0200] error : 'test' process is not running
Note that where Monit reports that the process is not running, it is in fact running.
Test.py only needs a few ms to start. Manual start is without issues.
Things I’ve tried:
- multiple fresh installs of different versions of Monit
- running on a bare-metal Ubuntu 22.04.3 LTS
- using a PID file as well as the MATCHING method
- numerous variants of the ‘start program’ syntax
How can I solve this problem?
2
Answers
@lutzmad Thanks for responding. I visited the link you mentioned and used it to arrive at a working start script. I had tried this method previously but the syntax from the link is different. I'm showing - for the benefit of others - the non-working and the working start line.
NON-working
WORKING
Notice the absence of the "nohup" and the "&" in the NON-working commands. I understand the role of "&" but am less certain about the "nohup" in this case.
Your python script does not spawn a new process.
Monit starts your script, but the start script will not end in a given time, therefore Monit tries to start the script again and again.
You should be aware that Monit use a limited environment to execute scripts like Initd or Systemd only. To get a more useful/proper environment use a shell script to start your python script and initialize the environment, set things like PYTHONPATH, etc., in the script.
See this answer to get an explanation on how to create a start/stop script and the way how Monit handles these scripts.