I’ve never created, nor used a cron job before, but what I’ve gathered from numerous questions and answers on SO is that the process is fairly simple and involves something like the following:
- Create bash file with shell commands
- Edit crontab
I’ve found lots of questions and answers on SO regarding cron jobs, but not a single one of them actually explains the syntax. I’ve tried looking online for a reliable explanation too, but to no avail. I did find this page, however, which explains the time and date portion of crontab
statements very clearly.
Here’s my understanding so far:
1. Create bash script, which can be placed anywhere.
#!/bin/bash
cd /home/user/public_html/scrapy/projects/myproject/spiders
scrapy crawl mycrawler
-
What is the significance of the
#!/usr/bin/bash
statement? -
Why is it commented out?
-
Is using a shell script as a proxy even necessary to run Python scripts?
2. Edit crontab via the crontab -e
command
I’ve seen so many different recommendations for this part, so I’m going to list a few examples from a few different answers.
PATH=/usr/bin
* 5 * * * cd project_folder/project_name/ && scrapy crawl spider_name
- Is embedding commands directly in
crontab -e
considered good practice?
*/5 * * * * /usr/local/bin/python /home/Documents/SCRAPE_PYTHON/SCRAPE.py &>> /home/Desktop/log.txt
- What is the significance of the first path,
/usr/local/bin/python
, in this context?
He states in his answer that &>> /home/Desktop/log.txt
is the file to which errors and other output will be appended.
-
Is that what the
&>>
does? -
Is that universal for every single Linux environment?
*/2 * * * * /home/user/shell_scripts/cj-scrapy.sh
-
How come the above code does not include two paths?
-
Is it a potential security vulnerability to place shell scripts in the
/home/user/scripts
directory? -
Is there a specific directory where shell scripts like this are commonly stored?
Example #4
The cPanel Cron Job Wizard recommends the following syntax:
/usr/local/bin/php /home/user/public_html/path/to/cron/script
Why all of the discrepancies between crontab
recommendations?
I understand the syntax of the time and date portion of crontab
, but can somebody please explain the proper syntax for the rest of it?
3
Answers
Many questions here BUT:
Cron job or cron schedule is a specific set of execution instructions specifying day, time and command to execute. crontab can have multiple execution statements. And each execution statement can have many commands (i.e. per line).
What is the significance of the #!/usr/bin/bash statement?
It is a shebang. If a script is named with the path path/to/script, and it starts with the shebang line, #!/usr/bin/bash, then the program loader is instructed to run the program /usr/bin/bash and pass it the path/to/script as the first arg.
Why is it commented out?
In computing, a shebang is the character sequence consisting of the characters number sign and exclamation mark (#!) at the beginning of a script.
Is using a shell script as a proxy even necessary to run Python scripts?
In relation to the crontab? No. You can pass many commands
Editing crontab by crontab -e. Simple answer, yes. Here is a very quick reference:
Example 2
You are telling cron to execute a python script. Cron needs to know where the python binary is (at /usr/local/bin/python), which is required to execute the python script sitting at /home/Documents/SCRAPE_PYTHON/SCRAPE.py (the &>> is for directing output to a log file).
I try to provide context for all of your questions and examples below, but ultimately the question is:
Generally a crontab entry is a time directive, followed by a shell command:
In UNIX,
#!
(orshebang
) indicates which program should be used to interpret the script that follows. So#!/usr/local/bin/python
means, execute the following script withpython
(aka/usr/local/bin/python
), just as/bin/bash
indicates that the following script should be executed with thebash
shell. This looks like a comment because it is a comment … it is designed to be a comment topython
so it’s not interpreted, but it has meaning toUNIX
when executing (similar to a preprocessor directive).This
shebang
answers your question:The answer is no. The
shebang
makes this wrapper completely unnecessary.Now to get to your examples:
Example 1:
Breaking this down.
* 5 * * *
indicates that this command should be run on every minute (first*
) of the 5th hour (5
) of every day of the month (next*
) of every month (next*
) on every day of the week (last*
). This is almost certainly not what you want from a time perspective. The rest of the line is executed as a command string, so you are changing directory toproject_folder/project_name
and then executingscapy
. Overall thecrontab
bits here are not what you want, and the relative path on thecd
indicates that this command is also probably not correct.It is what crontab is time directives followed by a command.
Example #2
This command will run every
5
th minute of every hour of every day of every month on every day of the week. The/usr/local/bin/python
here is redundant with the#!/usr/local/bin/python
so it is quite unnecessary.The
&>>
will append the output (>>
) of the command on bothstdout
andstderr
(&
) to the file/home/Desktop/log.txt
. This logging is good, the every 5 minutes might be good, but thepython bits are not necessary. To answer your second question: yes this is
bash` syntax so it will work with every command.Example #3
This executes the program
/home/user/shell_scripts/cj-scraph.sh
(presumably ashell
script) every2
nd minute of every hour of every day of the month of every month on every day of the week. This script presumably runs your python script.Example 4
This is neither python nor a cron job.
Let me answer you all of your question one by one!
1.) What is the significance of the
#!/usr/local/bin/python
statement in this context?You can prevent this but if you prevent this you have to specify interpreter while running your script.
Similarly
#!/bin/bash
to point out bash interpreter. & shebang necessarily to be the first line of script.2) Is creating a shell file necessary to run Python/Scrapy scripts, et cetera?
3) Can you actually
export PATH
and execute other commands (i.e.cd ... && scrapy crawl mycrawler
) directly in crontab -e?PATH
then you have to specify every time the full path of a command to be executed like/usr/bin/find
to just runfind
command.4) What is the significance of the first path,
/usr/local/bin/python
?5) In his answer, he states that
&>> /home/Desktop/log.txt
is the file to which errors and other output will appended. Is that what the&>>
does? Is that universal for every single Linux distro?&>>
will apped output to the file that isstdout
Yea it’s common for all linux distro.6) Is there a specific location on servers where shell scripts like this are commonly stored?
/opt/<user>/scripts
folder.Optional
https://crontab.guru/
– here you can understand more about crontab syntax and all other things are basic Linux things.