skip to Main Content

I’ve never created, nor used a cron job before, but what I’ve gathered from numerous questions and answers on SO is that the process is fairly simple and involves something like the following:

  1. Create bash file with shell commands
  2. Edit crontab

I’ve found lots of questions and answers on SO regarding cron jobs, but not a single one of them actually explains the syntax. I’ve tried looking online for a reliable explanation too, but to no avail. I did find this page, however, which explains the time and date portion of crontab statements very clearly.

Here’s my understanding so far:

1. Create bash script, which can be placed anywhere.

#!/bin/bash
cd /home/user/public_html/scrapy/projects/myproject/spiders
scrapy crawl mycrawler
  • What is the significance of the #!/usr/bin/bash statement?

  • Why is it commented out?

  • Is using a shell script as a proxy even necessary to run Python scripts?

2. Edit crontab via the crontab -e command

I’ve seen so many different recommendations for this part, so I’m going to list a few examples from a few different answers.


Example #1

PATH=/usr/bin
* 5 * * * cd project_folder/project_name/ && scrapy crawl spider_name
  • Is embedding commands directly in crontab -e considered good practice?

Example #2

*/5 * * * * /usr/local/bin/python /home/Documents/SCRAPE_PYTHON/SCRAPE.py &>> /home/Desktop/log.txt

  • What is the significance of the first path, /usr/local/bin/python, in this context?

He states in his answer that &>> /home/Desktop/log.txt is the file to which errors and other output will be appended.

  • Is that what the &>> does?

  • Is that universal for every single Linux environment?


Example #3

*/2 * * * * /home/user/shell_scripts/cj-scrapy.sh

  • How come the above code does not include two paths?

  • Is it a potential security vulnerability to place shell scripts in the /home/user/scripts directory?

  • Is there a specific directory where shell scripts like this are commonly stored?


Example #4

The cPanel Cron Job Wizard recommends the following syntax:

/usr/local/bin/php /home/user/public_html/path/to/cron/script


Why all of the discrepancies between crontab recommendations?

I understand the syntax of the time and date portion of crontab, but can somebody please explain the proper syntax for the rest of it?

3

Answers


  1. Many questions here BUT:

    Cron job or cron schedule is a specific set of execution instructions specifying day, time and command to execute. crontab can have multiple execution statements. And each execution statement can have many commands (i.e. per line).

    What is the significance of the #!/usr/bin/bash statement?

    It is a shebang. If a script is named with the path path/to/script, and it starts with the shebang line, #!/usr/bin/bash, then the program loader is instructed to run the program /usr/bin/bash and pass it the path/to/script as the first arg.

    Why is it commented out?

    In computing, a shebang is the character sequence consisting of the characters number sign and exclamation mark (#!) at the beginning of a script.

    Is using a shell script as a proxy even necessary to run Python scripts?

    In relation to the crontab? No. You can pass many commands

    * * * * * /usr/bin/python script.py
    

    Editing crontab by crontab -e. Simple answer, yes. Here is a very quick reference:

    crontab -e    Edit crontab file, or create one if it doesn’t already exist.
    crontab -l    crontab list of cronjobs , display crontab file contents.
    crontab -r    Remove your crontab file.
    crontab -v    Display the last time you edited your crontab file. (This option is only available on a few systems.)
    

    Example 2
    You are telling cron to execute a python script. Cron needs to know where the python binary is (at /usr/local/bin/python), which is required to execute the python script sitting at /home/Documents/SCRAPE_PYTHON/SCRAPE.py (the &>> is for directing output to a log file).

    Login or Signup to reply.
  2. I try to provide context for all of your questions and examples below, but ultimately the question is:

    1. How frequently do you want to execute your command?
    2. What command do you need to execute?

    Generally a crontab entry is a time directive, followed by a shell command:

    * * * * * shell command
    ^ ^ ^ ^ ^|^^^^^^^^^^^^^
    | | | | |||||||||||||||
       time  |shell command
    

    In UNIX, #! (or shebang) indicates which program should be used to interpret the script that follows. So #!/usr/local/bin/python means, execute the following script with python (aka /usr/local/bin/python), just as /bin/bash indicates that the following script should be executed with the bash shell. This looks like a comment because it is a comment … it is designed to be a comment to python so it’s not interpreted, but it has meaning to UNIX when executing (similar to a preprocessor directive).

    This shebang answers your question:

    Is using a shell script as a proxy even necessary to run Python scripts?

    The answer is no. The shebang makes this wrapper completely unnecessary.

    Now to get to your examples:

    Example 1:

    PATH=/usr/bin
    * 5 * * * cd project_folder/project_name/ && scrapy crawl spider_name
    

    Breaking this down. * 5 * * * indicates that this command should be run on every minute (first *) of the 5th hour (5) of every day of the month (next *) of every month (next *) on every day of the week (last *). This is almost certainly not what you want from a time perspective. The rest of the line is executed as a command string, so you are changing directory to project_folder/project_name and then executing scapy. Overall the crontab bits here are not what you want, and the relative path on the cd indicates that this command is also probably not correct.

    Is embedding commands directly in crontab -e considered good practice?

    It is what crontab is time directives followed by a command.

    Example #2

    */5 * * * * /usr/local/bin/python /home/Documents/SCRAPE_PYTHON/SCRAPE.py &>> /home/Desktop/log.txt
    

    This command will run every 5th minute of every hour of every day of every month on every day of the week. The /usr/local/bin/python here is redundant with the #!/usr/local/bin/python so it is quite unnecessary.

    The &>> will append the output (>>) of the command on both stdout and stderr (&) to the file /home/Desktop/log.txt. This logging is good, the every 5 minutes might be good, but the python bits are not necessary. To answer your second question: yes this isbash` syntax so it will work with every command.

    Example #3

    */2 * * * * /home/user/shell_scripts/cj-scrapy.sh
    

    This executes the program /home/user/shell_scripts/cj-scraph.sh (presumably a shell script) every 2nd minute of every hour of every day of the month of every month on every day of the week. This script presumably runs your python script.

    Example 4

    This is neither python nor a cron job.

    Login or Signup to reply.
  3. Let me answer you all of your question one by one!

    1.) What is the significance of the #!/usr/local/bin/python statement in this context?

    • So, This is called Shebang that is used for specifying the interpreter here you are pointing to the python interpreter.
      You can prevent this but if you prevent this you have to specify interpreter while running your script.

    Similarly #!/bin/bash to point out bash interpreter. & shebang necessarily to be the first line of script.

    2) Is creating a shell file necessary to run Python/Scrapy scripts, et cetera?

    • Nup you can directly enter commands in crontab.

    3) Can you actually export PATH and execute other commands (i.e. cd ... && scrapy crawl mycrawler) directly in crontab -e?

    • If you will do so it will be a good practice and also it reduces the line because if you not use PATH then you have to specify every time the full path of a command to be executed like /usr/bin/find to just run find command.

    4) What is the significance of the first path, /usr/local/bin/python?

    • as i said earlier it is pointing to your interpreter.

    5) In his answer, he states that &>> /home/Desktop/log.txt is the file to which errors and other output will appended. Is that what the &>> does? Is that universal for every single Linux distro?

    • &>> will apped output to the file that is stdout Yea it’s common for all linux distro.

    6) Is there a specific location on servers where shell scripts like this are commonly stored?

    • Well It’s up to you where you store your scripts. but it’s recommended that you store your scripts at a safe place like in /opt/<user>/scripts folder.

    Optional

    https://crontab.guru/ – here you can understand more about crontab syntax and all other things are basic Linux things.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search