skip to Main Content

I have a bash script in my apache directories that download some pictures and optimize them.

my script path is in : /var/www/site/storage/optimazer/photo_optimazer.sh

this script get some command from an txt file and pass it to wget

#!/usr/bin/env bash
..
THREAD="$(cat ${THREAD_FILE})";
$(command -v wget) $THREAD
...

Contents of ${THREAD_FILE}:

$ cat "${THREAD_FILE}"
--user-agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:90.0) Gecko/20100101 Firefox/90.0" -np -r -l 1 -A "jpg" --ignore-case -P /var/www/optimazer/public/optimazed -x http://example.com

I try to execute this bash with another one that was created at /usr/local/bin/optimaze.sh

I had to do it cuz its would be run with system services.

here is the /usr/local/bin/optimaze.sh content

#!/usr/bin/env bash

cd /var/www/site/storage/optimazer/
$(command -v bash) photo_optimazer.sh

now, when I execute the optimaze.sh its add some extra quotes to my ${THREAD} content and broke the script and I got some errors like this :

--2021-07-30 12:56:59--  http://(windows/
Resolving (windows ((windows)... failed: Name or service not known.
wget: unable to resolve host address ‘(windows’
--2021-07-30 12:56:59--  http://nt/
Resolving nt (nt)... failed: Name or service not known.
wget: unable to resolve host address ‘nt’
--2021-07-30 12:56:59--  http://10.0;/
Resolving 10.0; (10.0;)... failed: Name or service not known.
wget: unable to resolve host address ‘10.0;’
--2021-07-30 12:56:59--  http://win64;/
Resolving win64; (win64;)... failed: Name or service not known.
wget: unable to resolve host address ‘win64;’
--2021-07-30 12:56:59--  http://x64;/
Resolving x64; (x64;)... failed: Name or service not known.
wget: unable to resolve host address ‘x64;’
--2021-07-30 12:56:59--  ftp://rv/90.0)
           => ‘/var/www/scraper/public/***/3/rv/.listing’
Resolving rv (rv)... failed: Name or service not known.
wget: unable to resolve host address ‘rv’
--2021-07-30 12:56:59--  http://gecko/20100101
Resolving gecko (gecko)... failed: Name or service not known.
wget: unable to resolve host address ‘gecko’
--2021-07-30 12:56:59--  http://firefox/90.0%22
Resolving firefox (firefox)... failed: Name or service not known.
wget: unable to resolve host address ‘firefox’

I try set -ex in photo_optimazer.sh and see what happend

 wget '--user-agent="Mozilla/5.0' '(Windows' NT '10.0;' 'Win64;' 'x64;' 'rv:90.0)' Gecko/20100101 'Firefox/90.0"' -np -A '"jpg,png"' --ignore-case --ignore-length -P /example/path -x http://example.com

It add single quotes to my ${THREAD} output and I don’t know why!

I use GNU bash, version 4.2.46(2)-release (x86_64-redhat-linux-gnu)

2

Answers


  1. If your arguments can’t contain newlines, consider changing THREAD_FILE (better named all-lowercase, as thread_file, to stay out of the reserved all-caps namespace) to be contain one argument per line, with no shell syntax whatsoever:

    --user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:90.0) Gecko/20100101 Firefox/90.0
    -np
    -r
    -l
    1
    -A
    jpg
    --ignore-case
    -P
    /var/www/optimazer/public/optimazed
    -x
    http://example.com
    

    Once you’ve done that, you can use (in bash 4.0 or later) readarray or mapfile to read each line of that file into a new array entry:

    readarray -t wget_args <"$thread_file"
    

    …and then expand that array onto your wget command line:

    wget "${wget_args[@]}"
    

    A note, about that "reserved all-caps namespace" claim made above: The POSIX standard only strictly requires POSIX-specified tools to use only all-caps names for environment variables that modify those tools’ behavior. However:

    • When an environment variable and a shell variable have the same name, any changes to the shell variable will also implicitly modify the environment variable.
    • The purpose of POSIX tools being defined to use only all-caps variables is to make variable names with at least one lower-case variable safe for application use.

    When all-caps variables are used for application-defined purposes, doing so discards the benefits of the restrictions POSIX places on built-in tools.

    Login or Signup to reply.
  2. For this particular case one idea would be to feed each line to xargs.

    For sample data I doubled OP’s $THREAD_FILE:

    $ cat tfile
    --user-agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:90.0) Gecko/20100101 Firefox/90.0" -np -r -l 1 -A "jpg" --ignore-case -P /var/www/optimazer/public/optimazed -x http://example.com
    --user-agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:90.0) Gecko/20100101 Firefox/90.0" -np -r -l 1 -A "jpg" --ignore-case -P /var/www/optimazer/public/optimazed -x http://example.com
    

    A first pass at xargs:

    cat tfile | xargs -r wget
    

    Or we can eliminate the unnecessary cat by feeding the file directly to xargs:

    xargs -r -a tfile wget
    

    A few variations on KamilCuk’s comment/suggestion:

    xargs -r < tfile wget
    xargs -r wget < tfile
    < tfile xargs -r wget
    

    If we’re dealing wih a variable (as with OPs example):

    thread=$(head -1 tfile)
    xargs -r wget <<< "${thread}"
    

    And expanding on the <<< "${thread}" example … using this in a loop (eg, need to perform additional processing for each line from a multi-line input file):

    while read -r thread
    do
        xargs -r wget <<< "${thread}"
    done < tfile
    

    All of these generate the following for each line processed:

    --2021-07-31 13:50:41--  http://example.com/
    Resolving example.com (example.com)... 93.184.216.34, 2606:2800:220:1:248:1893:25c8:1946
    Connecting to example.com (example.com)|93.184.216.34|:80... connected.
    HTTP request sent, awaiting response... 200 OK
    Length: 1256 (1.2K) [text/html]
    Saving to: ‘/var/www/optimazer/public/optimazed/example.com/index.html.tmp’
    
    example.com/index.html.tmp               100%[================================================================================>]   1.23K  --.-KB/s    in 0.001s
    
    2021-07-31 13:50:41 (1.25 MB/s) - ‘/var/www/optimazer/public/optimazed/example.com/index.html.tmp’ saved [1256/1256]
    
    Removing /var/www/optimazer/public/optimazed/example.com/index.html.tmp since it should be rejected.
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search