I want to process a post request which sends a file. However the CGI script written in bash on an Apache server fails to upload the file to the server. I was able to pin the error down to /dev/stdin
not working as it’s supposed to. Instead of writing the binary stream out it throws an error. I’m working with the light-weight open source content management system Lichen and the reason why I originally encountered this problem.
In the following, I will first simplify the problem and show all the things I did wrong trying to upload a file from a html-front-end with a CGI back-end using shell script.
Simplifying the problem
The html post form to upload files looks like this:
<form action="http://guestserver/cgi-bin/upload.cgi" method="post" enctype="multipart/form-data">
<p><input type="file" name="filename" id="file"></p>
<p><input type="submit" value="Upload"></p>
</form>
It sends its request to upload.cgi
:
#!/bin/sh
# Exit immediately if a command exits with a non-zero status.
set -e
# replace spaces with underscores
# which is done by tr "thisGetsReplaced" "byThis" ;
# -s means squeeze repeated occurence
# echo $PATH_INFO |
# gets filename and passes output to next (by '|')
sanitized=$(echo $PATH_INFO | tr -s ' ' '_')
# move one dir up and look if file exists there
if [ -f ..$sanitized ]; then
cat /dev/stdin > /dev/null
echo 'Status: 409 Conflict'
echo 'Content-Type: text/plain'
echo ''
echo 'File already exists.'
exit 0
fi
# Actual file write
mkdir -p ..$(dirname $sanitized)
cat /dev/stdin > ..$sanitized # line that throws error
# I guess if file write at this point was successful
# it exits with something non-zero
# so the script is STOPPED
echo 'Status: 204 No Content'
echo "X-File-Name: $(basename $sanitized)"
echo "X-File-Path: $(dirname $sanitized)"
echo ''
However $PATH_INFO
won’t hold a string of the filename that’s in the process of being uploaded as we see later, so the upload fails?
On the other hand in the production environment, a new file is created with the correct filename, which can be seen in the correct directory, however the file is empty. 😮
I’m wondering how it done that?
A test.cgi script to validate if all data in test environment is send accordingly and to proof that PATH_INFO is empty:
#!/bin/sh
echo "Content-Type: text/html"
echo "<html><head></head><body>"
echo SERVER_<?> = $SERVER_<?> # just so I don't have to write so much
echo PATH_INFO = $PATH_INFO
dd count=1 bs=$CONTENT_LENGTH # prints file content
echo "</body></html>"
Has following output when client runs it through the html file post form:
> SERVER_SOFTWARE = Apache/2.4.55 (Unix)
> GATEWAY_INTERFACE = CGI/1.1
> SERVER_PROTOCOL = HTTP/1.1
> SERVER_PORT = 80
> SERVER_PROTOCOL = HTTP/1.1
> SERVER_PORT = 80
> REQUEST_METHOD = POST
> HTTP_ACCEPT =
> PATH_INFO =
> ------WebKitFormBoundaryMNBsYvUe3DbH9tpE Content-Disposition: form-data; name="filename"; filename="uploaded file.jpg" Content-Type: image/jpeg ÿØÿàJFIFÿÛC %# , #&')*)-0-(0%()(ÿÀ,àÿÄÿÄ; !"#$312%4CB“5ADQRcdƒabe„”•ÿÚ?•ñ£Ö˜þFßE%ò}õ[/ì³è1Æ'¬YÇçªÙæÞõÑTÚuJn4îÝ)ÎV“¦9îª ©“1í”»ge¢R…Z¿MÑŽ¼ÜÃÛ—d´¯±¦#ø4¦‚ðœDÐŽæ…c4û°e¥4ê×1žOO qu»Ö:ûïAB¬?ÙܶbZÎf³ª‹¹yçDÖÒáSªµù¦
The last bit is intentionally cut off so to not show 80kb of file and it’s what the command dd count=1 bs=$CONTENT_LENGTH
prints. It does a great job at printing out the content of the uploaded file (just not correctly encoded), proving that it somehow works. However, the file-content is never saved on the server.
With that we were able to confirm that the upload.cgi script receives the file in our test environment, although not the PATH_INFO
and thus the cgi-script fails?
Also this is Apache’s error message: Premature end of script headers: upload.cgi
and the apache error_log at /var/log/httpd/error_log
shows following error_code:
> dirname: missing operand
> Try 'dirname --help' for more information.
> /srv/http/cgi-bin/upload.cgi: line 20: ..: Is a directory
> [Mon Mar 06 12:29:04.166828 2023] [cgid:error] [pid 297:tid 140340891719360] [client 192.168.56.1:63168] Premature end of script headers: upload.cgi, referer: http://localhost:5500/
Pointing towards line 20 in the upload.cgi script (for full context look for the script above), although I guess it actually means line 19:
mkdir -p ..$(dirname $sanitized)
Where dirname
fails on its argument since $sanitized:
sanitized=$(echo $PATH_INFO | tr -s ‘ ‘ ‘_’)
is actually an empty string, since $PATH_INFO holds no value! – as we have seen.
I greatly appreciate any help and will be very happy if there is a solution to this problem. 🙂 The goal is to have the file correctly uploaded on the server.
2
Answers
If you're here to fix your Lichen upload.cgi script jump to the last part, where we fix
/dev/stdin
. The first two parts are solving errors I created in addition while searching for the real error (last part).PATH_INFO
Stay calm, everything is working like it's supposed to.
Your first mistake is that,
PATH_INFO
is taking its value from the url, so your post-request should actual include the filename as in this case. For examplehttp://server/cgi-bin/test.cgi/thisWillBePassedTo-PATH_INFO.file
.So your simplified problem is wrong. Which explains why you could still upload the files in your production environment, but without any content (we'll have a look at that in the third part), because the name was transferred correctly in
PATH_INFO
, causing a lot of confusion on your side.Next time have a look into the docs
Setting the header correct
On the error:
Premature end of script headers
Although the internet provides dozen ways of setting the header of a CGI script, long trial and error process showed that the correct header to just send text looks like this:
Whether you have upper or lower case letters is not important however the empty
echo ''
(setting a line-break) is very important!Additionally, you can also send status headers, which look would different (you can see them in the next code-block of the upload.cgi script).
cat: /dev/stdin: No such device or address
I was trying to run the static site content management system (CMS) Lichen, when this error occurred
cat: /dev/stdin: No such device or address
.I assume the server environment causes this error (I'm running Arch/Linux)
To fix this, I had to use a different command to read from
stdin
, which is calleddd
.Changing two lines from the
upload.cgi
script withdd
did it for me:Be aware, that the post request coming from your front-end should only send the body of the file, otherwise the CGI script processing the data stream will fail to save the file you send correctly. Look in the improvement section of this post for more.
A very simple showcase of using a CGI shell script to upload files
What’s CGI? CGI (Common Gateway Interface) defines a way for a web server to interact with external content-generating programs, which are often referred to as CGI programs or CGI scripts. It is a simple way to put dynamic content on your web site, using whatever programming language you’re most familiar with. copied from apache tutorials
CGI-Scripts usually lay in there own cgi-bin, where they also have permission to be executed, as well as the folder they’re in. (Can be set with
chmod o=rwx /path/to/folder-and-or-cgiFile
, which is often a source of errors). Also Apache has to be configured extensively, read the link provided above to learn more, and don’t forget to restart Apache otherwise your changes won’t have any effect.Server-Side
In our case we will be using shell script, which is preinstalled on GNU/Linux machines. In my case I used GNU bash, Version 5.1.16 which can invoked on the command prompt by typing
sh
.Creating a CGI-script: At the beginning we define what programming language we use to execute the script.
Tip, analyse Apache’s error log, if things aren’t working:
cat /var/log/httpd/error_log
Client-Side
Next, we’ll need a front-end for trying things out. This is how our html part looks like:
Explanation
The additional Javascript is needed in our html file, to provide the filename that gets pasted to the url, which is then accessed in the backend by $PATH_INFO, sanitized to remove any blank spaces and a directory just outside the cgi-bin is created, where either dd or cat print out the datastream coming from
/dev/stdin
.Also our front-end send some headers with the file which have to be removed. The CGI-script removes the first four lines
Which is a very wanky solution. I advice to do some more reading on that and I’m not sure how stable the example is concerning that. The original source-code prevents that by sending only the body, which is done with some JavaScript (as seen in this post).
The script is based on CMS Lichen source-code, however I had to change things to use ‘dd’ otherwise it failed. You may have to try out which one is working for you.
Improvement
This how you can send only the file body using JavaScript, making the part of our CGI script redundant that removes the last and first four lines of the file:
Don’t forget to change the URL.
The uploadFile function being invoked by an EvenListner which has to be defined under the occurrence of a html
<input type="file" id="file">
I’m sure one can further improve this script, by getting rid of the complexity the URL editing and removing file headers causes. Please let me know how you have improved the script.
And be aware that the upload.cgi script should be at least protected through
.htaccess
– me knowing little about cyber security!