I am working on a Bash scripting project in which I need to delete one of two files if they have identical content. I should delete the one which comes last in an alphabetical sort and in the example output my professor has provided, apple.dat is deleted when the choices are apple.dat and Apple.dat.
if [[ "apple" > "Apple" ]]; then
echo apple
else
echo Apple
fi
prints Apple
echo $(echo -e "Applenapple" | sort | tail -n1)
prints Apple
The ASCII value of a is 97 and A is 65, why is the test saying A is greater?
The weird thing is that I get opposite results with the older syntax:
if [ "apple" > "Apple" ]; then
echo apple
else
echo Apple
fi
prints apple
and if we try to use the > in the [[ ]] syntax, it is a syntax error.
How can we correct this for the double bracket syntax? I have tested this on the school Debian server, my local machine, and my Digital Ocean droplet server. On my local Ubuntu 20.04 and on the school server I get the output described above. Interestingly, on my Digital Ocean droplet which is an Ubuntu 20.04 server, I get "apple" with both double and single bracket syntax. We are allowed to use either syntax, double bracket or the single bracket actual test call, however I prefer using the newer double bracket syntax and would rather learn how to make this work than to convert my mostly finished script to the older more POSIX compliant syntax.
3
Answers
I have come up with my own solution to the problem, however I must first thank @GordonDavisson and @LéaGris for their help and for what I have learned from them as that is invaluable to me.
No matter if computer or human locale is used, if, in an alphabetical sort, apple comes after Apple, then it also comes after Banana and if Banana comes after apple, then Apple comes after apple. So I have come up with the following:
prints:
This works using process substitution and redirecting the output into the while loop to read one character at a time and then using printf to get the decimal ASCII value of each character. It is like creating a temporary file from the string which will be automatically destroyed and then reading it one character at a time. The -n for echo means the n character, if there is one from user input or something, will be ignored.
From bash man pages:
from stackoverflow post about printf:
Note: process substitution is not POSIX compliant, but it is supported by Bash in the way stated in the bash man page.
UPDATE: The above does not work in all cases!
The above solution works in many cases however we get some anomalies.
correct
correct
incorrect
correct
incorrect
The following solution gets the results that are needed:
prints:
Hints:
but:
The difference is that the Bash specific test
[[ ]]
uses the locale collation’s rules to compare strings. Whereas the POSIX test[ ]
uses the ASCII value.From bash man page:
Change your syntax.
if [[ "Apple" -gt "apple" ]]
works as expected.