If relevant I have GNU awk V 3.1.6 downloaded directly from GNU pointed source in sourceforge.
I am getting a page of URLs using wget for windows. After prcoessing the incoming file, I reduce it to single line, from which I have to extract a key value, which is quite a long string. The final line looks something like this:
<ENUM_TAG>content"href:e@5nUtw3Fc^b=tZjqpszvja$sb=Lp4YGH=+J_XuupctY9zE9=&KNWbphdFnM3=x4*A@a=W4YXZKV3TMSseQx66AHz9MBwdxY@B#&57t3%s6ZyQz3!aktRNzcWeUm*8^$B6L&rs5X%H3C3UT&BhnhXgAXnKZ7f2Luy*jYjRLLwn$P29WzuVzKVnd3nVc2AKRFRPb79gQ$w$Nea6cA!A5dGRQ6q+L7QxzCM%XcVaap-ezduw?W@YSz!^7SwwkKc"</ENUM_TAG>
I need the long string between the two ” signs.
So I use this construct with awk
type processedFile | awk -F """ "{print $2}"
and I get the output as expected
href:e@5nUtw3Fc^b=tZjqpszvja$sb=Lp4YGH=+J_XuupctY9zE9=&KNWbphdFnM3=x4*A@a=W4YXZKV3TMSseQx66AHz9MBwdxY@B#&57t3%s6ZyQz3!aktRNzcWeUm*8^$B6L&rs5X%H3C3UT&BhnhXgAXnKZ7f2Luy*jYjRLLwn$P29WzuVzKVnd3nVc2AKRFRPb79gQ$w$Nea6cA!A5dGRQ6q+L7QxzCM%XcVaap-ezduw?W@YSz!^7SwwkKc
but when I run the same command with output redirected to a file, such as
type processedFile | awk -F """ "{print $2}" > tempDummy
I get this error message:
awk: cmd. line:1: fatal: cannot open file `>' for reading (Invalid argument)
I am thinking the ” field separator is causing me some grief and making the last ” character as a non-closed string value, but I am not sure how to make this right. The same construct runs on my centos box perfectly well by the way.
Any pointers are greatly appreciated. I tried reading all the readme files I could find but none of them touches the output redirection.
2
Answers
You were close. The issue here is that you are mixing
awk
redirection withcmd
one.For completness sake I’m using
MSYS2 awk
version (version should not matter in this issue):Windows version is in this case irrelevant – will work both on Win7 and Win10
Your command:
uses
>
which you expect to be acmd.exe
redirection, but awk expects a file, thus you get the error:awk: cmd. line:1: fatal: cannot open file ``>'
1) Fixing the redirection
You can fix that by doing the redirection directly at
awk
:2) Using awk to read the file
The
type
command is here superfluous as you can use directlyawk
to read the file:Don’t forget note: What is important to note is that GNU utils are case sensitive but the default filesystem settings at windows is case-insensitive.
Yes, you have problems with how
cmd
parser deals with where quoted areas start/end. Whatcmd
sees isthat is, three quoted areas. As the
>
falls inside a quoted area it is not handled as aredirection operator, it is an argument to the command in the rigth side of the pipe.
This can be solved by just escaping (
^
iscmd
‘s general escape character) a quote to ensurecmd
properly generates the final command after parsing the line and that the redirection is not part of theawk
commandOr you can reorder the command to place the redirection operation where it could not interfere
but while this works using this approach may later fail in other cases because the
awk
code ({print $2}
) is placed in an unquoted area.There is a simpler, standard, portable way of doing it without having to deal with quote escaping: instead of passing the quote as argument it is better to use the
awk
string handling and just include the escape sequence of the quote character