skip to Main Content

If relevant I have GNU awk V 3.1.6 downloaded directly from GNU pointed source in sourceforge.

I am getting a page of URLs using wget for windows. After prcoessing the incoming file, I reduce it to single line, from which I have to extract a key value, which is quite a long string. The final line looks something like this:

<ENUM_TAG>content"href:e@5nUtw3Fc^b=tZjqpszvja$sb=Lp4YGH=+J_XuupctY9zE9=&KNWbphdFnM3=x4*A@a=W4YXZKV3TMSseQx66AHz9MBwdxY@B#&57t3%s6ZyQz3!aktRNzcWeUm*8^$B6L&rs5X%H3C3UT&BhnhXgAXnKZ7f2Luy*jYjRLLwn$P29WzuVzKVnd3nVc2AKRFRPb79gQ$w$Nea6cA!A5dGRQ6q+L7QxzCM%XcVaap-ezduw?W@YSz!^7SwwkKc"</ENUM_TAG>

I need the long string between the two ” signs.

So I use this construct with awk

type processedFile | awk -F """ "{print $2}"

and I get the output as expected

href:e@5nUtw3Fc^b=tZjqpszvja$sb=Lp4YGH=+J_XuupctY9zE9=&KNWbphdFnM3=x4*A@a=W4YXZKV3TMSseQx66AHz9MBwdxY@B#&57t3%s6ZyQz3!aktRNzcWeUm*8^$B6L&rs5X%H3C3UT&BhnhXgAXnKZ7f2Luy*jYjRLLwn$P29WzuVzKVnd3nVc2AKRFRPb79gQ$w$Nea6cA!A5dGRQ6q+L7QxzCM%XcVaap-ezduw?W@YSz!^7SwwkKc

but when I run the same command with output redirected to a file, such as

type processedFile | awk -F """ "{print $2}" > tempDummy

I get this error message:

awk: cmd. line:1: fatal: cannot open file `>' for reading (Invalid argument)

I am thinking the ” field separator is causing me some grief and making the last ” character as a non-closed string value, but I am not sure how to make this right. The same construct runs on my centos box perfectly well by the way.

Any pointers are greatly appreciated. I tried reading all the readme files I could find but none of them touches the output redirection.

2

Answers


  1. You were close. The issue here is that you are mixing awk redirection with cmd one.

    For completness sake I’m using MSYS2 awk version (version should not matter in this issue):

    awk --version
    GNU Awk 4.2.1, API: 2.0 (GNU MPFR 4.0.1, GNU MP 6.1.2)
    

    Windows version is in this case irrelevant – will work both on Win7 and Win10

    Your command:

    type processedFile | awk -F """ "{print $2}" > tempDummy
    

    uses > which you expect to be a cmd.exe redirection, but awk expects a file, thus you get the error: awk: cmd. line:1: fatal: cannot open file ``>'

    1) Fixing the redirection

    You can fix that by doing the redirection directly at awk:

    type processedFile | awk -F """ "{ print $2 > "tempDummy"; }"
    

    2) Using awk to read the file

    The type command is here superfluous as you can use directly awk to read the file:

    awk -F """ "{ print $2 > "tempDummy"; }" processedFile
    

    Don’t forget note: What is important to note is that GNU utils are case sensitive but the default filesystem settings at windows is case-insensitive.

    Login or Signup to reply.
  2. Yes, you have problems with how cmd parser deals with where quoted areas start/end. What cmd sees is

    awk -F """ "{print $2}" > tempDummy
           ^-^^-^          ^-------------
           1  2            3
    

    that is, three quoted areas. As the > falls inside a quoted area it is not handled as a
    redirection operator, it is an argument to the command in the rigth side of the pipe.

    This can be solved by just escaping (^ is cmd‘s general escape character) a quote to ensure cmd properly generates the final command after parsing the line and that the redirection is not part of the awk command

    type processedFile | awk -F ^""" "{print $2}" > tempDummy
                                   ^^ ^..........^
    

    Or you can reorder the command to place the redirection operation where it could not interfere

    type processedFile | > tempDummy awk -F """ "{print $2}"
    

    but while this works using this approach may later fail in other cases because the awk code ({print $2}) is placed in an unquoted area.

    There is a simpler, standard, portable way of doing it without having to deal with quote escaping: instead of passing the quote as argument it is better to use the awk string handling and just include the escape sequence of the quote character

    type processedFile | awk -F "x22" "{print $2}" > tempDummy
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search