skip to Main Content

I have this awk command:

echo www.host.com |awk -F. '{$1="";OFS="." ; print $0}' | sed 's/^.//'

which what it does is to get the domain from the hostname:

host.com

that command works on CentOS 7 (awk v 4.0.2), but it does not work on ubuntu 19.04 (awk 4.2.1) nor alpine (gawk 5.0.1), the output is:

host com

How could I fix that awk expression so it works in recent awk versions ?

4

Answers


  1. For your provided samples could you please try following. This will try to match regex from very first . to till last of the line and then prints after first dot to till last of line.

    echo www.host.com | awk 'match($0,/..*/){print substr($0,RSTART+1,RLENGTH-1)}'
    


    OP’s code fix: In case OP wants to use his/her own tried code then following may help. There are 2 points here: 1st- We need not to use any other command along with awk to processing. 2nd- We need to set values of FS and OFS in BEGIN section which you are doing in everyline.

    echo www.host.com | awk 'BEGIN{FS=OFS="."} {$1="";sub(/./,"");print}'
    
    Login or Signup to reply.
  2. To get the domain, use:

    $ echo www.host.com | awk 'BEGIN{FS=OFS="."}{print $(NF-1),$NF}'
    host.com
    

    Explained:

    awk '
    BEGIN {                 # before processing the data
        FS=OFS="."          # set input and output delimiters to .
    }
    {
        print $(NF-1),$NF   # then print the next-to-last and last fields
    }'
    

    It also works if you have arbitrarily long fqdns:

    $ echo if.you.have.arbitrarily.long.fqdns.example.com |
    awk 'BEGIN{FS=OFS="."}{print $(NF-1),$NF}'
    example.com
    

    And yeah, funny, your version really works with 4.0.2. And awk version 20121220.

    Update:

    Updated with some content checking features, see comments. Are there domains that go higher than three levels?:

    $ echo and.with.peculiar.fqdns.like.co.uk | 
    awk '
    BEGIN {
        FS=OFS="."
        pecs["co34uk"]
    }
    {
        print (($(NF-1),$NF) in pecs?$(NF-2) OFS:"")$(NF-1),$NF
    }'
    like.co.uk
    
    Login or Signup to reply.
  3. You got 2 very good answers on awk but I believe this should be handled with cut because of simplicity it offers in getting all fields starting for a known position:

    echo 'www.host.com' | cut -d. -f2-
    

    host.com
    

    Options used are:

    • -d.: Set delimiter as .
    • -f2-: Extract all the fields starting from position 2
    Login or Signup to reply.
  4. What you are observing was a bug in GNU awk which was fixed in release 4.2.1. The changlog states:

    2014-08-12 Arnold D. Robbins

    OFS being set should rebuild $0 using previous OFS if $0 needs to be
    rebuilt. Thanks to Mike Brennan for pointing this out.

    • awk.h (rebuild_record): Declare.
    • eval.c (set_OFS): If not being called from var_init(), check if $0 needs rebuilding. If so, parse the record fully and rebuild it. Make OFS point to a separate copy of the new OFS for next time, since OFS_node->var_value->stptr was
      already updated at this point.

    • field.c (rebuild_record): Is now extern instead of static. Use OFS and OFSlen instead of the value of OFS_node.

    When reading the code in the OP, it states:

    awk -F. '{$1="";OFS="." ; print $0}'
    

    which, according to POSIX does the following:

    1. -F.: set the field separator FS to represent the <dot>-character
    2. read a record
    3. Perform field splitting with FS="."
    4. $1="": redefine field 1 and rebuild record $0 using OFS. At this time, OFS is set to be a single space. If the record $0 was www.foo.com it now reads _foo_com (underscores represent spaces). Recompute the number of fields which are now only one as there is no FS available anymore.
    5. OFS=".": redefine the output field separator OFS to be the <dot>-character. This is where the bug happens. The Gnu awk knew that a rebuild needed to happend, but did this already with the new OFS and not the old OFS.
    6. **print $0':** print the record $0 which is now_foo_com`.

    The minimal change to your program would be:

    awk -F. '{OFS="."; $1=""; print $0}'
    

    The clean change would be:

    awk 'BEGIN{FS=OFS="."}{$1="";print $0}'
    

    The perfect change would be to replace the awk and sed by the cut solution of Anubahuva

    If you have a variable with that name in there, you could use:

    var=www.foo.com
    echo ${var#*.}
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search