skip to Main Content

I’m trying to print out the happiest countries in the world for 2022, by receiving the data from https://en.wikipedia.org/wiki/World_Happiness_Report?action=raw). and then editing displaying the first 5 countries. Here is my code:

#!/bin/bash
content=$(curl -s "https://en.wikipedia.org/wiki/World_Happiness_Report?action=raw")
lines=$(echo "$content" | grep '^|' | sed -n '/2022/{n;p;}')
top_5=$(echo "$lines" | awk '{print $3}' | sort | head -n 5)
echo "$top_5"

However, when I run this code in Ubuntu, nothing shows up, its just blank, like this:

....(My computer server).....:~$ bash happy_countriesnew.sh
#(I'm expecting there to be a list here)
....(My computer server).....:~$

I’m expecting something like this instead of the blank space my terminal is displaying:

Finland
Norway
Denmark
Iceland
Switzerland
Netherlands
Canada
New Zealand
Sweden
Australia

What am I doing wrong and what should I change?

4

Answers


  1. I guess you see this error (but you are ignoring it)

    grep: empty (sub)expression
    

    the problem is with your grep expression, remove the ecape

    lines=$(echo "$content" | grep '^|' | sed -n '/2022/{n;p;}')
    

    and check for errors.

    Login or Signup to reply.
  2. echo | grep | sed | awk is a bit of an anti-pattern. Typically, you want to refactor such pipelines to just be a call to awk. In your case, it looks like your code that is attempting to extract the 2022 data is flawed. The data is already sorted, so you can drop the sort and get the data you want with:

     sed -n '/^=== 2022 report/,/^=/{ s/}}//; /^|[12345]|/s/.*|//p; }' 
    

    The first portion (the /^=== 2022 report/,/^=/) tells sed to only work on lines between those that match the two given patterns, which is the data you are interested in. The rest is just cleaning up and extracting just the country name, printing only those lines in which the 2nd field is exactly one of the single digits 1, 2, 3, 4, or 5.
    Note that this is not terribly flexible, and it is difficult to modify it to print the top 7 or the top 12, so you might want something like:

    sed -n '/^=== 2022 report/,/^=/{ s/}}//; /^|[[:digit:]]/s/.*|//p; }' | head -n 5
    

    Note that it could be argued that sed | head is also a bit of an anti-pattern, but keeping track of lines of output in sed is tedious and the pipe to head is less egregious than attempting to write such code.

    Login or Signup to reply.
  3. Using awk:

    awk -F"{{|}}|[|]" '/^=== 2022 rep/ {f=1} /^=== 2021 rep/ {f=0} {if(f==1 && /flag/) {print $6}}' <<<"$content" | head -n 5
    Finland
    Denmark
    Iceland
    Switzerland
    Netherlands
    

    -F"{{|}}|[|]" # set field separator to ‘{{‘ or ‘}}’ or ‘|’

    /^=== 2022 rep/ {f=1} # set flag if line starts with ‘=== 2022 rep’

    /^=== 2021 rep/ {f=0} # unset flag if line starts with ‘=== 2021 rep’

    {if(f==1 && /flag/) {print $6}}' # if f is set and line contains ‘flag’ text print 6th field

    Note: Assumes "$content" variable is populated via content=$(curl -s "https://en.wikipedia.org/wiki/World_Happiness_Report?action=raw")

    — or —

    You could use bash command substitution and avoid the intermediate content variable altogether:

    awk -F"{{|}}|[|]"  '/^=== 2022 rep/ {f=1} /^=== 2021 rep/ {f=0} {if(f==1 && /flag/) {print $6}}' < <(curl -s "https://en.wikipedia.org/wiki/World_Happiness_Report?action=raw") | head -n 5
    

    Output:

    Finland
    Denmark
    Iceland
    Switzerland
    Netherlands
    
    Login or Signup to reply.
  4. curl …………… | 
    
    gawk 'NF *= 2<NF' FS='^[|][1-5][|][|][{][{]flag[|]|[}][}]$' OFS=
    
    Finland
    Denmark
    Iceland
    Switzerland
    Netherlands
    

    If you wanna shrink it even further :

    mawk 'NF *= 2<NF' FS='^[|][1-5][|].+[|]|[}]+$' OFS=
    

    this approach makes is easy to expand the list to, say, Top 17 :

    nawk 'NF *= 2<NF' FS='^[|]([1-9]|1[0-7])[|].+[|]|[}]+$' OFS=
    
         1  Finland
         2  Denmark
         3  Iceland
         4  Switzerland
         5  Netherlands
         6  Luxembourg
         7  Sweden
         8  Norway
         9  Israel
        10  New Zealand
        11  Austria
        12  Australia
        13  Ireland
        14  Germany
        15  Canada
        16  United States
        17  United Kingdom
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search