I’m trying to print out the happiest countries in the world for 2022, by receiving the data from https://en.wikipedia.org/wiki/World_Happiness_Report?action=raw). and then editing displaying the first 5 countries. Here is my code:
#!/bin/bash
content=$(curl -s "https://en.wikipedia.org/wiki/World_Happiness_Report?action=raw")
lines=$(echo "$content" | grep '^|' | sed -n '/2022/{n;p;}')
top_5=$(echo "$lines" | awk '{print $3}' | sort | head -n 5)
echo "$top_5"
However, when I run this code in Ubuntu, nothing shows up, its just blank, like this:
....(My computer server).....:~$ bash happy_countriesnew.sh
#(I'm expecting there to be a list here)
....(My computer server).....:~$
I’m expecting something like this instead of the blank space my terminal is displaying:
Finland
Norway
Denmark
Iceland
Switzerland
Netherlands
Canada
New Zealand
Sweden
Australia
What am I doing wrong and what should I change?
4
Answers
I guess you see this error (but you are ignoring it)
the problem is with your
grep
expression, remove the ecapeand check for errors.
echo | grep | sed | awk
is a bit of an anti-pattern. Typically, you want to refactor such pipelines to just be a call toawk
. In your case, it looks like your code that is attempting to extract the 2022 data is flawed. The data is already sorted, so you can drop the sort and get the data you want with:The first portion (the
/^=== 2022 report/,/^=/
) tellssed
to only work on lines between those that match the two given patterns, which is the data you are interested in. The rest is just cleaning up and extracting just the country name, printing only those lines in which the 2nd field is exactly one of the single digits 1, 2, 3, 4, or 5.Note that this is not terribly flexible, and it is difficult to modify it to print the top 7 or the top 12, so you might want something like:
Note that it could be argued that
sed | head
is also a bit of an anti-pattern, but keeping track of lines of output insed
is tedious and the pipe tohead
is less egregious than attempting to write such code.Using
awk
:-F"{{|}}|[|]"
# set field separator to ‘{{‘ or ‘}}’ or ‘|’/^=== 2022 rep/ {f=1}
# set flag if line starts with ‘=== 2022 rep’/^=== 2021 rep/ {f=0}
# unset flag if line starts with ‘=== 2021 rep’{if(f==1 && /flag/) {print $6}}'
# iff
is set and line contains ‘flag’ text print 6th fieldNote: Assumes "$content" variable is populated via
content=$(curl -s "https://en.wikipedia.org/wiki/World_Happiness_Report?action=raw")
— or —
You could use bash command substitution and avoid the intermediate
content
variable altogether:Output:
If you wanna shrink it even further :
this approach makes is easy to expand the list to, say, Top 17 :