skip to Main Content

This question now answered – scroll to the end of this post for the solution.

Apologies if the answer is already here, but all the answers I have found so far suggest either the -h flag or the -n flag, and neither of those are working for me…

I have some output from a curl command that is giving me several columns of data. One of those columns is a human-readable file size (“1.6mb”, “4.3gb” etc).

I am using the unix sort command to sort by the relevant column, but it appears to be trying to sort alphabetically instead of numercially. I have tried using both the -n and the -h flags, but although they do change the order, in neither case is the order numerically correct.

I am on CentOS Linux box, version 7.2.1511. The version of sort I have is “sort (GNU coreutils) 8.22”.

I have tried using the -h flag in these different formats:

curl localhost:9200/_cat/indices | sort -k9,9h | head -n5
curl localhost:9200/_cat/indices | sort -k9 -h | head -n5
curl localhost:9200/_cat/indices | sort -k 9 -h | head -n5
curl localhost:9200/_cat/indices | sort -k9h | head -n5

I always get these results:

green open indexA            5 1        0       0   1.5kb    800b
green open indexB            5 1  9823178 2268791 152.9gb  76.4gb
green open indexC            5 1    35998    7106 364.9mb 182.4mb
green open indexD            5 1      108      11 387.1kb 193.5kb
green open indexE            5 1        0       0   1.5kb    800b

I have tried using the -n flag in the same formats as above:

curl localhost:9200/_cat/indices | sort -k9,9n | head -n5
curl localhost:9200/_cat/indices | sort -k9 -n | head -n5
curl localhost:9200/_cat/indices | sort -k 9 -n | head -n5
curl localhost:9200/_cat/indices | sort -k9n | head -n5

I always get these results:

green open index1      5 1     1021       0   3.2mb   1.6mb
green open index2      5 1     8833       0   4.1mb     2mb
green open index3      5 1     4500       0     5mb   2.5mb
green open index4      1 0        3       0   3.9kb   3.9kb
green open index5      3 1  2516794       0   8.6gb   4.3gb

Edit: It turned out there were two problems:

1) sort expects to see capital single letters – M, K and G instead of mb, kb and gb (for bytes you can just leave blank).

2) sort will include leading spaces unless you explicitly exclude them, which messes with the ordering.

The solution is to replace lower case with upper case and use the -b flag to make sort ignore leading spaces (I’ve based this answer on @Vinicius’ solution below, because it’s easier to read if you don’t know regex):

curl localhost:9200/_cat/indices | tr '[kmg]b' '[KMG] ' | sort -k9hb

2

Answers


  1. Your ‘m’ and ‘g’ units should be uppercase. GNU sort manual reads:

    -h –human-numeric-sort –sort=human-numeric

    Sort numerically, first by numeric sign (negative, zero, or positive); then by SI suffix (either empty, or ‘k’ or ‘K’, or one of ‘MGTPEZY’, in that order; see Block size); and finally by numeric value.

    You can change the output of curl with GNU sed like this:

    curl localhost:9200/_cat/indices 
    | sed 's/[0-9][mgtpezy]/U&/g'
    | sort -k9,9h 
    | head -n5
    

    Yields:

    green open index4      1 0        3       0   3.9kb   3.9kb
    green open index1      5 1     1021       0   3.2Mb   1.6Mb
    green open index2      5 1     8833       0   4.1Mb     2Mb
    green open index3      5 1     4500       0     5Mb   2.5Mb
    green open index5      3 1  2516794       0   8.6Gb   4.3Gb
    

    Other letters like "b" will be treated as "no unit":

    green open indexA            5 1        0       0   1.5kb    800b
    green open indexE            5 1        0       0   1.5kb    800b
    green open indexD            5 1      108      11 387.1kb 193.5kb
    green open indexC            5 1    35998    7106 364.9Mb 182.4Mb
    green open indexB            5 1  9823178 2268791 152.9Gb  76.4Gb
    

    If so desired, you can change the units in the sorted output back to lowercase by piping to sed 's/[0-9][MGTPEZY]/L&/g'

    Login or Signup to reply.
  2. sort does not understand kb, mb and gb. You have to use K, M and G. You can use tr to convert the suffixes:

    curl localhost:9200/_cat/indices | tr 'kmgb' 'KMG ' | sort -b -k 9 -h
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search