skip to Main Content

Suppose I have the following data:

# all the numbers are their own number.  I want to reshape exactly as below
0 a 
1 b
2 c
0 d
1 e
2 f
0 g
1 h
2 i
...

And I would like to reshape the data such that it is:

0 a d g ...
1 b e h ... 
2 c f i ...

Without writing a complex composition. Is this possible using the unix/bash toolkit?

Yes, trivially I can do this inside a language. The idea is NOT TO “just” do that. So if some cat X.csv | rs [magic options] sort of solution (and rs, or the bash reshape command, would be great, except it isn’t working here on debian stretch) exists, that is what I am looking for.

Otherwise, an equivalent answer that involves a composition of commands or script is out of scope: already got that, but would rather not have it.

2

Answers


  1. Using GNU datamash:

    $ datamash -s -W -g 1 collapse 2 < file
    0       a,d,g
    1       b,e,h
    2       c,f,i
    

    Options:

    • -s sort
    • -W use whitespace (spaces or tabs) as delimiters
    • -g 1 group on the first field
    • collapse 2 print comma-separated list of values of the second field

    To convert the tabs and commas to space characters, pipe the output to tr:

    $ datamash -s -W -g 1 collapse 2 < file | tr 't,' ' '
    0 a d g
    1 b e h
    2 c f i
    
    Login or Signup to reply.
  2. bash version:

    function reshape {
        local index number key
        declare -A result
        while read index number; do
            result[$index]+=" $number"
        done
        for key in "${!result[@]}"; do
            echo "$key${result[$key]}"
        done
    }
    reshape < input
    

    We just need to make sure input is in unix format

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search