skip to Main Content

I have a CSV that looks like this:

created,id,value
2022-12-16 11:55,58,10
2022-12-16 11:55,59,2
2022-12-16 11:50,58,11
2022-12-16 11:50,59,3
2022-12-16 11:50,60,7

I want to parse it so I have the following result, setting ids as columns and grouping by date:

created,58,59,60
2022-12-16 11:55,10,2,nan
2022-12-16 11:50,11,3,7

missing values are set to nan, each id appears at most once per date

How can I do it? I also have the first CSV in a JSON equivalent if this is easier to do with jq

The JSON is composed of similar elements:

{ 
  "created": "2022-12-16 09:15", 
  "value": "10.4", 
  "id": "60" 
} 

2

Answers


  1. Using the great Miller (version >= 6), running

    mlr --csv reshape -s id,value then unsparsify then fill-empty -v "nan" input.csv
    

    you get

    created,58,59,60
    2022-12-1611:55,10,2,nan
    2022-12-1611:50,11,3,7
    

    The core command here is reshape -s id,value, to transform your input from long to wide structure.

    Login or Signup to reply.
  2. This is how I would do it in jq, based on the JSON input stream:

    reduce inputs as {$created, $value, $id} ({head: [], body: {}};
      .head |= (.[index($id) // length] = $id) | .body[$created][$id] = $value
    )
    | (.head | sort_by(tonumber)) as $head | ["created", $head[]], (
      .body | to_entries[] | [.key, .value[$head[]]]
    )
    

    Then, either use the @csv builtin which wraps the values in quotes, and produces empty values for missing combinations:

    jq -nr '
      ⋮
      | @csv
    '
    
    "created","2","3","10","11","50","55","58","59"
    "2022-12-16 11:55","6",,"3",,,"4","2","5"
    "2022-12-16 11:50",,"12",,"9","10",,"8","11"
    

    Demo

    Or generate nan and , manually by mapping and joining accordingly:

    jq -nr '
      ⋮
      | map(. // "nan") | join(",")
    '
    
    created,2,3,10,11,50,55,58,59
    2022-12-16 11:55,6,nan,3,nan,nan,4,2,5
    2022-12-16 11:50,nan,12,nan,9,10,nan,8,11
    

    Demo

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search