I have a CSV with thousands of hundreds of thousands of rows but basically looks like this
personal_id | location_type | location_number |
---|---|---|
1 | ‘company’ | 123 |
2 | ‘branch | 321 |
1 | ‘branch | 456 |
1 | ‘branch | 567 |
The goal is to group everything by personal_id
and beneath that have 2 lists of the location_number
that are identified by the location_type
[
{
"personal_id": 1,
"company": [123],
"branch": [456, 567]
},
{
"personal_id": 2,
"branch": [321]
}
]
I used python pandas because i’ve done something successful before but only at 1 filtering level and using pandas to_dict('records)
worked perfectly at the time
ive been trying to do something in that light such as this
merge_df= (data_df.groupby(['personal_id'])
.apply(lambda x: x[['regulator', 'employee_number', 'sex', 'status']]
.to_dict('records'))
.reset_index()
.rename(columns={0: 'employee'}))
but im not able to figure out how to add an additional filter inside the apply()
as well as this method creates a column which I dont need in the above scenario that I renamed to ’employee’
My only other option is to start everything over in C# with say CSVHelper and maybe automapper if pandas was the wrong choice
2
Answers
Try:
Prints:
You can do this: