skip to Main Content

I’m working with Shopify’s Order API and I receive the following responses, which currently sit in a dataframe (very simplified).

+----------+-----------+-----------------------------------------------------------------------------+
| order_id | name      | item_order                                                                  |
+----------+-----------+-----------------------------------------------------------------------------+
| 123      |  Jon Doe  | [{'id':789,'price':40.0,'sku':'HAT'},{'id':790,'price':41.0,'sku':'SHIRT'}] |
+----------+-----------+-----------------------------------------------------------------------------+
| 122      | Jonah Doe |                    [{'id':553,'price':20.0,'sku':'CAP'}]                    |
+----------+-----------+-----------------------------------------------------------------------------+
|          |           |                                                                             |
+----------+-----------+-----------------------------------------------------------------------------+

I’d like to explode item_order, so we get something closer to the below:

+----------+-----------+---------------+------------------+----------------+
| order_id | name      | item_order.id | item_order.price | item_order.sku |
+----------+-----------+---------------+------------------+----------------+
| 123      |  Jon Doe  | 789           | 40.0             | HAT            |
+----------+-----------+---------------+------------------+----------------+
| 123      | Jon Doe   | 790           | 41.0             | SHIRT          |
+----------+-----------+---------------+------------------+----------------+
| 122      | Jonah Doe | 553           | 20               | CAP            |
+----------+-----------+---------------+------------------+----------------+

2

Answers


  1. Use DataFrame.explode with json_normalize:

    df = df.explode('item_order').reset_index(drop=True)
    df = df.join(pd.json_normalize(df.pop('item_order')))
    
    Login or Signup to reply.
  2. First explode and then agg(pd.Series):

    new_df = df.explode('item_order')
    new_df[['item_order.id', 'item_order.price', 'item_order.sku']] = ( new_df['item_order'].
                                                                  agg(pd.Series) )
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search