skip to Main Content

The input is given as:
rec = [b'1674278797,14.33681', b'1674278798,6.03617', b'1674278799,12.78418']
I want to get a DataFrame like:

df
    timestamp       val
0  1674278797  14.33681
1  1674278798   6.03617
2  1674278799  12.78418

What is the most efficient way? Thanks!

If I can convert rec like
[[1674278797,14.33681], [1674278798,6.03617], [1674278799,12.78418]]
It would be easy for me by calling
df = pd.DataFrame(rec, columns=['timestamp','val'])
But I don’t know how to do the conversion quickly.

btw, I got rec from a Redis list. I can modify the format of each element (for example, b’1674278797,14.33681′ is an element) if necessory.

2

Answers


  1. If you can’t directly handle the original input, you can use:

    (pd.Series([x.decode('utf-8') for x in rec])
       .str.split(',', expand=True).convert_dtypes()
       .set_axis(['timestamp', 'val'], axis=1)
    )
    

    Or:

    import io
    
    pd.read_csv(io.StringIO('n'.join([x.decode('utf-8') for x in rec])),
                header=None, names=['timestamp', 'val'])
    

    Output:

        timestamp       val
    0  1674278797  14.33681
    1  1674278798   6.03617
    2  1674278799  12.78418
    
    Login or Signup to reply.
  2. You can do this in one line:

    pd.DataFrame([x.decode().split(",") for x in rec], columns=["timestamp","val"])
    

    Returns

        timestamp       val
    0  1674278797  14.33681
    1  1674278798   6.03617
    2  1674278799  12.78418
    

    If you want to convert the datatypes of the column you can add .astype({"timestamp": "int64", "val": "float64"}) to the end of the line.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search