Question posted in Json
Our archive of expertly curated questions and answers provides insights and solutions to common problems related to this popular data interchange format. From parsing and manipulating JSON data to integrating it with various programming languages and web services, our archive has got you covered. Start exploring today and take your JSON skills to the next level

Parsing text and JSON from a log file with three lines (request, response, record)

mikhailtugushev
September 8, 2024
233 views
1 vote
2 Answers

I have a .csv log file containing three json lines for each client. For example:

{"name":"John","phone":"08847","politic":"on","ville":"LA","isTest":"false","source":"t3_1"}
{"data":{"name":"John","phone":"+8847","city":"LA","source":"t3_1","cameF":"a1"},"token":"bd67a","isTest":false}
{"data":{"responseId":"R_2hs","city":"LA","cameF":"cpl_agency2","source":"t3_1"},"success":true,"ts":1721394844,"message":null}

I don’t understand how you can combine three separate lines into one and then read the json.

Starting reading the file:

import csv
import datetime
import pandas as pd

df = pd.read_csv('/content/drive/MyDrive/Colab Notebooks/in/post.csv',
                          delimiter=',',
                          header=None,
                          skiprows=range(0, 42), # skip test row
                          names=['t1', 't2', 't3', 't4', 't5', 't6', 't7', 't8', 't9']
                 )

display(df)

	t1	t2	t3	t4	t5	t6	t7	t8	t9
0	"name":"John"	"phone":"08847"	"politic":"on"	"ville":"LA"	"isTest":"false"	"source":"t3_1"	Nan	Nan	Nan
1	{"data":{"name":"John"	phone:"+08847"	city:"LA"	source:"t3_1"	cameF:"a1"}	token:"bd67a"	isTest:false}	NaN	NaN
2	{"data:{"responseId":"R_2hs"	city:"LA"	cameF:"a1"	source:"t3_1"}	success:true	ts:1721394844	message:null	NaN	NaN

…

…	…	…	…	…	…	…	…	…
5001	"name":"Michael"	"phone":"09947"	"politic":"on"	"ville":"FE"	"isTest":"false"	"source":"t3_1"	Nan	Nan
5002	{"data":{"name":"Michael"	phone:"+09947"	city:"FE"	source:"t3_1"	cameF:"a1"}	token:"bd90d"	isTest:false}	NaN
5003	{"data:{"responseId":"R_3gn"	city:"FE"	cameF:"a1"	source:"t3_1"}	success:true	ts:1822404844	message:null}	NaN

I want to get a dataframe like this. Combine three lines into one:

name	phone	politic	ville	isTest	source	name	phone	city	source	cameF	token	rId	city	cameF	source	success	ts	message
John	08847	on	LA	false	t3_1	John	+08847	LA	t3_1	a1	bd67a	R_2hs	LA	a1	t3_1	true	1721394844	null

Answers

- md2perpe
- September 8, 2024 at 3:11 pm
- 0 votes
0
```
import json

with open('/content/drive/MyDrive/Colab Notebooks/in/post.csv') as f:
    data = dict()
    data.update(json.loads(next(f)))
    data.update(json.loads(next(f)))
    data.update(json.loads(next(f)))
    print(data)
```
Login or Signup to reply.

I wasn’t sure about this, but assuming your csv file has one column that is effectively the three lines you showed at the beginning of your post repeating (with a new client every three rows), this code should work:

import json

import pandas as pd
from IPython.display import display

df = pd.DataFrame(
    columns=[
        "name",
        "phone",
        "politic",
        "ville",
        "isTest",
        "source",
        "name",
        "phone",
        "city",
        "source",
        "cameF",
        "token",
        "isTest",
        "rId",
        "city",
        "cameF",
        "source",
        "success",
        "ts",
        "message",
    ]
)

with open("test.csv") as f:
    l = []
    count = 0
    for row in f.readlines():
        d = json.loads(row)
        for k, v in d.items():
            if k == "data":
                for _, data_v in v.items():
                    l.append(data_v)
            else:
                l.append(v)
        count += 1
        if count == 3:
            df = df._append(pd.Series(l, index=df.columns), ignore_index=True)
            l, count = [], 0

display(df)

   name  phone politic ville isTest source  name  phone city source cameF  token isTest    rId city        cameF source success          ts message
0  John  08847      on    LA  false   t3_1  John  +8847   LA   t3_1    a1  bd67a  False  R_2hs   LA  cpl_agency2   t3_1    True  1721394844    None

Please signup or login to give your own answer.

Click here to cancel reply.