skip to Main Content

I have a JSON data that I want to represent in a tabular form and later write it to a different format (parquet)

Schema

root
|-- : string (nullable = true)

sample data

+----------------------------------------------+

+----------------------------------------------+
|{"deviceTypeId":"A2A","deviceId":"123","geo...|
|{"deviceTypeId":"A2B","deviceId":"456","geo...|
+----------------------------------------------+

Expected Output

+--------------+------------+
|  deviceTypeId|deviceId|...| 
+--------------+--------+---+
|           A2A|     123|   |
|           A2B|     456|   |
+--------------+--------+---+

I tried splitting the string, but this doesn’t seem like an efficient approach

split_col = split(df_explode[''], ',')

And then extract the columns, but it appends the initial string as well.

df_1 = df_explode.withColumn('deviceId',split_col.getItem(1))
# df_1 = df_explode.withColumn('deviceTypeId',split_col.getItem(0))
printOutput(df_1)

I’m looking for a better way to solve this problem

2

Answers


  1. Chosen as BEST ANSWER

    I was able to do it using the from_json function.

    #Convert json column to multiple columns
    schema = getSchema()
    dfJSON = df_explode.withColumn("jsonData",from_json(col(''),schema)) 
                       .select("jsonData.*")
    dfJSON.printSchema()
    dfJSON.limit(100).toPandas()
    

    We need to create Json Schema that will parse the Json data.

    def getSchema():
        schema = StructType([ 
                    StructField('deviceTypeId', StringType()),
                    StructField('deviceId', StringType()),
                    ...
                    ])
        return schema
    

    The value string is empty in this Json data so the col consists of empty string


  2. Explode function is only to work on Array.

    In your case which is a json, you should use from_json function.

    Please refer from_json from pyspark.sql.functions

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search