skip to Main Content

Azure – DataBricks Pyspark DATEADD

I'm trying to filter out data from the current date to last 3 years and trying to use this for Pyspark dataframe. below sql query needs to convert into Pyspark dataframe format date >= dateadd(month,-4,current_date) How to write above sql…

VIEW QUESTION

Pyspark – JSON string column explode into multiple without mentioning schema

I have below JSON string as a column in a pyspark dataframe. { "result":{ "version":"1.2", "timeStamp":"2023-08-14 14:00:12", "description":"", "data":{ "DateTime_Received":"2023-08-14T14:01:10.4516457+01:00", "DateTime_Actual":"2023-08-14T14:00:12", "OtherInfo":null, "main":[ { "Status":0, "ID":111, "details":null } ] }, "tn":"aaa" } } I want to explode the above one…

VIEW QUESTION

Converting pyspark.sql.Rowtype data to Json string eliminating values in Azure Databricks NB

I have below pyspark row type data: indv_msg = [Row(cbm_json_output=Row(country_code='USA', date='06-10-2023', date_epoch='1696550400', id='USA-001535-1696550400', interfaceVersion='1.0.0', opmode_car_door=Row(health_category='GREEN', msg_id='1', num_yellow_preds_in_last_14_days=0, reason=None, reasonDetail=None), opmode_landing_door=Row(health_category='GREEN', msg_id='1', reason=None, reasonDetail=None), sensor=Row(component_type=None, health_category=None, landing_priority=None, msg_id='1', num_yellow_preds_in_last_14_days=None, reason=None, reasonDetail=None), unit_id='001535')) While trying to convert to json string, it is…

VIEW QUESTION
Back To Top
Search