I’m trying to filter out data from the current date to last 3 years and trying to use this for Pyspark dataframe.
below sql query needs to convert into Pyspark dataframe format
date >= dateadd(month,-4,current_date)
How to write above sql in Pyspark dataframe format
2
Answers
Lets create a dataframe to test on:
These create a dataframe with 12 rows, each for the 1 day of the month in 2023:
Now lets filter the dataframe into a new dataframe:
And the results:
You need to use a combination of the filter and add_months functions, like this: