skip to Main Content

My Data set looks like this :

enter image description here

I am using this filter :

df = df.filter(trim(col("AGE"))!="" & trim(col("PHONE"))!="") 

I am getting an empty dataframe, I want the data without the record having name =G3

Any help is appreciated.

2

Answers


  1. Please try filtering using below :-

    df.filter(~((col('AGE').isNotNull()) & (col('PHONE').isNotNull()))).show()
    

    OR

    df.filter(~((col('AGE') != lit(None)) & (col('PHONE') != lit(None)))).show()
    

    Please note the additional parentheses separating both conditions with ‘&’ operator. And one set of parentheses for ‘~’ (NOT operator)

    Login or Signup to reply.
  2. It is advisable to include not null checks when performing checks on columns, as the values in the columns may be null rather than an empty string(”). For instance, when checking the age and phone columns, you may want to consider adding not null checks to avoid errors that may arise from null values in these columns.

    Below filter condition should help.

    df.filter(
        col("AGE").isNotNull() & col("PHONE").isNotNull() & 
        trim(col("AGE")) != "" & trim(col("PHONE")) != ""
    )
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search