skip to Main Content

I am facing issue when i try to write file in S3 as CSV.
I am basically trying to overwrite existing single csv file in an S3 folder. Below is the peice of code in I’m running.
enter image description here

I am getting below error. My wild guess is this is due to single file present in S3 folder. While overwriting it first deletes existing file which further deletes the S3 folder since there is no file inside it. And then it couldn’t create file since no folder exists with given name. Hence whole overwriting fails.

enter image description here

Any help to resolve this issue will be appreciated.

2

Answers


  1. Chosen as BEST ANSWER

    So this issue didn't resolve, had to do work around. Seems like this issue is not with S3, the issue is of spark. Once you read a csv using Spark, you cannot write over the same csv until you read some other csv.

    Work around looked like below:

    1. Read from root/myfolder
    2. Make your data transformations
    3. Write transform the data into root/mytempfolder
    4. Read from root/mytempfolder
    5. Write into root/myfolder

  2. Caching the dataset solves the problem and you don’t need to save the same data into multiple paths

    dataframe.cache()

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search