How to write a csv file on Azure Databricks Workspace

GaelBosc
October 12, 2023
190 views
0 votes
2 Answers

I’m using a notebook on Azure Databricks, this notebook is in my user repo. I want to write a csv file created by this notebook in this repo.
When i’m using the code below :

df_pandas.to_csv(‘test.csv’, index=False, header=False)

There is no error but the file is not written is the notebook’s repo.

Does someone has a clue ?

I’ve tried to write the complete path or just the csv name but the it still the same error :

Cannot save file into a non-existent directory: ‘/Users/*********/repo_one/repo_two’

Answers

Chosen as BEST ANSWER
- GaelBosc
- October 12, 2023 at 10:52 am
- 0 votes
0
Hi thanks for the explanation, But do you know how to write the csv file not in dbfs path but where I can retrieve it here, in the Workspace folder where my notebook is :

Thanks for the help again !

(Edit)

- DileeprajnarayanThumula
- October 12, 2023 at 7:46 am
- 0 votes
0
- I created a sample Pandas dataframe df_pandas with two columns "name" and "age" and three rows of data. Then,I am writing the Spark dataframe to a CSV file named "test.csv" in the Databricks file
  system (DBFS)
- The toPandas() method is used to convert the Spark dataframe to a Pandas dataframe, and the to_csv() method is used to convert the Pandas dataframe to a CSV string. The dbutils.fs.put() method is used to write the CSV string to the specified file path in DBFS.
The below is the code:
```
from pyspark.sql.types import *
import pandas as pd
data = {'name': ['John', 'Jane', 'Bob'], 'age': [25, 30, 35]}
df_pandas = pd.DataFrame(data)
df_spark = spark.createDataFrame(df_pandas)
dbutils.fs.put("/Users/Dilip/repo_one/repo_two/test.csv", df_spark.toPandas().to_csv(index=False, header=False), True)
```
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.