I have a spark notebook which I am running with the help of pipeline. The notebook is running fine manually but in the pipeline it is giving error for file location. In the code I am loading the file in a data frame. The file location in the code is abfss://storage_name/folder_name/* and in pipeline it is taking abfss://storage_name/filename.parquetn
This is the error
{
"errorCode": "6002",
"message": "org.apache.spark.sql.AnalysisException: Path does not exist: abfss://storage_name/filename.parquetn at org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$checkAndGlobPathIfNecessary$4(DataSource.scala:806)nn at org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$checkAndGlobPathIfNecessary$4$adapted(DataSource.scala:803)nn at org.apache.spark.util.ThreadUtils$.$anonfun$parmap$2(ThreadUtils.scala:372)nn at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659)nn at scala.util.Success.$anonfun$map$1(Try.scala:255)nn at scala.util.Success.map(Try.scala:213)nn at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)nn at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)nn at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)nn at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)nn at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)nn at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)nn at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)nn at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)nn at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)n",
"failureType": "UserError",
"target": "notebook_name",
"details": []
}
2
Answers
Added my synapse workspace under the required access. Hence, worked.
The above error mainly happens because of permission issue, the synapse workspace required lack of permissions to access storage account, so you need to grant
storage blob contributor
role.To add
storage account contributor
role to your workspace, refer this Microsoft documentationAnd also, make sure to check whether you are following
ADLS gen2
proper syntax or not.Sample code
For more detail information refer this link.