I am using the Databricks VSCode extension for development in an IDE. The basic functionalities are all working well. I connected to an Azure Databricks workspace with Unity Catalog enabled, selected an active cluster (DBR 13.2) and configured the sync destination. I am able to execute code.
Now I want to use Databricks Connect "V2" to run my code locally.
I have the following code:
from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
However, when I run this, I get the following error:
RuntimeError: Only remote Spark sessions using Databricks Connect are supported. Could not find connection parameters to start a Spark remote session.
Am I missing something? I did my authentication once with the AZ CLI, once with a PAT. I also tried it on DBR 13.2 and 13.3, but all options failed.
Thanks!
2
Answers
Ok, that issue was fixed in the extension version 1.1.1 by exporting the
SPARK_REMOTE
environment variables that is needed forspark = SparkSession.builder.getOrCreate()
to work.But please note that it will work only if you configure profile-based authentication, not for
azure-cli
or OAuth authentication – for them to work you need to instantiate theDatabricksSession
instance that could be imported withfrom databricks.connect import DatabricksSession
For anyone else that get’s this error and does not wish to execute code using databricks.connect you have to completely uninstall the extension and package. I also recommend pip purge cache. Then you can reinstall the package and it should stop trying to use databricks.connect.