skip to Main Content

In Databricks Python notebook I can easily use dbutils module.
Now I also would like to use it within plain Python file which I import into Databricks notebook

Here is an example.

Here is content of some_python_module.py

secret_value = dbutils.secrets.get("some_location", "some_secret")

Later on I am importing it in Databricks notebook

import some_python_module.py

But I get error message: NameError: name 'dbutils' is not defined

I tried to add import statement into my some_python_module.py

import dbutils

but it returns: ModuleNotFoundError: No module named 'dbutils'

Aslo dbutils.secrets.get("some_location", "some_secret") works fine in Databricks notebook

2

Answers


  1. As you want to import the python file into Azure databricks workspace.

    I have created the below python file and uploaded the file to my filestore dbfs path:

    try:
        import IPython
        
        dbutils = IPython.get_ipython().user_ns["dbutils"]
    except (ImportError, KeyError):
        
        dbutils = None
    if dbutils:
        secret_value = dbutils.secrets.get("Key-vault-secret-dbx", "secretkv")
    else:
        secret_value = "dbutils not available"
    

    In the above code I have used

    import IPython
    dbutils = IPython.get_ipython().user_ns["dbutils"]
    

    It is used to access the dbutils object in a Databricks notebook’s environment when it is not directly available in the imported module.
    So basically Accessing dbutils from the Databricks notebook’s IPython environment

    I have tried to retrieve a secret using the below:

    import sys
    sys.path.append("/dbfs/FileStore/tables/")
    import dilip_module
    print(f"Retrieved secret value: {dilip_module.secret_value}")
    

    Results:

    
    Retrieved secret value: [REDACTED]
    
    
    Login or Signup to reply.
  2. dbutils is a variable/instance-of-DBUtils-class.

    You can create your own instance of DBUtils if you like.

    dbutils = DBUtils(SparkSession.builder.getOrCreate())
    

    From my conftest.py for pytest

    from pyspark.sql import SparkSession
    from unittest.mock import MagicMock
    
    
    @pytest.fixture(scope='session')
    def spark_session():
      yield SparkSession.builder.getOrCreate()
    
    @pytest.fixture(scope='session')
    def is_running_on_databricks_cluster(spark_session: SparkSession):
      return spark_session.conf.get('spark.app.name') == 'Databricks Shell'
    
    @pytest.fixture(scope='session')
    def dbutils(spark_session: SparkSession, is_running_on_databricks_cluster):
      if is_running_on_databricks_cluster:
        from pyspark.dbutils import DBUtils
    
        yield DBUtils(spark)
      else:
        yield MagicMock()
    
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search