skip to Main Content

I’m quite new to Azure ML and Python. I created some datasets using both the Azure ML GUI and the Python SDK:

enter image description here

Now I want to load these datasets in a Pandas Dataframe. But when I run

Dataset.get_all(workspace=workspace)

I got an empty list.

enter image description here

Do I miss something? I’m using the version 0.2.7. of azureml and Version 1.46.0. of azureml-core.

I also tried

workspace.datasets

But also got an empty result.

3

Answers


  1. We need to have the datasets in the workspace in the form of tabular. If the dataset is in another format, we can’t retrieve the datasets in the workspace.

    enter image description here

    enter image description here

    enter image description here

    I am choosing Local files

    enter image description here

    enter image description here

    Code:

    ws = Workspace.get(name="your ws name",
                   subscription_id='your subscription ID',
                   resource_group='your resource group')
    

    Then using the workspace details, we can connect and get dataset details.

    Dataset.get_all(ws)
    

    Output:

    {'churn3': DatasetRegistration(id='b9fb5e3d-4893-4847-9f2e-497d85edd3b3', name='churn3', version=1, description='', tags={}), 'churn2': DatasetRegistration(id='007a6e1a-268c-409e-adbb-d5437946c4ef', name='churn2', version=1, description='', tags={})}
    

    Achieved the result.

    Login or Signup to reply.
  2. There’s a way to do it.

    You have to upload the files manually.

    In the azure ML studio:

    • Data/Data Assets -> Create
    • Type -> File (Dataset types from Azure ML v1 APIs)

    Then you call:

    Dataset.get_all(ws)
    

    It returns all datasets, both File and Tabular.

    Login or Signup to reply.
  3. Assuming that you have the dataset in AzureML Data already saved, here is python code you can run from anywhere to access the dataset, authenticating as a ServicePrincipal:

    from azureml.core.authentication import ServicePrincipalAuthentication
    from azureml.core import Workspace
    from azureml.core import Dataset
    
    svc_pr = ServicePrincipalAuthentication(
        tenant_id="your-tenant-id-here",
        service_principal_id="your-service-principal-id",
        service_principal_password='your-service-principal-password')
    
    ws = Workspace.from_config(path='config/file/path/config.json', auth=svc_pr)
    
    
    # Get one dataset
    ds = Dataset.get_by_name(ws, 'Your dataset name from AzureML Data')
    df = ds.to_pandas_dataframe()
    
    #Get all datasets
    ds_all = Dataset.get_all(ws)
    for registration_name, dataset in ds_all.items():
        df_instance = dataset.to_pandas_dataframe()
        # .... do something with the data
    

    The config.json file could look like this:

    {
        "subscription_id": "your-subscription-id",
        "resource_group": "your-resource-group",
        "workspace_name": "your-workspace-name"
    }
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search