Azure - How can I write an mltable artifact from python to a local folder?

9879ypxkj
November 20, 2022
366 views
0 votes
2 Answers

I am using the mltable library on an AzureML notebook.

I can successufully load a local csv file as an mltable:

from mltable import from_delimited_files
paths = [{'file': "dati_estra_test.csv"}]
dati = from_delimited_files(paths)

And I can view it as a pandas dataframe:

Is there a way to write this artifact as an MLTable artifact?
Or to register it as an mltable AzureML dataset?

Answers

Use the below code block to get the file downloaded.

from azureml.core import Workspace, Dataset

subscription_id = ‘subscription'
resource_group = ‘your RG’
workspace_name = 'nov21'

workspace = Workspace(subscription_id, resource_group, workspace_name)

dataset = Dataset.get_by_name(workspace, name='churn')
dataset.to_pandas_dataframe()

dataset.to_pandas_dataframe(on_error='null', out_of_range_datetime='null')

dataset.download('Churn', target_path='df.csv', overwrite=False, ignore_not_found=True)

This will download the file to the specific folder.

- SamKemp
- December 20, 2022 at 2:19 pm
- 0 votes
0
In mltable version 1.0.0, a save method was introduced that will write out the MLTable file:

https://learn.microsoft.com/python/api/mltable/mltable.mltable.mltable?view=azure-ml-py#mltable-mltable-mltable-save

Artifacts should be stored in a folder. Therefore, you need to create a folder that stores the dati_estra_test.csv, so
```
# create directory
mkdir dati_estra_test

# move csv to directory
mv dati_estra_test.csv dati_estra_test
```
Next, create/save the MLTable file using the SDK:
```
import mltable
import os

# change the working directory to the data directory
os.chdir("./dati_estra_test")

# define the path to relative to the MLTable
path = {
    'file': './dati_estra_test.csv'
}

# load from parquet files
tbl = mltable.from_delimited_files(paths=[path])

# show the first few records
new_tbl.show()

# save MLTable file in the data directory
new_tbl.save(".")
```
You can create a data asset using either the CLI (note the path should be pointing to the artifact folder):
```
az ml data create --name dati_estra_test --version 1 --type mltable --path ./dati_estra_test
```
Or the Python SDK:
```
from azure.ai.ml.entities import Data
from azure.ai.ml.constants import AssetTypes

my_path = './dati_estra_test'

my_data = Data(
    path=my_path,
    type=AssetTypes.MLTABLE,
    name="dati_estra_test",
    version='1'
)

ml_client.data.create_or_update(my_data)
```
When the asset is created your artifact will automatically be uploaded to cloud storage (the default Azure ML Datastore).

It should be noted that it isn’t a requirement to use Azure ML Tables (mltable) when your data is tabular in nature. You can use Azure ML File (uri_file) and Folder (uri_folder) types, and provide your own parsing logic to materialize the data into a Pandas or Spark data frame. In cases where you have a simple CSV file or Parquet folder, you’ll probably find it easier to use Azure ML Files/Folders rather than Tables.

You’ll find Azure ML Tables (mltable) to be much more useful when you’re faced with the following scenarios:
- The schema of your data is complex and/or changes frequently.
- You only need a subset of data (for example: a sample of rows or files, specific columns, etc.).
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Azure – How can I write an mltable artifact from python to a local folder?

Answers