I am using the mltable library on an AzureML notebook.
I can successufully load a local csv file as an mltable:
from mltable import from_delimited_files
paths = [{'file': "dati_estra_test.csv"}]
dati = from_delimited_files(paths)
And I can view it as a pandas dataframe:
Is there a way to write this artifact as an MLTable artifact?
Or to register it as an mltable AzureML dataset?
2
Answers
Use the below code block to get the file downloaded.
This will download the file to the specific folder.
In
mltable
version 1.0.0, a save method was introduced that will write out the MLTable file:https://learn.microsoft.com/python/api/mltable/mltable.mltable.mltable?view=azure-ml-py#mltable-mltable-mltable-save
Artifacts should be stored in a folder. Therefore, you need to create a folder that stores the
dati_estra_test.csv
, soNext, create/save the MLTable file using the SDK:
You can create a data asset using either the CLI (note the path should be pointing to the artifact folder):
Or the Python SDK:
When the asset is created your artifact will automatically be uploaded to cloud storage (the default Azure ML Datastore).
It should be noted that it isn’t a requirement to use Azure ML Tables (
mltable
) when your data is tabular in nature. You can use Azure ML File (uri_file
) and Folder (uri_folder
) types, and provide your own parsing logic to materialize the data into a Pandas or Spark data frame. In cases where you have a simple CSV file or Parquet folder, you’ll probably find it easier to use Azure ML Files/Folders rather than Tables.You’ll find Azure ML Tables (
mltable
) to be much more useful when you’re faced with the following scenarios: