I am trying to create data assest with ADLS gen 2, and read a delta table on adls gen folder something like this:
/
└── my-data
├── _delta_log
├── part-0000-xxx.parquet
└── part-0001-xxx.parquet
Currently, when creating the data asset I used file dataset type ML v1 APIs, but when reading the table, it shows all the rows(even the deleted ones), and not the most recent version.
I have attempted to create it all the other data asset types for azure Ml v1/v2.
I ideally want to read the most recent version of the delta table and also have the option to change version.
No sucess. How to resolve this?
2
Answers
For the below code to work, you need to create a mltable(data asset) with correct folder path.
You can follow the procedure below to read the current version of the Delta table:
Add the Storage Blob Data Contributor role to your Entra ID, where you created the ML workspace to ADLS account. Run the code below to read the current version of the Delta Lake table:
You will see the Delta table as shown below:
For more information, you can refer to this.