skip to Main Content

I’m trying to ingest data from the open-source, public Yahoo Finance API using Azure Data Factory. The endpoint I’m testing is https://query2.finance.yahoo.com/v8/finance/chart/GOLD.

I am able to ingest the data but I’m coming up with an issue when trying to transform the data as part of a data flow. I am trying to flatten the JSON produced, which is a series of nested arrays in the structure of:

JSON structure

To produce a table in the below format:

timestamp volume open low high close

The setup of my flatten activity is as follows:

Flatten settings

The Partition option I’m using is Use current partitioning. This is what it looks like under the Inspect tab:

Inspect

However, when I try to preview the data, nothing comes up and the notifications in ADF show this error:

Could not fetch statistics due to operation timeout.

In the source, I’ve tried sampling the data to only 10 rows and I’m getting the same error so I don’t think this is the issue. I have also tried a different API endpoint (MSFT) and I’m getting the same error here as well.

Any ideas appreciated!

Thanks,

Carolina

2

Answers


  1. Chosen as BEST ANSWER

    Figured it out! It was because the amount of data it was trying to ingest was too large. I set the query parameters as below and I'm now getting data through:

    enter image description here


  2. The error you are getting is might because of your large data set. The default IR used for debug mode in data flows is a small 4-core single worker node with a 4-core single driver node. to work with large dataset, you need to increase the single worker node with a driver node as below:

    • Select large as compute size:
      enter image description here
    • You can directly create integration runtime with large compute size and use it as below:
      enter image description here

    The way a data flow previews data can be changed. Clicking "Debug Settings" on the Data Flow canvas toolbar will allow you to change the debug settings. Here, you can reduce the no of rows in preview:
    enter image description here

    You can also enable sampling of data to use sample amount of data from source for testing purpose:
    enter image description here

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search