skip to Main Content

I need to do a one-time load (batch) from Azure to BigQuery and I am new in the Google Cloud environment. I noticed there are numerous ways to do this, but still isn’t clear which option is the most efficient one.

Any thoughts on this? Thank you

(EDIT)

I didn’t have much information on the process when I first came here with the question. Unfortunately, I got a -1 due to the lack of information on the question. You can also comment and ask me to add more information to the question!
I now have more info and I’ll add a comment with more information in the reply space.

2

Answers


  1. Chosen as BEST ANSWER

    If you are new to GCP, there are two good options for a batch load:

    1. Google Data Transfer, or Storage Data Transfer You can either send the information to the Cloud Storage or BigQuery, there are several options for the source. Documentation: https://cloud.google.com/storage-transfer-service
    2. Google Data Fusion You'll have to set up an instance and then you will be able to create your pipeline. If you are familiar with some ETL tools you will feel at ease to set up the connectors. You can extract the data from various sources and, as the problem states, you can have BigQuery as destination as well. Highly recommend it as a straight-forward solution. Documentation: https://cloud.google.com/data-fusion

  2. If you want to transfer large amounts of files from Azure Storage into BigQuery tables on a scheduled basis, use BigQuery Data Transfer Service and if you want to read and process data before transferring data into BigQuery tables, use the CREATE TABLE AS SELECT statement.

    This article might also help you:

    https://cloud.google.com/bigquery/docs/loading-data

    https://cloud.google.com/migrate/compute-engine/docs/4.8/how-to/migrate-azure-to-gcp/overview

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search