I want to load data from MySQL to BigQuery using Cloud Dataflow. Anyone can share article or work experience about load data from MySQL to BigQuery using Cloud Dataflow with Python language?
Thank you
I want to load data from MySQL to BigQuery using Cloud Dataflow. Anyone can share article or work experience about load data from MySQL to BigQuery using Cloud Dataflow with Python language?
Thank you
2
Answers
You can use apache_beam.io.jdbc to read from your MySQL database, and the BigQuery I/O to write on BigQuery.
Beam knowledge is expected, so I recommend looking at Apache Beam Programming Guide first.
If you are looking for something pre-built, we have the JDBC to BigQuery Google-provided template, which is open-source (here), but it is written in Java.
If you only want to copy data from
MySQL
toBigQuery
, you can firstly export yourMySql
data toCloud Storage
, then load this file to aBigQuery
table.I think no need using
Dataflow
in this case because you don’t have complex transformations and business logics. It only corresponds to a copy.Export the
MySQL
data toCloud Storage
via asql
query andgcloud
cli :Load the
csv
file to aBigQuery
table viagcloud
cli andbq
:./myschema.json
is theBigQuery
table schema.