So forgive my ignorance, but I can’t seem to work this out.
I want to create a “table” in BigQuery, from an API call.
I am thinking https://developer.companieshouse.gov.uk/api/docs/search/companies/companysearch.html#here
I want to easily query the Companies House API, without writing oodles of code?
And then cross reference that with other datasets – like Facebook API, LinkedIn API.
eg. I want to input a company ID/ name on Companies house and get a fuzzy list of the people and their likely Social connections (Facebook, LinkedIn and Twitter)
Maybe BigQuery is the wrong tool for this? Should I just code it??
Or
It is, and adding a dataset with an API is just not obvious to me how to figure it out – in which case – please enlighten me.
2
Answers
You will not be able to directly use BigQuery and perform the task at hand. BigQuery is a web service that allows you to analyze massive datasets working in conjunction with Google Storage (or any other storage system).
The correct way of going about the situation would be to perform a curl request to collect all the data you require from Companies House and store the data as a spreadsheet (csv). Afterwards you may store the csv within Google Cloud Storage and load the data into BigQuery.
If you simply wish to link clients from Companies House and social media applications such as Facebook or LinkedIn, then you may not even need to use BigQuery. You may construct a structured table using Google Cloud SQL. The fields would consist of the necessary client information and you may later do comparisons with the FaceBook or LinkedIn API responses.
So if you are looking to load data from various sources and do big-query operations through API – Yes there is a way and adding to the previous answer, big-query is meant to do only analytical queries (on big data) otherwise simply, it’s gonna cost you a lot and slower than a regular search api if you intend to do thousands of search queries on big datasets joining various tables etc.,
let’s try to query using api from bigquery from public datasets
to authenticate – you will need to generate the authentication token using your application default credentials
Now using the token generated by
gcloud
command – you can use it for rest api calls.Response:
Query – The most popular tags on Stack Overflow questions linked from Hacker News since 2014:
Result :
So, we do some of our analytical queries using api to build periodical reports. But, I let you explore the other options & big-query API to create & load data using API.