I am working on a data mining system and one of the requirements is it being able to perform the analysis without the use of API. Is there a way to download the Twitter database (or a big part of it, at least) and work with it locally?
Question posted in Twitter API
The official Twitter API documentation can be found here.
The official Twitter API documentation can be found here.
2
Answers
APIs are the official way of getting Twitter data and they work really well so it is not comprehensible why you do not want to use APIs. The web scraping is a work around but not recommended, in addition you would like to get a big part of it, so I do not think you will be satisfied with it. You can also buy the data from Gnip.
There is a paper about creating corpora from twitter. It is called “TWORPUS – An Easy-to-Use Tool for the Creation of Tailored Twitter Corpora”. I recommend to read it because it also covers licensing issues etc. They also provide there code on Github.
In fact, you cannot download the twitter data dumps directly. I can download single tweets and stored them in a corpus. But, it is also not allowed to share that data. Therefore, the authors built the Tworpus client to create private twitter corpora.