I am trying to extract twitter data using rest API in zeppelin. Tried both option registerAsTable
and registerTempTable
, both ways are not working. Please help me to resolve the error. Getting below error while executing zeppelin Tutorial Code:
error: value registerAsTable is not a member of org.apache.spark.rdd.RDD[Tweet] ).foreachRDD(rdd=> rdd.registerAsTable(“tweets”)
3
Answers
RDD cannot be registered as Table whereas dataframe can. You can convert your RDD into dataframe and then write the resulting dataframe as tempTable or table.
You can convert RDD into Dataframe as below
Refer How to convert rdd object to dataframe in spark and http://spark.apache.org/docs/latest/sql-programming-guide.html
in zepplin interpretors add external dependency of org.apache.bahir:spark-streaming-twitter_2.11:2.0.0 from GUI and after that run following using spark-2.0.1
After that run some queries in the table in another zappelin cell
//convert RDD to DF