I have been working on a project related to Sentiment Analysis on Emojis. And I only want tweets with emojis in them and I don’t want to do it manually So, is there any way that I could make some changes in the below code that will result only in the tweets that have emoticons in them. So, let’s say that if I scrape 100 tweets, those 100 tweets must have some kind of emojis with some text. Any help will be highly appreciated.
For example, I only want tweets like this:
when is @McDonalds_SA gonna let us add spicy sauce on our veg burgers when we order on MrD or Uber eats ๐ญ๐ญ๐ญ๐ญ
Code:
get_token() # Connects with Twitter API
Uber <- search_tweets("uber", n = 100, lang = "en")
2
Answers
Just a simple solution based on a regex of all emojis. Let me know if this works.
Note: I assume you’re not looking for all emoji, since they include quite common characters:
(from https://unicode.org/Public/UNIDATA/emoji/emoji-data.txt)
Unicode library
To get the Unicode block for one or more characters, we can use the Unicode library:
A few examples:
Matching all emoji-like characters could be done like this:
Integrating into your code