I have two tables of Twitter API data bound together and I want a function that determines if the text contains the word f150. If it does then it should return ford, if not it should search the text for the word Silverado and if Silverado is found it should return chevy. all others should be null.
I saw this online but it isn’t working for me. Also are there wildcards in R like in SQL?
`tweet_sentiments <-
tweet_sentiments %>%
mutate(vehicle = if(text = "f150") {ford}
else_if(text= "silverado"){Chevy})
2
Answers
1) We can use
case_when
2) If it is substring, then use
str_detect
3) another option is
%like%
4) another option is
rowwise
withif/else
5) or we can use
fuzzy_join
It is legal to use
if
in amutate
call, but in what you demonstrate here, it is wrong. Since you want to condition on a vector, you should considerifelse
(base R), orif_else
(indplyr
).The first change to your code is something like:
text = 'f150'
is an assignment, you need a comparison, which is==
for equality. Progressive code changes:You need a default value, one that is assigned if
text
is neither"f150"
nor"silverado"
. Options include a literal string like"unknown"
, or the R-idiomaticNA
(which means effectively "not-applicable" or "could be anything"). Code progress:(R has at least six kinds of
NA
, andif_else
is rather particular about keeping the class of itsyes=
andno=
arguments the same class. If you usedifelse
instead, you could have kept it atNA
at the risk of several of the other problems thatbase::ifelse
presents. It has baggage.)You mentioned wildcards, which suggests that you may want to find
"f150"
as a substring in thetext
, in which case we will wantgrepl
. Code progress:grepl
supportsignore.case=
as well, in case you want to consider case-insensitive comparisons.Lastly, working this back around to a
dplyr
-idiomatic way of doing things … whenever I see more than one nestedifelse
(…), I immediately recommenddplyr::case_when
. For instance, if you add another car type or two, it gets unwieldy:but can be cleaned up (indents and parens) as:
Since you asked about "wildcards", if you don’t know about regular expressions, or don’t know the difference between regex and glob-style patterns, then I suggest you look at https://stackoverflow.com/a/22944075/3358272 (and perhaps
?glob2rx
, for converting glob-style to regex, sincegrep*
functions only deal with regex or fixed-strings).