I would like to create a new column for words used for grep. I have a data frame and a list of keywords to identify whether my data frame includes such list of keywords or not. If keywords are included in the data frame, I would like to know which words in a newly created column.
So, this is what my data is
id // skills
1 // this is a skill for xoobst
2 // artificial intelligence
3 // logistic regression
I used the below code to grep words.
keyword <- "xoobst|logistic|intelligence"
result <- df[grep(keyword, df$skills, ignore.case = T),]
This is what I desired for as an outcome
id // skills // words
1 // this is a skill for xoobst // xoobst
2 // artificial intelligence // intelligence
3 // logistic regression // logistic
I tried the below code, but it got me a full sentence rather than a word used to identify whether it includes the word or not.
keys <- sprintf(".*(%s).*", keyword)
df$words <- sub(keys, "\1", df$skills)
Which alternative way would be necessary for me? Thank you in advance!
3
Answers
You can use
stringr
:Using R base functions:
Using
grep
withsapply
andstrsplit
.This assumes that single keywords don’t contain spaces.