Trying to convert a python project (that uses selenium to scrape twitter tweets without using the limited twitter api) into R programming. Works fine in Python but I want to recreate it in R. New to R but i have some MatLab experience if it helps
install.packages("RSelenium") # install RSelenium 1.7.1
As far as I’m aware the package has been updated. So instead of startserver() i need to user other functions. But based on all the research I get slightly conflicting answers that all don’t work:
require(RSelenium) #used require() and library()
remDr <- remoteDriver(browserName = "chrome")
remDr$open()
I get error:
[1] "Connecting to remote server"
Error in checkError(res) :
Undefined error in httr call. httr output: Failed to connect to localhost port 4444: Connection refused
also tried:
require(RSelenium)
remDr <- rsDriver(browser = c("chrome"))
and i get:
checking Selenium Server versions:
BEGIN: PREDOWNLOAD
BEGIN: DOWNLOAD
BEGIN: POSTDOWNLOAD
checking chromedriver versions:
BEGIN: PREDOWNLOAD
BEGIN: DOWNLOAD
BEGIN: POSTDOWNLOAD
checking geckodriver versions:
BEGIN: PREDOWNLOAD
BEGIN: DOWNLOAD
BEGIN: POSTDOWNLOAD
checking phantomjs versions:
BEGIN: PREDOWNLOAD
BEGIN: DOWNLOAD
BEGIN: POSTDOWNLOAD
[1] "Connecting to remote server"
The chrome browser (61.0.3163.100) launches but I cannot run the next line of my code because of the last line. The browser stays open for about half a minute before self closing and i get this error:
Selenium message:unknown error: unable to discover open pages
(Driver info: chromedriver=2.33.506120 (e3e53437346286c0bc2d2dc9aa4915ba81d9023f),platform=Windows NT 6.1.7601 SP1 x86_64) (WARNING: The server did not provide any stacktrace information)
Command duration or timeout: 60.44 seconds
Build info: version: '3.6.0', revision: '6fbf3ec767', time: '2017-09-27T16:15:40.131Z'
System info: host: 'RENTEC-THINK', ip: '192.168.56.1', os.name: 'Windows 7', os.arch: 'amd64', os.version: '6.1', java.version: '1.8.0_144'
Driver info: driver.version: unknown
Error: Summary: UnknownError
Detail: An unknown server-side error occurred while processing the command.
Further Details: run errorDetails method
I’ve tried multiple different things, including downloading a chrome driver (v2.33 should support chrome v60-62 https://sites.google.com/a/chromium.org/chromedriver/downloads)
and including the path in removedriver or adding the path as a system variable
It’s like anything I do does not work, as if there the update for RSelenium messed everything up. Am I doing something stupid ?
I’ve reached the point where, from all the inconsistent answers I’ve seen online, that I’m finding myself trying different combinations of different lines of code, mixmatching everything etc in a desperate attempt to try and get this working through trial and error alone
My next attempt is trying to find out where R installed RSelenium then seeing what is in the code 🙁
I was also thinking about the docker, but I’m not really into installing separate applications just to get my code to work.
2
Answers
Try:
Sometimes the driver tries to open too quickly and you get the “Failed to connect to localhost port 4444: Connection refused” error.
The following worked for me. Note browser, selenium and driver versions…