skip to Main Content

Scraping PDF or pdf text when the file is embedded in html

Using R, I am trying to get the text (ideally, with some formatting) of a pdf embedded in html. THe url, as an example, is "https://www.nycourts.gov/courts/ad2/Handdowns/2024/10-October/10-02-2024_FINAL_HANDDOWN_LIST.pdf" using pdf_text doesn't work: > pdf_text <- pdf_text("https://www.nycourts.gov/courts/ad2/Handdowns/2024/10-October/10-02-2024_FINAL_HANDDOWN_LIST.pdf") Error in open.connection(con, "rb") : cannot…

VIEW QUESTION
Back To Top
Search