I’m trying to download zip codes that are in different pages. I started with a list of nodes for each municipality inside Mexico City.
url<-"https://www.codigopostal.lat/mexico/Ciudad-de-Mexico/"
resource<-GET(url)
parse<-htmlParse(resource)
links<-as.character(xpathSApply(parse,path="//a",xmlGetAttr,"href"))
print(links)
And then I’m trying to create a loop that grabs each url and grabs the table of zip codes to later create a big corpus of each matrix created per municipality:
scraper<-function(url){
html<-read_html(url)
tabla<-html%>%
html_elements("td , th") %>%
html_text2()
data<-matrix(ncol=3,nrow=length(tabla))
data<-data.frame(matrix(tabla,nrow=length(tabla),ncol=3,byrow=TRUE)) %>%
row_to_names(row_number=1)
}
I will have "municipality", "locality", "zp", that’s why the number of columns is 3, but it seems that:
"Error: x
must be a string of length 1" and I also cannot add up all the matrices.
Any ideas are greatly appreciated!
2
Answers
Here is a way to scrape the zip codes of Ciudad-de-Mexico.
Then
rbind
them all together.Edit
In the original post I have loaded package
dplyr
. After a second thought I have realized that it’s only loaded to make themagrittr
pipe operator available, so I have changed the code to only load the relevant package,magrittr
.