i want to be able to pull all urls from the following webpage using python https://yeezysupply.com/pages/all i tried using some other suggestions i found but they didn’t seem to work with this particular website. i would end up not finding any urls at all.
import urllib
import lxml.html
connection = urllib.urlopen('https://yeezysupply.com/pages/all')
dom = lxml.html.fromstring(connection.read())
for link in dom.xpath('//a/@href'):
print link
2
Answers
There are no links in the page source; they are inserted using Javascript after the page is loaded int the browser.
perhaps it would be useful for you to make use of modules specifically designed for this. heres a quick and dirty script that gets the relative links on the page
it generates output like this:
is this what you are looking for? requests and beautiful soup are amazing tools for scraping.