I am trying to scrape a page that contains specific info.
The url:https://www.artisans-du-batiment.com/trouver-un-artisan-qualifie/?job=Charpentier&place=35000%2F35900
I want to select a class for each carpenter, so I try response.css(‘div.a-artisanTease to-animate’), but it gives no selection. What might be the problem?
Thanks.
I’ve tried several different paths.
I need the scrapy to select all separate carpenters that are on the page, so I can later collect info for all search results
2
Answers
The reason you can’t retrieve data with Scrapy is that this webpage is written in JavaScript. Scrapy cannot assist you in this case. You need to use a library that can handle JavaScript, such as Selenium or Splash, to retrieve data from this webpage.
I recommend using XPath selectors instead of CSS selectors, as XPath offers many useful options for searching text in the DOM. The equivalent XPath code for your code would be:
//div[@class=’a-artisanTease to-animate’]
The actual reason is because you need a specific cookie for the server to serve you the full html for the page. Also your css selector expression is wrong.
The cookie needed is
"tarteaucitron=!googletagmanager=wait"
and the correct css expression would bediv.a-artisanTease.to-animate
For example using scrapy shell: