i’m trying to get the prices displayed on :
I tried with requests, and requests-html but neither works…
Here is my code :
from requests_html import HTMLSession
session = HTMLSession()
from bs4 import BeautifulSoup
response = session.get('https://campervans.jeanlain.com/locations/?city-input=annecy&city-name=ANNECY&departure_date=06%2F01%2F2025&departure_time=11%3A00&return_date=10%2F01%2F2025&return_time=10%3A00')
response.html.render()
soup = BeautifulSoup(response.html.html, 'html.parser')
products = soup.find_all('section', class_='product')
for product in products:
title = product.find('h2', class_='woocommerce-loop-product__title')
if title:
print(title.text)
price_info = product.find('div', class_='content-right')
if price_info:
price = price_info.find('p', class_='price')
print(price)
else:
print("content-right not found")
The problem is that the "content-right" div is displayed on the page but not in the response… It seems to be loaded with javascript…
How to get the prices displayed with python requests only when the javascript is loaded ? I don’t want to use Selenium…
Thanks 🙂
2
Answers
I don’t know if there is method to check if JavaScript already finished work but it allows to use parameter
sleep
and JavaScript may have time for work.and it gives you results.
It also allows to send JavaScript code which could check if element exists
but I don’t have any code for this. And probably it would also need to use
sleep
for this.Last idea:
selenium
has functionsWaits
which run loop and it checks periodically if element already exists.And
requests_html
has optionkeep_page
which allows to interact with the browser page through
html.page
.And you could use it to check periodically if element already exists.
BTW:
I get result without sleep when I don’t use
BeautifulSoup
but built-in functions to search data. Maybe these functions already use some longer time to check if object exists. Or maybe it is accidental.In example I use
xpath
for this:Result:
You don’t need to use something that mimics a browser. You just need to make one extra request. If you go into dev-tools you can see this endpoint and that it is exactly what you need:
Here’s the endpoint itself. Use the
request
library to send aPOST
request to retrieve data from this endpoint, passing additional parameters :As a result, if you do it through python, you get something like this:
You will be able to pass all keys, except for
action
andnonce
, in a dynamic way. It’s then a matter of technique to match this when parsing the html.UPDATED
To fix what you wrote in the comment, you should have just retrieved the data (screenshot1) from the
HTML
and passed it in aPOST
request. Or initially store theget
request data and pass it on (screenshot2, I’m talking about what’s passed through the url).