skip to Main Content

i’m trying to get the prices displayed on :

https://campervans.jeanlain.com/locations/?city-input=annecy&city-name=ANNECY&departure_date=06%2F01%2F2025&departure_time=11%3A00&return_date=10%2F01%2F2025&return_time=10%3A00

I tried with requests, and requests-html but neither works…

Here is my code :

from requests_html import HTMLSession
session = HTMLSession()
from bs4 import BeautifulSoup

response = session.get('https://campervans.jeanlain.com/locations/?city-input=annecy&city-name=ANNECY&departure_date=06%2F01%2F2025&departure_time=11%3A00&return_date=10%2F01%2F2025&return_time=10%3A00')
response.html.render()

soup = BeautifulSoup(response.html.html, 'html.parser')
products = soup.find_all('section', class_='product')

for product in products:
    title = product.find('h2', class_='woocommerce-loop-product__title')
    if title:
        print(title.text)
    
    price_info = product.find('div', class_='content-right')
    if price_info:
        price = price_info.find('p', class_='price')
        print(price)
    else:
        print("content-right not found")

The problem is that the "content-right" div is displayed on the page but not in the response… It seems to be loaded with javascript…

How to get the prices displayed with python requests only when the javascript is loaded ? I don’t want to use Selenium…

Thanks 🙂

2

Answers


  1. I don’t know if there is method to check if JavaScript already finished work but it allows to use parameter sleep and JavaScript may have time for work.

    response.html.render(sleep=3)
    

    and it gives you results.


    It also allows to send JavaScript code which could check if element exists

    response.html.render(script="...")
    

    but I don’t have any code for this. And probably it would also need to use sleep for this.


    Last idea: selenium has functions Waits which run loop and it checks periodically if element already exists.

    And requests_html has option keep_page

    request_html.html.render(keep_page=True)
    

    which allows to interact with the browser page through html.page.
    And you could use it to check periodically if element already exists.


    BTW:

    I get result without sleep when I don’t use BeautifulSoup but built-in functions to search data. Maybe these functions already use some longer time to check if object exists. Or maybe it is accidental.

    In example I use xpath for this:

    from requests_html import HTMLSession
    
    session = HTMLSession()
    
    response = session.get('https://campervans.jeanlain.com/locations/?city-input=annecy&city-name=ANNECY&departure_date=06%2F01%2F2025&departure_time=11%3A00&return_date=10%2F01%2F2025&return_time=10%3A00')
    response.html.render()
    
    products = response.html.xpath('//section[contains(@class, "product")]')
    
    for product in products:
        title = product.xpath('.//h2[contains(@class, "woocommerce-loop-product__title")]', first=True)
        if title:
            print(title.text)
        
        price_info = product.xpath('.//div[contains(@class, "content-right")]', first=True)
        if price_info:
            price = price_info.xpath('.//p[contains(@class, "price")]', first=True)
            if price:
                print(price.text.strip())
            else:
                print(price)
        else:
            print("content-right not found")        
    

    Result:

    CITROEN AMENAGE
    339,98 €
    CITROEN AMENAGE
    None
    CITROEN AMENAGE
    339,98 €
    CITROEN AMENAGE
    339,98 €
    CITROEN AMENAGE
    None
    VW CALIFORNIA
    400,00 €
    VW CALIFORNIA
    None
    VW CALIFORNIA
    None
    VW CALIFORNIA
    None
    VW CALIFORNIA AT
    None
    VW CALIFORNIA 4M AT
    None
    VW CALIFORNIA 4M AT
    400,00 €
    VW CALIFORNIA AT
    400,00 €
    VW CALIFORNIA BEACH
    360,00 €
    
    Login or Signup to reply.
  2. You don’t need to use something that mimics a browser. You just need to make one extra request. If you go into dev-tools you can see this endpoint and that it is exactly what you need:
    enter image description here

    Here’s the endpoint itself. Use the request library to send a POST request to retrieve data from this endpoint, passing additional parameters :
    enter image description here

    As a result, if you do it through python, you get something like this:

    import requests
    
    response = requests.post(
        'https://campervans.jeanlain.com/wp/wp-admin/admin-ajax.php',
        headers={
            'Content-Type': 'application/x-www-form-urlencoded',
            'Cache-Control': 'no-cache',
        },
        data={
            'action': 'products_list_prices',
            'agency': 'annecy',
            'nonce': '839a81d954', # it's nailed down in the html
            'agency_city': 'ANNECY',
            'begin_date': '06/01/2025',
            'begin_hour': '11:00',
            'end_date': '10/01/2025',
            'end_hour': '10:00',
        },
    )
    data = response.json()
    

    You will be able to pass all keys, except for action and nonce, in a dynamic way. It’s then a matter of technique to match this when parsing the html.

    for product in products:
        data_immatriculation = product['data-immatriculation']
        title = product.find('h2', class_='woocommerce-loop-product__title')
        price = data.get(data_immatriculation, {}).get('total_price')
        #  if price = None - not available product
        print(f'{price=}')
    

    UPDATED

    To fix what you wrote in the comment, you should have just retrieved the data (screenshot1) from the HTML and passed it in a POST request. Or initially store the get request data and pass it on (screenshot2, I’m talking about what’s passed through the url).

    screenshot1
    screenshot2

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search