Html - How to download and check a sub-div with BeautifoulSoup?

flyingMate
May 2, 2023
170 views
0 votes
2 Answers

I’m trying to get hands on some tickets for the Collosseum in Rome in a few weeks. But the Full Experience Tickets are always sold out!

So I thought about tracking the website and check if there is availability for tickets -> send me a message if new tickets are available.

As tool of choice I want to use Python with the BeautifullSoup and Requests libraries to download the sourcecode of the website and check if the background of the desired day changed color to green. (I try it this way, because I can’t find the source where the website knows from if tickets are available in the sourcecode)

so this is the website:https://ecm.coopculture.it/index.php?option=com_snapp&view=event&id=D7E12B2E-46C4-074B-5FC5-016ED579426D&catalogid=DDDA3AB3-47BC-0A49-7752-0174490F632A&lang=en

and here’s my Python code so far:

import requests
from bs4 import BeautifulSoup 

URL = 'https://ecm.coopculture.it/index.php?option=com_snapp&view=event&id=D7E12B2E-46C4-074B-5FC5-016ED579426D&catalogid=DDDA3AB3-47BC-0A49-7752-0174490F632A&lang=en'

#set the headers as a browser
headers = {
    'User-Agent':'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36'
}
#download the homepage
response = requests.get(URL, headers=headers) 

#parse the downloaded homepage and grab all text
soup = BeautifulSoup(response.text, "html")

msg = soup

print(msg)

It should print me the whole sourcecode but it doesn’t (oh wonder that’s why I’m here 🙂

**The Information I’m looking for is hidden in the but the printed sourcecode doesn’t show the contents of the div. **

Where did i make a mistake?

Answers

The issue with the code is that it’s not specifying the parser to be used by BeautifulSoup, and the default parser is not able to parse the page correctly. You should use the lxml parser instead. Here’s the corrected code:

import requests
from bs4 import BeautifulSoup 

URL = 'https://ecm.coopculture.it/index.php?option=com_snapp&view=event&id=D7E12B2E-46C4-074B-5FC5-016ED579426D&catalogid=DDDA3AB3-47BC-0A49-7752-0174490F632A&lang=en'

#set the headers as a browser
headers = {
    'User-Agent':'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36'
}
#download the homepage
response = requests.get(URL, headers=headers) 

#parse the downloaded homepage and grab all text
soup = BeautifulSoup(response.text, 'lxml')

#find the div with the information you are looking for
div = soup.find('div', {'class': 'time-slot-color status-2'})

if div is not None:
    print('Tickets available!')
else:
    print('Tickets not available yet')

In this example, I’m searching for the div with the class time-slot-color status-2, which represents the time slot where tickets are available. You can adjust the code to look for the specific element that you are interested in.

- Guido
- May 2, 2023 at 3:52 pm
- 0 votes
0
The line soup = BeautifulSoup(response.text, "html") raises a TypeError because the second argument should be a parser, not just the string "html". BeautifulSoup requires a parser to process the HTML content correctly.

To fix this error, you need to specify a parser to use with BeautifulSoup. Common parsers include ‘html.parser’, ‘lxml’, and ‘html5lib’.

Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Html – How to download and check a sub-div with BeautifoulSoup?

Answers