I’m trying to scrape all the team statistics at this page:
https://www.unitedrugby.com/clubs/glasgow-warriors/stats
As you can see there are several drop down menus. The ones I’m interested in are the six ones describing the 2022/23 statistics (Attack, Defence, Kicking, Discipline, Lineouts, Scrums).
I have inspected the page and the item to click to open each of the six menus should have the following XPATH: //div[@class='bg-white px-6 py-2 absolute left-1/2 -translate-x-1/2 -top-5 text-slate-deep uppercase text-2xl leading-5 font-step-1 font-urc-sans tracking-[2px] hover:cursor-pointer select-none']
.
In the Firefox inspector it also says "event" next to this particular line so (since I’m not that skilled in Selenium yet) I thought it was the element to click.
I have used the following piece of code to retrieve all elements with that class:
Elements = WebDriverWait(driver, 60).until(
EC.element_to_be_clickable((By.XPATH, "//div[@class='bg-white px-6 py-2 absolute left-1/2 -translate-x-1/2 -top-5 text-slate-deep uppercase text-2xl leading-5 font-step-1 font-urc-sans tracking-[2px] hover:cursor-pointer select-none']"))
)
My idea was to find all these elements, wait for them to be clickable, then click them to open the dropdown menus, and scrape all the statistics contained inside.
Regardless of how much time I allow it to wait, it always reaches a Timeout exception.
Could anyone help me sorting out this issue?
EDIT #1:
Thanks to the answers I have achieved the first step. However, my ultimate goal is to retrieve the actual statistics (e.g. "Points scored", inside "Attack").
These are all under the class flex justify-between items-center border-t border-mono-300 py-4 md:py-6
.
After clicking on the cookies button and waiting for the presence of all elements (which works now) I am not able to retrieve elements with this class.
What I’m missing is how to open all those 6 menus prior to scrape the statistics, because they don’t show up unless I click on the dropdown.
I’m doing this:
Elements = [el.click() for el in Elements]
Because I’m trying to click on each of the 6 webdriver instances resulting from the previous Wait
.
I think this isn’t the way I’m supposed to do it but I can’t find how, in case any of you has any hint.
3
Answers
I just changed this
EC.element_to_be_clickable
toEC.presence_of_all_elements_located
, it seems to be working.Check the working code below:
Console output:
I tried to open https://www.unitedrugby.com/clubs/glasgow-warriors/stats and I see that it shows a modal cookie window that covers all the content of the site.
So elements you are trying to reach are actually not clickable.
You need to remove this dialog window first like this:
After that your code will work as expected
To click to open each of the six menus you have to induce WebDriverWait for the visibility_of_all_elements_located() and you can use the following locator strategies:
Note : You have to add the following imports :