I want to extract the legal form of the search term, in this example "Fondazione Amici di AMCA" on https://www.zefix.ch/en/search/entity/welcome
The code always yields "No search results found.", despite the entry exists. Where is the mistake?
The expected output is ‘Foundation’ from https://www.zefix.ch/en/search/entity/list/firm/1151591 (this url emerges when entering the search parameter "Fondazione Amici di AMCA" on https://www.zefix.ch/en/search/entity/welcome
import requests
from bs4 import BeautifulSoup
search_url = "https://www.zefix.ch/de/search/entity/welcome" search_params = {"name": "Fondazione Amici di AMCA"}
response = requests.get(search_url, params=search_params) soup = BeautifulSoup(response.content, "html.parser")
# Find the link to the details page for the first search result search_results = soup.find_all("div", {"class": "result-entry"}) if search_results:
first_result = search_results[0]
details_link = first_result.find("a", {"class": "list-group-item-1"})
if details_link:
details_url = f"https://www.zefix.ch{details_link['href']}"
# Visit the details page and extract the legal form
details_response = requests.get(details_url)
details_soup = BeautifulSoup(details_response.content, "html.parser")
legal_form = details_soup.find("td", text="Rechtsform:").find_next_sibling("td").get_text(strip=True)
print(f"The legal form of Fondazione Amici di AMCA is: {legal_form}")
else:
print("No link to details page found.") else:
error_message = soup.find("div", {"class": "no-result-message"})
if error_message:
print(error_message.text.strip())
else:
print("No search results found.")
2
Answers
You can use their Ajax API to get the legal form of the entity from search query:
Prints:
The data you’re looking for is loaded dynamically using javascript so you can’t get it directly using
requests
. You need to find the API call (it can be found in the Developer console in your browser under Network/XHR. You may have to do some more research if you’re not familiar with those tools).Once you do that, you get back a JSON which you can easily parse. Note, I did it in English so you may want to change that:
The data is hiding in
zf['list']
which is a one-element list of dictionaries. For example:will return: