skip to Main Content

I try to extract all the data for every school on the following site:

https://schulfinder.kultus-bw.de/

My code is this:

import requests
from selenium import webdriver
from bs4 import BeautifulSoup
from requests import get
from selenium.webdriver.common.by import By
import json

url = "https://schulfinder.kultus-bw.de/api/school?uuid=81af189c-7bc0-44a3-8c9f-73e6d6e50fdb&_=1675072758525"

payload = {}
headers = {}

response = requests.request("GET", url, headers=headers, data=payload)

print(response.text)

Output is this:

{
  "outpost_number": "0",
  "name": "Gartenschule Grundschule Ebnat",
  "street": "Abt-Angehrn-Str.",
  "house_number": "5",
  "postcode": "73432",
  "city": "Aalen",
  "phone": "+49736796700",
  "fax": "+497367967016",
  "email": "[email protected]",
  "website": null,
  "tablet_tranche": null,
  "tablet_platform": null,
  "tablet_branches": null,
  "tablet_trades": null,
  "lat": 48.80094,
  "lng": 10.18761,
  "official": 0,
  "branches": [
    {
      "branch_id": 12110,
      "acronym": "GS",
      "description_long": "Grundschule"
    }
  ],
  "trades": []
}

I got the code via Chrome Inspector Network and requested the URL per Postman. My problem is, that I just get the Info for one school, and I can’t find out how to request all the schools.

2

Answers


  1. Simply use the correct endpoint:

    https://schulfinder.kultus-bw.de/api/schools?distance=1&outposts=1&owner=&school_kind=&term=&types=&work_schedule=&_=1675079497084
    

    That will give you a list of schools, that could be used to request further data via your endpoint from question (https://schulfinder.kultus-bw.de/api/school?…) using the uuid.

    [{"uuid":"50de01a4-503d-44d1-af4b-a6031a022b85","outpost_number":"0","name":"Grundschule Aach","city":"Aach","lat":47.84399,"lng":8.85067,"official":0,"marker_class":"marker green","marker_label":"G","website":null},{"uuid":"8818037f-9aed-4860-b42e-8a49b1403c02","outpost_number":"0","name":"Braunenbergschule Grundschule Wasseralfingen","city":"Aalen","lat":48.8612,"lng":10.11191,"official":0,"marker_class":"marker green","marker_label":"G","website":null},...]
    

    Be aware, that the result is limited to 500 and you have to use and filters and combine results to get all of them.:

    Das Suchlimit wurde erreicht. Mehr als 500 Treffer werden nicht angezeigt. Bitte verfeinern Sie Ihre Suche indem Sie z. B. einen Ort angeben.

    Example

    import requests
    
    url = 'https://schulfinder.kultus-bw.de/api/schools?distance=1&outposts=1&owner=&school_kind=&term=&types=&work_schedule=&_=1675079497084'
    
    data = []
    
    for uuid in [item['uuid'] for item in requests.get(url).json()]:
        url = url = f'https://schulfinder.kultus-bw.de/api/school?uuid={uuid}&_=1675072758525'
        data.append(
            requests.get(url).json()
        )
    
    data
    

    Output

    [{'outpost_number': '0', 'name': 'Grundschule Aach', 'street': 'Schulstr.', 'house_number': '5', 'postcode': '78267', 'city': 'Aach', 'phone': '+4977741442', 'fax': None, 'email': '[email protected]', 'website': None, 'tablet_tranche': None, 'tablet_platform': None, 'tablet_branches': None, 'tablet_trades': None, 'lat': 47.84399, 'lng': 8.85067, 'official': 0, 'branches': [{'branch_id': 12110, 'acronym': 'GS', 'description_long': 'Grundschule'}], 'trades': []}, {'outpost_number': '0', 'name': 'Braunenbergschule Grundschule Wasseralfingen', 'street': 'Steinstr.', 'house_number': '38', 'postcode': '73433', 'city': 'Aalen', 'phone': '+49736197700', 'fax': '+497361977019', 'email': '[email protected]', 'website': 'http://www.braunenbergschule.de', 'tablet_tranche': None, 'tablet_platform': None, 'tablet_branches': None, 'tablet_trades': None, 'lat': 48.8612, 'lng': 10.11191, 'official': 0, 'branches': [{'branch_id': 12110, 'acronym': 'GS', 'description_long': 'Grundschule'}], 'trades': []},...]
    
    Login or Signup to reply.
  2. In addition to the answer already given.

    To get all the search criteria for the GET request to the API, you can parse the main page contents using BeautifulSoup you’ve already imported:

    from bs4 import BeautifulSoup
    import requests
    
    search_page_url = "https://schulfinder.kultus-bw.de"
    page_contents = requests.request("GET", search_page_url).text
    
    parsed_html = BeautifulSoup(page_contents, features="html.parser")
    input_elements = parsed_html.body.find_all('input')
    search_params = list(map(lambda x: (x.get('name'), x.get('type'), x.get('value')), input_elements))
    

    search_params contains tuples of a name, type, and value. It should give you insights into parameters and their possible values.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search