skip to Main Content

I’m trying to scrape this site. I used the following code:

import requests
import json
from bs4 import BeautifulSoup

api_url ='https://seniorcarefinder.com/Providers/List'

headers= {
    "Content-Type":"application/json; charset=utf-8",
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:108.0) Gecko/20100101 Firefox/108.0"}

body_first_page={"Services":["Independent Living","Assisted Living","Long-Term Care / Skilled Nursing","Home Care (Non-Medical)","Home Health Care (Medicare-Certified)","Hospice","Adult Day Services","Active Adult Living"],"StarRatings":[],"PageNumber":1,"Location":"Colorado Springs, CO","Geography":{"Latitude":38.833882,"Longitude":-104.821363},"ProximityInMiles":30,"SortBy":"Verified"}
res = requests.post(api_url,data=json.dumps(body_first_page),headers=headers)
soup = BeautifulSoup(res.text,'html.parser')

However, the resulting soup is in json, so I cannot parse it using .find methods of Beatifulsoup. How can I have it in the normal html, so that I can parse it using bs4 .find() and .find_all() methods?

2

Answers


  1. I’d recommend actually just using the JSON and converting that to a dict since that’s basically the structure that BS4 uses for HTML.

    With the json library, you can convert JSON to a dict and then use regular .get() methods to find the info you’re looking for

    https://www.w3schools.com/python/python_json.asp

    Login or Signup to reply.
  2. Why not using this structured data? Using pandas you can simply create a dataframe:

    pd.DataFrame(
        requests.post(api_url,data=json.dumps(body_first_page),headers=headers)
        .json()['Results']
    )
    

    Example

    import pandas as pd
    import requests
    import json
    api_url ='https://seniorcarefinder.com/Providers/List'
    
    headers= {
        "Content-Type":"application/json; charset=utf-8",
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:108.0) Gecko/20100101 Firefox/108.0"}
    
    body_first_page={"Services":["Independent Living","Assisted Living","Long-Term Care / Skilled Nursing","Home Care (Non-Medical)","Home Health Care (Medicare-Certified)","Hospice","Adult Day Services","Active Adult Living"],"StarRatings":[],"PageNumber":1,"Location":"Colorado Springs, CO","Geography":{"Latitude":38.833882,"Longitude":-104.821363},"ProximityInMiles":30,"SortBy":"Verified"}
    pd.DataFrame(
        requests.post(api_url,data=json.dumps(body_first_page),headers=headers).json()['Results']
    )
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search