Scraping Medium's clap data without Selenium - Nginx

SefaKalkan
July 22, 2022
269 views
0 votes
2 Answers

I’m trying to scrape clap data from medium let’s say this is the link. When I inspect it looks like in this photo.

My code looks like this :

URL = "https://medium.com/@xdxxxx4713/basic-settings-of-nginx-aeace532534f"
page = requests.get(URL)
soup = BeautifulSoup(page.content, 'html.parser')
print(soup.prettify())

There’s only — in the output where there should be the value of the clap. If it’s possible how can I scrape the clap value without using Selenium? After getting the value with HTML request "requests.get(URL)" I can do the rest. The html request returns empty at where the clap value should be.

I tried to use urllib library but I have Non-ASCII characters on my links
I tried using BeautifulSoup’s findChildren library.
I tried using BeautifulSoup’s descendants traverse algorithm.

Answers

Chosen as BEST ANSWER
- SefaKalkan
- July 22, 2022 at 6:23 pm
- 0 votes
0
As @esqew mentioned on commands. There's an API for that but It didn't work for me. But I was inspired by the API code. Here's my code :
```
    aditionalPage = requests.get(pages).content.decode("utf-8")
    claps = aditionalPage.split("clapCount":")[1]
    endIndex = claps.index(",")
    claps = int(claps[0:endIndex])
```

(Edit)

It’s possible, try the code below:

import requests

data = [{"operationName":"ClapCountQuery","variables":{"postId":"aeace532534f"},"query":"query ClapCountQuery($postId: ID!) {n  postResult(id: $postId) {n    __typenamen    ... on Post {n      idn      clapCountn      __typenamen    }n  }n}n"}]
r = requests.post('https://medium.com/_/graphql', json=data)
print(r.json()[0]['data']['postResult']['clapCount'])

This will return:

Please signup or login to give your own answer.

Click here to cancel reply.

Scraping Medium's clap data without Selenium – Nginx

Answers