I will use this code to explain my doubt:
Using the url without sold filter
import requests
from bs4 import BeautifulSoup
import pandas as pd
import numpy as np
url = "https://www.ebay.es/sch/i.html?_from=R40&_trksid=p2334524.m570.l1313&_nkw=iphone+x&_sacat=0&LH_TitleDesc=0&_udlo=400&LH_Auction=1&_osacat=0&_odkw=Pok%C3%A9mon+card+Charizard+4%2F102&rt=nc"
r = requests.get(url)
soup = BeautifulSoup(r.text, "html.parser")
results = soup.find_all("div", {"class": "s-item__info clearfix"})
print(len(results))
Output: 12
Then I use the url where there are only sold items, I check the html and the class is the same.
import requests
from bs4 import BeautifulSoup
import pandas as pd
import numpy as np
url = "https://www.ebay.es/sch/i.html?_from=R40&_nkw=iphone+x&_sacat=0&LH_TitleDesc=0&_udlo=400&LH_Auction=1&rt=nc&LH_Sold=1&LH_Complete=1"
r = requests.get(url)
soup = BeautifulSoup(r.text, "html.parser")
results = soup.find_all("div", {"class": "s-item__info clearfix"})
print(len(results))
Output: 0
I tried different classes but I can´t never obtain something.
Thanks.
2
Answers
It was a captcha problem. tHanks!
There are several reasons why the output will be empty.
This is often because the site may think it is being accessed by a bot if
requests
is the defaultuser-agent
in therequests
library ispython-requests
, this can be prevented by passing your actualUser-Agent
to the "headers". This seems to be a reason why you get a CAPTCHA.The next step would be if User-Agent passing didn’t work would be to use rotate
user-agent
, for example, to switch between PC, mobile, and tablet, as well as between browsers e.g. Chrome, Firefox, Safari, Edge and so on.Also if passing request headers is not enough. That’s when you can try using proxies (ideally residential) in combination with request headers.
An additional step is to use CAPTCHA solver, for example,
2captcha
. It allows bypassing all possible CAPTCHAs depending on the target website.Check the code using
BeautifulSoup
in online IDE.Example output:
Also you can using official eBay Finding API, has a limit of 5000 requests per day, or third-party API like Ebay Organic Results API from SerpApi. It’s a paid API with a free plan that handles blocks and parsing on their backend.
Example code with pagination:
Output: