skip to Main Content

I’m trying to get all the album names and prices from this website: https://vinilosalvaro.cl/tienda/

But with the following script I’m just getting one of them.

import requests
from bs4 import BeautifulSoup


URL = 'https://vinilosalvaro.cl/tienda/'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
page = requests.get(URL, headers=headers)

soup = BeautifulSoup(page.content, 'html.parser')
listado_productos = soup.find_all('ul', class_='products columns-3')

for listado_productos in listado_productos:
  titulos = listado_productos.find('h2', class_='woocommerce-loop-product__title').text.strip()
  precios = listado_productos.find('span', class_='woocommerce-Price-amount amount').text.strip()
  print(titulos)
  print(precios)

How to get all the album names and prices?

2

Answers


  1. Instead of find_all you can use find when searching for one specific ul tags. go ahead and change line 10 to:

    listado_productos = soup.find('ul', class_='products columns-3')

    Also, to get li childs, you should use find_all('li'), so change the line 12 to:

    for listado_productos in listado_productos.find_all('li'):

    Login or Signup to reply.
  2. Main issue is that your selection give you a ResultSet of one <ul> not all the <li> so your loop just iterate one time.

    Select your Elements more specific as also mentioned @benyamin payandeh or for example with css selectors:

    soup.select('ul.products li')
    

    In addition also some concepts like while loop for paging, .get_text(strip=True) and how to store your results over iterations in a more structured form like a list of dicts that you can simply transform in dataframes or process like you need.

    Example

    Be aware this will start from page 59, to show how the while loop works and break if there is no more page to scrape. Simply set URL to your default value to iterate all pages

    import requests
    from bs4 import BeautifulSoup
    
    URL = 'https://vinilosalvaro.cl/tienda/page/59/'
    headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
    
    data = []
    
    while True:
        page = requests.get(URL, headers=headers)
        soup = BeautifulSoup(page.content, 'html.parser')
    
        for listado_productos in soup.select('ul.products li'):
            data.append({
                'titulos' : listado_productos.h2.get_text(strip=True),
                'precios' : listado_productos.span.get_text(strip=True)
            })
        
        if soup.select_one('a.next'):
            URL = soup.select_one('a.next').get('href')
        else:
            break
    data
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search