skip to Main Content

I’m trying to extract information from the list called data, but I get error trying to access phone storage (eg 4GB) and status (whether brand new or used): AttributeError: 'list' object has no attribute 'values'. After printing the final list below, I will like to export it to csv. The structure of data list is like this:

[{'admin_info': {}, 'as_top': False, 'attrs': [{'name': 'Condition', 'value': 'Used', 'unit': None}, {'name': 'Screen Size', 'value': '4-5 inches', 'unit': None}, {'name': 'RAM', 'value': '3 GB', 'unit': None}], 'badge_info': {}, 'can_view_contacts': True, 'category_name': 'Mobile Phones', 'category_slug': 'mobile-phones', 'fb_view_content_data': {'content_name': 'Apple iPhone XS Max 64 GB Gold', 'content_category': 'Mobile Phones', 'content_ids': ['4ZjsTfDGPKwV6yaBawTErVfl'], 'content_type': 'product', 'value': '0.09', 'currency': 'USD'}, 'guid': '4ZjsTfDGPKwV6yaBawTErVfl', 'id': 35482354, 'image_obj': {'url': 'https://pictures-nigeria.jijistatic.com/117119013_MzAwLTUzMy1mZjI1MjBmMmUz.jpg', 'center': None}, 'images_count': 2, 'is_boost': False, 'is_cv': False, 'is_job': False, 'is_owner': False, 'is_top': False, 'message_url': '/login.html?type=reply-form&url=%2Fakure-south%2Fmobile-phones%2Fapple-iphone-xs-max-64-gb-gold-4ZjsTfDGPKwV6yaBawTErVfl.html', 'price_obj': {'value': 215000, 'view': '₦ 215,000', 'period': None}, 'region_id': 62, 'region_item_text': 'Ondo, Akure', 'region_name': 'Akure', 'region_parent_name': 'Ondo State', 'region_slug': 'akure', 'short_description': 'IPhone Xsmax 64gb || 64gb || Color: Gold || Bh :92% || neatness : slighyly negotiable', 'slug': 'apple-iphone-xs-max-64-gb-gold', 'status': 'active', 'title': 'Apple iPhone XS Max 64 GB Gold', 'title_labels': [], 'tops_count': 0, 'url': '/akure-south/mobile-phones/apple-iphone-xs-max-64-gb-gold-4ZjsTfDGPKwV6yaBawTErVfl.html', 'user_id': 7468366, 'user_phone': '07033124785', 'images': [{'url': 'https://pictures-nigeria.jijistatic.com/117119013_MzAwLTUzMy1mZjI1MjBmMmUz.jpg', 'center': None, 'size': [300, 533]}], 'paid_info': {}}]

Any help is appreciated.

import requests
from bs4 import BeautifulSoup

# list to hold all results
data = []

# for multiple pages iterate in range from - to
for i in range(0, 40):
    # increase page by number of iteration
    url = f'https://jiji.ng/api_web/v1/listing?slug=mobile-phones&init_page=true&page={i}'
    # extend list with all items per page
    data.extend(requests.get(url).json()['adverts_list']['adverts'])

    for name in data:
        name = name.get('title')
    for location in data:
        location = location.get('region_name')
    for state in data:
        state = state.get('region_parent_name')
    for price in data:
        price = price.get('price_obj')['value']
    for storage in data:
        storage = storage.get('attrs').values()
    for status in data:
        status = status.get('attrs')[0][2]
        print([name, location, state, price, storage, status])

2

Answers


  1. Chosen as BEST ANSWER

    An improvement of the answer given by @Jack Fleeting. I was able to get exactly what I wanted thus:

    import requests
    
    url = 'https://jiji.ng/api_web/v1/listing?slug=mobile-phones&init_page=true&page=1&webp=true'
    
    data = requests.get(url).json()['adverts_list']['adverts']
    
    for x in data:
        phone_name = x['title']
        for t in data:
            storage = ''
            st = [a['value'] for a in t['attrs']]
            if len(st) == 3:
                storage += st[2]
            else:
                storage += "NA"
    
        price = x['price_obj']['value']
        city = x['region_name']
        state = x['region_parent_name'] 
        condition = x['attrs'][0]['value']
        phone_info = [phone_name, storage, price, city, state, condition]
    
        print(phone_info)
    

    and some of the output is:

    ['New Google Pixel 7 Pro 256 GB White', '2 GB', 620000, 'Ikeja', 'Lagos State', 'Brand New']
    ['New Google Pixel 7 128 GB Black', '2 GB', 380000, 'Ikeja', 'Lagos State', 'Brand New']
    ['Samsung Galaxy Note 9 128 GB Blue', '2 GB', 135000, 'Alimosho', 'Lagos State', 'Used']
    ['Google Pixel 6a 128 GB White', '2 GB', 185000, 'Ikeja', 'Lagos State', 'Used']
    ['New Apple iPhone 14 Pro Max 256 GB Purple', '2 GB', 900000, 'Ikeja', 'Lagos State', 'Brand New']       
    ['Samsung Galaxy S9 Plus 64 GB Black', '2 GB', 110000, 'Benin City', 'Edo State', 'Used']
    

  2. I tried it only with the first page, but if I understand you correclty, you are looking for something like the below.

    One thing to notice, not all entries have a storage value, so in those cases I used "NA" instead:

    req = requests.get(url).json()
    for t in req['adverts_list']['adverts']:
        storage = ''
        st = [a['value'] for a in t['attrs']]
        if len(st)==3:
            storage+= st[2]
        else:
            storage+="NA"
        data.append([t['title'],t['region_name'],t['region_parent_name'],storage,t['status']])
    

    EDIT:
    Start of output from the first page:

    ['New Samsung Galaxy S23 Ultra 256 GB', 'Lekki', 'Lagos State', 'NA', 'active']
    ['New Tecno Camon 19 128 GB', 'Lekki', 'Lagos State', '4 GB', 'active']
    ['New Oppo Find X5 256 GB White', 'Ikeja', 'Lagos State', '12 GB', 'active']
    ['New Tecno Camon 19 128 GB', 'Lekki', 'Lagos State', '4 GB', 'active']
    ['New Samsung Galaxy S23 Ultra 256 GB', 'Lekki', 'Lagos State', 'NA', 'active']
    ['Samsung Galaxy Note 8 128 GB Black', 'Ikeja', 'Lagos State', '6 GB', 'active']
    ['Samsung Galaxy S8 Plus 64 GB Black', 'Ikeja', 'Lagos State', '4 GB', 'active']
    

    etc.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search