I’m trying to extract information from the list called data, but I get error trying to access phone storage (eg 4GB) and status (whether brand new or used): AttributeError: 'list' object has no attribute 'values'
. After printing the final list below, I will like to export it to csv. The structure of data list is like this:
[{'admin_info': {}, 'as_top': False, 'attrs': [{'name': 'Condition', 'value': 'Used', 'unit': None}, {'name': 'Screen Size', 'value': '4-5 inches', 'unit': None}, {'name': 'RAM', 'value': '3 GB', 'unit': None}], 'badge_info': {}, 'can_view_contacts': True, 'category_name': 'Mobile Phones', 'category_slug': 'mobile-phones', 'fb_view_content_data': {'content_name': 'Apple iPhone XS Max 64 GB Gold', 'content_category': 'Mobile Phones', 'content_ids': ['4ZjsTfDGPKwV6yaBawTErVfl'], 'content_type': 'product', 'value': '0.09', 'currency': 'USD'}, 'guid': '4ZjsTfDGPKwV6yaBawTErVfl', 'id': 35482354, 'image_obj': {'url': 'https://pictures-nigeria.jijistatic.com/117119013_MzAwLTUzMy1mZjI1MjBmMmUz.jpg', 'center': None}, 'images_count': 2, 'is_boost': False, 'is_cv': False, 'is_job': False, 'is_owner': False, 'is_top': False, 'message_url': '/login.html?type=reply-form&url=%2Fakure-south%2Fmobile-phones%2Fapple-iphone-xs-max-64-gb-gold-4ZjsTfDGPKwV6yaBawTErVfl.html', 'price_obj': {'value': 215000, 'view': '₦ 215,000', 'period': None}, 'region_id': 62, 'region_item_text': 'Ondo, Akure', 'region_name': 'Akure', 'region_parent_name': 'Ondo State', 'region_slug': 'akure', 'short_description': 'IPhone Xsmax 64gb || 64gb || Color: Gold || Bh :92% || neatness : slighyly negotiable', 'slug': 'apple-iphone-xs-max-64-gb-gold', 'status': 'active', 'title': 'Apple iPhone XS Max 64 GB Gold', 'title_labels': [], 'tops_count': 0, 'url': '/akure-south/mobile-phones/apple-iphone-xs-max-64-gb-gold-4ZjsTfDGPKwV6yaBawTErVfl.html', 'user_id': 7468366, 'user_phone': '07033124785', 'images': [{'url': 'https://pictures-nigeria.jijistatic.com/117119013_MzAwLTUzMy1mZjI1MjBmMmUz.jpg', 'center': None, 'size': [300, 533]}], 'paid_info': {}}]
Any help is appreciated.
import requests
from bs4 import BeautifulSoup
# list to hold all results
data = []
# for multiple pages iterate in range from - to
for i in range(0, 40):
# increase page by number of iteration
url = f'https://jiji.ng/api_web/v1/listing?slug=mobile-phones&init_page=true&page={i}'
# extend list with all items per page
data.extend(requests.get(url).json()['adverts_list']['adverts'])
for name in data:
name = name.get('title')
for location in data:
location = location.get('region_name')
for state in data:
state = state.get('region_parent_name')
for price in data:
price = price.get('price_obj')['value']
for storage in data:
storage = storage.get('attrs').values()
for status in data:
status = status.get('attrs')[0][2]
print([name, location, state, price, storage, status])
2
Answers
An improvement of the answer given by @Jack Fleeting. I was able to get exactly what I wanted thus:
and some of the output is:
I tried it only with the first page, but if I understand you correclty, you are looking for something like the below.
One thing to notice, not all entries have a storage value, so in those cases I used "NA" instead:
EDIT:
Start of output from the first page:
etc.