Question posted in Json
Our archive of expertly curated questions and answers provides insights and solutions to common problems related to this popular data interchange format. From parsing and manipulating JSON data to integrating it with various programming languages and web services, our archive has got you covered. Start exploring today and take your JSON skills to the next level

Python: can not save all links to a dict in json file, only the last one

Alberi
June 22, 2023
152 views
2 votes
3 Answers

I wrote this code in python to get all the links and to put them in a json file, but for some reason i am only getting the last link (website and class see in the code). Any ideas, why is it not working properly?

import requests
from bs4 import BeautifulSoup
import json

headers = {
>     "Accept": "*/*",
>     "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0Safari/537.36"
> }

number = 0

for page_number in range(1, 2):
    url = f"https://www.sothebysrealty.com/eng/associates/int/{page_number}-pg"
    req = requests.get(url, headers=headers)
    src = req.text
    soup = BeautifulSoup(src, "lxml")
    name_link = soup.find_all("a", class_="Entities-card__cta btn u-text-uppercase u-color-sir-blue palm--hide")

    all_links_dict = {}
    for item in name_link:
        value_links = ("https://www.sothebysrealty.com" + item.get("href"))

    all_links_dict[number + 1] = value_links

    with open("all_links_dict.json", "w", encoding="utf-8-sig") as file:
    json.dump(all_links_dict, file, indent=4, ensure_ascii=False)

Answers

- rochard4u
- June 22, 2023 at 11:55 am
- 0 votes
0
This is because all_links_dict[number + 1] = value_links is not in your for item in name_link loop. Hence you only add to the dict once.

You must also increment number in the loop.
```
for item in name_link:
    value_links = ("https://www.sothebysrealty.com" + item.get("href"))
    all_links_dict[number] = value_links
    number += 1
```
Login or Signup to reply.

- Edward
- June 22, 2023 at 12:01 pm
- 0 votes
0
There’s a few things I notice here.

Firstly, your page numbers range(1,2). In python the stop is not included in the range so the for loop will only run once with a page number of 1.

Secondly, your all_links_dict = {} line is resetting the dictionary to an empty dict each time.

Lastly, you are opening the file each iteration of the loop in 'w' mode and then json dumping which will overwrite any previous contents.

I would advise to adjust your range, move the dictionary initialisation out of the for loop and dump the dictionary to your file once at the end outside of the for loop.

Login or Signup to reply.

- Driftr95
- June 22, 2023 at 12:14 pm
- 0 votes
0
There are several issues:
```
    all_links_dict = {}
    for item in name_link:
        value_links = ("https://www.sothebysrealty.com" + item.get("href"))
    all_links_dict[number + 1] = value_links
```
You’re not updating number at any point, so only a value keyed to 1 gets saved every loop. Either use all_links_dict[page_number] = value_links as page_number updates itself in each iteration, or add a line to increment number.
```
    all_links_dict = {}
    for item in name_link:
        value_links = ("https://www.sothebysrealty.com" + item.get("href"))
        number += 1
        all_links_dict[number] = value_links
```
```
    with open("all_links_dict.json", "w", encoding="utf-8-sig") as file:
        json.dump(all_links_dict, file, indent=4, ensure_ascii=False)
```
You should use mode="a" instead of "w" to append instead of over-writing in each iteration. However, you should be aware that the file will not be valid json (ie, you can’t decode it any more) after a second iteration. Might be better to have a list that you append to every time and then write the list to json after (or at the end of) the loop.

There’s also the fact that for page_number in range(1, 2): will only lead to one iteration (where page_number is 1), so even with all this, only one page’s info will be saved unless the range is expanded to include more pages.
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.