skip to Main Content

I am a newer programmer and I am trying to call the Github API to get a list of the most starred python projects, ordered by number of stars. Every time I run the program, rerunning the request, I am never getting the same result. Throwing everything into a dataframe and sorting by number of stars, every run of the program is giving me a different number. I’m not sure if I am doing something wrong with my code or not understanding the github api.

```
query_url = "https://api.github.com/search/repositories?q=language:python&sort=stars&order=desc"


headers = {'Authorization': f'token {API_KEY}'}
r = requests.get(query_url, headers=headers)
result = r.json() # pprint(r.json())
```

2

Answers


  1. Chosen as BEST ANSWER

    Problem: Pagination

    Solution: Sort through all pages Code:

    def fetch_all_pages(url, headers):
        all_items = []
        page = 1
        while True:
            response = requests.get(url, headers=headers, params={'q': 'language:python', 'sort': 'stars', 'order': 'desc', 'page': page})
            if response.status_code != 200:
                print(f"Error: {response.status_code}")
                return None
            
            result = response.json()
            items = result.get('items', [])
            
            if not items:
                break
            
            all_items.extend(items)
            page += 1
            
        return all_items
    

  2. how about if using pagination and breaking it down into several pages? i think the Github API there’s a parameter to achieve this using per_page and page parameters such as :

    import requests
    
    
    def get_most_starred_python(page=None, per_page=None):
        base_url = "https://api.github.com/search/repositories"
        params = {
            "q": "language:python",
            "sort": "stars",
            "order": "desc",
            "page": page,
            "per_page": per_page,
        }
    
        resp = requests.get(base_url, params)
        data = resp.json()
        return data.get("items", [])
    
    # by default if it was set to None, the page number would be set to 1
    # and the per_page would be set to 30 but you can adjust accordingly your needs
    all_repos = get_most_starred_python(page=1, per_page=20)
    for repo in all_repos:
        print(f"Name of repo: {repo['name']}")
        print(f"Number of stars: {repo['stargazers_count']}")
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search