skip to Main Content

This code has been working for many years. A few years back, I faced similar issues where in there was change in API and I do not remember how I could debug it and see that extra parameter for page number was added. Now again there seems to be some slight change and my program is not able to fetch data. Any help shall be appreciated.


import requests
import pandas as pd
import sys
import numpy as np
from pandas.io.json import json_normalize
pdate ="20230721"               # starting date
date ="20230724"            # till this date
url = 'https://api.bseindia.com/BseIndiaAPI/api/AnnGetData/w'

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36'}

payload = {
'Pageno': 1,
'strCat': '-1',
'strPrevDate': pdate,
'strScrip': '',
'strSearch': 'P',
'strToDate':   date,
'strType': 'C'}

data = []
should_fetch_next_page = True
while should_fetch_next_page:
    print(f"Fetching page {payload['Pageno']} ...")
    jsonData = requests.get(url, headers=headers, params=payload).json()
    if jsonData["Table"]:
        data.extend(jsonData["Table"])
        payload['Pageno'] += 1
        # every thing we want to do

    else:
        should_fetch_next_page = False

df = pd.DataFrame(data)
print(df)

2

Answers


  1. The API url changed, also the server need in HTTP header Referer:

    import requests
    import pandas as pd
    
    pdate = "20230721"  # starting date
    date = "20230724"  # till this date
    url = "https://api.bseindia.com/BseIndiaAPI/api/AnnSubCategoryGetData/w"
    
    headers = {
        "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/115.0",
        "Referer": "https://www.bseindia.com/",
    }
    
    payload = {
        "pageno": 1,
        "strCat": "-1",
        "strPrevDate": pdate,
        "strScrip": "",
        "strSearch": "P",
        "strToDate": pdate,
        "strType": "C",
        "subcategory": "",
    }
    
    data = []
    should_fetch_next_page = True
    while should_fetch_next_page:
        print(f"Fetching page {payload['pageno']} ...")
        jsonData = requests.get(url, headers=headers, params=payload).json()
        if jsonData["Table"]:
            data.extend(jsonData["Table"])
            payload["pageno"] += 1
            # every thing we want to do
    
        else:
            should_fetch_next_page = False
    
    df = pd.DataFrame(data)
    print(df)
    

    Prints:

    Fetching page 1 ...
    Fetching page 2 ...
    Fetching page 3 ...
    
    ...
    

    NOTE: To debug the future problems, the base URL for this API request is https://www.bseindia.com/corporates/ann.html. So open the URL in your browser and open Web Developer Tools -> Network tab and reload the page.

    You should see the API url there + the required parameters/HTTP headers/cookies/etc…

    Login or Signup to reply.
  2. Hey @Andrej Kesely your code is working fine, thanks. How you figured out the issue as bse has blocked the direct access of api and also how to directly access the api link provided by bse as when I hit the url (explain : https://api.bseindia.com/BseIndiaAPI/api/AnnSubCategoryGetData/w?pageno=1&strCat=-1&strPrevDate=20230731&strScrip=&strSearch=P&strToDate=20230731&strType=C&subcategory=) it re-direct me to another site (explain : https://www.bseindia.com/members/showinterest.aspx)

    Please, can you share your inputs on it, Thank You

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search