skip to Main Content

I’m trying to scrape some data from a website that detects live fotball odds drop and if there is a specific change in the HTML of the page,it will send me a notification to a Telegram bot that I’ve made..here is my code:

from distutils.command.clean import clean
import time
import requests
from bs4 import BeautifulSoup as bs
import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

ids_list=[]
game_urls=[] 
game_name=[]
gfix=[]
livecapper_url ="https://livecapper.ru/bet365/" #the website link

while(True):
    page=requests.get(livecapper_url,verify=False).text
    soup = bs(page , "html.parser")
    game_ids = soup.find_all(game_id=True) #getting the IDs of every football game
    for g in game_ids:
            x=g.get('game_id')
            ids_list.append(x)   #putting the IDs on a list

    for id in ids_list:
            game_url = f"https://livecapper.ru/bet365/event.php?id={id}" #the URL of every single football game
            game_urls.append(game_url)

    for g in game_urls:
            response=requests.get(g).text
            soup = bs(response, "html.parser")
            for t in soup.find_all("td",class_=['red1','red2','red3'], limit=1): #detecting the change in HTML
                for g in soup.find_all("h1"):
                    game_name.append(g.get_text()) if g.get_text() not in game_name else game_name

    for f in game_name:
            game_url= 'https://api.telegram.org/botTOKEN/sendMessage?chat_id=-609XXXXXX&text=Fixed Alert : {}'.format(f) #sending notification to telegram bot
            if game_url not in gfix:
                gfix.append(game_url)
                requests.get(game_url)
            else:
                pass       

    ids_list.clear
    game_name.clear
    game_urls.clear
    time.sleep(1)

As you can see I’m using the While (True): method to run the code 24/7 but the problem is that each iteration lasts twice as long as the previous one approximately .

e.g.
1st iteration=10s | 2nd iteration=20s | 3rd iteration=40s | 4th iteration=80s

What can I do to make all the iterations work as fast as possible?

2

Answers


  1. Change these:

        ids_list.clear
        game_name.clear
        game_urls.clear
    

    to:

        ids_list.clear()
        game_name.clear()
        game_urls.clear()
    

    Without the parentheses, you aren’t calling the methods, but are merely accessing them and then discarding them (i.e., it does nothing).

    Login or Signup to reply.
  2. There’s quite a few issues with the code, but ultimately the reason it takes longer each time is you continue to append to your lists, so after each iteration that list will grow bigger and bigger (included with duplicates). There’s a few things you could do:

    1. Put those initial empty list within your loop
    2. remove duplicates from the list so it’s not requesting the same thing multiple times in each iteration
    3. Correctly use .clear()

    I simply did 1, since what it looks like you want is to start each iteration with a clear list.

    from distutils.command.clean import clean
    import time
    import requests
    from bs4 import BeautifulSoup as bs
    import urllib3
    urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
    
    
    gfix=[]
    livecapper_url ="https://livecapper.ru/bet365/" #the website link
    
    while(True):
        ids_list=[]
        game_urls=[] 
        game_name=[]
        page=requests.get(livecapper_url,verify=False).text
        soup = bs(page , "html.parser")
        game_ids = soup.find_all(game_id=True) #getting the IDs of every football game
        for g in game_ids:
                x=g.get('game_id')
                ids_list.append(x)   #putting the IDs on a list
    
        for id in ids_list:
                game_url = f"https://livecapper.ru/bet365/event.php?id={id}" #the URL of every single football game
                game_urls.append(game_url)
    
        for g in game_urls:
                response=requests.get(g).text
                soup = bs(response, "html.parser")
                for t in soup.find_all("td",class_=['red1','red2','red3'], limit=1): #detecting the change in HTML
                    for g in soup.find_all("h1"):
                        game_name.append(g.get_text()) if g.get_text() not in game_name else game_name
    
        for f in game_name:
                game_url= 'https://api.telegram.org/botTOKEN/sendMessage?chat_id=-609XXXXXX&text=Fixed Alert : {}'.format(f) #sending notification to telegram bot
                if game_url not in gfix:
                    gfix.append(game_url)
                    requests.get(game_url)
                else:
                    pass       
    
        time.sleep(1)
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search