I made a python script that crawls the web page at ‘http://spys.one/en/socks-proxy-list/‘ and fetches all the IP addresses there, then checks if they’re up and finally returns a list of all live Ip addresses. then there’s a second script which connects to telegrams bot API and uses the first script to show the user a list of recent socks5 working servers.
I’m an amateur programmer and new to Python programming language. I made these scripts for exercise. feel free to point out my mistakes and show the ways I can improve my code. thanks in advance!
import requests as req
import re
import socket
def is_open(ip, port):
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
try:
s.connect((ip, int(port)))
s.shutdown(2)
return True
except:
return False
# Initial settings:
url = 'http://spys.one/en/socks-proxy-list/'
regex = 'd{1,4}.d{1,4}.d{1,4}.d{1,4}'
# Request URL
response = req.get(url).text
# Extract IP and port from source
p = re.compile(regex)
results = p.findall(response)
# Fetch and check the first 20 IPs
alive = []
for i in range(0, 20):
if is_open(results[i], '1080'):
alive.append(results[i])
def gimmeprox():
links = []
for x in range(0,len(alive)):
links.append('https://t.me/proxy?server=' + alive[int(x)] + '&port=1080')
payload = 'nn'.join(links)
return payload
When I run this code and the other (bot) script, everything works fine, but as soon as I put it on the web (heroku, etc.) it crashes on line 30:
line 30, in <module>
if is_open(results[i], '1080'):
with the error ""
.
2
Answers
Short answer: “results” does not always have 20 items. So, you’re basically asking for something that doesn’t exist.
You should always check the length before iterating over; or in these scenarios when you don’t need the index, simply iterate over the actual items rather than the index.
When you run
and
len(results)
is <20, you will eventually try to accessresults[len(results)]
, resulting in an IndexError. To prevent this, choose the lower value oflen(results)
and 20 as your argument forrange
, like so:min(len(results), 20)
.An alternative is to loop through all values of
results
and break when you have 20.