I’m currently trying to build a bot to grab details from my ebay listings, normally I would use the API but part of the data I need is in the description. So I am grabbing the page’s HTML data looking for a string inside it and then trying to format the data after it using while loops, but I think I have got a stuck loop. I believe it should be working.
It prints – print itemData but hangs after that, also does anyone know of a better way to do this?
import os
import urllib
itemURL = urllib.urlopen('http://www.ebay.co.uk/itm/161333231002')
itemDetails = itemURL.read()
toFind = 'Our Number'
ourPos = itemDetails.find(toFind) + 10
itemData = itemDetails[ourPos:ourPos+15]
print itemData
def Pos(Pos1, PageData):
while PageData[Pos1:Pos1+1] == ' ' or PageData[Pos1:Pos1+1]:
Pos1 = Pos1 + 1
PosS = Pos1
PosE = PosS
while PageData[PosE:PosE+1] != '<':
PosE = PosE + 1
print PageData[PosS:PosE]
if ourPos == -1:
print 'Not found'
else:
Pos(ourPos, itemData)
print Done
2
Answers
I’d say you have several ways to do this. Either you loop through the array of lines och you mash it to a string and run regex.
For instance in python 2.7 its something like this:
The second while loop is infinite, because the first while loop sets the value of PosE to something which is outside the range of length of PageData and thus the while loop condition “PageData[Pos1:Pos1+1] != ‘<‘” is always true, as PageData[Pos1:Pos1+1] = ” which is not equal to ‘<‘. Check for the values of Pos1 through the first while loop. You will get the answer.