I’m trying to access a protected page on twitter (for example my own like list) via urllib2 in Python, but this code always sends me back to the login page. Any idea why that is?
(I know I can use the twitter API and stuff, but want to learn in general how this is done)
Thanks,
Roy
The code:
url = "https://twitter.com/login"
protectedUrl = "https://twitter.com/username/likes
USER = "myTwitterUser"
PASS = "myTwitterPassword"
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
opener.addheaders = [('User-Agent', 'Mozilla/5.0'), ("Referer", "https://twitter.com")]
hdr = {'User-Agent': 'Mozilla/5.0', "Referer":"https://twitter.com"}
req = urllib2.Request(url, headers=hdr)
page = urllib2.urlopen(req)
html = page.read()
s = BeautifulSoup(html, "lxml")
AUTH_TOKEN = s.find(attrs={"name": "authenticity_token"})["value"]
login_details = {"session[username_or_email]": USER,
"session[password]": PASS,
"remember_me": 1,
"return_to_ssl": "true",
"scribe_log": "",
"redirect_after_login": "/",
"authenticity_token": AUTH_TOKEN
}
login_data = urllib.urlencode(login_details)
opener.open(url, login_data)
resp = opener.open(protectedUrl)
print resp.read()
2
Answers
You need to post to the correct url which is
"https://twitter.com/sessions"
, it is also essential to use theopener
when you make the initial request to get the=authenticity_token
sopage = opener.open(req)
in place ofpage = urllib2.urlopen(req)
so we get the cookies needed:If we run the code using one of my twitter accounts with no likes:
You can see we get successfully to the page we want.
Doing the same with a requests.Session() object, the code has a lot less going on:
From the experience I had with websites like this, you need to use complete HTTP headers including:
delete only the cookie from the header.
You need also to create a session and handle cookies as twitter must be like facebook. I personally like more to use “requests” as you can create a session and use cookies easily.
you can do something like this:
Hope this helps.