skip to Main Content

I wanted to write a program to fetch tweets from Twitter and then do sentiment analysis. I wrote the following code and got the error even after importing all the necessary libraries. I’m relatively new to data science, so please help me.
I could not understand the reason for this error:

class TwitterClient(object):


def __init__(self):

    # keys and tokens from the Twitter Dev Console
    consumer_key = 'XXXXXXXXX'
    consumer_secret = 'XXXXXXXXX'
    access_token = 'XXXXXXXXX'
    access_token_secret = 'XXXXXXXXX'
    api = Api(consumer_key, consumer_secret, access_token, access_token_secret)

    def preprocess(tweet, ascii=True, ignore_rt_char=True, ignore_url=True, ignore_mention=True, ignore_hashtag=True,letter_only=True, remove_stopwords=True, min_tweet_len=3):
        sword = stopwords.words('english')

        if ascii:  # maybe remove lines with ANY non-ascii character
            for c in tweet:
                if not (0 < ord(c) < 127):
                    return ''

        tokens = tweet.lower().split()  # to lower, split
        res = []

        for token in tokens:
            if remove_stopwords and token in sword: # ignore stopword
                continue
            if ignore_rt_char and token == 'rt': # ignore 'retweet' symbol
                continue
            if ignore_url and token.startswith('https:'): # ignore url
                continue
            if ignore_mention and token.startswith('@'): # ignore mentions
                continue
            if ignore_hashtag and token.startswith('#'): # ignore hashtags
                continue
            if letter_only: # ignore digits
                if not token.isalpha():
                    continue
            elif token.isdigit(): # otherwise unify digits
                token = '<num>'

            res += token, # append token

        if min_tweet_len and len(res) < min_tweet_len: # ignore tweets few than n tokens
            return ''
        else:
            return ' '.join(res)

    for line in api.GetStreamSample():            
        if 'text' in line and line['lang'] == u'en': # step 1
            text = line['text'].encode('utf-8').replace('n', ' ') # step 2
            p_t = preprocess(text)

    # attempt authentication
    try:
        # create OAuthHandler object
        self.auth = OAuthHandler(consumer_key, consumer_secret)
        # set access token and secret
        self.auth.set_access_token(access_token, access_token_secret)
        # create tweepy API object to fetch tweets
        self.api = tweepy.API(self.auth)
    except:
        print("Error: Authentication Failed")

Assume all the necessary libraries are imported. The error is on line 69.

for line in api.GetStreamSample():            
    if 'text' in line and line['lang'] == u'en': # step 1
        text = line['text'].encode('utf-8').replace('n', ' ') # step 2
        p_t = preprocess(text)

I tried checking on the internet the reason for the error but could not get any solution.

Error was:

requests.exceptions.ChunkedEncodingError: ('Connection broken: IncompleteRead(0 bytes read, 512 more expected)', IncompleteRead(0 bytes read, 512 more expected))

I’m using Python 2.7 and requests version 2.14, the latest one.

3

Answers


  1. If you set stream to True when making a request, Requests cannot release the connection back to the pool unless you consume all the data or call Response.close. This can lead to inefficiency with connections. If you find yourself partially reading request bodies (or not reading them at all) while using stream=True, you should make the request within a with statement to ensure it’s always closed:

    with requests.get('http://httpbin.org/get', stream=True) as r:
        # Do things with the response here.
    
    Login or Signup to reply.
  2. I had the same problem but without stream, and as stone mini said, just apply “with” clause before to make sure your request is closed before a new request.

        with requests.request("POST", url_base, json=task, headers=headers) as report:
            print('report: ', report)
    
    Login or Signup to reply.
  3. actually the problem with your django2.7 or earlier version based application. that django versions by default allowed 2.5mb data upload memory size of request body.

    I was facing the same issue with django2.7 based application, I just updated the setting.py file of my django application where my urls(endpoints) were working.

    DATA_UPLOAD_MAX_MEMORY_SIZE = None
    

    I just add the above variable in my application’s settings.py file.
    you can also readout about that from here

    I’m pretty sure this will work for you.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search