Python how to get the tweet data using specific word in csv file and put it in new csv file - Twitter API

Ranee
May 27, 2018
216 views
3 votes
3 Answers

I have data twitter in a CSV file (that I’m mining with a Python API). I get around 1000 lines of data. Now I want to shorten the tweet data using the specific Indonesian words “macet” or “kecelakaan” (in English “traffic” or “accident”) and put the matching rows into a new separate CSV file, just like in Excel using find all.

The sample data twitter is example1.csv and the new file which will be created after the search of the word “macet” or “kecelakaan” is example2.csv. But there is no result.

import re
import csv

with open('example1.csv', 'r') as csvFile:
    reader = csv.reader(csvFile)

if re.search(r'macet', reader):
    for row in reader:
        myData = list(row)
        print(row)

newFile = open('example2.csv', 'w')
with newFile:
    writer = csv.writer(newFile)
    writer.writerows(myData)

print("Writing complete")

I use spyder for environment Python 3.6.

The CSV file is already in the same folder with Spyder. Here is the screen capture image of my CSV twitter data

myCSVtwitterData

updated : Sample of csv file. OS using : Windows

Answers

- PM2Ring
- May 27, 2018 at 10:12 am
- 0 votes
0
There are a couple of problems with your code.

In your reading loop you are passing a csv.reader object to re.search, but it doesn’t know how to search that object. You need to pass it text or byte strings.

The line
```
myData = list(row)
```
converts row into a new list and saves it to myData, but it’s already a list, so no conversion is necessary. And that line replaces the previous contents of myData, but you actually want to save all the matching rows. However, there’s no need to save the rows, you can just write them to the new file as you go.

Anyway, here’s a repaired version of your code. From the screen shot it looks like you only want to search the text in column 2 of the input data (which corresponds to column C in your spreadsheet). I’ve created a regex that searches for the whole words “macet” and “kecelakaan”, the “b” matches at word boundaries so we don’t get a match if “macet” or “kecelakaan” is part of a larger word.
```
import re
import csv

# Make a case-insensitive regex to match the words "macet" or "kecelakaan"
pattern = re.compile(r'bmacetb|bkecelakaanb', re.I)

with open('example1.csv', 'r', newline='') as csvFile, open('example2.csv', 'w', newline='') as newFile:
    reader = csv.reader(csvFile)
    writer = csv.writer(newFile)

    for row in reader:
        # Skip empty rows
        if not row:
            continue
        if pattern.search(row[2]):
            print(row)
            writer.writerow(row)

print("Writing complete")
```
I’ve just made a couple of improvements to that code. It now uses the newline='' arg to open the CSV files, and it skips any empty lines in the input CSV. And the regex now ignores the case when looking for matching words.
Login or Signup to reply.

- JeffProd
- May 27, 2018 at 10:33 am
- 0 votes
0
Not answering about Python. But if you have a Linux OS, you can do it in one command line :
```
grep -i "macet" exemple1.csv > exemple2.csv
```
-i is for ignore case, so it will also match “Macet”
Login or Signup to reply.

how is it~?
this code visit rows one by one
and find cells that contain a word in word_list
and write the value list on the row

import re
import csv

word_list = ['macet', 'kecelakaan']

with open('example1.csv', 'r') as csvFile, open('example2.csv', 'w') as newFile:

    reader = csv.reader(csvFile)
    writer = csv.writer(newFile, lineterminator='n')

    for row in reader:
        new_row = [content for content in row if any(map(lambda word: word in content, word_list))]
        if(new_row != []):
            print(new_row)
            writer.writerow(new_row)

print("Writing complete")

Please signup or login to give your own answer.

Click here to cancel reply.

Python how to get the tweet data using specific word in csv file and put it in new csv file – Twitter API

Answers