Question posted in Json
Our archive of expertly curated questions and answers provides insights and solutions to common problems related to this popular data interchange format. From parsing and manipulating JSON data to integrating it with various programming languages and web services, our archive has got you covered. Start exploring today and take your JSON skills to the next level

How to polish return and save to individual JSON/csv file?

ArkLade
February 27, 2023
248 views
0 votes
2 Answers

Below code is from Using word2vec to classify words in categories and I need some help on input and return saveing. Any help would be greatly appreciated.

# Category -> words
data = {
  'Names': ['john','jay','dan','nathan','bob'],
  'Colors': ['yellow', 'red','green', 'oragne', 'purple'],
  'Places': ['tokyo','bejing','washington','mumbai'],
}
# Words -> category
categories = {word: key for key, words in data.items() for word in words}

# Load the whole embedding matrix
embeddings_index = {}
with open('glove.6B.100d.txt', encoding='utf-8') as f:
  for line in f:
    values = line.split()
    word = values[0]
    embed = np.array(values[1:], dtype=np.float32)
    embeddings_index[word] = embed
print('Loaded %s word vectors.' % len(embeddings_index))
# Embeddings for available words
data_embeddings = {key: value for key, value in embeddings_index.items() if key in categories.keys()}

# Processing the query
def process(query):
  query_embed = embeddings_index[query]
  scores = {}
  for word, embed in data_embeddings.items():
    category = categories[word]
    dist = query_embed.dot(embed)
    dist /= len(data[category])
    scores[category] = scores.get(category, 0) + dist
  return scores

# Testing
print(process('jonny'))
print(process('green'))
print(process('park'))

And the return looks like:

Loaded 400000 word vectors.
{'Names': 7.965438079833984, 'Places': -0.3282392770051956, 'Colors': 1.803783965110779}
{'Names': 11.360316085815429, 'Places': 3.536876901984215, 'Colors': 21.82199630737305}
{'Names': 10.234728145599364, 'Places': 8.739515662193298, 'Colors': 10.761297225952148}

Below are the changes I want to make to this scrip but keep failing 🙁 Please help.

Question 1: The order or category (data) is Names, Colors, and Places. But why does the retun has Name, Place, Color order instead? This is not important but was wondering why.

Question 2: Instead of using print(process(‘jonny’)), how can I input list of text from text file?

Question 3: Lets suppose name of input text file is TEST.txt. How can I save the return in TEST.JSON or TEST.csv file? Basically input and output as same name.

Thank yo so much!

Answers

Chosen as BEST ANSWER
- ArkLade
- February 27, 2023 at 9:15 pm
- 0 votes
0
Thanks a lot, @Driftr95

The below code allows to input of multiple text files and then saving the return in individual json files.
```
inpFiles = ['text1.txt', 'text2.txt', 'text3.txt']
# ifLen = len(ifLen)
for inpf in inpFiles: # for ifi, inpf in enumerate(inpFiles, 1):
    # print('', end=f'r[{ifi} of {ifLen}] processing "{inpf}"...')
    with open(inpf) as f: inputList = f.read().splitlines()
    with open(f'{inpf[:-4]}.json', 'w') as f:   
        json.dump({inp: process(inp) for inp in inputList}, f, indent=4)
```

(Edit)

- Driftr95
- February 27, 2023 at 5:45 pm
- 0 votes
0
Question 1: The order or category (data) is Names, Colors, and Places. But why does the return has Name, Place, Color order instead? This is not important but was wondering why.

It’s probably because of how the contents of ‘glove.6B.100d.txt’ are ordered/arranged.

Question 2: Instead of using print(process('jonny')), how can I input list of text from text file? [Lets suppose name of input text file is TEST.txt.]

Assuming ‘TEST.txt’ has an input in each line like
```
jonny
green
park
[input#4]
[input#5]
```
Then you could read them into a list of strings to loop through and apply process to:
```
with open('TEST.txt') as f: 
    inputList = f.read().splitlines()

# for inp in inputList: print(process(inp)) ## OR
outputList = [process(inp) for inp in inputList] 
for op in outputList: print(op) 
```
Question 3: […] How can I save the return in TEST.JSON or TEST.csv file? Basically input and output as same name.

To save as CSV, you could use pandas .to_csv^{(view examples)}
```
import pandas as pd
# pd.DataFrame(outputList, index=inputList).to_csv('TEST.csv') ## same as:
# pd.DataFrame([process(i) for i in inputList], index=inputList).to_csv('TEST.csv')
 
pd.DataFrame(
    [{'input': inp, **process(inp)} for inp in inputList]
).set_index('input').to_csv('TEST.csv')
```
and to save as JSON, you can use json.dump^{(view examples: op1, op2)}
```
import json

with open('TEST.json', 'w') as f: 
    # json.dump([{'input':inp, 'output': process(inp)} for inp in inputList], f) ## op1
    json.dump({inp: process(inp) for inp in inputList}, f) #, indent=4) ## op2
```
Added EDIT:

Let’s suppose I have a list of text files for this. Then how would I be able to process all the text files at once and save the return in the same file name? For example, if I use text1.txt, text2.txt, and text3.txt…..return will be text1.json, text2.json, and text3.json.
```
inpFiles = ['text1.txt', 'text2.txt', 'text3.txt']
# ifLen = len(ifLen)
for inpf in inpFiles: # for ifi, inpf in enumerate(inpFiles, 1):
    # print('', end=f'r[{ifi} of {ifLen}] processing "{inpf}"...')
    with open(inpf) as f: inputList = f.read().splitlines()
    with open(f'{inpf[:-4]}.json', 'w') as f:   
        json.dump({inp: process(inp) for inp in inputList}, f, indent=4)
```
[Using f'{inpf[:-4]}.json' assumes all file names in inpFiles end with ‘.txt’]
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

How to polish return and save to individual JSON/csv file?

Answers

Added EDIT: