I’m newer to Python. I’m using openpyxl for a SEO project for my brother and I’m trying to get a number of rows that contain a specific value in them.
I have a spreadsheet that looks something like this:
I want to write a program that will get the keywords and parse them to a string by state, so like:
Missouri = "search item 1, search item 2, search item 5, search item 6"
Illinois = "search item 3, search item 4"
I have thus far created a program like this:
#first, import openpyxl
import openpyxl
#next, give location of file
path = "testExcel.xlsx"
#Open workbook by creating object
wb_object = openpyxl.load_workbook(path)
#Get workbook active sheet object
sheet_object = wb_object.active
#Getting the value of maximum rows
#and column
row = sheet_object.max_row
column = sheet_object.max_column
print("Total Rows:", row)
print("Total Columns:", column)
#printing the value of forth column, state
#Loop will print all values
#of first column
print("nValue of fourth column")
for i in range(4, row + 1):
cell_object = sheet_object.cell(row=i, column=4)
split_item_test = cell_object.value.split(",")
split_item_test_result = split_item_test[0]
state = split_item_test_result
print(state)
if (state == 'Missouri'):
print(state.count('Missouri'))
print("All good")
The problem is after doing this, I see that it prints 1 repeatedly, but not a total number for Missouri. I would like a total number of mentions of the state, and then eventually get it to a string with each search criteria.
Is this possible with openpyxl? Or will I need a different library?
3
Answers
ranemirusG is right, there are several ways to obtain the same result. Here’s another option…I attempted to preserve your thought process, good luck.
Yes, it’s possible with
openpyxl
.To achieve your real goal try something like this:
Ok another option
This will create a dictionary ‘state_dict’ in the format per your question
Will create the dictionary ‘state_dict’ as so;
Print output
###————–Additional Information ———————–###
Updated the state_dict to include the rank details for each item.
The output display now shows each items for each state in rank order. You have two options on how the data may be restricted;
Maximum rank to show, the variable rank_max = 100 determines the highest rank the output will display so if it’s set to 5 then only ranks 1, 2, 3, 4 and 5 will be displayed if the State has items with those ranks.
The total_ranks_to_display determines number of ranks to display. So regardless of the rank_max value this will restrict the number of ranks shows with rank 1 as top. Example; if you set rank_max to 10 and this would show 8 rows for Missouri then setting the total_ranks_to_display to 4 will mean only the top 4 ranks will show.
You can use either or both to achieve what you need I think.
Example Output
Max rank is 10 and the total ranks to display is 4