Hi I’m fairly new to Python and needed help with extracting strings from a list. I am using Python on Visual Studios.
I have hundreds of similar strings and I need to extract specific information so I can add it to a table in columns – the aim is to automate this task using python. I would like to extract the data between the headers ‘Names’, ‘Ages’ and ‘Jobs’. The issue I am facing is that the number of entries of names, ages and jobs varies a lot within all the lists and so I would like to write unique code which could apply to all the lists.
list_x = ['Names','Ashley','Lee','Poonam','Ages', '25', '35', '42' 'Jobs', 'Doctor', 'Teacher', 'Nurse']
I am struggling to extract
['Ashley', 'Lee', 'Poonam']
I have tried the following:
for x in list_x:
if x == 'Names':
for y in list_x:
if y == 'Ages':
print(list_x[x:y])
This however comes up with the following error:
"Exception has occurred: typeError X
slice indices must be integers or None or have an index method"
Is there a way of doing this without specifying exact indices?
2
Answers
As the comment suggested editing the data is the easiest way to go, but if you have to…
It just finds the indices of
"Names"
and"Ages"
in the list, and extracts the bit between.Lots can (and will) go wrong with this method though – if there’s a name which is "Names", or if they are misspelt, etc.
For completeness sake, it might be not a bad idea to use an approach similar to the below.
First, build a list of indices of each of the desired headers:
Then, create a list of values for each header, which we can infer from the positions where each header shows up in the list:
And finally, we can display it for debugging purposes:
For better time complexity
O(N)
, we can alternatively use an approach like below so that we only have onefor
loop over the list to build adict
object with the values: