skip to Main Content

Assuming a list as follows:

list_of_strings = ['foo', 'bar', 'soap', 'seo', 'paseo', 'oes']

and a sub string

to_find = 'eos'

I would like to find the string(s) in the list_of_strings that match the sub string. The output from the list_of_strings should be ['seo', 'paseo', 'oes'] (since it has all the letters in the to_find sub string)

I tried a couple of things:

a = next((string for string in list_of_strings if to_find in string), None) # gives NoneType object as output

&

result = [string for string in list_of_strings if to_find in string] # gives [] as output

but both the codes don’t work.

Can someone please tell me what is the mistake I am doing?

Thanks

2

Answers


  1. Your problem logically is comparing the set of characters in the word to find against the set of characters in each word in the list. If the latter word contains all characters in the word to find, then it is a match. Here is one approach using a list comprehension along with set intesection:

    list_of_strings = ['foo', 'bar', 'soap', 'seo', 'paseo', 'oes']
    to_find = 'eos'
    to_find_set = set(list(to_find))
    output = [x for x in list_of_strings if len(to_find_set.intersection(set(list(x)))) == len(to_find_set)]
    print(output)  # ['seo', 'paseo', 'oes']
    

    If you want to retain an empty string placeholder for any input string which does not match, then use this version:

    output = [x if len(to_find_set.intersection(set(list(x)))) == len(to_find_set) else '' for x in list_of_strings]
    print(output)  # ['', '', '', 'seo', 'paseo', 'oes']
    
    Login or Signup to reply.
  2. Do you need the letters of to_find to be next to each other or just all the letters should be in the word? Basically: does seabco match or not?

    [Your question does not include this detail and you use "substring" a lot but also "since it has all the letters in the to_find", so I’m not sure how to interpret it.]

    If seabco matches, then @Tim Biegeleisen’s answer is the correct one. If the letters need to be next to each other (but in any order, of course), then look below:


    If the to_find is relatively short, you can just generate all permutations of letters (n! of them, so here (3!) = 6: eos, eso, oes, ose, seo, soe) and check in.

    import itertools
    list_of_strings = ['foo', 'bar', 'soap', 'seo', 'paseo', 'oes']
    to_find = 'eos'
    
    result = [string for string in list_of_strings if any("".join(perm) in string for perm in itertools.permutations(to_find))]
    

    https://docs.python.org/3/library/itertools.html#itertools.permutations

    We do "".join(perm) because perm is a tuple and we need a string.

    >>> result = [string for string in list_of_strings if any("".join(perm) in string for perm in itertools.permutations(to_find))]
    >>> result
    ['seo', 'paseo', 'oes']
    

    Less-obvious but better complexity would be to just get 3-character substrings of our strings (to keep them next to each other) and set-compare them to set of to_find.

    list_of_strings = ['foo', 'bar', 'soap', 'seo', 'paseo', 'oes']
    to_find = 'eos'
    
    result = [string for string in list_of_strings if any(set(three_substring)==set(to_find) for three_substring in zip(string, string[1:], string[2:]))]
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search