skip to Main Content

I have a text string and I want to replace two words with a single word. E.g. if the word is artificial intelligence, I want to replace it with artificial_intelligence. This needs to be done for a list of 200 words and on a text file of size 5 mb.
I tried string.replace but it can work only for one element, not for the list.

Example

Text=’Artificial intelligence is useful for us in every situation of deep learning.’

List a : list b
Artificial intelligence: artificial_intelligence
Deep learning: deep_ learning 
...

Text.replace('Artificial intelligence','Artificial_intelligence') is working.
But

For I in range(len(Lista)):
 Text=Text.replace(Lista[I],List b[I])

doesn’t work.

2

Answers


  1. I would suggest using a dict for your replacements:

    text = "Artificial intelligence is useful for us in every situation of deep learning."
    replacements = {"Artificial intelligence" : "Artificial_intelligence",
                    "deep learning" : "deep_learning"}
    

    Then your approach works (although it is case-sensitive):

    >>> for rep in replacements:
            text = text.replace(rep, replacements[rep])
    >>> print(text)
    Artificial_intelligence is useful for us in every situation of deep_learning.
    

    For other approaches (like the suggested regex-approach), have a look at SO: Python replace multiple strings.

    Login or Signup to reply.
  2. Since you have a case problem between your list entries and your string, you could use the re.sub() function with IGNORECASE flag to obtain what you want:

    import re
    
    list_a = ['Artificial intelligence', 'Deep learning']
    list_b = ['artificial_intelligence', 'deep_learning']
    text = 'Artificial intelligence is useful for us in every situation of deep learning.'
    
    for from_, to in zip(list_a, list_b):
        text = re.sub(from_, to, text, flags=re.IGNORECASE)
    
    print(text)
    # artificial_intelligence is useful for us in every situation of deep_learning.
    

    Note the use of the zip() function wich allows to iterate over the two lists in the same time.


    Also note that Christian is right, a dict would be more suitable for your substitution data. The previous code would then be the following for the exact same result:

    import re
    
    subs = {'Artificial intelligence': 'artificial_intelligence',
            'Deep learning': 'deep_learning'}
    text = 'Artificial intelligence is useful for us in every situation of deep learning.'
    
    for from_, to in subs.items():
        text = re.sub(from_, to, text, flags=re.IGNORECASE)
    
    print(text)
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search