skip to Main Content

imagine, that i’ve some chatlog protocol. It could look like this:

MSG sender|reciever2: Hello its meCRLF
MSG bob|anna: Hello annaCRLF
MSG bob|anna: How are youCRLF
MSG anna|bob: Im fine, you?CRLF
MSG bob|anna: Same, wanna hang out some time?CRLF
MSG anna|bob: YesCRLF
MSG bob|peter: hey im asking anna to hang out lolCRLF
MSG anna|bob: for sureCRLF
MSG anna|bob: maybe in a few weeks?CRLF

I only want to get the chat between Anna and Bob, but only want to have the senders name one time, just until the other chatpartner begins.

What i’ve already archived is this sed script.

s/^MSGs+(anna|bob)|(anna|bob):s{1}(.+)CRLF$/1: "3"/g
t end

/^.*/d

:end

This creates:

bob: "Hello anna"
bob: "How are you"
anna: "Im fine, you?"
bob: "Same, wanna hang out some time?"
anna: "Yes"
anna: "for sure"
anna: "maybe in a few weeks?"

But i want something similar to:

bob: 
  Hello anna
  How are you
anna
  Im fine, you?
bob: 
  Same, wanna hang out some time?
anna: 
  Yes
  for sure
  maybe in a few weeks?

So, how can delete after one bob, all the bobs until the next anna comes?
Note, this is some stuff i have to use sed for. This has to run on Ubuntu Linux Systems with sed (GNU sed) 4.7 Packaged by Debian

3

Answers


  1. The following script:

    cat <<EOF |
    MSG sender|reciever2: Hello its meCRLF
    MSG bob|anna: Hello annaCRLF
    MSG bob|anna: How are youCRLF
    MSG anna|bob: Im fine, you?CRLF
    MSG bob|anna: Same, wanna hang out some time?CRLF
    MSG anna|bob: YesCRLF
    MSG bob|peter: hey im asking anna to hang out lolCRLF
    MSG anna|bob: for sureCRLF
    MSG anna|bob: maybe in a few weeks?CRLF
    EOF
    sed '
      # preprocess - remove uninterested parts
      /MSG ((anna)|bob|(bob)|anna): (.*)CRLF/!d
      s//23:4/
    
      # Check if are doing it again with same name.
      G   # Grab the previous name from hold space.
      /^([^:]*):(.*)n1$/{   # The names match?
        s//  2/p                 # Print only the message.
        d
      }
    
      h    # Put the whole line into hold space. For later.
      s/^([^:]*):([^n]*).*/1/   # Extract only name from the line.
      x    # Put the name in hold space, and grab the full line from hold space.
      s//1:n  2/     # Print the name with the message.
    '
    

    outputs:

    bob:
      Hello anna
      How are you
    anna:
      Im fine, you?
    bob:
      Same, wanna hang out some time?
    anna:
      Yes
      for sure
      maybe in a few weeks?
    
    Login or Signup to reply.
  2. This might work for you (GNU sed):

    sed -E '/^MSG ((anna)|bob|(bob)|anna): (.*)CRLF/{s//23:4/;H};$!d
           x;s/(n.*:).*(1.*)*/1n&/mg;s/n+.*:(S)/n  1/mg;s/.//' file
    

    Turn on extended regexp -E.

    Gather up the anna and bob conversations in the hold space.

    At the end of file swap to the hold space, prepend the name of the of the following lines of conversation, remove the unwanted names and space indent each line of conversation for the prepended name.

    Finally remove the first newline artefact.


    An alternative solution (similar to KamilCuk):

    sed -E '/^MSG ((anna)|bob|(bob)|anna): (.*)CRLF/!d;s//23:4/;G
            /^([^:]*:)(.*)n1$/{s//  2/p;d};h;s/:.*/:/p;x;s/[^:]*:/  /;P;d' file
    
    Login or Signup to reply.
  3. This uses POSIX sed syntax.

    sed '
    /^MSG (anna)|bob:/!{
      /^MSG (bob)|anna:/!d
    }
    s//1:
     /;s/CRLF$//;t t
    :t
    H;x;s/^([^:]*:n).*1//;t
    g' file
    

    It appends the current record to the previous one in the hold space, swaps them, removes duplicate names (along with the previous record), or else reverts the pattern space back to the original current record.

    Here’s a more efficient version:

    sed '
    t
    /^MSG (anna)|bob:/!{
      /^MSG (bob)|anna:/!d
    }
    s//1:
     /;s/CRLF$//
    H;s/:.*/:/
    x;s/^([^:]*:n)1//p;D' file
    

    This avoids the use of .* in the duplicate detecting regexp by using the hold space to store the previous name rather than the entire previous record.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search