skip to Main Content

I am on a Ubuntu OS, in a Bash shell, trying to use grep to find all occurrences of substring engineBreakdown() inside a .tra extention log file, let’s say my_log_16.tra, and save the results inside a file, let’s say results_16.txt

So I run

cat /path/to/my_log_16.tra | grep "engineBreakdown()" > results_16.txt

and when I run less results_16.txt I actually see that there inside are saved some lines containing the substring, but they are not all the lines I expected.

In fact, when I manually search the occurrences of engineBreakdown() down my_log_16.tra, I see that there are other lines containing the substring, but these are not saved into results_16.txt. So it seems that my command only saves the first occurrences of the substring.

I think the grep may fail because my_log_16.tra is a very large file ( about 100 MB ).

If this is the cause, is there a more reliable way to find all occurrences of a substring in a very big file?

version and alias of grep

grep --version
grep (GNU grep) 2.25
Copyright (C) 2016 Free Software Foundation, Inc.     
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.     
This is free software: you are free to change and redistribute it.     
There is NO WARRANTY, to the extent permitted by law.         

Written by Mike Haertel and others, see <http://git.sv.gnu.org/cgit/grep.git/tree/AUTHORS>.
$ type -a grep
grep is aliased to `grep --color=auto'
grep is /bin/grep

Example of lines from my_log_16.tra

lines correctly detected and saved into results_16.txt

[I 2022-10-16 07:26:35.449 Rservice:75] engineBreakdown()
[I 2022-10-16 07:26:35.846 Rservice:75] engineBreakdown()
[I 2022-10-16 07:26:35.848 Rservice:75] engineBreakdown()

a piece of the file where the substring appears, but it is not saved into results_16.txt

[I 2022-10-16 11:32:48.039 web:2064] 200 GET /static/ui-src/default/img/Customer.png?v=0.9702853857687699 (127.0.0.1) 10.49ms
[I 2022-10-16 11:32:49.778 Rservice:75] engineBreakdown()
[I 2022-10-16 11:32:50.122 websocketclient:62] Connection : url::ws://localhost:3333/ws
[I 2022-10-16 11:32:50.125 Rservice:75] engineBreakdown()
[I 2022-10-16 11:32:50.128 Rservice:75] engineBreakdown()
[I 2022-10-16 11:32:55.123 websocketclient:62] Connection : url::ws://localhost:3333/ws
[I 2022-10-16 11:32:55.128 Rservice:75] engineBreakdown()
[I 2022-10-16 11:32:55.134 Rservice:75] engineBreakdown()

another piece of the file where the substring appears, but it is not saved into results_16.txt

[I 2022-10-17 04:00:35.127 Rservice:75] engineBreakdown()
[I 2022-10-17 04:00:35.138 Rservice:75] engineBreakdown()
[I 2022-10-17 04:00:39.206 websocketclient:62] Connection : url::ws://127.0.0.1:9999/request
[I 2022-10-17 04:00:39.220 websocketclient:62] Connection : url::ws://127.0.0.1:9999/auxiliary
[I 2022-10-17 04:00:39.228 channels:75] _on_connection_error, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:39.233 channels:82] _on_connection_close, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:39.237 channels:75] _on_connection_error, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:39.243 channels:82] _on_connection_close, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:40.122 websocketclient:62] Connection : url::ws://localhost:3333/ws
[I 2022-10-17 04:00:40.128 Rservice:75] engineBreakdown()
[I 2022-10-17 04:00:40.133 Rservice:75] engineBreakdown()
[I 2022-10-17 04:00:44.206 websocketclient:62] Connection : url::ws://127.0.0.1:9999/request
[I 2022-10-17 04:00:44.221 websocketclient:62] Connection : url::ws://127.0.0.1:9999/auxiliary
[I 2022-10-17 04:00:44.227 channels:75] _on_connection_error, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:44.232 channels:82] _on_connection_close, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:44.234 channels:75] _on_connection_error, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:44.237 channels:82] _on_connection_close, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:45.122 websocketclient:62] Connection : url::ws://localhost:3333/ws
[I 2022-10-17 04:00:45.126 Rservice:75] engineBreakdown()
[I 2022-10-17 04:00:45.128 Rservice:75] engineBreakdown()

update 1

I also tryed with

grep "engineBreakdown()" /path/to/my_log_16.tra > results_16.txt

but the result is the same.

update 2

As suggested, double quotes might not be enough to handle the parentheses properly, so I removed the parentheses from the input substring and changed the double quotes to single ones

grep "engineBreakdown" /path/to/my_log_16.tra > results_16.txt

grep 'engineBreakdown' /path/to/my_log_16.tra > results_16.txt

but the result is the same.

2

Answers


  1. You can try if this awk helps.

    Data

    $ cat file
    engineBreakdown()
    engineBreakdown() engineBreakdown() engineBreakdown() engineBreakdown()
    engineBreakdown()
    
    $ awk -v var="engineBreakdown()" '
        $0~var{
          printf NR
          for(i=1;i<=NF;i++){
            if($i~var){x++}
          }
          print " # matches: " x
          x=0
      }' file
    1 # matches: 1
    2 # matches: 4
    3 # matches: 1
    

    Just printing the lines (like grep) without substring detection simply do

    $ awk -v var="engineBreakdown()" '$0~var{ print }' file
    engineBreakdown()
    engineBreakdown() engineBreakdown() engineBreakdown() engineBreakdown()
    engineBreakdown()
    
    Login or Signup to reply.
  2. Seems like your grep command is behaving oddly (perhaps because you are using an old version that has some bug that was fixed later).

    Here’s an alternative with sed:

    sed -n '/engineBreakdown()/p' /path/to/my_log_16.tra > op.txt
    

    I’d recommend updating your grep installation. ripgrep is another alternative.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search