I am on a Ubuntu OS, in a Bash shell, trying to use grep
to find all occurrences of substring engineBreakdown()
inside a .tra extention log file, let’s say my_log_16.tra
, and save the results inside a file, let’s say results_16.txt
So I run
cat /path/to/my_log_16.tra | grep "engineBreakdown()" > results_16.txt
and when I run less results_16.txt
I actually see that there inside are saved some lines containing the substring, but they are not all the lines I expected.
In fact, when I manually search the occurrences of engineBreakdown()
down my_log_16.tra
, I see that there are other lines containing the substring, but these are not saved into results_16.txt
. So it seems that my command only saves the first occurrences of the substring.
I think the grep may fail because my_log_16.tra
is a very large file ( about 100 MB ).
If this is the cause, is there a more reliable way to find all occurrences of a substring in a very big file?
version and alias of grep
grep --version
grep (GNU grep) 2.25 Copyright (C) 2016 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>. This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Written by Mike Haertel and others, see <http://git.sv.gnu.org/cgit/grep.git/tree/AUTHORS>.
$ type -a grep
grep is aliased to `grep --color=auto' grep is /bin/grep
Example of lines from my_log_16.tra
lines correctly detected and saved into results_16.txt
[I 2022-10-16 07:26:35.449 Rservice:75] engineBreakdown()
[I 2022-10-16 07:26:35.846 Rservice:75] engineBreakdown()
[I 2022-10-16 07:26:35.848 Rservice:75] engineBreakdown()
a piece of the file where the substring appears, but it is not saved into results_16.txt
[I 2022-10-16 11:32:48.039 web:2064] 200 GET /static/ui-src/default/img/Customer.png?v=0.9702853857687699 (127.0.0.1) 10.49ms
[I 2022-10-16 11:32:49.778 Rservice:75] engineBreakdown()
[I 2022-10-16 11:32:50.122 websocketclient:62] Connection : url::ws://localhost:3333/ws
[I 2022-10-16 11:32:50.125 Rservice:75] engineBreakdown()
[I 2022-10-16 11:32:50.128 Rservice:75] engineBreakdown()
[I 2022-10-16 11:32:55.123 websocketclient:62] Connection : url::ws://localhost:3333/ws
[I 2022-10-16 11:32:55.128 Rservice:75] engineBreakdown()
[I 2022-10-16 11:32:55.134 Rservice:75] engineBreakdown()
another piece of the file where the substring appears, but it is not saved into results_16.txt
[I 2022-10-17 04:00:35.127 Rservice:75] engineBreakdown()
[I 2022-10-17 04:00:35.138 Rservice:75] engineBreakdown()
[I 2022-10-17 04:00:39.206 websocketclient:62] Connection : url::ws://127.0.0.1:9999/request
[I 2022-10-17 04:00:39.220 websocketclient:62] Connection : url::ws://127.0.0.1:9999/auxiliary
[I 2022-10-17 04:00:39.228 channels:75] _on_connection_error, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:39.233 channels:82] _on_connection_close, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:39.237 channels:75] _on_connection_error, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:39.243 channels:82] _on_connection_close, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:40.122 websocketclient:62] Connection : url::ws://localhost:3333/ws
[I 2022-10-17 04:00:40.128 Rservice:75] engineBreakdown()
[I 2022-10-17 04:00:40.133 Rservice:75] engineBreakdown()
[I 2022-10-17 04:00:44.206 websocketclient:62] Connection : url::ws://127.0.0.1:9999/request
[I 2022-10-17 04:00:44.221 websocketclient:62] Connection : url::ws://127.0.0.1:9999/auxiliary
[I 2022-10-17 04:00:44.227 channels:75] _on_connection_error, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:44.232 channels:82] _on_connection_close, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:44.234 channels:75] _on_connection_error, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:44.237 channels:82] _on_connection_close, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:45.122 websocketclient:62] Connection : url::ws://localhost:3333/ws
[I 2022-10-17 04:00:45.126 Rservice:75] engineBreakdown()
[I 2022-10-17 04:00:45.128 Rservice:75] engineBreakdown()
update 1
I also tryed with
grep "engineBreakdown()" /path/to/my_log_16.tra > results_16.txt
but the result is the same.
update 2
As suggested, double quotes might not be enough to handle the parentheses properly, so I removed the parentheses from the input substring and changed the double quotes to single ones
grep "engineBreakdown" /path/to/my_log_16.tra > results_16.txt
grep 'engineBreakdown' /path/to/my_log_16.tra > results_16.txt
but the result is the same.
2
Answers
You can try if this
awk
helps.Data
Just printing the lines (like grep) without substring detection simply do
Seems like your
grep
command is behaving oddly (perhaps because you are using an old version that has some bug that was fixed later).Here’s an alternative with
sed
:I’d recommend updating your
grep
installation. ripgrep is another alternative.