awk: get log data part by part - Ubuntu

mortezamortezaie
October 23, 2022
198 views
0 votes
2 Answers

the log file is

Oct 01 [time] a
Oct 02 [time] b
Oct 03 [time] c
.
.
.
Oct 04 [time] d
Oct 05 [time] e
Oct 06 [time] f
.
.
.
Oct 28 [time] g
Oct 29 [time] h
Oct 30 [time] i

and it is really big ( millions of lines )

I wanna to get logs between Oct 01 and Oct 30

I can do it with gawk

gawk 'some conditions' filter.log

and it works correctly.

and it return millions of log lines that is not good

because I wanna to get it part by part

some thing like this

gawk 'some conditions' -limit 100 -offset 200 filter.log

and every time when I change limit and offset

I can get another part of that.

How can I do that ?

Answers

- Daweo
- October 23, 2022 at 1:27 pm
- 0 votes
0
awk solution
I would harness GNU AWK for this task following way, let file.txt content be
```
1
2
3
4
5
6
7
8
9
```
and say I want to print such lines that 1st field is odd in part starting at 3th line and ending at 7th line (inclusive), then I can use GNU AWK following way
```
awk 'NR<3{next}$1%2{print}NR>=7{exit}' file.txt
```
which will give
```
3
5
7
```
Explanation: NR is built-in variable, which hold number of row, when processing lines before 3 just go to next row without doing anything, when remainder from division by 2 is non-zero do print line, when processing 7th or further row just exit. Using exit might give notice boost in performance if you are processing relatively small part of file. Observe order of 3 pattern-action pairs in code above: next is first, then whatever you do want do, exit is last. If you want to know more about NR read 8 Powerful Awk Built-in Variables – FS, OFS, RS, ORS, NR, NF, FILENAME, FNR

(tested in GNU Awk 5.0.1)

linux solution
If you prefer working with offset limit, then you might exploit tail-head combination e.g. for above file.txt
```
tail -n +5 file.txt | head -3
```
gives output
```
5
6
7
```
observe that offset goest first and with + before value then limit with - before value.
Login or Signup to reply.

Using OP’s pseudo code mixed with some actual awk code:

gawk -v limit=100 -v offset=200 '
some conditions { matches++                                # track number of matches
                  if (matches >= offset and limit > 0) {
                     print                                 # print current line
                     limit--                               # decrement limit
                  }
                  if (limit == 0) exit                     # optional: abort processing if we found "limit" number of matches
                }
' filter.log

Please signup or login to give your own answer.

Click here to cancel reply.

awk: get log data part by part – Ubuntu

Answers