The output I have is the following:
T 2020/03/05 16:06:41.565817 193.126.13.199:80 -> 10.8.0.4:55639 [AP] HTTP/1.1 200 OK..Date: Thu, 05 Mar 2020 16:06:41 GMT..Server: Apache/2.2.3 (CentOS)..Expires: Thu, 19 Nov 1981 08:52:00 GMT..Cache-Control: no-store, no-cache,
T 2020/03/05 16:06:46.727199 10.8.0.4:55642 -> 193.126.13.199:80 [AP] GET / HTTP/1.1..Host: www.radionova.fm..User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:72.0) Gecko/xml;q=0.9,image/webp,*/*;q=0.8..Accept-Langu
T 2020/03/05 16:06:47.174078 193.126.13.199:80 -> 10.8.0.4:55642 [A] HTTP/1.1 200 OK..Date: Thu, 05 Mar 2020 16:06:46 GMT..Server: Apache/2.2.3 (CentOS)..Expires: Thu, 19 Nov 1981 08:52:00 GMT..Cache-Control: no-store, no-cache
How can I do a regex pattern to match only the [AP] rows?
Something like:
T 2020/03/05 16:06:46.727199 10.8.0.4:55642 -> 193.126.13.199:80 [AP] GET / HTTP/1.1..Host: www.radionova.fm..User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:72.0)
So.. the first group: 2020/03/05
Second group: 16:06:46.727199
Third group: 10.8.0.4:55642
Fourth group: GET / HTTP/1.1..Host: www.radionova.fm..User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:72.0)
I have the following python regex:
pattern = r'''Ts([^ ]+)s([^ ]+)s([^ ]+).*?[.]{2,}(.*?)[.]{2,}'''
Not working like I want..
2
Answers
You could add matching 2 extra parts matching a whitespace and non whitespace chars and match the
[AP]
partRegex demo | Python demo
Output
Why not the obvious
in
operator?