I try to count the access on a specific URL which begins every time with "shop/product?traffic=ads" with AWK, but I failed.
The following code gives me a counter how often an IP address has accessed these URL:
awk -F'[ "]+' '$7 == "/shop/product?traffic=ads" { ipcount[$1]++ }
END { for (i in ipcount) {
printf "%15s - %dn", i, ipcount[i] } }' /var/www/vhosts/domain.com/logs/access_ssl_log
An example for the access_log (input-file) is here:
66.249.68.xx- - [19/Dec/2022:09:14:15 +0100] "GET /shop/other-product/1.0" 404 16996 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.5304.xxx Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
109.42.242.xxx - - [19/Dec/2022:09:14:55 +0100] "GET /shop/product?traffic=ads&gclid=Cj0KCQiAtICdBhCLARIsALUBFcFMmvFbA_1EyTTMRDp9IWhDXFA_HCeuEsIBXl886PoaAapen2KdussaAniSEALw_wcB HTTP/1.0" 200 30589 "https://www.google.com/" "Mozilla/5.0 (Linux; Android 11; SM-A515F) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Mobile Safari/537.36"
80.187.75.xx - - [20/Dec/2022:06:40:12 +0100] "GET /shop/product HTTP/1.0" 200 10821 "https://www.example.com/shop/product?traffic=ads&gclid=EAIaIQobChMIg_Ks5vWF_AIVAgGLCh3k_gBKEAAYASAAEgKBOfD_BwE&dt=1671461107791" "Mozilla/5.0 (iPhone; CPU iPhone OS 16_0_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.0 Mobile/15E148 Safari/604.1"
The "gclid" and and the "dt"(session cookie) are dynamic.
I try to play with ^ after ads, before /shop, but there will be no results.
I want for example the following output:
6 Clicks from 109.42.242.xxx to /shop/product?traffic=ads&gclid=Cj0KCQiAtICdBhCLARIsALUBFcFMmvFbA_1EyTTMRDp9IWhDXFA_HCeuEsIBXl886PoaAapen2KdussaAniSEALw_wcB
1 Clicks from 80.187.75.xx to https://www.example.com/shop/product?traffic=ads&gclid=EAIaIQobChMIg_Ks5vWF_AIVAgGLCh3k_gBKEAAYASAAEgKBOfD_BwE&dt=1671461107791"
2
Answers
You can check if the string occurs in field 7 using index(), and then store the values of field 1 and field 7 with a space in between as the key, to retrieve both values in the END block by splitting on a space again.
Test data
Output
With your shown samples please try following
awk
code. Usingmatch
function to match regex/shop/product?traffic=adsS+
(where escaped/
to match literal/
) and if match is found then creating an array value with index of $1 FS and matched value. In theEND
block of this program printing the values as per requirement.