skip to Main Content

I am using a cPanel account and have an Apache 2.4 access log that stores its logs like:

66.249.93.30 - - [04/May/2018:21:26:39 +0200] "GET / HTTP/1.1" 302 207 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko; Google Page Speed Insights) Chrome/41.0.2272.118 Safari/537.36"
66.249.93.30 - - [05/May/2018:10:26:39 +0200] "GET / HTTP/1.1" 302 207 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko; Google Page Speed Insights) Chrome/41.0.2272.118 Safari/537.36"

The date is in format date “+%d/%B/%Y:%k:%M:%S”

Using a bash script I would like to extract just the lines that were logged in the last hour, for example:

Full Log file:

66.249.93.30 - - [04/May/2018:21:26:39 +0200] First Line
66.249.93.30 - - [05/May/2018:11:00:21 +0200] Second Line
66.249.93.30 - - [05/May/2018:11:15:39 +0200] Third Line
66.249.93.30 - - [05/May/2018:12:00:11 +0200] Fourth Line

Current Time: 05/May/2018:12:01:06

Logs from: 5th of May between the time interval of 11:01 – 12:01

Filtered Output:

66.249.93.30 - - [05/May/2018:11:15:39 +0200] Third Line
66.249.93.30 - - [05/May/2018:12:00:11 +0200] Fourth Line

I have tried using awk and several other suggestions but I can’t get it to work, any help will be appreciated!

2

Answers


  1. Chosen as BEST ANSWER

    I was able to figure it out!

    I had to convert the 04/May/2018:21:26:39 to a UNIX Timestamp. This is done by the following usage of date

    date -d "YEAR-MONTH-DAY HR:M:S" "+%S"
    

    Then make another UNIX Timestamp that's 60 minutes behind

    date -d "60 min ago" "+%s"
    

    And in an if conditional filter all log entries whose UNIX Timestamp is bigger ( -gt ) the 60 minutes behind Timestamp

    With my current setup:

    cPanel + Apache 2.4

    Logging Format: /home/$USER/public_html_cron_logs/$DAY/$HOUR-$MINUTE-[GET|POST].log

    Like /home/$USER/public_html_cron_logs/05-05-2018/14-53-GET.log

    #!/bin/bash
    
    LOG_DIR="public_html_cron_logs"
    
    DAY=`date +"%d-%m-%Y"`
    HOUR=`date "+%H-%M"`
    GET_LOG="GET.log"
    POST_LOG="POST.log"
    
    if [ ! -d /home/$USER/$LOG_DIR/$DAY ]; then
        mkdir /home/$USER/$LOG_DIR/$DAY;
    fi
    
    CREATE_DIR=/home/$USER/$LOG_DIR/$DAY
    GET_LOG=$CREATE_DIR/$HOUR-$GET_LOG
    POST_LOG=$CREATE_DIR/$HOUR-$POST_LOG
    
    while read line; do
    
        DATE_LOG=`echo $line | awk '{print $4}'`; DATE_LOG=${DATE_LOG:1}
        MONTH_VERB=`echo $DATE_LOG | awk -F '[/:]' '{print $2}'`
    
        if [ "$MONTH_VERB" = "January" ]; then
            MONTH=01
        elif [ "$MONTH_VERB" = "February" ]; then
            MONTH=02
        elif [ "$MONTH_VERB" = "March" ]; then
            MONTH=03
        elif [ "$MONTH_VERB" = "April" ]; then
            MONTH=04
        elif [ "$MONTH_VERB" = "May" ]; then
            MONTH=05
        elif [ "$MONTH_VERB" = "June" ]; then
            MONTH=06
        elif [ "$MONTH_VERB" = "July" ]; then
            MONTH=07
        elif [ "$MONTH_VERB" = "August" ]; then
            MONTH=08
        elif [ "$MONTH_VERB" = "September" ]; then
            MONTH=09
        elif [ "$MONTH_VERB" = "October" ]; then
            MONTH=10
        elif [ "$MONTH_VERB" = "November" ]; then
            MONTH=11
        elif [ "$MONTH_VERB" = "December" ]; then
            MONTH=12
        fi
    
        UNIX_DATE=`echo $DATE_LOG | awk -v AWK_MONTH="$MONTH" -F '[/:]' '{print $3"-"AWK_MONTH"-"$1" "$4":"$5":"$6}'`
        UNIX_TIMESTAMP_LOG=`date -d "$UNIX_DATE" "+%s"`
        UNIX_TIMESTAMP_LAST_HOUR=`date -d '60 min ago' "+%s"`
    
        if  [ $UNIX_TIMESTAMP_LOG -gt $UNIX_TIMESTAMP_LAST_HOUR ]; then
            if [[ $line = *"GET"* ]]; then
                echo $line | awk '{print $1}' >> $GET_LOG
            else
                echo $line | awk '{print $1}' >> $POST_LOG
            fi
        fi
    
    done < ~/access-logs/ENTER_YOUR_DOMAIN_LOG_FILE_HERE
    

  2. $ date
    Sat, May 05, 2018 10:49:13 AM
    
    $ cat tst.awk
    {
        split($4,t,/[[ :/]/)
        mthNr = sprintf("%02d",(index("JanFebMarAprMayJunJulAugSepOctNovDec",t[3])+2)/3)
        curTime = t[4] mthNr t[2] t[5] t[6] t[7]
    }
    curTime >= minTime
    
    $ awk -v minTime=$(date -d '60 min ago' '+%Y%m%d%H%M%S') -f tst.awk file
    66.249.93.30 - - [05/May/2018:11:00:21 +0200] Second Line
    66.249.93.30 - - [05/May/2018:11:15:39 +0200] Third Line
    66.249.93.30 - - [05/May/2018:12:00:11 +0200] Fourth Line
    

    Using the time from your question to get the expected output in your question:

    $ awk -v minTime=$(date -d '2018/05/05 11:01:06' '+%Y%m%d%H%M%S') -f tst.awk file
    66.249.93.30 - - [05/May/2018:11:15:39 +0200] Third Line
    66.249.93.30 - - [05/May/2018:12:00:11 +0200] Fourth Line
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search