cancel
Showing results for 
Search instead for 
Did you mean: 

Echo message (Request/Response) Monitor

Zahid_Haseeb
Moderator
Moderator
Partner    VIP    Accredited

Environment

OS = Linux 6x/7x

SFHA = Suppose latest i.e 6x

Query:

We have a two nodes local HA. A customized application running on it(HA) fine. The customized application send a tcp echo response (800) on the tcp echo request of (810) from a machine exist on LAN. I want to monitor this echo cycle under HA.

2 ACCEPTED SOLUTIONS

Accepted Solutions

mikebounds
Level 6
Partner Accredited

Try the following:

MAX_TIME=300 # 5 mins
LOGFILE=/tmp/testlog
ARG=$2
LOCKFILE=/tmp/.$2.DO_NOT_REMOVE
case "$1" in
start)
    # START APP
    touch $LOCKFILE
    ;;
stop)
    # Stop App
    rm $LOCKFILE
    ;;
monitor)
    if [ -a $LOCKFILE ]
    then
        # Check date in LOGFILE is less than MAX_TIME ago
        logdate_str=`/usr/bin/tail -1 $LOGFILE | /bin/awk '
          {split($1,slash,"/");print slash[3]"/"slash[2]"/"slash[1]" "$2}'`
        logdate_epoch=`/bin/date -d "$logdate_str" +%s`
        today_epoch=`/bin/date +%s`

        (( sec_since_log=$today_epoch - $logdate_epoch ))

        if [[ $sec_since_log -gt $MAX_TIME ]]
        then
            # More than max_time so return offline
            rm $LOCKFILE
            exit 100
        else
            exit 110
        fi
    else
        exit 100
    fi ;;
*)
    echo "Usage: $0 {start|stop|monitor} resource_name"
    exit 1
    ;;
esac

 

Mike

View solution in original post

mikebounds
Level 6
Partner Accredited

Try this:

MAX_TIME=300 # 5 mins
ARG=$2
AWK=/bin/awk
TAIL=/bin/tail
DATE=/bin/date
LOCKFILE=/tmp/.$2.DO_NOT_REMOVE
case "$1" in
start)
    # START APP
    touch $LOCKFILE
    ;;
stop)
    # Stop App
    rm $LOCKFILE
    ;;
monitor)
    if [ -a $LOCKFILE ]
    then
        LOGFILE=/tmp/Z35xlo.o`$DATE +%m%d`
        # Check date in LOGFILE is less than MAX_TIME ago
        logdate_str=`$TAIL -1 $LOGFILE | $AWK -F: '{print "20"$2":"$3":"$4}'`
        logdate_epoch=`$DATE -d "$logdate_str" +%s`
        today_epoch=`$DATE +%s`

        (( sec_since_log=$today_epoch - $logdate_epoch ))
        # echo "DEBUG: $today_epoch, $logdate_str, $logdate_epoch"

        if [[ $sec_since_log -gt $MAX_TIME ]]
        then
            # More than max_time so return offline
            rm $LOCKFILE
            exit 100
        else
            exit 110
        fi
    else
        exit 100
    fi ;;
*)
    echo "Usage: $0 {start|stop|monitor} resource_name"
    exit 1
    ;;
esac

 

Mike

View solution in original post

9 REPLIES 9

mikebounds
Level 6
Partner Accredited

Create a script, say called tcp-echo.sh, with something like:

ARG=$2
LOCKFILE=/tmp/.{$2}.DO_NOT_REMOVE
case "$1" in
start)
    # START APP
    touch $LOCKFILE
    ;;
stop)
    # Stop App
    rm $LOCKFILE
    ;;
monitor)
    if [ -a $LOCKFILE ]
    then
    # DO TCP echo test
        if [ $? -eq 0 ]
        then
            exit 110
        else
            rm $LOCKFILE
        exit 100
    fi
    else
        exit 100
    fi ;;
*) 
    echo "Usage: $0 {start|stop|monitor} arg
    exit 1
    ;;
esac 

 

and then use Application resource:

Application tcp-echo1 (
  StartProgram  = "/opt/VRTSvcs/bin/tcp-echo.sh start tcp-echo1"
  StopProgram  = "/opt/VRTSvcs/bin/tcp-echo.sh stop tcp-echo1"
  MonitorProgram = "/opt/VRTSvcs/bin/tcp-echo.sh monitor tcp-echo1"
)

 

The lock file ensures you don't get concurrency violation as if you just monitor TCP echo, then all nodes in your cluster will report resource is online.

Mike

Zahid_Haseeb
Moderator
Moderator
Partner    VIP    Accredited

Thanks Mike. A little change in plan.

Last echo cycle (800/810) completion time will be written in to a log file by application. I want to write a cluster application script which read the time from log and compare it to current time. If the result exceeded 5 minutes then I want Cluster trigger fault on application script resource.

mikebounds
Level 6
Partner Accredited

This is more of a scripting question than VCS, but if you post an example of last few lines of log file, I will take a look when I get time.  If other things are posted in log other than the time, please give examples and if the time log is not always the last line in the log, how many "non-time log" lines may there be at the end of the file.

Mike

 

Zahid_Haseeb
Moderator
Moderator
Partner    VIP    Accredited

The below is the only line (example) print in the log

23/06/2015 15:25:00

mikebounds
Level 6
Partner Accredited

Try the following:

MAX_TIME=300 # 5 mins
LOGFILE=/tmp/testlog
ARG=$2
LOCKFILE=/tmp/.$2.DO_NOT_REMOVE
case "$1" in
start)
    # START APP
    touch $LOCKFILE
    ;;
stop)
    # Stop App
    rm $LOCKFILE
    ;;
monitor)
    if [ -a $LOCKFILE ]
    then
        # Check date in LOGFILE is less than MAX_TIME ago
        logdate_str=`/usr/bin/tail -1 $LOGFILE | /bin/awk '
          {split($1,slash,"/");print slash[3]"/"slash[2]"/"slash[1]" "$2}'`
        logdate_epoch=`/bin/date -d "$logdate_str" +%s`
        today_epoch=`/bin/date +%s`

        (( sec_since_log=$today_epoch - $logdate_epoch ))

        if [[ $sec_since_log -gt $MAX_TIME ]]
        then
            # More than max_time so return offline
            rm $LOCKFILE
            exit 100
        else
            exit 110
        fi
    else
        exit 100
    fi ;;
*)
    echo "Usage: $0 {start|stop|monitor} resource_name"
    exit 1
    ;;
esac

 

Mike

Zahid_Haseeb
Moderator
Moderator
Partner    VIP    Accredited

Thanks Mike. I will update you on this :)

Zahid_Haseeb
Moderator
Moderator
Partner    VIP    Accredited

Thanks MIKE. I tested. Its running awesome. But need your help a bit more. The log file where the time stamp will be updating after every 5 minutes need to create(log file) which required a change in Application.

 

I discussed with Application dev team and they told me that time stamp already updating in to an existing log file and no ned to create a new log file.

Details of existing log file name:

A new log file created on daily basis with the name "Z35xlo.o0712". Z35xlo.o is fixed and 0712 is mmdd(Month and Day).

Details of existing log file data:

Z35172.16.24.6.o0620:15/06/20 16:36:11:688 |DBG | TCPCommunication::initializeHeartBeat    Heart Beat Msg Const [ARE_YOU_ALIVE]
Z35172.16.24.6.o0620:15/06/20 16:42:06:213 |DBG | TCPCommunication::initializeHeartBeat    Heart Beat Msg Const [ARE_YOU_ALIVE]

----x---------------x-----------------------x-----------------x------------------x---------------x--------------x------------x

Query:

So we need to grep the time stamp from end/bottom of that log file (Z35xlo.ommdd) and compare that time with the current time.

Z35xlo.ommdd file size = 50MB

mikebounds
Level 6
Partner Accredited

Try this:

MAX_TIME=300 # 5 mins
ARG=$2
AWK=/bin/awk
TAIL=/bin/tail
DATE=/bin/date
LOCKFILE=/tmp/.$2.DO_NOT_REMOVE
case "$1" in
start)
    # START APP
    touch $LOCKFILE
    ;;
stop)
    # Stop App
    rm $LOCKFILE
    ;;
monitor)
    if [ -a $LOCKFILE ]
    then
        LOGFILE=/tmp/Z35xlo.o`$DATE +%m%d`
        # Check date in LOGFILE is less than MAX_TIME ago
        logdate_str=`$TAIL -1 $LOGFILE | $AWK -F: '{print "20"$2":"$3":"$4}'`
        logdate_epoch=`$DATE -d "$logdate_str" +%s`
        today_epoch=`$DATE +%s`

        (( sec_since_log=$today_epoch - $logdate_epoch ))
        # echo "DEBUG: $today_epoch, $logdate_str, $logdate_epoch"

        if [[ $sec_since_log -gt $MAX_TIME ]]
        then
            # More than max_time so return offline
            rm $LOCKFILE
            exit 100
        else
            exit 110
        fi
    else
        exit 100
    fi ;;
*)
    echo "Usage: $0 {start|stop|monitor} resource_name"
    exit 1
    ;;
esac

 

Mike

Zahid_Haseeb
Moderator
Moderator
Partner    VIP    Accredited

Nice effort Mike. I am really thank ful to you.

 

/usr/bin/tail need to be done instead of /bin/tail at my lab.