06-15-2015 09:58 PM
Environment
OS = Linux 6x/7x
SFHA = Suppose latest i.e 6x
Query:
We have a two nodes local HA. A customized application running on it(HA) fine. The customized application send a tcp echo response (800) on the tcp echo request of (810) from a machine exist on LAN. I want to monitor this echo cycle under HA.
Solved! Go to Solution.
06-23-2015 05:20 AM
Try the following:
MAX_TIME=300 # 5 mins LOGFILE=/tmp/testlog ARG=$2 LOCKFILE=/tmp/.$2.DO_NOT_REMOVE case "$1" in start) # START APP touch $LOCKFILE ;; stop) # Stop App rm $LOCKFILE ;; monitor) if [ -a $LOCKFILE ] then # Check date in LOGFILE is less than MAX_TIME ago logdate_str=`/usr/bin/tail -1 $LOGFILE | /bin/awk ' {split($1,slash,"/");print slash[3]"/"slash[2]"/"slash[1]" "$2}'` logdate_epoch=`/bin/date -d "$logdate_str" +%s` today_epoch=`/bin/date +%s` (( sec_since_log=$today_epoch - $logdate_epoch )) if [[ $sec_since_log -gt $MAX_TIME ]] then # More than max_time so return offline rm $LOCKFILE exit 100 else exit 110 fi else exit 100 fi ;; *) echo "Usage: $0 {start|stop|monitor} resource_name" exit 1 ;; esac
Mike
07-15-2015 04:12 AM
Try this:
MAX_TIME=300 # 5 mins ARG=$2 AWK=/bin/awk TAIL=/bin/tail DATE=/bin/date LOCKFILE=/tmp/.$2.DO_NOT_REMOVE case "$1" in start) # START APP touch $LOCKFILE ;; stop) # Stop App rm $LOCKFILE ;; monitor) if [ -a $LOCKFILE ] then LOGFILE=/tmp/Z35xlo.o`$DATE +%m%d` # Check date in LOGFILE is less than MAX_TIME ago logdate_str=`$TAIL -1 $LOGFILE | $AWK -F: '{print "20"$2":"$3":"$4}'` logdate_epoch=`$DATE -d "$logdate_str" +%s` today_epoch=`$DATE +%s` (( sec_since_log=$today_epoch - $logdate_epoch )) # echo "DEBUG: $today_epoch, $logdate_str, $logdate_epoch" if [[ $sec_since_log -gt $MAX_TIME ]] then # More than max_time so return offline rm $LOCKFILE exit 100 else exit 110 fi else exit 100 fi ;; *) echo "Usage: $0 {start|stop|monitor} resource_name" exit 1 ;; esac
Mike
06-16-2015 04:51 AM
Create a script, say called tcp-echo.sh, with something like:
ARG=$2 LOCKFILE=/tmp/.{$2}.DO_NOT_REMOVE case "$1" in start) # START APP touch $LOCKFILE ;; stop) # Stop App rm $LOCKFILE ;; monitor) if [ -a $LOCKFILE ] then # DO TCP echo test if [ $? -eq 0 ] then exit 110 else rm $LOCKFILE exit 100 fi else exit 100 fi ;; *) echo "Usage: $0 {start|stop|monitor} arg exit 1 ;; esac
and then use Application resource:
Application tcp-echo1 ( StartProgram = "/opt/VRTSvcs/bin/tcp-echo.sh start tcp-echo1" StopProgram = "/opt/VRTSvcs/bin/tcp-echo.sh stop tcp-echo1" MonitorProgram = "/opt/VRTSvcs/bin/tcp-echo.sh monitor tcp-echo1" )
The lock file ensures you don't get concurrency violation as if you just monitor TCP echo, then all nodes in your cluster will report resource is online.
Mike
06-22-2015 10:46 PM
Thanks Mike. A little change in plan.
Last echo cycle (800/810) completion time will be written in to a log file by application. I want to write a cluster application script which read the time from log and compare it to current time. If the result exceeded 5 minutes then I want Cluster trigger fault on application script resource.
06-23-2015 02:08 AM
This is more of a scripting question than VCS, but if you post an example of last few lines of log file, I will take a look when I get time. If other things are posted in log other than the time, please give examples and if the time log is not always the last line in the log, how many "non-time log" lines may there be at the end of the file.
Mike
06-23-2015 04:09 AM
The below is the only line (example) print in the log
23/06/2015 15:25:00
06-23-2015 05:20 AM
Try the following:
MAX_TIME=300 # 5 mins LOGFILE=/tmp/testlog ARG=$2 LOCKFILE=/tmp/.$2.DO_NOT_REMOVE case "$1" in start) # START APP touch $LOCKFILE ;; stop) # Stop App rm $LOCKFILE ;; monitor) if [ -a $LOCKFILE ] then # Check date in LOGFILE is less than MAX_TIME ago logdate_str=`/usr/bin/tail -1 $LOGFILE | /bin/awk ' {split($1,slash,"/");print slash[3]"/"slash[2]"/"slash[1]" "$2}'` logdate_epoch=`/bin/date -d "$logdate_str" +%s` today_epoch=`/bin/date +%s` (( sec_since_log=$today_epoch - $logdate_epoch )) if [[ $sec_since_log -gt $MAX_TIME ]] then # More than max_time so return offline rm $LOCKFILE exit 100 else exit 110 fi else exit 100 fi ;; *) echo "Usage: $0 {start|stop|monitor} resource_name" exit 1 ;; esac
Mike
06-26-2015 12:35 PM
Thanks Mike. I will update you on this :)
07-15-2015 02:12 AM
Thanks MIKE. I tested. Its running awesome. But need your help a bit more. The log file where the time stamp will be updating after every 5 minutes need to create(log file) which required a change in Application.
I discussed with Application dev team and they told me that time stamp already updating in to an existing log file and no ned to create a new log file.
Details of existing log file name:
A new log file created on daily basis with the name "Z35xlo.o0712". Z35xlo.o is fixed and 0712 is mmdd(Month and Day).
Details of existing log file data:
Z35172.16.24.6.o0620:15/06/20 16:36:11:688 |DBG | TCPCommunication::initializeHeartBeat Heart Beat Msg Const [ARE_YOU_ALIVE]
Z35172.16.24.6.o0620:15/06/20 16:42:06:213 |DBG | TCPCommunication::initializeHeartBeat Heart Beat Msg Const [ARE_YOU_ALIVE]
----x---------------x-----------------------x-----------------x------------------x---------------x--------------x------------x
Query:
So we need to grep the time stamp from end/bottom of that log file (Z35xlo.ommdd) and compare that time with the current time.
Z35xlo.ommdd file size = 50MB
07-15-2015 04:12 AM
Try this:
MAX_TIME=300 # 5 mins ARG=$2 AWK=/bin/awk TAIL=/bin/tail DATE=/bin/date LOCKFILE=/tmp/.$2.DO_NOT_REMOVE case "$1" in start) # START APP touch $LOCKFILE ;; stop) # Stop App rm $LOCKFILE ;; monitor) if [ -a $LOCKFILE ] then LOGFILE=/tmp/Z35xlo.o`$DATE +%m%d` # Check date in LOGFILE is less than MAX_TIME ago logdate_str=`$TAIL -1 $LOGFILE | $AWK -F: '{print "20"$2":"$3":"$4}'` logdate_epoch=`$DATE -d "$logdate_str" +%s` today_epoch=`$DATE +%s` (( sec_since_log=$today_epoch - $logdate_epoch )) # echo "DEBUG: $today_epoch, $logdate_str, $logdate_epoch" if [[ $sec_since_log -gt $MAX_TIME ]] then # More than max_time so return offline rm $LOCKFILE exit 100 else exit 110 fi else exit 100 fi ;; *) echo "Usage: $0 {start|stop|monitor} resource_name" exit 1 ;; esac
Mike
07-16-2015 04:17 AM
Nice effort Mike. I am really thank ful to you.
/usr/bin/tail need to be done instead of /bin/tail at my lab.