bonny6
15 years agoLevel 3
problem with VRTS ONG resource
Ive got a problem . i have 2 servers with VRTS cluster
altought VRTS ONG process is online,im getting this message very often
MIG_MPM_1a mpm1a (Veritas_Cluster_Server): ONG (ONG): Resource state is unknown
and also this message
VCS INFO V-16-2-13001 (mpm1a) Resource(ONG): Output of the completed operation (monitor)
/opt/VRTSvcs/bin/ONG/monitor: test: unknown operator 2300
i have compared both configuration files and nothing missing .
what could be the problem here ?
Thanks,
- bonny,
The error you are getting suggests the ps command in the monitor script is picking up more than one instance of ong_agent and/or ong_monitor, so the test is failing
example:
Here is a process with many instances:
# ps -ef |grep httpd
juser 9111 9094 0 12:27:48 ? 0:00 /usr/local/apache2/bin/httpd -k start
juser 10190 9094 0 12:31:56 ? 0:00 /usr/local/apache2/bin/httpd -k start
juser 9112 9094 0 12:27:48 ? 0:00 /usr/local/apache2/bin/httpd -k start
juser 9109 9094 0 12:27:48 ? 0:00 /usr/local/apache2/bin/httpd -k start
root 9094 6589 0 12:27:48 ? 0:00 /usr/local/apache2/bin/httpd -k start
juser 9110 9094 0 12:27:48 ? 0:00 /usr/local/apache2/bin/httpd -k start
juser 11247 9094 0 12:39:14 ? 0:00 /usr/local/apache2/bin/httpd -k start
juser 12314 9094 0 12:42:03 ? 0:00 /usr/local/apache2/bin/httpd -k start
juser 9113 9094 0 12:27:48 ? 0:00 /usr/local/apache2/bin/httpd -k start
juser 10195 9094 0 12:32:29 ? 0:00 /usr/local/apache2/bin/httpd -k start
root 16818 16541 0 13:06:11 pts/4 0:00 grep httpd
Substitute this process into the monitor script logic:
# process1=`ps -ef | grep httpd | grep -v grep | awk '{ print $2 }'`
# if [ X$process1 = "X" ]; then
> echo 100
> else
> echo 110
> fi
test: unknown operator 10190
^^^^^^^ this is the error you are getting (with diff proc number obviously)
this is because test is expecting $process1 to be a single arg, but it's not:
# echo $process1
9111 10190 9112 9109 9094 9110 11247 12314 9113 10195
Compare to a "good" process with only one instance:
# ps -ef |grep lpsched
root 6858 6589 0 Jan 21 ? 0:00 /usr/lib/lp/local/lpsched
root 17617 16541 0 13:07:45 pts/4 0:00 grep lpsched
# process2=`ps -ef | grep lpsched | grep -v grep | awk '{print $2 }'`
# if [ X$process2 = "X" ]; then
> echo 100
> else
> echo 110
> fi
110
^^^^^^^ correct output
# echo $process2
6858
Re: why it's picking up multiple instances, it's not possible to determine that from here (some possibilities include: someone/something else might be running the program manually at the same time, or running a proc/file with the same name)
If the application multiple instances running (ie: it is fine as long as it can find at least one instance), then the monitor script can be modified as follows as a workaround (similar to Gaurav's suggestion, but will account for multiple lines found in ps):
process1=`ps -ef | grep '/'ong_agent | grep -vc grep`
process2=`ps -ef | grep '/'ong_alerter | grep -vc grep`
# Check the process of the ODM
if [ $process1 -le 0 ] ; then
retcode=100
elif [ $process2 -le 0 ] ; then
retcode=100
else
retcode=110
fi
If the application cannot handle multiple processes running (ie: there should only be one process running at a time, any more is problem/issue), then you will need to investigate on your system or follow up with the ONG vendor to see where/how the extra processes are being run.