Forum Discussion

bonny6's avatar
bonny6
Level 3
14 years ago

problem with VRTS ONG resource


Ive got a problem . i have 2 servers with VRTS cluster

altought VRTS ONG process is online,im getting this message very often

 MIG_MPM_1a mpm1a (Veritas_Cluster_Server): ONG (ONG): Resource state is unknown

and also this message

 VCS INFO V-16-2-13001 (mpm1a) Resource(ONG): Output of the completed operation (monitor)
/opt/VRTSvcs/bin/ONG/monitor: test: unknown operator 2300

 i have compared both configuration files and nothing missing .

what could be the problem here ?

Thanks,

  • bonny,

    The error you are getting suggests the ps command in the monitor script is picking up more than one instance of ong_agent and/or ong_monitor, so the test is failing

    example:
    Here is a process with many instances:
    # ps -ef |grep httpd
       juser  9111  9094   0 12:27:48 ?           0:00 /usr/local/apache2/bin/httpd -k start
       juser 10190  9094   0 12:31:56 ?           0:00 /usr/local/apache2/bin/httpd -k start
       juser  9112  9094   0 12:27:48 ?           0:00 /usr/local/apache2/bin/httpd -k start
       juser  9109  9094   0 12:27:48 ?           0:00 /usr/local/apache2/bin/httpd -k start
        root  9094  6589   0 12:27:48 ?           0:00 /usr/local/apache2/bin/httpd -k start
       juser  9110  9094   0 12:27:48 ?           0:00 /usr/local/apache2/bin/httpd -k start
       juser 11247  9094   0 12:39:14 ?           0:00 /usr/local/apache2/bin/httpd -k start
       juser 12314  9094   0 12:42:03 ?           0:00 /usr/local/apache2/bin/httpd -k start
       juser  9113  9094   0 12:27:48 ?           0:00 /usr/local/apache2/bin/httpd -k start
       juser 10195  9094   0 12:32:29 ?           0:00 /usr/local/apache2/bin/httpd -k start
        root 16818 16541   0 13:06:11 pts/4       0:00 grep httpd

    Substitute this process into the monitor script logic:
    # process1=`ps -ef | grep httpd | grep -v grep | awk '{ print $2 }'`
    # if [ X$process1 = "X" ]; then
    > echo 100
    > else
    > echo 110
    > fi
    test: unknown operator 10190
    ^^^^^^^ this is the error you are getting (with diff proc number obviously)
    this is because test is expecting $process1 to be a single arg, but it's not:
    # echo $process1
    9111 10190 9112 9109 9094 9110 11247 12314 9113 10195

    Compare to a "good" process with only one instance:
    # ps -ef |grep lpsched
        root  6858  6589   0   Jan 21 ?           0:00 /usr/lib/lp/local/lpsched
        root 17617 16541   0 13:07:45 pts/4       0:00 grep lpsched
    # process2=`ps -ef | grep lpsched | grep -v grep | awk '{print $2 }'`
    # if [ X$process2 = "X" ]; then
    > echo 100
    > else
    > echo 110
    > fi
    110
    ^^^^^^^ correct output
    # echo $process2
    6858

    Re: why it's picking up multiple instances, it's not possible to determine that from here (some possibilities include: someone/something else might be running the program manually at the same time, or running a proc/file with the same name)

    If the application multiple instances running (ie: it is fine as long as it can find at least one instance), then the monitor script can be modified as follows as a workaround (similar to Gaurav's suggestion, but will account for multiple lines found in ps):

    process1=`ps -ef | grep '/'ong_agent | grep -vc grep`
    process2=`ps -ef | grep '/'ong_alerter | grep -vc grep`

    # Check the process of the ODM
    if [ $process1 -le 0 ] ; then
        retcode=100
    elif [ $process2 -le 0 ] ; then
        retcode=100
    else
        retcode=110
    fi

    If the application cannot handle multiple processes running (ie: there should only be one process running at a time, any more is problem/issue), then you will need to investigate on your system or follow up with the ONG vendor to see where/how the extra processes are being run.

25 Replies