Forum Discussion

hytham_fekry's avatar
10 years ago

MonitorProcesses parameter in Veritas cluster

Hi ,

   we are running VERITAS cluster version  6.1  on Red Hat Enterprise Linux Server release 6.5 (Santiago) .

we configured a resource to monitor an application "pricemaker" , when  we configure MonitorProcesses parameter for the process below , hacf is showing no errors , however after running haconf -dump -make ro , the parameter is reset as below :

- main.cf before dump : 

================

MonitorProcesses={ "java -server -d64 -verbose:gc -XX:+UseCompressedOops -XX:+PrintGCTimeStamps -XX:+UseSerialGC -ms12g -mx12g -XX:NewSize=1g -XX:MaxNewSize=1g -DlocalMode=true -Djava.library.path=/home/rlx/rlmapp2/RLX/instance1/Product/lib:/home/rlx/rlmapp2/RLX/instance1/Config/lib -Dpricemaker.directory=/home/rlx/rlmapp2/RLX/instance1 -Dpricemaker.stopIfDisabledLookup=true -Ddb.Product= -Ddb.DriverClass= -Ddb.ConnectionString= -Ddb.User= -Ddb.Password= -Ddb.Schema= -classpath /home/rlx/rlmapp2/RLX/instance1/Product/properties:/home/rlx/rlmapp2/RLX/instance1/Product/lib/3rdParty.jar:/home/rlx/rlmapp2/RLX/instance1/Product/lib/pmutil.jar:/home/rlx/rlmapp2/RLX/instance1/Product/lib/pricemaker.jar:/home/rlx/rlmapp2/RLX/instance1/Product/lib/sun/jmxremote_optional.jar:/home/rlx/rlmapp2/RLX/instance1/Product/lib/sun/jmxremote.jar:/home/rlx/rlmapp2/RLX/instance1/Product/lib/sun/jmxri.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rlmexecution-3rdParty.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rlmexecution.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rlmpoints.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rlmpointsinterfaces.jar:/home/rlx/rlmapp2/RLX/data:/home/rlx/rlmapp2/RLX/lib/Utilities/riiutil.jar:/home/rlx/rlmapp2/RLX/lib/Utilities/3rdParty.jar:/home/rlx/rlmapp2/RLX/lib/Repository/embedded.jar:/home/rlx/rlmapp2/RLX/lib/Repository/3rdParty.jar:/home/rlx/rlmapp2/RLX/lib/Catalog/deployment.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rlmexecution.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib:/home/rlx/rlmapp2/RLX/instance1/Config/lib/support.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/plugins.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rp:/home/rlx/rlmapp2/RLX/master/Product/lib/oracle/10.2.0.5/ojdbc14.jar com.rii.pricemaker.engine.RatingEngine" }

 

** process name is added as one line in main.cf file .. that's not the problem 

- main.cf after dump

==============

MonitorProcesses={ "" } 

[root@rlm2 ~]# ps -aef |grep rlx

rlx        527     1  1 16:04 pts/1    00:00:41 java -server -d64 -verbose:gc -XX:+UseCompressedOops -XX:+PrintGCTimeStamps -XX:+UseSerialGC -ms12g -mx12g -XX:NewSize=1g -XX:MaxNewSize=1g -DlocalMode=true -Djava.library.path=/home/rlx/rlmapp2/RLX/instance1/Product/lib:/home/rlx/rlmapp2/RLX/instance1/Config/lib -Dpricemaker.directory=/home/rlx/rlmapp2/RLX/instance1 -Dpricemaker.stopIfDisabledLookup=true -Ddb.Product= -Ddb.DriverClass= -Ddb.ConnectionString= -Ddb.User= -Ddb.Password= -Ddb.Schema= -classpath /home/rlx/rlmapp2/RLX/instance1/Product/properties:/home/rlx/rlmapp2/RLX/instance1/Product/lib/3rdParty.jar:/home/rlx/rlmapp2/RLX/instance1/Product/lib/pmutil.jar:/home/rlx/rlmapp2/RLX/instance1/Product/lib/pricemaker.jar:/home/rlx/rlmapp2/RLX/instance1/Product/lib/sun/jmxremote_optional.jar:/home/rlx/rlmapp2/RLX/instance1/Product/lib/sun/jmxremote.jar:/home/rlx/rlmapp2/RLX/instance1/Product/lib/sun/jmxri.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rlmexecution-3rdParty.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rlmexecution.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rlmpoints.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rlmpointsinterfaces.jar:/home/rlx/rlmapp2/RLX/data:/home/rlx/rlmapp2/RLX/lib/Utilities/riiutil.jar:/home/rlx/rlmapp2/RLX/lib/Utilities/3rdParty.jar:/home/rlx/rlmapp2/RLX/lib/Repository/embedded.jar:/home/rlx/rlmapp2/RLX/lib/Repository/3rdParty.jar:/home/rlx/rlmapp2/RLX/lib/Catalog/deployment.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rlmexecution.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib:/home/rlx/rlmapp2/RLX/instance1/Config/lib/support.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/plugins.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rp:/home/rlx/rlmapp2/RLX/master/Product/lib/oracle/10.2.0.5/ojdbc14.jar com.rii.pricemaker.engine.RatingEngine

root      4228 29319  0 16:42 pts/0    00:00:00 grep rlx

root     32596 31636  0 16:04 pts/1    00:00:00 su - rlx

rlx      32597 32596  0 16:04 pts/1    00:00:00 –bash

 

** The probe script is enough for our needs , The problem is if we hashed this MonitorProcesses parameter , V-16-1-10283 is reported .

3 Replies

  • Firstly, as you describe the behaviour of VCS somehow mangling the setting of the MonitorProcesses attribute to:

    MonitorProcesses={ "" } 

    after you do a dump of the configuration:

    haconf -dump makero

    Well, that is a big bad ugly bug that you should report and escalate to Tech Support. 

     

    Secondly, as Mike suggests, use MonitorProgram instead of MonitorProcesses to avoid the bug.  Create a simple script that will allow you to reliably match the unique process via appropriate egrep regular expression-match, such as:

    procUSER="$( hares -value pricemakerResourceName User )"
    [[ -n $procUSER ]] && procUSER="root"
    
    /bin/ps -u $procUSER -wwo args | egrep '^java .*jar com.rii.pricemaker.engine.RatingEngine$'
    if [[ $? == 0 ]] ; then
      # indicate to VCS ONLINE
      exit 0
    else
      # indicate to VCS OFFLINE
      exit 1
    fi

    Modify the pattern-matching portion until it will reliably AND uniquely matches ONLY the process you are monitoring.

    Double-check what the appropriate exit code should be for your version of VCS and Application agent documentation...

    This method will both avoid the bug you are witnessing and it will provide you a much more resonable main.cf file, as you will be avoiding using such a large MonitorProcesses definition.

    2015-01-08: Updated and fixed above code snippet to work properly on Linux....

  • I thought it might be a problem with the attribute being too long, but I tried on my Linux 6.1 cluster running on RHEL 5.5 (in Vbox) and it works fine.

    main.cf before start VCS is as yours and after starting VCS, the MonitorProcesses in main.cf get reformatted on to 2 lines:

                    MonitorProcesses = {
                             "java -server -d64 -verbose:gc -XX:+UseCompressedOops -XX:+PrintGCTimeStamps -XX:+UseSerialGC -ms12g -mx12g -XX:NewSize=1g -XX:MaxNewSize=1g -DlocalMode=true -Djava.library.path=/home/rlx/rlmapp2/RLX/instance1/Product/lib:/home/rlx/rlmapp2/RLX/instance1/Config/lib -Dpricemaker.directory=/home/rlx/rlmapp2/RLX/instance1 -Dpricemaker.stopIfDisabledLookup=true -Ddb.Product= -Ddb.DriverClass= -Ddb.ConnectionString= -Ddb.User= -Ddb.Password= -Ddb.Schema= -classpath /home/rlx/rlmapp2/RLX/instance1/Product/properties:/home/rlx/rlmapp2/RLX/instance1/Product/lib/3rdParty.jar:/home/rlx/rlmapp2/RLX/instance1/Product/lib/pmutil.jar:/home/rlx/rlmapp2/RLX/instance1/Product/lib/pricemaker.jar:/home/rlx/rlmapp2/RLX/instance1/Product/lib/sun/jmxremote_optional.jar:/home/rlx/rlmapp2/RLX/instance1/Product/lib/sun/jmxremote.jar:/home/rlx/rlmapp2/RLX/instance1/Product/lib/sun/jmxri.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rlmexecution-3rdParty.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rlmexecution.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rlmpoints.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rlmpointsinterfaces.jar:/home/rlx/rlmapp2/RLX/data:/home/rlx/rlmapp2/RLX/lib/Utilities/riiutil.jar:/home/rlx/rlmapp2/RLX/lib/Utilities/3rdParty.jar:/home/rlx/rlmapp2/RLX/lib/Repository/embedded.jar:/home/rlx/rlmapp2/RLX/lib/Repository/3rdParty.jar:/home/rlx/rlmapp2/RLX/lib/Catalog/deployment.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rlmexecution.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib:/home/rlx/rlmapp2/RLX/instance1/Config/lib/support.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/plugins.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rp:/home/rlx/rlmapp2/RLX/master/Product/lib/oracle/10.2.0.5/ojdbc14.jar com.rii.pricemaker.engine.RatingEngine" }

    I then tried opening config, and closing and main.cf does not change.  Also made a change to config and dumped and MonitorProcesses stills looks fine.  Also tried setting MonitorProcesses using hares and this works fine too.

    What are the results if you run:

    hares -display app_res_name -attribute MonitorProcesses

    before and after seeing issue.

    Also are there any errrors in engine log.

    If you can't get to work, then using MonitorProgram or PidFiles is a work-a-round instead of using Monitorprogram.

    Mike