cancel
Showing results for 
Search instead for 
Did you mean: 

MonitorProcesses parameter in Veritas cluster

hytham_fekry
Level 4
Partner Accredited Certified

Hi ,

   we are running VERITAS cluster version  6.1  on Red Hat Enterprise Linux Server release 6.5 (Santiago) .

we configured a resource to monitor an application "pricemaker" , when  we configure MonitorProcesses parameter for the process below , hacf is showing no errors , however after running haconf -dump -make ro , the parameter is reset as below :

- main.cf before dump : 

================

MonitorProcesses={ "java -server -d64 -verbose:gc -XX:+UseCompressedOops -XX:+PrintGCTimeStamps -XX:+UseSerialGC -ms12g -mx12g -XX:NewSize=1g -XX:MaxNewSize=1g -DlocalMode=true -Djava.library.path=/home/rlx/rlmapp2/RLX/instance1/Product/lib:/home/rlx/rlmapp2/RLX/instance1/Config/lib -Dpricemaker.directory=/home/rlx/rlmapp2/RLX/instance1 -Dpricemaker.stopIfDisabledLookup=true -Ddb.Product= -Ddb.DriverClass= -Ddb.ConnectionString= -Ddb.User= -Ddb.Password= -Ddb.Schema= -classpath /home/rlx/rlmapp2/RLX/instance1/Product/properties:/home/rlx/rlmapp2/RLX/instance1/Product/lib/3rdParty.jar:/home/rlx/rlmapp2/RLX/instance1/Product/lib/pmutil.jar:/home/rlx/rlmapp2/RLX/instance1/Product/lib/pricemaker.jar:/home/rlx/rlmapp2/RLX/instance1/Product/lib/sun/jmxremote_optional.jar:/home/rlx/rlmapp2/RLX/instance1/Product/lib/sun/jmxremote.jar:/home/rlx/rlmapp2/RLX/instance1/Product/lib/sun/jmxri.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rlmexecution-3rdParty.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rlmexecution.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rlmpoints.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rlmpointsinterfaces.jar:/home/rlx/rlmapp2/RLX/data:/home/rlx/rlmapp2/RLX/lib/Utilities/riiutil.jar:/home/rlx/rlmapp2/RLX/lib/Utilities/3rdParty.jar:/home/rlx/rlmapp2/RLX/lib/Repository/embedded.jar:/home/rlx/rlmapp2/RLX/lib/Repository/3rdParty.jar:/home/rlx/rlmapp2/RLX/lib/Catalog/deployment.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rlmexecution.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib:/home/rlx/rlmapp2/RLX/instance1/Config/lib/support.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/plugins.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rp:/home/rlx/rlmapp2/RLX/master/Product/lib/oracle/10.2.0.5/ojdbc14.jar com.rii.pricemaker.engine.RatingEngine" }

 

** process name is added as one line in main.cf file .. that's not the problem 

- main.cf after dump

==============

MonitorProcesses={ "" } 

[root@rlm2 ~]# ps -aef |grep rlx

rlx        527     1  1 16:04 pts/1    00:00:41 java -server -d64 -verbose:gc -XX:+UseCompressedOops -XX:+PrintGCTimeStamps -XX:+UseSerialGC -ms12g -mx12g -XX:NewSize=1g -XX:MaxNewSize=1g -DlocalMode=true -Djava.library.path=/home/rlx/rlmapp2/RLX/instance1/Product/lib:/home/rlx/rlmapp2/RLX/instance1/Config/lib -Dpricemaker.directory=/home/rlx/rlmapp2/RLX/instance1 -Dpricemaker.stopIfDisabledLookup=true -Ddb.Product= -Ddb.DriverClass= -Ddb.ConnectionString= -Ddb.User= -Ddb.Password= -Ddb.Schema= -classpath /home/rlx/rlmapp2/RLX/instance1/Product/properties:/home/rlx/rlmapp2/RLX/instance1/Product/lib/3rdParty.jar:/home/rlx/rlmapp2/RLX/instance1/Product/lib/pmutil.jar:/home/rlx/rlmapp2/RLX/instance1/Product/lib/pricemaker.jar:/home/rlx/rlmapp2/RLX/instance1/Product/lib/sun/jmxremote_optional.jar:/home/rlx/rlmapp2/RLX/instance1/Product/lib/sun/jmxremote.jar:/home/rlx/rlmapp2/RLX/instance1/Product/lib/sun/jmxri.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rlmexecution-3rdParty.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rlmexecution.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rlmpoints.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rlmpointsinterfaces.jar:/home/rlx/rlmapp2/RLX/data:/home/rlx/rlmapp2/RLX/lib/Utilities/riiutil.jar:/home/rlx/rlmapp2/RLX/lib/Utilities/3rdParty.jar:/home/rlx/rlmapp2/RLX/lib/Repository/embedded.jar:/home/rlx/rlmapp2/RLX/lib/Repository/3rdParty.jar:/home/rlx/rlmapp2/RLX/lib/Catalog/deployment.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rlmexecution.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib:/home/rlx/rlmapp2/RLX/instance1/Config/lib/support.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/plugins.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rp:/home/rlx/rlmapp2/RLX/master/Product/lib/oracle/10.2.0.5/ojdbc14.jar com.rii.pricemaker.engine.RatingEngine

root      4228 29319  0 16:42 pts/0    00:00:00 grep rlx

root     32596 31636  0 16:04 pts/1    00:00:00 su - rlx

rlx      32597 32596  0 16:04 pts/1    00:00:00 –bash

 

** The probe script is enough for our needs , The problem is if we hashed this MonitorProcesses parameter , V-16-1-10283 is reported .

3 REPLIES 3

mikebounds
Level 6
Partner Accredited

I thought it might be a problem with the attribute being too long, but I tried on my Linux 6.1 cluster running on RHEL 5.5 (in Vbox) and it works fine.

main.cf before start VCS is as yours and after starting VCS, the MonitorProcesses in main.cf get reformatted on to 2 lines:

                MonitorProcesses = {
                         "java -server -d64 -verbose:gc -XX:+UseCompressedOops -XX:+PrintGCTimeStamps -XX:+UseSerialGC -ms12g -mx12g -XX:NewSize=1g -XX:MaxNewSize=1g -DlocalMode=true -Djava.library.path=/home/rlx/rlmapp2/RLX/instance1/Product/lib:/home/rlx/rlmapp2/RLX/instance1/Config/lib -Dpricemaker.directory=/home/rlx/rlmapp2/RLX/instance1 -Dpricemaker.stopIfDisabledLookup=true -Ddb.Product= -Ddb.DriverClass= -Ddb.ConnectionString= -Ddb.User= -Ddb.Password= -Ddb.Schema= -classpath /home/rlx/rlmapp2/RLX/instance1/Product/properties:/home/rlx/rlmapp2/RLX/instance1/Product/lib/3rdParty.jar:/home/rlx/rlmapp2/RLX/instance1/Product/lib/pmutil.jar:/home/rlx/rlmapp2/RLX/instance1/Product/lib/pricemaker.jar:/home/rlx/rlmapp2/RLX/instance1/Product/lib/sun/jmxremote_optional.jar:/home/rlx/rlmapp2/RLX/instance1/Product/lib/sun/jmxremote.jar:/home/rlx/rlmapp2/RLX/instance1/Product/lib/sun/jmxri.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rlmexecution-3rdParty.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rlmexecution.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rlmpoints.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rlmpointsinterfaces.jar:/home/rlx/rlmapp2/RLX/data:/home/rlx/rlmapp2/RLX/lib/Utilities/riiutil.jar:/home/rlx/rlmapp2/RLX/lib/Utilities/3rdParty.jar:/home/rlx/rlmapp2/RLX/lib/Repository/embedded.jar:/home/rlx/rlmapp2/RLX/lib/Repository/3rdParty.jar:/home/rlx/rlmapp2/RLX/lib/Catalog/deployment.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rlmexecution.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib:/home/rlx/rlmapp2/RLX/instance1/Config/lib/support.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/plugins.jar:/home/rlx/rlmapp2/RLX/instance1/Config/lib/rp:/home/rlx/rlmapp2/RLX/master/Product/lib/oracle/10.2.0.5/ojdbc14.jar com.rii.pricemaker.engine.RatingEngine" }

I then tried opening config, and closing and main.cf does not change.  Also made a change to config and dumped and MonitorProcesses stills looks fine.  Also tried setting MonitorProcesses using hares and this works fine too.

What are the results if you run:

hares -display app_res_name -attribute MonitorProcesses

before and after seeing issue.

Also are there any errrors in engine log.

If you can't get to work, then using MonitorProgram or PidFiles is a work-a-round instead of using Monitorprogram.

Mike

starflyfly
Level 6
Employee Accredited Certified

try to shorten the monitorprocess , and test again.

kjbss
Level 5
Partner Accredited

Firstly, as you describe the behaviour of VCS somehow mangling the setting of the MonitorProcesses attribute to:

MonitorProcesses={ "" } 

after you do a dump of the configuration:

haconf -dump makero

Well, that is a big bad ugly bug that you should report and escalate to Tech Support. 

 

Secondly, as Mike suggests, use MonitorProgram instead of MonitorProcesses to avoid the bug.  Create a simple script that will allow you to reliably match the unique process via appropriate egrep regular expression-match, such as:

procUSER="$( hares -value pricemakerResourceName User )"
[[ -n $procUSER ]] && procUSER="root"

/bin/ps -u $procUSER -wwo args | egrep '^java .*jar com.rii.pricemaker.engine.RatingEngine$'
if [[ $? == 0 ]] ; then
  # indicate to VCS ONLINE
  exit 0
else
  # indicate to VCS OFFLINE
  exit 1
fi

Modify the pattern-matching portion until it will reliably AND uniquely matches ONLY the process you are monitoring.

Double-check what the appropriate exit code should be for your version of VCS and Application agent documentation...

This method will both avoid the bug you are witnessing and it will provide you a much more resonable main.cf file, as you will be avoiding using such a large MonitorProcesses definition.

2015-01-08: Updated and fixed above code snippet to work properly on Linux....