cancel
Showing results for 
Search instead for 
Did you mean: 

Application resource is in w_offline state

pawan_n
Level 3

Hi All,

I have created a application resource in a veritas cluster of 2 nodes.

It does not have any dependency on any other resource.

One primary node, it's state is correct(OFFLINE) but on secondary node it is in w_offline state.Logs show that the cluster is not able to bring the resource to OFFLINE state on secondary node.

I tried hastop -local on sec. node followed by hastart, but no luck.

I even tried deleting my App. resource & re-created it with some other name, but results are same.

Any help would be appreciated!!

 

Thanks,

Pawan

2 ACCEPTED SOLUTIONS

Accepted Solutions

RiaanBadenhorst
Level 6
Partner    VIP    Accredited Certified

That is quite a list :)

 

I would say stick with the monitor program you wrote. It would need to exit with 100 (offline) or 110 (online).

View solution in original post

mikebounds
Level 6
Partner Accredited

You can't use pattern matching with Application agent so your choices are:

  1. Use MonitorProgram
  2. Use PidFiles
  3. Use Agent builder where you can use pattern matching using the "MonitorProcessPatterns" attribute (see https://sort.symantec.com/agents/detail/5711)

Mike

View solution in original post

11 REPLIES 11

RiaanBadenhorst
Level 6
Partner    VIP    Accredited Certified

Review the application agent log in /var/VRTSvcs/log/

Sacheen_Birhade
Level 2

Hi Pawan,

 

Can you please post logs?

Marianne
Level 6
Partner    VIP    Accredited Certified

Please post Application config in main.cf along with VCS logs.

pawan_n
Level 3

Hi All,

 

The issue is resolved now.

I was directly bringing my application up without bringing its resource ONLINE.

I first brought the resource ONLINE & then started my Application & the issue is resolved.

 

Thanks,

Pawan.

RiaanBadenhorst
Level 6
Partner    VIP    Accredited Certified

I first brought the resource ONLINE & then started my Application & the issue is resolved.<<< that doesn't make sense.

 

Onlining the resource should start your application. Unless there its some strange two tiered process to start this appl.

pawan_n
Level 3

Hi Riaan,

Sorry for the confusion from my end.

Actually, bringing the Application resource ONLINE is indeed starting my application automatically due to StartProgram attribute.

But, I had to write a monitor program to detect OFFLINE state of my application & use MonitorProgram attribute as MonitorProcesses attribute didn't work for me.

My application process string is of 28 lines(as per ps -ef output) & specifying its substring in MonitorProcesses doesn't help as it requires the exact string.

My question is can we specify such a huge string(like 28 lines of string in my case) in MonitorProcesses?

 

Thanks,

Pawan

RiaanBadenhorst
Level 6
Partner    VIP    Accredited Certified

Can you post your main.cf

 

and details about the application (ps -ef etc)

pawan_n
Level 3

Hi Riaan,

Here you go...we are concerned about CLSGBPPM resource group-

-----------------------------------------------------------------------------------------------------

main.cf

 

cluster CLMBPPM (
    UserNames = { admin = JijBidIfjEjjHrjDig }
    Administrators = { admin }
    HacliUserLevel = COMMANDROOT
    )

system ithvvm606 (
    )

system ithvvm607 (
    )

group CLSGBPPM (
    SystemList = { ithvvm606 = 0, ithvvm607 = 1 }
    AutoStartList = { ithvvm606, ithvvm607 }
    )

    Application BPPMAPP_RES (
        StartProgram = "/opt/bmc/ProactiveNet/pw/pronto/bin/pw system start"
        StopProgram = "/opt/bmc/ProactiveNet/pw/pronto/bin/pw system stop"
        MonitorProgram = "/opt/bmc/ProactiveNet/bppmprog.sh"
        )

    DiskGroup CLSG_appdg (
        Critical = 0
        DiskGroup = appdg
        StartVolumes = 0
        )

    IP CLSG_IP (
        Device = eth0
        Address = "10.119.155.208"
        NetMask = "255.255.255.0"
        )

    Mount CLSG_MNT (
        Critical = 0
        MountPoint = "/opt/bmc"
        BlockDevice = "/dev/vx/dsk/appdg/opt_bmc_vol"
        FSType = vxfs
        FsckOpt = "-y"
        )

    NIC CLSG_NIC (
        Device = "eth0:0"
        )

    NIC appNIC (
        Device = eth0
        )

    Volume CLSG_VOL (
        Critical = 0
        DiskGroup = appdg
        Volume = opt_bmc_vol
        )

    BPPMAPP_RES requires CLSG_MNT
    CLSG_MNT requires CLSG_VOL
    CLSG_VOL requires CLSG_appdg


    // resource dependency tree
    //
    //    group CLSGBPPM
    //    {
    //    Application BPPMAPP_RES
    //        {
    //        Mount CLSG_MNT
    //            {
    //            Volume CLSG_VOL
    //                {
    //                DiskGroup CLSG_appdg
    //                }
    //            }
    //        }
    //    IP CLSG_IP
    //    NIC CLSG_NIC
    //    NIC appNIC
    //    }


group cvm (
    SystemList = { ithvvm606 = 0, ithvvm607 = 1 }
    AutoFailOver = 0
    Parallel = 1
    AutoStartList = { ithvvm606, ithvvm607 }
    )

    CFSfsckd vxfsckd (
        )

    CVMCluster cvm_clus (
        CVMClustName = CLMBPPM
        CVMNodeId = { ithvvm606 = 0, ithvvm607 = 1 }
        CVMTransport = gab
        CVMTimeout = 200
        )

    CVMVxconfigd cvm_vxconfigd (
        Critical = 0
        CVMVxconfigdArgs = { syslog }
        )

    ProcessOnOnly vxattachd (
        Critical = 0
        PathName = "/bin/sh"
        Arguments = "- /usr/lib/vxvm/bin/vxattachd root"
        RestartLimit = 3
        )

    cvm_clus requires cvm_vxconfigd
    vxfsckd requires cvm_clus


    // resource dependency tree
    //
    //    group cvm
    //    {
    //    ProcessOnOnly vxattachd
    //    CFSfsckd vxfsckd
    //        {
    //        CVMCluster cvm_clus
    //            {
    //            CVMVxconfigd cvm_vxconfigd
    //            }
    //        }
    //    }


------------------------------------------------------------------------------------------------

We have to monitor 7-8 processes with huge strings.Pid Files are created for only 2 of them hence PidFiles attribute of almost no use.

ps -ef o/p of one of those processes to be monitored-

root     17692     1  0 04:37 ?        00:03:27 /usr/pw/jre/bin/java -server -Djserver=DUMMY -Djava.net.namelookup.cache=0 -Dsun.net.inetaddr.ttl=0 -Djava2d.font.usePlatformFont=false -Djava.awt.headless=true -Djava.awt.fonts=/usr/pw/jre/lib/fonts -Duser.home=/usr/pw/jre -Djava.security.policy=/usr/pw/pronto/conf/java.security.policy -Djava.system.class.loader=com.proactivenet.reports.util.ReportClassLoader -Dpronet.demonname=ProactiveNet_JServer -Dpronet.properties=/usr/pw/pronto/conf/pronet.conf -Dpronet.rate.cell.cellName=pncell_CLMBPPM -Dias.home=/usr/pw/pronto -Dcloud.adapter.home=/usr/pw/pronto -Dmarimba.cas.tool.dir=/usr/pw/pronto/conf -Djava.security.auth.login.config=/usr/pw/pronto/conf/jaas.config -Djava.util.logging.config.file=/usr/pw/pronto/conf/ias_logging.properties -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Dlog4j.configuration=file:/usr/pw/tomcat/conf/log4j.properties -Dcatalina.base=/usr/pw/tomcat -Dcatalina.home=/usr/pw/tomcat -Djava.io.tmpdir=/usr/pw/tomcat/temp -Daxis2.repo=/usr/pw/tomcat/webapps/bppmws/WEB-INF -Daxis2.xml=/usr/pw/tomcat/webapps/bppmws/WEB-INF/conf/axis2.xml -Dexpression.lib.home=/usr/pw/pronto -Djavax.net.ssl.trustStore=/usr/pw/pronto/conf/pnserver.ks -Djavax.net.ssl.trustStorePassword=get2net -Dcellservice.impactmanager.connectionpoolsize=10 -Dcellservice.CacheEnabled=true -Dhttp.maxConnections=500 -Xms64m -Xmx2048m -XX:MaxPermSize=256m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/usr/pw/pronto/logs -XX:+UseCompressedOops -classpath /usr/pw/apps3rdparty/jdbc/ojdbc6.jar:/usr/pw/apps3rdparty/antlr-2.7.6.jar:/usr/pw/apps3rdparty/aopalliance.jar:/usr/pw/apps3rdparty/aspectjweaver.jar:/usr/pw/apps3rdparty/c3p0-0.9.1.2.jar:/usr/pw/apps3rdparty/commons-collections-3.2.1.jar:/usr/pw/apps3rdparty/commons-logging-1.1.1.jar:/usr/pw/apps3rdparty/dom4j-1.6.1.jar:/usr/pw/apps3rdparty/hibernate-annotations.jar:/usr/pw/apps3rdparty/hibernate-commons-annotations.jar:/usr/pw/apps3rdparty/hibernate-entitymanager.jar:/usr/pw/apps3rdparty/hibernate3.jar:/usr/pw/apps3rdparty/javassist-3.4.GA.jar:/usr/pw/apps3rdparty/jdbc/jconn4.jar:/usr/pw/apps3rdparty/jta.jar:/usr/pw/apps3rdparty/log4j/log4j.jar:/usr/pw/apps3rdparty/slf4j-api-1.5.0.jar:/usr/pw/apps3rdparty/slf4j-log4j12-1.5.0.jar:/usr/pw/apps3rdparty/persistence.jar:/usr/pw/apps3rdparty/jndi/jndi.jar:/usr/pw/apps3rdparty/jndi/ldap.jar:/usr/pw/apps3rdparty/jndi/providerutil.jar:/usr/pw/apps3rdparty/jndi/jboss-common.jar:/usr/pw/apps3rdparty/jndi/jnp-client.jar:/usr/pw/apps3rdparty/graphing/elegant/common.jar:/usr/pw/apps3rdparty/graphing/elegant/dialgauge.jar:/usr/pw/apps3rdparty/imcomm/imcomm.jar:/usr/pw/jre/lib/rt.jar:/usr/pw/jre/lib/tools.jar:/usr/pw/jboss/client/jnp-client.jar:/usr/pw/jboss/client/log4j.jar:/usr/pw/jboss/lib/jboss-common.jar:/usr/pw/sybase/asa/java/jlogon.jar:/usr/pw/sybase/asa/java/jodbc.jar:/usr/pw/tomcat/lib/servlet-api.jar:/usr/pw/tomcat/bin/tomcat-juli.jar:/usr/pw/tomcat/lib/tomcat-util.jar:/usr/pw/tomcat/lib/tomcat-coyote.jar:/usr/pw/tomcat/lib/tomcat-api.jar:/usr/pw/tomcat/bin/bootstrap.jar:/usr/pw/tomcat/webapps/pronto/WEB-INF/lib/spring-web.jar:/usr/pw/pronto/lib/pw_server.jar:/usr/pw/pronto/lib/multiserver/multiserver.jar:/usr/pw/pronto/lib/multiserver/dbmigration.jar:/usr/pw/pronto/lib/pw_client.jar:/usr/pw/pronto/lib/pw_util.jar:/usr/pw/pronto/lib/pw_agent.jar:/usr/pw/pronto/lib/domain.jar:/usr/pw/pronto/lib/cellservice.jar:/usr/pw/pronto/lib/psservice.jar:/usr/pw/pronto/lib/ngp_service.jar:/usr/pw/pronto/lib/genericdao.jar:/usr/pw/pronto/lib/ngp-persistence.jar:/usr/pw/pronto/lib/atrium-ws-client-7.5.jar:/usr/pw/pronto/lib/atrium-ws-client-2.1.jar:/usr/pw/pronto/lib/blwsclient.jar:/usr/pw/pronto/lib/bppmRoutingService.jar:/usr/pw/pronto/lib/configutil.jar:/usr/pw/pronto/lib/ptools.jar:/usr/pw/pronto/lib/i18n_service.jar:/usr/pw/apps3rdparty/ice/ice.jar:/usr/pw/apps3rdparty/xerces/xercesImpl.jar:/usr/pw/apps3rdparty/xerces/xercesSamples.jar:/usr/pw/apps3rdparty/xerces/xmlParserAPIs.jar:/usr/pw/apps3rdparty/graphing/sitraka/jcschart.jar:/usr/pw/colt/colt.jar:/usr/pw/pronto/lib/pw_applet.jar:/usr/pw/pronto/lib/pproxy_ui.jar:/usr/pw/pronto/li


----------------------------------------------------------------------------------------------------------------------------

Thanks ,

Pawan

 

RiaanBadenhorst
Level 6
Partner    VIP    Accredited Certified

That is quite a list :)

 

I would say stick with the monitor program you wrote. It would need to exit with 100 (offline) or 110 (online).

mikebounds
Level 6
Partner Accredited

You can't use pattern matching with Application agent so your choices are:

  1. Use MonitorProgram
  2. Use PidFiles
  3. Use Agent builder where you can use pattern matching using the "MonitorProcessPatterns" attribute (see https://sort.symantec.com/agents/detail/5711)

Mike

pawan_n
Level 3

Hi All,

Thanks for your inputs.

I have written a Monitor Program which detects ONLINE or OFFLINE state of my application & its working fine for me. Its a simple shell script which is monitoring the running state of my application.

 

Thanks,

Pawan