Forum Discussion

solom's avatar
solom
Level 4
11 years ago

Disk groups stoping

Hi

I installed veritas 6.1 on redhat 6.4 64bit  and vom 6.0 and created disk groups and volumes  and then service groups failover and mounted the volumes in the service groups.

 

When i restarted node1  the service groups failover good to node2 and when i restarted node2 the failover working good and failover but if i tray to do that mounal switch some service groups failover and others not failover and the state for diskgroups in host stoping and the volume too .

 

Please help

 

Regards

  • You seem to have 'nested' mounts:

    MountPoint = "/traksec/meam/live/tcanl"

    MountPoint = "/traksec/meam/live/tcanl/jrn/alt"

    MountPoint = "/traksec/meam/live/tcanl/jrn/pri"

    MountPoint = "/traksec/meam/live/tcanl/wij"

    MountPoint = "/traksec/meam/live/tcanl/app"

     

    This means that /traksec/meam/live/tcanl must be mounted before all the other filesystems can be mounted.

    Same with service group offline - all other filesystems must be unmounted before /traksec/meam/live/tcanl can be unmounted/offlined.

    You need more dependencies: 

    TRAKSECVOL-TCANL-JRNALT requires TRAKSECVOL-TCANL
    TRAKSECVOL-TCANL-JRNPRI requires TRAKSECVOL-TCANL
    TRAKSECVOL-TCANLAPP requires TRAKSECVOL-TCANL

    Your main.cf only shows TRAKSECVOL-TCANL-WIJ requires TRAKSECVOL-TCANL, none of the rest.

    You are also missing TRAKSECVOL-TCANL dependency on the diskgroup:

    TRAKSECVOL-TCANL requires TRAKSEC-INT

     

    Please fix these dependancies, then use the dependency tree view in the Java GUI to check that dependencies are correct.

    When you offline and online the SG, you will be able to see resources going up and down in the correct order.

    Another great utility to test your config is the VCS Simulator.

  • Spot on Marianne ... Completely agree, you are facing issues because of nested mounts

    You need to set the right dependency order as suggested above so that nested mounts go online/offline in correct order.

    Download simulator from below link & see how to use it

    https://www-secure.symantec.com/connect/forums/sfha-solutions-601-using-veritas-cluster-server-simulator

    modify the configuration in simulator & test behavior. Once successfully tested, you can go ahead on Production

     

    G

  • Hi,

    Please paste the snippet of engine_A.log for us to see what is happening

    also, when it volumes are stopping, are you able to see any errors in messages file ?

    are you able to run normal vx commands like "vxdisk list" or "vxtask list" when this issue happens ?

     

    G

  • In messages log this errors come .

     

    code:

    Mar 30 12:05:45 TCSEC-CLU1 Had[23496]: VCS ERROR V-16-10031-1522 (TCSEC-CLU2) DiskGroup:TRAKSEC-LIVEANL:clean:Could not deport the disk group TRAKSEC-LIVEANL.
    Mar 30 12:05:46 TCSEC-CLU1 Had[23496]: VCS ERROR V-16-2-13069 (TCSEC-CLU2) Resource(TRAKSEC-LIVEANL) - clean failed.
    Mar 30 12:06:46 TCSEC-CLU1 Had[23496]: VCS ERROR V-16-2-13077 (TCSEC-CLU2) Agent is unable to offline resource(TRAKSEC-LIVEANL). Administrative intervention may be required.
    Mar 30 12:06:47 TCSEC-CLU1 Had[23496]: VCS ERROR V-16-10031-1522 (TCSEC-CLU2) DiskGroup:TRAKSEC-LIVEANL:clean:Could not

     

     

     

  • IS this a production or development configuration. Is it possible for you to stop VCS? And move the fileystem manually without ucing VCS?

     

  • Hi,

    I am little confused with timestamps, you have pasted above timestamp of Mar 30 however the engine log you have attached has logs only till Mar 27

    however lets see what happened on Mar 27

    Manually initiated switch

    2014/03/27 13:06:08 VCS INFO V-16-1-50135 User admin fired command: hagrp -switch TRAKPRI-LIVEINT  TCPRI-CLU1  localclus  from ::ffff:10.100.208.76


    2014/03/27 13:06:08 VCS NOTICE V-16-1-10208 Initiating switch of group TRAKPRI-LIVEINT from system TCPRI-CLU2 to system TCPRI-CLU1
    2014/03/27 13:06:08 VCS NOTICE V-16-1-10300 Initiating Offline of Resource ip3 (Owner: Unspecified, Group: TRAKPRI-LIVEINT) on System TCPRI-CLU2
    2014/03/27 13:06:09 VCS INFO V-16-1-10305 Resource ip3 (Owner: Unspecified, Group: TRAKPRI-LIVEINT) is offline on TCPRI-CLU2 (VCS initiated)
    2014/03/27 13:06:09 VCS NOTICE V-16-1-10300 Initiating Offline of Resource TRAKPRIVOL-INTJRNPRI (Owner: Unspecified, Group: TRAKPRI-LIVEINT) on System TCPRI-CLU2
    2014/03/27 13:06:09 VCS NOTICE V-16-1-10300 Initiating Offline of Resource TRAKPRIVOL-INT (Owner: Unspecified, Group: TRAKPRI-LIVEINT) on System TCPRI-CLU2
    2014/03/27 13:06:09 VCS NOTICE V-16-1-10300 Initiating Offline of Resource TRAKPRIVOL-INTJRNALT (Owner: Unspecified, Group: TRAKPRI-LIVEINT) on System TCPRI-CLU2
    2014/03/27 13:06:10 VCS INFO V-16-2-13716 (TCPRI-CLU2) Resource(TRAKPRIVOL-INTJRNPRI): Output of the completed operation (offline)

    Two resources reported error saying not mounted

    2014/03/27 13:06:10 VCS INFO V-16-2-13716 (TCPRI-CLU2) Resource(TRAKPRIVOL-INTJRNPRI): Output of the completed operation (offline)
    ==============================================
    umount: /trakpri/meam/live/int/jrn/pri: not mounted
    ==============================================

    2014/03/27 13:06:10 VCS INFO V-16-1-10305 Resource TRAKPRIVOL-INTJRNPRI (Owner: Unspecified, Group: TRAKPRI-LIVEINT) is offline on TCPRI-CLU2 (VCS initiated)
    2014/03/27 13:06:11 VCS INFO V-16-2-13716 (TCPRI-CLU2) Resource(TRAKPRIVOL-INTJRNALT): Output of the completed operation (offline)
    ==============================================
    umount: /trakpri/meam/live/int/jrn/alt: not mounted
    ==============================================

    vxvol reported issues for multiple volumes not able to stop

    2014/03/27 13:06:11 VCS INFO V-16-1-10305 Resource TRAKPRIVOL-INTJRNALT (Owner: Unspecified, Group: TRAKPRI-LIVEINT) is offline on TCPRI-CLU2 (VCS initiated)
    2014/03/27 13:06:11 VCS NOTICE V-16-1-10300 Initiating Offline of Resource TRAKPRI-LIVEINT (Owner: Unspecified, Group: TRAKPRI-LIVEINT) on System TCPRI-CLU2
    2014/03/27 13:06:11 VCS WARNING V-16-10031-1521 (TCPRI-CLU2) DiskGroup:TRAKPRI-LIVEINT:offline:The command *vxvol -g TRAKPRI-LIVEINT stopall* failed. Doing a forced stop.
    2014/03/27 13:06:12 VCS ERROR V-16-10031-1522 (TCPRI-CLU2) DiskGroup:TRAKPRI-LIVEINT:offline:Could not deport the disk group TRAKPRI-LIVEINT.
    2014/03/27 13:06:12 VCS INFO V-16-2-13716 (TCPRI-CLU2) Resource(TRAKPRI-LIVEINT): Output of the completed operation (offline)
    ==============================================
    VxVM vxvol ERROR V-5-1-1220 Volume TRAKPRIVOL-INTJRNPRI is currently open or mounted
    VxVM vxvol ERROR V-5-1-1220 Volume TRAKPRIVOL-INTJRNALT is currently open or mounted
    VxVM vxvol WARNING V-5-1-1220 Volume TRAKPRIVOL-INTJRNPRI is currently open or mounted
    VxVM vxvol WARNING V-5-1-1220 Volume TRAKPRIVOL-INTJRNALT is currently open or mounted
    VxVM vxdg ERROR V-5-1-584 Disk group TRAKPRI-LIVEINT: Some volumes in the disk group are in use
    ==============================================

    Also, diskgroup went in disabled state

    2014/03/27 15:11:27 VCS INFO V-16-2-13717 (TCPRI-CLU2) Output of the completed operation (imf_getnotification)
    ==============================================
    Cannot continue monitoring event
    Got notification for group: TRAKPRI-LIVETC

    ==============================================

    2014/03/27 15:16:24 VCS CRITICAL V-16-10031-1533 (TCPRI-CLU2) DiskGroup:TRAKPRI-LIVEINT:monitor:**ADMINISTRATIVE HELP** required, disk group (TRAKPRI-LIVEINT) is *DISABLED* on the system .
    2014/03/27 15:16:24 VCS WARNING V-16-10031-1521 (TCPRI-CLU2) DiskGroup:TRAKPRI-LIVEINT:clean:The command *vxvol -g TRAKPRI-LIVEINT stopall* failed. Doing a forced stop.
    2014/03/27 15:16:24 VCS ERROR V-16-10031-1522 (TCPRI-CLU2) DiskGroup:TRAKPRI-LIVEINT:clean:Could not deport the disk group TRAKPRI-LIVEINT.
    2014/03/27 15:16:25 VCS INFO V-16-2-13716 (TCPRI-CLU2) Resource(TRAKPRI-LIVEINT): Output of the completed operation (clean)

     

    So with above, my understanding is

    1. Check system messages for same time. Are you having any storage related issues during same time, a diskgroup going to disable state indicates volume manager was unable to make I/O private region of disks & hence configuration marked disabled which may be preventing further operations.

    2. Second thing, verify the configuration, the first volume which gives error . I noticed the error in previous attempts as well, error starts from this volume only

    2014/03/25 12:25:02 VCS INFO V-16-2-13716 (TCPRI-CLU1) Resource(TRAKPRIVOL-INTJRNPRI): Output of the completed operation (offline)
    ==============================================
    umount: /trakpri/meam/live/int/jrn/pri: not mounted
    ==============================================


    attach main.cf here once for review

     

    G

  • You seem to have 'nested' mounts:

    MountPoint = "/traksec/meam/live/tcanl"

    MountPoint = "/traksec/meam/live/tcanl/jrn/alt"

    MountPoint = "/traksec/meam/live/tcanl/jrn/pri"

    MountPoint = "/traksec/meam/live/tcanl/wij"

    MountPoint = "/traksec/meam/live/tcanl/app"

     

    This means that /traksec/meam/live/tcanl must be mounted before all the other filesystems can be mounted.

    Same with service group offline - all other filesystems must be unmounted before /traksec/meam/live/tcanl can be unmounted/offlined.

    You need more dependencies: 

    TRAKSECVOL-TCANL-JRNALT requires TRAKSECVOL-TCANL
    TRAKSECVOL-TCANL-JRNPRI requires TRAKSECVOL-TCANL
    TRAKSECVOL-TCANLAPP requires TRAKSECVOL-TCANL

    Your main.cf only shows TRAKSECVOL-TCANL-WIJ requires TRAKSECVOL-TCANL, none of the rest.

    You are also missing TRAKSECVOL-TCANL dependency on the diskgroup:

    TRAKSECVOL-TCANL requires TRAKSEC-INT

     

    Please fix these dependancies, then use the dependency tree view in the Java GUI to check that dependencies are correct.

    When you offline and online the SG, you will be able to see resources going up and down in the correct order.

    Another great utility to test your config is the VCS Simulator.

  • Spot on Marianne ... Completely agree, you are facing issues because of nested mounts

    You need to set the right dependency order as suggested above so that nested mounts go online/offline in correct order.

    Download simulator from below link & see how to use it

    https://www-secure.symantec.com/connect/forums/sfha-solutions-601-using-veritas-cluster-server-simulator

    modify the configuration in simulator & test behavior. Once successfully tested, you can go ahead on Production

     

    G

  • TRAKSECVOL-TCANL-JRNALT requires TRAKSECVOL-TCANL
    TRAKSECVOL-TCANL-JRNPRI requires TRAKSECVOL-TCANL
    TRAKSECVOL-TCANLAPP requires TRAKSECVOL-TCANL

     

    This problem was .

     

    Thank you very mach for all . and thanks mr. Gaurav Sangamnerkar