03-27-2014 02:11 AM
Hi
I installed veritas 6.1 on redhat 6.4 64bit and vom 6.0 and created disk groups and volumes and then service groups failover and mounted the volumes in the service groups.
When i restarted node1 the service groups failover good to node2 and when i restarted node2 the failover working good and failover but if i tray to do that mounal switch some service groups failover and others not failover and the state for diskgroups in host stoping and the volume too .
Please help
Regards
Solved! Go to Solution.
04-01-2014 07:24 AM
You seem to have 'nested' mounts:
MountPoint = "/traksec/meam/live/tcanl"
MountPoint = "/traksec/meam/live/tcanl/jrn/alt"
MountPoint = "/traksec/meam/live/tcanl/jrn/pri"
MountPoint = "/traksec/meam/live/tcanl/wij"
MountPoint = "/traksec/meam/live/tcanl/app"
This means that /traksec/meam/live/tcanl must be mounted before all the other filesystems can be mounted.
Same with service group offline - all other filesystems must be unmounted before /traksec/meam/live/tcanl can be unmounted/offlined.
You need more dependencies:
TRAKSECVOL-TCANL-JRNALT requires TRAKSECVOL-TCANL
TRAKSECVOL-TCANL-JRNPRI requires TRAKSECVOL-TCANL
TRAKSECVOL-TCANLAPP requires TRAKSECVOL-TCANL
Your main.cf only shows TRAKSECVOL-TCANL-WIJ requires TRAKSECVOL-TCANL, none of the rest.
You are also missing TRAKSECVOL-TCANL dependency on the diskgroup:
TRAKSECVOL-TCANL requires TRAKSEC-INT
Please fix these dependancies, then use the dependency tree view in the Java GUI to check that dependencies are correct.
When you offline and online the SG, you will be able to see resources going up and down in the correct order.
Another great utility to test your config is the VCS Simulator.
04-01-2014 07:32 AM
Spot on Marianne ... Completely agree, you are facing issues because of nested mounts
You need to set the right dependency order as suggested above so that nested mounts go online/offline in correct order.
Download simulator from below link & see how to use it
https://www-secure.symantec.com/connect/forums/sfha-solutions-601-using-veritas-cluster-server-simulator
modify the configuration in simulator & test behavior. Once successfully tested, you can go ahead on Production
G
03-27-2014 09:22 AM
Hi,
Please paste the snippet of engine_A.log for us to see what is happening
also, when it volumes are stopping, are you able to see any errors in messages file ?
are you able to run normal vx commands like "vxdisk list" or "vxtask list" when this issue happens ?
G
03-27-2014 10:05 AM
I attached engine_A.log .
Just the errors messages on vom and i not run any commands i just working on vom .
Regards
03-30-2014 02:20 AM
In messages log this errors come .
code:
Mar 30 12:05:45 TCSEC-CLU1 Had[23496]: VCS ERROR V-16-10031-1522 (TCSEC-CLU2) DiskGroup:TRAKSEC-LIVEANL:clean:Could not deport the disk group TRAKSEC-LIVEANL.
Mar 30 12:05:46 TCSEC-CLU1 Had[23496]: VCS ERROR V-16-2-13069 (TCSEC-CLU2) Resource(TRAKSEC-LIVEANL) - clean failed.
Mar 30 12:06:46 TCSEC-CLU1 Had[23496]: VCS ERROR V-16-2-13077 (TCSEC-CLU2) Agent is unable to offline resource(TRAKSEC-LIVEANL). Administrative intervention may be required.
Mar 30 12:06:47 TCSEC-CLU1 Had[23496]: VCS ERROR V-16-10031-1522 (TCSEC-CLU2) DiskGroup:TRAKSEC-LIVEANL:clean:Could not
03-30-2014 05:54 AM
IS this a production or development configuration. Is it possible for you to stop VCS? And move the fileystem manually without ucing VCS?
03-31-2014 09:59 PM
Hi,
I am little confused with timestamps, you have pasted above timestamp of Mar 30 however the engine log you have attached has logs only till Mar 27
however lets see what happened on Mar 27
Manually initiated switch
2014/03/27 13:06:08 VCS INFO V-16-1-50135 User admin fired command: hagrp -switch TRAKPRI-LIVEINT TCPRI-CLU1 localclus from ::ffff:10.100.208.76
2014/03/27 13:06:08 VCS NOTICE V-16-1-10208 Initiating switch of group TRAKPRI-LIVEINT from system TCPRI-CLU2 to system TCPRI-CLU1
2014/03/27 13:06:08 VCS NOTICE V-16-1-10300 Initiating Offline of Resource ip3 (Owner: Unspecified, Group: TRAKPRI-LIVEINT) on System TCPRI-CLU2
2014/03/27 13:06:09 VCS INFO V-16-1-10305 Resource ip3 (Owner: Unspecified, Group: TRAKPRI-LIVEINT) is offline on TCPRI-CLU2 (VCS initiated)
2014/03/27 13:06:09 VCS NOTICE V-16-1-10300 Initiating Offline of Resource TRAKPRIVOL-INTJRNPRI (Owner: Unspecified, Group: TRAKPRI-LIVEINT) on System TCPRI-CLU2
2014/03/27 13:06:09 VCS NOTICE V-16-1-10300 Initiating Offline of Resource TRAKPRIVOL-INT (Owner: Unspecified, Group: TRAKPRI-LIVEINT) on System TCPRI-CLU2
2014/03/27 13:06:09 VCS NOTICE V-16-1-10300 Initiating Offline of Resource TRAKPRIVOL-INTJRNALT (Owner: Unspecified, Group: TRAKPRI-LIVEINT) on System TCPRI-CLU2
2014/03/27 13:06:10 VCS INFO V-16-2-13716 (TCPRI-CLU2) Resource(TRAKPRIVOL-INTJRNPRI): Output of the completed operation (offline)
Two resources reported error saying not mounted
2014/03/27 13:06:10 VCS INFO V-16-2-13716 (TCPRI-CLU2) Resource(TRAKPRIVOL-INTJRNPRI): Output of the completed operation (offline)
==============================================
umount: /trakpri/meam/live/int/jrn/pri: not mounted
==============================================
2014/03/27 13:06:10 VCS INFO V-16-1-10305 Resource TRAKPRIVOL-INTJRNPRI (Owner: Unspecified, Group: TRAKPRI-LIVEINT) is offline on TCPRI-CLU2 (VCS initiated)
2014/03/27 13:06:11 VCS INFO V-16-2-13716 (TCPRI-CLU2) Resource(TRAKPRIVOL-INTJRNALT): Output of the completed operation (offline)
==============================================
umount: /trakpri/meam/live/int/jrn/alt: not mounted
==============================================
vxvol reported issues for multiple volumes not able to stop
2014/03/27 13:06:11 VCS INFO V-16-1-10305 Resource TRAKPRIVOL-INTJRNALT (Owner: Unspecified, Group: TRAKPRI-LIVEINT) is offline on TCPRI-CLU2 (VCS initiated)
2014/03/27 13:06:11 VCS NOTICE V-16-1-10300 Initiating Offline of Resource TRAKPRI-LIVEINT (Owner: Unspecified, Group: TRAKPRI-LIVEINT) on System TCPRI-CLU2
2014/03/27 13:06:11 VCS WARNING V-16-10031-1521 (TCPRI-CLU2) DiskGroup:TRAKPRI-LIVEINT:offline:The command *vxvol -g TRAKPRI-LIVEINT stopall* failed. Doing a forced stop.
2014/03/27 13:06:12 VCS ERROR V-16-10031-1522 (TCPRI-CLU2) DiskGroup:TRAKPRI-LIVEINT:offline:Could not deport the disk group TRAKPRI-LIVEINT.
2014/03/27 13:06:12 VCS INFO V-16-2-13716 (TCPRI-CLU2) Resource(TRAKPRI-LIVEINT): Output of the completed operation (offline)
==============================================
VxVM vxvol ERROR V-5-1-1220 Volume TRAKPRIVOL-INTJRNPRI is currently open or mounted
VxVM vxvol ERROR V-5-1-1220 Volume TRAKPRIVOL-INTJRNALT is currently open or mounted
VxVM vxvol WARNING V-5-1-1220 Volume TRAKPRIVOL-INTJRNPRI is currently open or mounted
VxVM vxvol WARNING V-5-1-1220 Volume TRAKPRIVOL-INTJRNALT is currently open or mounted
VxVM vxdg ERROR V-5-1-584 Disk group TRAKPRI-LIVEINT: Some volumes in the disk group are in use
==============================================
Also, diskgroup went in disabled state
2014/03/27 15:11:27 VCS INFO V-16-2-13717 (TCPRI-CLU2) Output of the completed operation (imf_getnotification)
==============================================
Cannot continue monitoring event
Got notification for group: TRAKPRI-LIVETC
==============================================
2014/03/27 15:16:24 VCS CRITICAL V-16-10031-1533 (TCPRI-CLU2) DiskGroup:TRAKPRI-LIVEINT:monitor:**ADMINISTRATIVE HELP** required, disk group (TRAKPRI-LIVEINT) is *DISABLED* on the system .
2014/03/27 15:16:24 VCS WARNING V-16-10031-1521 (TCPRI-CLU2) DiskGroup:TRAKPRI-LIVEINT:clean:The command *vxvol -g TRAKPRI-LIVEINT stopall* failed. Doing a forced stop.
2014/03/27 15:16:24 VCS ERROR V-16-10031-1522 (TCPRI-CLU2) DiskGroup:TRAKPRI-LIVEINT:clean:Could not deport the disk group TRAKPRI-LIVEINT.
2014/03/27 15:16:25 VCS INFO V-16-2-13716 (TCPRI-CLU2) Resource(TRAKPRI-LIVEINT): Output of the completed operation (clean)
So with above, my understanding is
1. Check system messages for same time. Are you having any storage related issues during same time, a diskgroup going to disable state indicates volume manager was unable to make I/O private region of disks & hence configuration marked disabled which may be preventing further operations.
2. Second thing, verify the configuration, the first volume which gives error . I noticed the error in previous attempts as well, error starts from this volume only
2014/03/25 12:25:02 VCS INFO V-16-2-13716 (TCPRI-CLU1) Resource(TRAKPRIVOL-INTJRNPRI): Output of the completed operation (offline)
==============================================
umount: /trakpri/meam/live/int/jrn/pri: not mounted
==============================================
attach main.cf here once for review
G
04-01-2014 02:19 AM
I attached main.cf
Thanks
04-01-2014 07:24 AM
You seem to have 'nested' mounts:
MountPoint = "/traksec/meam/live/tcanl"
MountPoint = "/traksec/meam/live/tcanl/jrn/alt"
MountPoint = "/traksec/meam/live/tcanl/jrn/pri"
MountPoint = "/traksec/meam/live/tcanl/wij"
MountPoint = "/traksec/meam/live/tcanl/app"
This means that /traksec/meam/live/tcanl must be mounted before all the other filesystems can be mounted.
Same with service group offline - all other filesystems must be unmounted before /traksec/meam/live/tcanl can be unmounted/offlined.
You need more dependencies:
TRAKSECVOL-TCANL-JRNALT requires TRAKSECVOL-TCANL
TRAKSECVOL-TCANL-JRNPRI requires TRAKSECVOL-TCANL
TRAKSECVOL-TCANLAPP requires TRAKSECVOL-TCANL
Your main.cf only shows TRAKSECVOL-TCANL-WIJ requires TRAKSECVOL-TCANL, none of the rest.
You are also missing TRAKSECVOL-TCANL dependency on the diskgroup:
TRAKSECVOL-TCANL requires TRAKSEC-INT
Please fix these dependancies, then use the dependency tree view in the Java GUI to check that dependencies are correct.
When you offline and online the SG, you will be able to see resources going up and down in the correct order.
Another great utility to test your config is the VCS Simulator.
04-01-2014 07:32 AM
Spot on Marianne ... Completely agree, you are facing issues because of nested mounts
You need to set the right dependency order as suggested above so that nested mounts go online/offline in correct order.
Download simulator from below link & see how to use it
https://www-secure.symantec.com/connect/forums/sfha-solutions-601-using-veritas-cluster-server-simulator
modify the configuration in simulator & test behavior. Once successfully tested, you can go ahead on Production
G
04-03-2014 05:51 AM
TRAKSECVOL-TCANL-JRNALT requires TRAKSECVOL-TCANL
TRAKSECVOL-TCANL-JRNPRI requires TRAKSECVOL-TCANL
TRAKSECVOL-TCANLAPP requires TRAKSECVOL-TCANL
This problem was .
Thank you very mach for all . and thanks mr. Gaurav Sangamnerkar
04-04-2014 01:47 AM
You also need this dependancy:
TRAKSECVOL-TCANL requires TRAKSEC-INT
Please verify the dependancies in rest of Service Groups as well, as you seem to have nested mounts in all of them.
PS:
I see that you had the same issue in July last year:
https://www-secure.symantec.com/connect/forums/faulted-node2