cancel
Showing results for 
Search instead for 
Did you mean: 

VCS Switch Over problem

one_who_waits
Level 3

Hi,

I do have som problems when testing Cluster Switch (2 node cluster VCS 6.2, Solaris 10). We test with init 6. Active node always hangs with :

2014/02/09 08:07:36 VCS ERROR V-16-10001-11511 (node1) Volume:vol_v1:offline:Cannot find Volume v1, either the Volume is a Volume Set or Veritas Volume Manager configuration daemon (vxconfigd) is not in enabled mode
2014/02/09 08:07:36 VCS INFO V-16-6-15002 (node1) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resstatechange node1 mnt_1 ONLINE OFFLINE successfully
2014/02/09 08:07:37 VCS INFO V-16-2-13716 (node1) Resource(vol_v1): Output of the completed operation (offline)
==============================================
VxVM vxprint ERROR V-5-1-684 IPC failure: Configuration daemon is not accessible
==============================================

2014/02/09 08:07:37 VCS ERROR V-16-2-13064 (node1) Agent is calling clean for resource(vol_v1) because the resource is up even after offline completed.
2014/02/09 08:07:37 VCS ERROR V-16-10001-11511 (node1) Volume:vol_v1:clean:Cannot find Volume v1, either the Volume is a Volume Set or Veritas Volume Manager configuration daemon (vxconfigd) is not in enabled mode
2014/02/09 08:07:38 VCS INFO V-16-2-13716 (node1) Resource(vol_v1): Output of the completed operation (clean)
==============================================
VxVM vxprint ERROR V-5-1-684 IPC failure: Configuration daemon is not accessible
==============================================

 

I noticed that the init script /etc/rc0.d/K99vxvm-shutdown does stop the vxconfigd and also does "/usr/sbin/vxdctl -k stop".

My question is do I need any vxvm init script since the upgrade from 5 to 6.1 is done and we have SMF service in place, or should I increase the timeouts of the VCS stop procedures?

Thanks a lot in advance!

Cheers

1 ACCEPTED SOLUTION

Accepted Solutions

mikebounds
Level 6
Partner Accredited

I only have access to a 5.1 system.  5.0 has limited support so there are not many customers still running this - see https://sort.symantec.com/eosl/show_matrix:

Product Version Platform Country End of Support
Life Date
End of Limited Support Date Release Date
Storage Foundation for UNIX/Linux 5.0 MP3 Solaris Worldwide 2014-08-31 2012-06-07 2008-10-06
Solaris x86-64 Worldwide 2014-08-31 2012-06-07 2008-10-06
Linux Worldwide 2014-08-31 2012-06-07 2008-08-20
AIX Worldwide 2014-08-31 2012-06-07 2008-10-06

So it is quite likely that VCS 6.0 was not tested with vxvm 5.0 which could be why vxvm is shutting down before VCS or it could be you have some obsolete rc scripts.

As before, I would shut VCS down manually before rebooting system which is good practice in any case, even after you have upgraded vxvm to 6.0 - i.e if you are shutting down node A, then you should manually switch service groups to node B (to ensure they online successfully on node B before shutting down node A), or offline groups if you are shutting both nodes down.

Mike

View solution in original post

10 REPLIES 10

mikebounds
Level 6
Partner Accredited

Normally VCS is shutdown first and then Vxvm and so I would guess that maybe in 5.0, VCS and vxvm were both shutdown by rc scripts and in 6.1, both and shutdown by SMF service, so perhaps with the upgrade from 5 to 6, the vxvm rc script was not removed and so maybe vxvm is being shutdown by rc script before VCS is shutdown using SMF service.

What services do you have, if you have a SMF service for vxvm, then I would disable the vxvm rc script.

Mike

one_who_waits
Level 3

Hi Mike,

thanks a lot for you quick answer, we have as follows :

 

SMF

# svcs -a | egrep "vcs|gab|vxfen"
disabled       Feb_09   svc:/system/vcs-onenode:default
online         Feb_09   svc:/system/gab:default
online         Feb_09   svc:/system/vxfen:default
online         Feb_09   svc:/system/vcs:default
#
 

init scripts

 # ls -al /etc/rc?.d/{S,K}*{vcs,llt,vxfen,gab,vx}* 2>/dev/null
-r-x------   5 root     sys         3539 Nov 11  2009 /etc/rc0.d/K00vxrsyncd
-r-xr-xr-x   5 root     sys         2508 Nov 11  2009 /etc/rc0.d/K02vxnm-vxnetd
lrwxrwxrwx   1 root     root          27 Aug  4  2010 /etc/rc0.d/K750vxpal.gridnode -> //etc/init.d/vxpal.gridnode
lrwxrwxrwx   1 root     root          30 Aug  4  2010 /etc/rc0.d/K760vxpal.actionagent -> //etc/init.d/vxpal.actionagent
-r-xr-xr-x   2 root     sys         3068 Nov 11  2009 /etc/rc0.d/K99vxvm-shutdown
-r-x------   5 root     sys         3539 Nov 11  2009 /etc/rc1.d/K00vxrsyncd
-r-xr-xr-x   5 root     sys         2508 Nov 11  2009 /etc/rc1.d/K02vxnm-vxnetd
lrwxrwxrwx   1 root     root          27 Aug  4  2010 /etc/rc1.d/K750vxpal.gridnode -> //etc/init.d/vxpal.gridnode
lrwxrwxrwx   1 root     root          30 Aug  4  2010 /etc/rc1.d/K760vxpal.actionagent -> //etc/init.d/vxpal.actionagent
lrwxrwxrwx   1 root     other         18 Aug  4  2010 /etc/rc2.d/K50vxvail -> /etc/init.d/vxvail
lrwxrwxrwx   1 root     root          27 Aug  4  2010 /etc/rc2.d/K750vxpal.gridnode -> //etc/init.d/vxpal.gridnode
lrwxrwxrwx   1 root     other         30 Aug  4  2010 /etc/rc2.d/K75vxpal.StorageAgent -> /etc/init.d/vxpal.StorageAgent
lrwxrwxrwx   1 root     root          30 Aug  4  2010 /etc/rc2.d/K760vxpal.actionagent -> //etc/init.d/vxpal.actionagent
-rwxr-xr-x   1 root     bin         2370 Nov 28  2007 /etc/rc2.d/K99vxatd
lrwxrwxrwx   1 root     other         18 Aug  4  2010 /etc/rc2.d/S50vxvail -> /etc/init.d/vxvail
-rwxr-xr-x   1 root     bin         2370 Nov 28  2007 /etc/rc2.d/S70vxatd
lrwxrwxrwx   1 root     root          27 Aug  4  2010 /etc/rc2.d/S750vxpal.gridnode -> //etc/init.d/vxpal.gridnode
lrwxrwxrwx   1 root     other         30 Aug  4  2010 /etc/rc2.d/S75vxpal.StorageAgent -> /etc/init.d/vxpal.StorageAgent
lrwxrwxrwx   1 root     root          30 Aug  4  2010 /etc/rc2.d/S760vxpal.actionagent -> //etc/init.d/vxpal.actionagent
-r-x------   2 root     root        2441 Jul 19  2008 /etc/rc2.d/S92vxdcli
-r-xr-xr-x   5 root     sys         2508 Nov 11  2009 /etc/rc2.d/S94vxnm-vxnetd
-r-x------   5 root     sys         3539 Nov 11  2009 /etc/rc2.d/S96vxrsyncd
-rwxr-xr-x   1 root     sys         1310 May  6  2006 /etc/rc3.d/S99vxtf_chklog
-r-x------   5 root     sys         3539 Nov 11  2009 /etc/rcS.d/K00vxrsyncd
-r-xr-xr-x   5 root     sys         2508 Nov 11  2009 /etc/rcS.d/K02vxnm-vxnetd
lrwxrwxrwx   1 root     root          27 Aug  4  2010 /etc/rcS.d/K750vxpal.gridnode -> //etc/init.d/vxpal.gridnode
lrwxrwxrwx   1 root     root          30 Aug  4  2010 /etc/rcS.d/K760vxpal.actionagent -> //etc/init.d/vxpal.actionagent
 

Thanks.

Cheers

 

Daniel_Matheus
Level 4
Employee Accredited Certified

Hi,

 

your description is quite unclear.

If you do a cluster switch (i.e. hagrp -switch command) no init script will be called at all.

As for the error, it means that the vxconfigd is either not running or not in enabled mode.

 

What is the current OS and SF version?

What was previous OS and SF version?

Was there a vxconfigd coredump generated?

Was vxconfigd killed or restarted?

Does vxconfigd start successfully after a reboot?

There should be errors in the syslog if vxconfigd was not able to start or core dumped.

regards,
Dan

 

mikebounds
Level 6
Partner Accredited

Looks to me that you have vxvm as a service AND an rc script.

I have looked on a Solaris 10 server running 5.1 and "ls -al /etc/rc?.d/{S,K}*{vcs,llt,vxfen,gab,vx}*" does not match anything, so I think you should disable all these rc script and re-run your test.

You should also review your upgrade procedure - and if you believe you did this as per the product guides, and you have not manuall changed any rc scripts, then you should raise a support call to say that upgrade has not worked correctly.

I would have thought that when you removed 5.0 packages, it should have removed the rc scripts and it seems this happened for VCS, but not for vxvm.

Mike

one_who_waits
Level 3

Hi Mike,

vxvm is still 5.0 , but VCS is 6.1 :

# pkginfo -l VRTSvxvm
   PKGINST:  VRTSvxvm
      NAME:  Binaries for VERITAS Volume Manager by Symantec
  CATEGORY:  system
      ARCH:  sparc
   VERSION:  5.0,REV=05.11.2006.17.55
   BASEDIR:  /
    VENDOR:  Symantec Corporation
      DESC:  Virtual Disk Subsystem
    PSTAMP:  Veritas-5.0_MP3_RP3.3:2009-11-12
  INSTDATE:  Aug 04 2010 14:40
   HOTLINE:  http://support.veritas.com/phonesup/phonesup_ddProduct_.htm
     EMAIL:  support@veritas.com
    STATUS:  completely installed
     FILES:      995 installed pathnames
                  30 shared pathnames
                  13 linked files
                 112 directories
                 430 executables
              400303 blocks used (approx)

#

# pkginfo -l VRTSvcs 
   PKGINST:  VRTSvcs
      NAME:  Veritas Cluster Server by Symantec
  CATEGORY:  system
      ARCH:  sparc
   VERSION:  6.0.100.000
   BASEDIR:  /
    VENDOR:  Symantec Corporation
      DESC:  Veritas Cluster Server by Symantec
    PSTAMP:  6.0.300.000-GA-2013-01-17-16.00.00
  INSTDATE:  Apr 21 2013 10:15
    STATUS:  completely installed
     FILES:      281 installed pathnames
                  26 shared pathnames
                  57 directories
                 116 executables
              450978 blocks used (approx)

#

 

Thanks for your help.

Cheers

one_who_waits
Level 3

Hi Mike,

I beleive that you are right, if there are no S/K scripts for vx on Solaris 10 with 5.0 of VERITAS Volume Manager , which is exactly what I have, then I should mv the scripts away and try the test.

As mentioned, we always do the "init 6". To answer Dan - no, there are no coredums of vxconfigd and it always comes without problems after the reboot (reset from XSCF). Previous we had 5.1 VCS and upgraded to 6.0.1 with standard scripts from Veritas.

Thanks for you help.

Mike, did you ran the command in bash (as I did), cause otherwise it does not work:

# ksh
#  ls -al /etc/rc?.d/{S,K}*{vcs,llt,vxfen,gab,vx}*
/etc/rc?.d/{S,K}*{vcs,llt,vxfen,gab,vx}*: No such file or directory
#
 

Thanks a lot guys.

Cheers

p.s.

# pkginfo -l VRTSvxvm | grep VERSION
   VERSION:  5.0,REV=05.11.2006.17.55
#

# pkginfo -l VRTSvcs | grep VERSION
   VERSION:  6.0.100.000
#
 

mikebounds
Level 6
Partner Accredited

Yes shell was batch and also did an ls in the rc directory to check what was there, but this was on a 5.1 system, not 5.0.

I thought I had seen you had vxvm services, but looking now you only have vxvm rc scripts which may be normal for 5.0.

Note sure if VCS 6.0 supports Vxvm 5.0.  I still think basic problem is that vxvm is shutting down before VCS, but I don't know how Solaris integrates ordering of rc scripts with services, so this is what you should look at and if you can't get Solaris to shutdown VCS first, then you will have to shut VCS down manually before rebooting systems, as I presume this is just a temporary configuration until you upgrade vxvm.

Mike

 

one_who_waits
Level 3

Hi Mike,

I also have SMF vxvm services :

# svcs -a | egrep 'vx|vcs|gab|llt'
legacy_run     Jan_26   lrc:/etc/rc2_d/S70vxatd
legacy_run     Jan_26   lrc:/etc/rc2_d/S750vxpal_gridnode
legacy_run     Jan_26   lrc:/etc/rc2_d/S75vxpal_StorageAgent
legacy_run     Jan_26   lrc:/etc/rc2_d/S760vxpal_actionagent
legacy_run     Jan_26   lrc:/etc/rc2_d/S92vxdcli
legacy_run     Jan_26   lrc:/etc/rc2_d/S94vxnm-vxnetd
legacy_run     Jan_26   lrc:/etc/rc2_d/S96vxrsyncd
disabled       Jan_26   svc:/system/vcs-onenode:default
online         Jan_26   svc:/system/llt:default
online         Jan_26   svc:/system/gab:default
online         Jan_26   svc:/system/vxvm/vxvm-sysboot:default
online         Jan_26   svc:/system/vxvm/vxvm-startup1:default
online         Jan_26   svc:/system/vxvm/vxvm-startup2:default
online         Jan_26   svc:/system/vxvm/vxvm-startvc:default
online         Jan_26   svc:/system/vxvm/vxvm-reconfig:default
online         Jan_26   svc:/system/vxpbx:default
online         Jan_26   svc:/system/vxvm/vxvm-recover:default
online         Jan_26   svc:/system/vxfen:default
online         Jan_28   svc:/system/vcs:default
#

Do you have somewhere such combination, I would really like to know, do I need the rc scripts or not, to me, as you already confirmed, it looks like they are obsolete?

I have :

# pkginfo -l VRTSvxvm VRTSvcs | egrep 'NAME|VERS'
      NAME:  Veritas Cluster Server by Symantec
   VERSION:  6.0.100.000
      NAME:  Binaries for VERITAS Volume Manager by Symantec
   VERSION:  5.0,REV=05.11.2006.17.55
#

 

THANKS A LOT !

Cheers

mikebounds
Level 6
Partner Accredited

I only have access to a 5.1 system.  5.0 has limited support so there are not many customers still running this - see https://sort.symantec.com/eosl/show_matrix:

Product Version Platform Country End of Support
Life Date
End of Limited Support Date Release Date
Storage Foundation for UNIX/Linux 5.0 MP3 Solaris Worldwide 2014-08-31 2012-06-07 2008-10-06
Solaris x86-64 Worldwide 2014-08-31 2012-06-07 2008-10-06
Linux Worldwide 2014-08-31 2012-06-07 2008-08-20
AIX Worldwide 2014-08-31 2012-06-07 2008-10-06

So it is quite likely that VCS 6.0 was not tested with vxvm 5.0 which could be why vxvm is shutting down before VCS or it could be you have some obsolete rc scripts.

As before, I would shut VCS down manually before rebooting system which is good practice in any case, even after you have upgraded vxvm to 6.0 - i.e if you are shutting down node A, then you should manually switch service groups to node B (to ensure they online successfully on node B before shutting down node A), or offline groups if you are shutting both nodes down.

Mike

one_who_waits
Level 3

Hi Mike,

thanks, you confirmed what seamed to be the possible problem to me and I am going to take your advice.

Thanks a lot.

Cheers