Forum Discussion

sanderfiers's avatar
11 years ago

SG goes into Partial state if Native LVMVG is imported and activated outside VCS control

Environment:

  • Cluster: Veritas Cluster Server 6.1.0.000
  • OS: Red Hat Enterprise Linux Server release 6.4 (Santiago)
  • Volume Manager: Red Hat LVM
  • Multipathing: Veritas Dynamic Multi Pathing
     

Introduction:

I have a two-node Veritas Cluster on Red Hat. My volumes are under LVM (customer standard).
Since Symantec does not support Device Mapper multipathing for the combination LVM+VCS on Red Hat,
I am using Veritas DMP for multipathing. Now I finally managed to get it working properly, I still have one issue.
The Cluster has 1 failover Service Group on the first node. And a second failover Service Group on the second node.
 

Description:

When I reboot both nodes, and the cluster starts, my Service Groups are in Partial state.
Even though the AutoStart attribute is set to false for both SG's. Every resource in my SG's is offline, execpt for my LVM Volumes.
VCS is not trying to start any resource, which is correct since AutoStart is disabled.
But the cluster sees all LVM Volumes as online, not the LVM Volume Groups.
The funny thing is that all LVM volumes (for both SG's) are online on the first node.
Even when the LastOnline was the second node. And even when priority for the second SG is set to the second node.
It appears that LVM activates all volumes at boot time, just before the Cluster kicks in.
This was proven by stopping the cluster on all nodes, deactivating all LVM volumes manual via the vgchange command,
starting the cluster again (so no host reboot). That results in expected behavior: no resource online and SG's offline.
This operational issue is know to Symantec and described in various recent release notes.
The workaround is rather vague and I tried already several options.
 

Resources:

Apperently this is know to Symantec.
See page 38 in "Veritas Cluster Server 6.0.1 Release Notes - Linux"
And see page 30 in "Veritas Cluster Server 6.0.4 Release Notes - Linux"
And see page 47 in "Veritas Cluster Server 6.1 Release Notes - Linux"

SG goes into Partial state if Native LVMVG is imported and activated outside VCS control
If you import and activate LVM volume group before starting VCS, the
LVMVolumeGroup remains offline though the LVMLogicalVolume resource comes
online. This causes the service group to be in a partial state.
Workaround: You must bring the VCS LVMVolumeGroup resource offline manually,
or deactivate it and export the volume group before starting VCS.

 

Question:

Does Symantec has more information, procedure, ... about this workaround?
 

Thanks in advance,
Sander Fiers

  • I found the cause of my issue. It seems that the Veritas Volume Manager in the boot scripts deactivates all LVM volumes and export all LVM volume groups. To activate and export them again afterwards.

    I commented out the second section with the export and deactivation LVM command.

     

    Orginal snippet of the file:

    [root@bl19-13]# vi /etc/rc3.d/S14vxvm-boot
                    vgchange -a n $vg
                    vgexport $vg
                    vgimport $vg
                    vgchange -a y $vg

    New snippet of the file:

    [root@bl19-13]# vi /etc/rc3.d/S14vxvm-boot
                    vgchange -a n $vg
                    #vgexport $vg
                    #vgimport $vg
                    #vgchange -a y $vg

     

  • These are the options that I already tried, without success:

    • Adjusting the startup scripts
      I tried editing the /etc/rc.d/rc.sysinit file to deactivate my Service Group LVM volumes and an export of all LVM volumegroups with inactive volumes.
    if [ -x /sbin/lvm ]; then
    #       action $"Setting up Logical Volume Management:" /sbin/lvm vgchange -a ay --sysinit
            action $"Setting up Logical Volume Management:" /sbin/lvm vgchange -a y v1 --sysinit
            action $"Setting up Logical Volume Management:" /sbin/lvm vgchange -a n v11 --sysinit
            action $"Setting up Logical Volume Management:" /sbin/lvm vgchange -a n v2 --sysinit
            action $"Setting up Logical Volume Management:" /sbin/lvm vgchange -a n v3 --sysinit
            action $"Setting up Logical Volume Management:" /sbin/lvm vgchange -a n v4 --sysinit
            action $"Setting up Logical Volume Management:" /sbin/lvm vgchange -a n v5 --sysinit
            action $"Setting up Logical Volume Management:" /sbin/lvm vgchange -a n v6 --sysinit
            action $"Setting up Logical Volume Management:" /sbin/lvm vgchange -a n v8 --sysinit
            action $"Setting up Logical Volume Management:" /sbin/lvm vgexport -a --sysinit
    fi
    • Adjusting the LVM configuration
      I also tried editing /etc/lvm/lvm.conf file to only auto activate the LVM volumes used for the OS, thus not auto activating the LVM volumes used in the Service Groups:
    # If auto_activation_volume_list is defined, each LV that is to be
        # activated is checked against the list while using the autoactivation
        # option (--activate ay/-a ay), and if it matches, it is activated.
        #   "vgname" and "vgname/lvname" are matched exactly.
        #   "@tag" matches any tag set in the LV or VG.
        #   "@*" matches if any tag defined on the host is also set in the LV or VG
        #
        # auto_activation_volume_list = [ "vg1", "vg2/lvol1", "@tag1", "@*" ]
          auto_activation_volume_list = [ "v1", "v1/home", "v1/root", "v1/swap", "v1/tmp", "v1/usr", "v1/var" ]


     

  • I found out that auto_activation_volule_list in /etc/lvm/lvm.conf depends on the service lvmetad.
    Used reference https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Logical_Volume_Manager_Administration/metadatadaemon.html

    • So first I enabled lvmetad on both hosts in the /etc/lvm/lvm.conf file and enable the service persistent.
    [root@bl20-13~]# vi /etc/lvm/lvm.conf
       # If lvmetad has been running while use_lvmetad was 0, it MUST be stopped
       # before changing use_lvmetad to 1 and started again afterwards.
          use_lvmetad = 0
    [root@bl20-13 ~]# grep "use_lvmetad = " /etc/lvm/lvm.conf
        use_lvmetad = 1
    [root@bl20-13 ~]# service lvm2-lvmetad start
    Starting LVM metadata daemon:                              [  OK  ]
    [root@bl20-13 ~]# chkconfig lvm2-lvmetad on
    [root@bl20-13 ~]# service lvm2-lvmetad status
    lvmetad (pid  7619) is running...

     

    • Then I make sure that all LVM volumes (specially the root disks on volume group v1) are activated at boot. On both hosts offcourse in the /etc/rc.d/rc.sysinit file.
    [root@bl19-13 ~]# vi /etc/rc.d/rc.sysinit
       if [ -x /sbin/lvm ]; then
               action $"Setting up Logical Volume Management:" /sbin/lvm vgchange -a ay --sysinit
               action $"Setting up Logical Volume Management:" /sbin/lvm vgchange -a y v1 --sysinit
       #       action $"Setting up Logical Volume Management:" /sbin/lvm vgchange -a n v11 --sysinit
       #       action $"Setting up Logical Volume Management:" /sbin/lvm vgchange -a n v2 --sysinit
       #       action $"Setting up Logical Volume Management:" /sbin/lvm vgchange -a n v3 --sysinit
       #       action $"Setting up Logical Volume Management:" /sbin/lvm vgchange -a n v4 --sysinit
       #       action $"Setting up Logical Volume Management:" /sbin/lvm vgchange -a n v5 --sysinit
       #       action $"Setting up Logical Volume Management:" /sbin/lvm vgchange -a n v6 --sysinit
       #       action $"Setting up Logical Volume Management:" /sbin/lvm vgchange -a n v8 --sysinit
       fi

     

    • Afterwards I only allow LVM volumes for my root disks to be activated automaticly via the /etc/lvm/lvm.conf file, on both hosts.
    [root@bl19-13 ~]# vi /etc/lvm/lvm.conf
      # If auto_activation_volume_list is defined, each LV that is to be
          # activated is checked against the list while using the autoactivation
      # option (--activate ay/-a ay), and if it matches, it is activated.
          #   "vgname" and "vgname/lvname" are matched exactly.
         #   "@tag" matches any tag set in the LV or VG.
          #   "@*" matches if any tag defined on the host is also set in the LV or VG
          #
          # auto_activation_volume_list = [ "vg1", "vg2/lvol1", "@tag1", "@*" ]
            auto_activation_volume_list = [ "v1", "v1/home", "v1/root", "v1/swap", "v1/tmp", "v1/usr", "v1/var" ]

     

    • I make sure my cluster is stopped properly on both hosts and reboot them both.
       
    • After my two hosts are up and running again, I get strange warnings about my devices.
      No physical volumes are found on the hosts. Only the first node has it's root volumes.
      Funny enough the second node is booted correctly but doesn't have any LVM physical volumes?!
    [root@bl19-13 ~]# pvs
      No device found for PV pRCzeA-SvFh-B0hz-XdcH-75Nh-HqH5-3BhusO.
      No device found for PV NaRuH8-7t2J-5vtd-NJ40-11MC-uaEB-OAcfXH.
      No device found for PV qwqLhF-5zJo-4lhA-bu9W-BqBp-SOt2-fbLc7n.
      No device found for PV HIGyN7-yT68-Mls6-UhJk-1JAp-cpyJ-tDL6F4.
      No device found for PV O8tgMH-l1jR-e6ao-Md0U-N4zI-0ucm-oUEHcc.
      No device found for PV 50BqNZ-SRTP-DNZ3-Gfkk-DxJt-PsGM-dSVBoF.
      No device found for PV 3ivHzC-dHO2-OvSN-qP9h-GPbU-uoa8-r3LEPi.
      PV                   VG   Fmt  Attr PSize   PFree
      /dev/vx/dmp/disk_0s2 v1   lvm2 a--  135.93g 66.93g
    
    [root@bl20-13 ~]# pvs
      No device found for PV pRCzeA-SvFh-B0hz-XdcH-75Nh-HqH5-3BhusO.
      No device found for PV NaRuH8-7t2J-5vtd-NJ40-11MC-uaEB-OAcfXH.
      No device found for PV C7cIfI-hmNt-L6uF-h7rr-1knf-jA92-fojE5j.
      No device found for PV qwqLhF-5zJo-4lhA-bu9W-BqBp-SOt2-fbLc7n.
      No device found for PV O8tgMH-l1jR-e6ao-Md0U-N4zI-0ucm-oUEHcc.
      No device found for PV HIGyN7-yT68-Mls6-UhJk-1JAp-cpyJ-tDL6F4.
      No device found for PV 50BqNZ-SRTP-DNZ3-Gfkk-DxJt-PsGM-dSVBoF.
      No device found for PV 3ivHzC-dHO2-OvSN-qP9h-GPbU-uoa8-r3LEPi.

     

    • Offcourse I get the same result for LVM volume groups and for LVM logical volumes
    [root@bl19-13 ~]# vgs
      No device found for PV NaRuH8-7t2J-5vtd-NJ40-11MC-uaEB-OAcfXH.
      No device found for PV 50BqNZ-SRTP-DNZ3-Gfkk-DxJt-PsGM-dSVBoF.
      No device found for PV 3ivHzC-dHO2-OvSN-qP9h-GPbU-uoa8-r3LEPi.
      No device found for PV qwqLhF-5zJo-4lhA-bu9W-BqBp-SOt2-fbLc7n.
      No device found for PV pRCzeA-SvFh-B0hz-XdcH-75Nh-HqH5-3BhusO.
      No device found for PV HIGyN7-yT68-Mls6-UhJk-1JAp-cpyJ-tDL6F4.
      No device found for PV O8tgMH-l1jR-e6ao-Md0U-N4zI-0ucm-oUEHcc.
      VG   #PV #LV #SN Attr   VSize   VFree
      v1     1   6   0 wz--n- 135.93g 66.93g
    
    [root@bl20-13 ~]# vgs
      No device found for PV NaRuH8-7t2J-5vtd-NJ40-11MC-uaEB-OAcfXH.
      No device found for PV 50BqNZ-SRTP-DNZ3-Gfkk-DxJt-PsGM-dSVBoF.
      No device found for PV C7cIfI-hmNt-L6uF-h7rr-1knf-jA92-fojE5j.
      No device found for PV 3ivHzC-dHO2-OvSN-qP9h-GPbU-uoa8-r3LEPi.
      No device found for PV qwqLhF-5zJo-4lhA-bu9W-BqBp-SOt2-fbLc7n.
      No device found for PV pRCzeA-SvFh-B0hz-XdcH-75Nh-HqH5-3BhusO.
      No device found for PV O8tgMH-l1jR-e6ao-Md0U-N4zI-0ucm-oUEHcc.
      No device found for PV HIGyN7-yT68-Mls6-UhJk-1JAp-cpyJ-tDL6F4.
      No volume groups found
    
    
    [root@bl19-13 ~]# lvs
      No device found for PV NaRuH8-7t2J-5vtd-NJ40-11MC-uaEB-OAcfXH.
      No device found for PV 50BqNZ-SRTP-DNZ3-Gfkk-DxJt-PsGM-dSVBoF.
      No device found for PV 3ivHzC-dHO2-OvSN-qP9h-GPbU-uoa8-r3LEPi.
      No device found for PV qwqLhF-5zJo-4lhA-bu9W-BqBp-SOt2-fbLc7n.
      No device found for PV pRCzeA-SvFh-B0hz-XdcH-75Nh-HqH5-3BhusO.
      No device found for PV HIGyN7-yT68-Mls6-UhJk-1JAp-cpyJ-tDL6F4.
      No device found for PV O8tgMH-l1jR-e6ao-Md0U-N4zI-0ucm-oUEHcc.
      LV   VG   Attr      LSize  Pool Origin Data%  Move Log Cpy%Sync Convert
      home v1   -wi-ao---  2.00g
      root v1   -wi-ao---  5.00g
      swap v1   -wi-ao--- 48.00g
      tmp  v1   -wi-ao---  2.00g
      usr  v1   -wi-ao---  8.00g
      var  v1   -wi-ao---  4.00g
    
    [root@bl20-13 ~]# lvs
      No device found for PV NaRuH8-7t2J-5vtd-NJ40-11MC-uaEB-OAcfXH.
      No device found for PV 50BqNZ-SRTP-DNZ3-Gfkk-DxJt-PsGM-dSVBoF.
      No device found for PV C7cIfI-hmNt-L6uF-h7rr-1knf-jA92-fojE5j.
      No device found for PV 3ivHzC-dHO2-OvSN-qP9h-GPbU-uoa8-r3LEPi.
      No device found for PV qwqLhF-5zJo-4lhA-bu9W-BqBp-SOt2-fbLc7n.
      No device found for PV pRCzeA-SvFh-B0hz-XdcH-75Nh-HqH5-3BhusO.
      No device found for PV O8tgMH-l1jR-e6ao-Md0U-N4zI-0ucm-oUEHcc.
      No device found for PV HIGyN7-yT68-Mls6-UhJk-1JAp-cpyJ-tDL6F4.
      No volume groups found

    Cluster results:

    • My two Service Groups are offline as expected since AutoStart is set to false on both.
    • I can online my 2nd Service Group on the second node. And LVM commands can see the physical volumes, volume groups, logical volumes used by the 2nd Service Group. And strange enough now my LVM volumes for my root disks are listed as well.
    • I can not failover my 2nd Service Group to the first node. The cluster fails to online the LVM Volume Group resources on the 1st node.
    • I can not online my 1st Service Group on the first node. The cluster fails to online the LVM Volume Group resources on the 1st node.
    • I can not online my 1st Service Group on the second node. The cluster fails to online the LVM Volume Group resources on the 2nd node.

    Any ideas?


     

  • I found the cause of my issue. It seems that the Veritas Volume Manager in the boot scripts deactivates all LVM volumes and export all LVM volume groups. To activate and export them again afterwards.

    I commented out the second section with the export and deactivation LVM command.

     

    Orginal snippet of the file:

    [root@bl19-13]# vi /etc/rc3.d/S14vxvm-boot
                    vgchange -a n $vg
                    vgexport $vg
                    vgimport $vg
                    vgchange -a y $vg

    New snippet of the file:

    [root@bl19-13]# vi /etc/rc3.d/S14vxvm-boot
                    vgchange -a n $vg
                    #vgexport $vg
                    #vgimport $vg
                    #vgchange -a y $vg