Solved: VCS does not notice SAN LUN disappearance

thstettler · ‎03-19-2012

We have a cluster running on two Oracle M3000 with Solaris 10 Update 9 and VCS 5.1 SP1RP2.

Some more information on the cluster setup:

- Seven containers running as cluster resources

- On each of these one Oracle 11.2 DB is running

- The nodes have their storage on two SAN LUNs mirrored with ZFS

- All containers and the DB storage are also installed on SAN LUNs

- LUNs are stored on two EMC VNX

- All seven Container SGs have been set up the same way.

Here is the problem, which may not even be one. All the cluster failover tests worked as expected save one. We unplugged both FC cables from one node and it continued running. VCS did not notice this either. The node could no longer be reached by SSH but reponded to pings. GAB and LLT did not complain. On the XSCF console there were errors noted about the SAN devices not being available, but that was it. After about 15 minutes we halted the node and rebooted.

Now, I wonder if VCS should not have noticed that the LUNs had gone or that the node did not respond properly anymore. Or is this expected behaviour?

I know the scenario tested this way is very unlikely indeed.

Thank you for enlightening me.

Excerpt from main.cf:

group abc (

SystemList = { node1 = 0, node2 = 0 }

ContainerInfo @node1 = { Name = abc, Type = Zone, Enabled = 1 }

ContainerInfo @node2 = { Name = abc, Type = Zone, Enabled = 1 }

AutoStartList = { node1, node2 }

Administrators = { z_abc_zone_node1, z_abc_zone_node2 }

)

IP abc_nfs_IP (

Device = aggr10

Address = "192.168.10.11"

NetMask = "255.255.255.192"

)

IPMultiNICB abcip (

BaseResName = multinicbels

Address = "192.168.20.22"

NetMask = "255.255.255.192"

)

Mount abcmntarch (

FsckOpt = "-y"

BlockDevice = abcarch

MountPoint = "/oradata/abc/u02"

FSType = zfs

ContainerOpts = { RunInContainer = 1, PassCInfo = 1 }

)

Mount abcmntdata (

FsckOpt = "-y"

BlockDevice = abcdata

MountPoint = "/oradata/abc/u01"

FSType = zfs

ContainerOpts = { RunInContainer = 1, PassCInfo = 1 }

)

Mount abcmntlog (

FsckOpt = "-y"

BlockDevice = abclog

MountPoint = "/oradata/abc/u03"

FSType = zfs

ContainerOpts = { RunInContainer = 1, PassCInfo = 1 }

)

Mount abcmnttempfiles (

FsckOpt = "-y"

BlockDevice = abctempfiles

MountPoint = "/oradata/abc/u04"

FSType = zfs

ContainerOpts = { RunInContainer = 1, PassCInfo = 1 }

)

NIC abc_nfs_NIC (

Device = aggr10

)

Netlsnr ABC_LISTENER (

Owner = oracle

Home = "/opt/oracle/app/product/11.2.0/dbhome_1"

)

Oracle ABC (

Pfile = "/oradata/abc/u04/admin/abc/pfile/init.ora"

Owner = oracle

Home = "/opt/oracle/app/product/11.2.0/dbhome_1"

StartUpOpt = STARTUP

Sid = abc

)

Proxy abcnic (

TargetResName = multinicbels

)

Zone abc_zone (

BootState = multi-user-server

)

Zpool abc_arch_pool (

AltRootPath = "/oradata/abc/u02"

PoolName = abcarch

ZoneResName = abc_zone

)

Zpool abc_data_pool (

AltRootPath = "/oradata/abc/u01"

PoolName = abcdata

ZoneResName = abc_zone

)

Zpool abc_log_pool (

AltRootPath = "/oradata/abc/u03"

PoolName = abclog

ZoneResName = abc_zone

)

Zpool abc_root_pool (

PoolName = abc

ZoneResName = abc_zone

)

Zpool abc_tempfiles_pool (

AltRootPath = "/oradata/abc/u04"

PoolName = abctempfiles

ZoneResName = abc_zone

)

ABC requires abcip

ABC requires abcmntarch

ABC requires abcmntdata

ABC requires abcmntlog

ABC requires abcmnttempfiles

ABC_LISTENER requires abcip

abc_nfs_IP requires abc_nfs_NIC

abc_nfs_IP requires abc_zone

abc_zone requires abc_arch_pool

abc_zone requires abc_data_pool

abc_zone requires abc_log_pool

abc_zone requires abc_root_pool

abc_zone requires abc_tempfiles_pool

abc_zone requires abcnic

abcip requires abc_zone

abcmntarch requires abc_zone

abcmntdata requires abc_zone

abcmntlog requires abc_zone

abcmnttempfiles requires abc_zone

Eric_Hennessey1 · ‎03-19-2012

I think the main reason you don't see the cluster reacting is that while the agents are in memory, the agent entry points are not. So the scenario is that monitor interval for a Mount resource expires and the Mount agent calls the monitor entry point. Since the entry point script resides on disk on the root file system, the OS has to read it in. It now becomes an OS issue in that it can't access something it needs, but agent framework doesn't know what to do.

Personally, I'd love to see the OS throw a panic when root file system becomes unavailable. That would clear up a lot of problems. :)

View solution in original post

Eric_Hennessey1 · ‎03-19-2012

If a node's root file system is on SAN LUNs and you pulled both FC cables from that node, then the root file system is lost. We've seen very strange and unpredictable OS behavior in these cases.

GAB and LLT probably continued to run as they were in kernel memory and wouldn't have necessarily noticed the loss of the root file system.

thstettler · ‎03-19-2012

Yes, the root file systems are on SAN as well.

Strange behaviour seems about right. Strangely enough, we had had a similar scenario, when the SAN Admin removed the nodes from the proper zoning. But, then, the zones and nodes failed almost immediately.

I would have thought that one of the cluster resources, such as the mount resource would realize there is something wrong.

Eric_Hennessey1 · ‎03-19-2012

I think the main reason you don't see the cluster reacting is that while the agents are in memory, the agent entry points are not. So the scenario is that monitor interval for a Mount resource expires and the Mount agent calls the monitor entry point. Since the entry point script resides on disk on the root file system, the OS has to read it in. It now becomes an OS issue in that it can't access something it needs, but agent framework doesn't know what to do.

Personally, I'd love to see the OS throw a panic when root file system becomes unavailable. That would clear up a lot of problems. :)

sajith_cr · ‎03-19-2012

zpools do not react correctly to storage loss. One way to initiate system response is setting failmode=panic for each zpool. If DiskGroup is used instead od zpools, PanicOnDGLoss attribute of DiskGroup can be set.

thstettler · ‎03-19-2012

Thank you for your insight.

VOX

VCS does not notice SAN LUN disappearance