cancel
Showing results for 
Search instead for 
Did you mean: 

CFS access is blocking after both heartbeat links are down

kongzzzz
Level 3

Hi

I have a server cluster environment with VCS 6.0.2, 6 servers constitute the cluster, I/O fencing is configured with 3 coordinator disks. If I cold boot one of server, I found CFS access on other running server is blocking in a period.

I found CFS access starts being blocked when below logs appear in /var/log/messages

  LLT INFO V-14-1-10205 link 0 (eth6.109) node 0 in trouble

  LLT INFO V-14-1-10205 link 1 (eth7.110) node 0 in trouble

And access allowing when below logs appear in /var/log/messages

vxfs: msgcnt 8 Phase 2 - /dev/vx/dsk/filedg/filevol - Buffer reads allowed.

vxfs: msgcnt 9 Phase 9 - /dev/vx/dsk/filedg/filevol - Set Primary nodeid to 2

vxglm INFO V-42-106 GLM recovery complete, gen f59d30, mbr 2c/0/0/0

vxglm INFO V-42-107 times: skew 2673 ms, remaster 78 ms, completion 40 ms

I think the CFS access blocking is for data protection, but as my observation, CFS access blocking may continue 10+ seconds on running servers, so my questions are:

1. Is this the correct behaviour for VCS to block CFS access 10+ seconds?

2. Why not start CFS access blocking after heartbeat link being expired and before racing coordinator disks.

 

Thanks in advance!

1 ACCEPTED SOLUTION

Accepted Solutions

Daniel_Matheus
Level 4
Employee Accredited Certified

Hi Kongzz,

 

this is expected behaviour.

CVM and CFS master need to failover if the master node is borught down or faulted.

This also includes replaying any queued I/O (intent log).

 

Please see excerpt from the SFCFS admin guide:

 

If the server on which the Cluster File System (CFS) primary node is running fails,
the remaining cluster nodes elect a new primary node. The new primary node
reads the file system intent log and completes any metadata updates that were
in process at the time of the failure. Application I/O from other nodes may block
during this process and cause a delay. When the file system is again consistent,
application processing resumes.
Because nodes using a cluster file system in secondary node do not update file
system metadata directly, failure of a secondary node does not require metadata
repair. CFS recovery from secondary node failure is therefore faster than from a
primary node failure.

View solution in original post

3 REPLIES 3

Daniel_Matheus
Level 4
Employee Accredited Certified

Hi Kongzz,

 

this is expected behaviour.

CVM and CFS master need to failover if the master node is borught down or faulted.

This also includes replaying any queued I/O (intent log).

 

Please see excerpt from the SFCFS admin guide:

 

If the server on which the Cluster File System (CFS) primary node is running fails,
the remaining cluster nodes elect a new primary node. The new primary node
reads the file system intent log and completes any metadata updates that were
in process at the time of the failure. Application I/O from other nodes may block
during this process and cause a delay. When the file system is again consistent,
application processing resumes.
Because nodes using a cluster file system in secondary node do not update file
system metadata directly, failure of a secondary node does not require metadata
repair. CFS recovery from secondary node failure is therefore faster than from a
primary node failure.

mikebounds
Level 6
Partner Accredited

Default timeout for LLT is 15 seconds

With a CFS mount, one node is designated the primary (for each mount) and this controls coordination of writes.

So I believe this is how it works:

If you loose the node where the CFS primary resides, then until another node is elected primary, the secondary nodes can't write and can only read from buffer as if you read from disk then data could be changing as you read the data as the primary might not be down - you could have split-brain.  No action can be taken until heartbeat times out as the heartbeat could return.  A new node cannot be elected primary until race condition decides which nodes stay up.

Additionally, see extract from CFS admin guide:

 

If the server on which the SFCFS primary is running fails, the remaining cluster
nodes elect a new primary. The new primary reads the file system intent log and
completes any metadata updates that were in process at the time of the failure.
Application I/O from other nodes may block during this process and cause a delay.
When the file system is again consistent, application processing resumes.
So I would think that CFS access could be blocked for 15+ seconds
Mike

 

kongzzzz
Level 3
Thanks! Daniel and mikebounds.