IdaWong
11 years agoLevel 4
vxfen in replaying state after reboot
Hi,
I have rebooted a cluster with CPS vxfencing.
there is no SAN disks mapped to the cluster. However, "vxfenadm -d" shows 2 out of 4 nodes is in (replaying state) for a long time. i had to do
hastop -all (properly on the nodes with the replay status would do)
then hastart again
it seems like a timing. has anyone out there seen this before?
Thanks in advance.
- Hi, Wong You can ignore that, it will function normally. Here is explanation from a technote: RFSM ( Replicated Finite State Machine) is an abstraction layer over GAB. VxFen uses it to handle node join processing functionality. The ‘replaying’ state When a node joins the cluster the RFSM builds its state table by requesting a snapshot from a node which is already running in the cluster. In this duration, all broadcast messages from peer nodes are queued. Once the local node receives the broadcast echo of its own SNAP_REQ message, it starts replaying these messages to the RFSM client; hence the name ‘replaying’. The fact that a node has gone in to the ‘replaying’ state means, that RFSM has acquired all the data needed for properly configuring itself. A CONFIG_DONE message is broadcast to indicate this. As soon as RFSM receives its own CONFIG_DONE message (while in replaying state), it transitions to the ‘running’ state. Solution For all practical purposes, the ‘replaying’ state can be assumed to be equivalent to the ‘running’ state. Fencing continues to function in the same way (for e.g during a race or node join/leave) as in the ‘running’ state. Nodes remaining in ‘replaying’ state is a known issue in RFSM, and is being tracked by engineering through an etrack, to be fixed in a future release.