cancel
Showing results for 
Search instead for 
Did you mean: 

Primary server hangs with replication operations...

Ryan_H_
Level 4
Partner Accredited Certified
Hello,

I have SF 5.1 w/ SP1 installed on win 2008 w/ VVR option, (2 nodes, primary and secondary).  My replication had been working fine for last few days but for last week end I am facing an issue with my primary server.

Whenever I tries to pause or stop or start the replication my server gets hang and I have to do hard reset by power off the system. There is no issue with the replication itself. but we are unable to do replication operations from the primary server.

Any one have idea, pls share wit me.. thnks...

Regards,
Syed.
1 ACCEPTED SOLUTION

Accepted Solutions

jlockley
Level 3
Employee Accredited Certified

If you are experiencing this issue you will see events in the system event log.  The command eventually succeeds but is waiting for the rlink to disconnect (hence the "workaround" of disconnecting the network).

"Some time" (the incident shows 30 mins) after trying the command you will see vxio events in the system event log to say the RLINK was disconnected.

such as:

WARNING      Event 99 vxio <server> RLINK <your_rlink_name> disconnected from remote

and also at this time to this will be a retry

such as:

WARNING Event 134 vxio <server> Disconnecting RLINK <your_rlink_name> as retry count exceeded 200
 

If you can wait to see if these messsages appear, then you are hitting this known issue.  If you are not, or you get a failure message after 1-2 mins, then you have a different issue.

I can't see where this issue hangs the server as above, I'd suspect that is not this issue.

This issue would apply to any VVR operation (eg, expand volume, pause, resume) so if you are executing from the secondary site you may hit this issue.

James.

View solution in original post

10 REPLIES 10

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified
Have you had a look at Event Viewer Application Log as well as System log yet?

Ryan_H_
Level 4
Partner Accredited Certified
Yes, I checked on both servers but nothing is there,

When ever i try to pause the replication my primary server becomes hang,

Reards,
Ryan

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

Seems this is a 'Known Issue'. Extract from Release Notes:
Known Issues -> Veritas Volume Replicator
p.96:
Pause and Resume commands take a long time to complete (495192)
At times, the pause and resume operation can take a long time to complete due to which it appears to be hung.

Workaround: Wait for some time till the operation completes, or manually disconnect and reconnect the network that is used for communication to enable the operation to complete.

Ryan_H_
Level 4
Partner Accredited Certified
Hello,

thanks for the reply,

manually disconnect means removing the nework cable from the server, this is the live server I have, and this operation will cause users to stop working.....

Ryan_H_
Level 4
Partner Accredited Certified
It didn't help. any way This is not a solution at all, How can you disconnect the cable for a live server.

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified
It is good practice (not required) to have dedicated NIC's for replication.
The alternative in the suggested workaround is to "Wait for some time till the operation completes".

Maybe log a call with Symantec Support and tell them that you're not happy with their documented workaround?

Ryan_H_
Level 4
Partner Accredited Certified
Hi,

Once server got hang, you can't do anything, Even I wait for 30 minutes but there is nothing to do with...

BTW, you can ping the server but the screen got hanged & all desktop icons & open window disappered with this.

BR,
R. H.

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified
Please log a support call. Symantec will look at explorer output and might finding something more serious than the 'known issue' that's documented in the Release Notes.

Pellzy
Not applicable

Does this known issue also apply when you choose to "pause secondary from primary" when logged onto the secondary site?

 

Thanks - Mark

jlockley
Level 3
Employee Accredited Certified

If you are experiencing this issue you will see events in the system event log.  The command eventually succeeds but is waiting for the rlink to disconnect (hence the "workaround" of disconnecting the network).

"Some time" (the incident shows 30 mins) after trying the command you will see vxio events in the system event log to say the RLINK was disconnected.

such as:

WARNING      Event 99 vxio <server> RLINK <your_rlink_name> disconnected from remote

and also at this time to this will be a retry

such as:

WARNING Event 134 vxio <server> Disconnecting RLINK <your_rlink_name> as retry count exceeded 200
 

If you can wait to see if these messsages appear, then you are hitting this known issue.  If you are not, or you get a failure message after 1-2 mins, then you have a different issue.

I can't see where this issue hangs the server as above, I'd suspect that is not this issue.

This issue would apply to any VVR operation (eg, expand volume, pause, resume) so if you are executing from the secondary site you may hit this issue.

James.