cancel
Showing results forΒ 
Search instead forΒ 
Did you mean:Β 

VCS Switch is taking up to 10 min.

mhab11
Level 5

I have 2 server running SFWHA 5.1 SP2 on a 2008R2 platform. I created the diskgroups through VEA then created my clusters and Service Group in VCS. when the servers were furst setup the switch took about 30 seconds. Currently it takes about 10 mins. I have the same setup in a testbed that is having no problems, the only difference is production has users and data running throught it. Is there a way to see where the real hangup it coming from as the logs just give me time gaps but no real details what happened?

 

 

Jun 7, 2012 4:02:23 AM

V-16-1-50135 User admin fired command: hagrp -switch SG_AvMail MHSWS001ANP-2 localclus from 192.168.249.203

 

 

Jun 7, 2012 4:02:23 AM

V-16-1-10208 Initiating switch of group SG_AvMail from system MHSWS001ANP-1 to system MHSWS001ANP-2

 

 

Jun 7, 2012 4:02:23 AM

V-16-1-10300 Initiating Offline of Resource AviNetConnector (Owner: unknown, Group: SG_AvMail) on System MHSWS001ANP-1

 

 

Jun 7, 2012 4:02:23 AM

V-16-1-10300 Initiating Offline of Resource AvMailInterchange (Owner: unknown, Group: SG_AvMail) on System MHSWS001ANP-1

 

 

Jun 7, 2012 4:02:29 AM

V-16-1-10305 Resource AviNetConnector (Owner: unknown, Group: SG_AvMail) is offline on MHSWS001ANP-1 (VCS initiated)

 

 

Jun 7, 2012 4:02:34 AM

V-16-2-13064 (MHSWS001ANP-1) Agent is calling clean for resource(AvMailInterchange) because the resource is up even after offline completed.

 

 

Jun 7, 2012 4:02:35 AM

V-16-2-13069 (MHSWS001ANP-1) Resource(AvMailInterchange) - clean failed.

 

 

Jun 7, 2012 4:02:35 AM

V-16-2-13069 (MHSWS001ANP-1) Resource(AvMailInterchange) - clean failed.

 

 

Jun 7, 2012 4:03:34 AM

V-16-1-10305 Resource AvMailInterchange (Owner: unknown, Group: SG_AvMail) is offline on MHSWS001ANP-1 (VCS initiated)

 

 

Jun 7, 2012 4:03:34 AM

V-16-1-10300 Initiating Offline of Resource AGN_IP (Owner: unknown, Group: SG_AvMail) on System MHSWS001ANP-1

 

 

Jun 7, 2012 4:03:34 AM

V-16-1-10300 Initiating Offline of Resource Volume (Owner: unknown, Group: SG_AvMail) on System MHSWS001ANP-1

 

 

Jun 7, 2012 4:03:35 AM

V-16-1-10305 Resource AGN_IP (Owner: unknown, Group: SG_AvMail) is offline on MHSWS001ANP-1 (VCS initiated)

 

 

Jun 7, 2012 4:04:23 AM

V-16-1-10305 Resource Volume (Owner: unknown, Group: SG_AvMail) is offline on MHSWS001ANP-1 (VCS initiated)

 

 

Jun 7, 2012 4:04:23 AM

V-16-1-10300 Initiating Offline of Resource DiskGroup (Owner: unknown, Group: SG_AvMail) on System MHSWS001ANP-1

 

 

Jun 7, 2012 4:09:37 AM

V-16-1-50135 User admin fired command: MSG_RES_PROBE DiskGroup MHSWS001ANP-1 from 192.168.249.203

 

 

Jun 7, 2012 4:09:50 AM

V-16-1-10305 Resource DiskGroup (Owner: unknown, Group: SG_AvMail) is offline on MHSWS001ANP-1 (VCS initiated)

 

 

Jun 7, 2012 4:09:50 AM

V-16-1-10446 Group SG_AvMail is offline on system MHSWS001ANP-1

 

 

Jun 7, 2012 4:09:50 AM

V-16-1-10301 Initiating Online of Resource AGN_IP (Owner: unknown, Group: SG_AvMail) on System MHSWS001ANP-2

 

 

Jun 7, 2012 4:09:50 AM

V-16-1-10301 Initiating Online of Resource DiskGroup (Owner: unknown, Group: SG_AvMail) on System MHSWS001ANP-2

 

 

Jun 7, 2012 4:09:51 AM

V-16-1-10306 Resource DiskGroup (Owner: unknown, Group: SG_AvMail) is offline on MHSWS001ANP-1 (Previous State = OFFLINE)

 

 

Jun 7, 2012 4:09:51 AM

V-16-1-10298 Resource AGN_IP (Owner: unknown, Group: SG_AvMail) is online on MHSWS001ANP-2 (VCS initiated)

 

 

Jun 7, 2012 4:09:52 AM

V-16-6-15004 (MHSWS001ANP-1) hatrigger:Failed to send trigger for postoffline; script doesn't exist

 

 

Jun 7, 2012 4:11:50 AM

V-16-1-10298 Resource DiskGroup (Owner: unknown, Group: SG_AvMail) is online on MHSWS001ANP-2 (VCS initiated)

 

 

Jun 7, 2012 4:11:50 AM

V-16-1-10301 Initiating Online of Resource Volume (Owner: unknown, Group: SG_AvMail) on System MHSWS001ANP-2

 

 

Jun 7, 2012 4:12:01 AM

V-16-1-10298 Resource Volume (Owner: unknown, Group: SG_AvMail) is online on MHSWS001ANP-2 (VCS initiated)

 

 

Jun 7, 2012 4:12:01 AM

V-16-1-10301 Initiating Online of Resource AvMailInterchange (Owner: unknown, Group: SG_AvMail) on System MHSWS001ANP-2

 

 

Jun 7, 2012 4:12:01 AM

V-16-1-10301 Initiating Online of Resource AviNetConnector (Owner: unknown, Group: SG_AvMail) on System MHSWS001ANP-2

 

 

Jun 7, 2012 4:12:03 AM

V-16-1-10298 Resource AvMailInterchange (Owner: unknown, Group: SG_AvMail) is online on MHSWS001ANP-2 (VCS initiated)

 

 

Jun 7, 2012 4:12:05 AM

V-16-1-10298 Resource AviNetConnector (Owner: unknown, Group: SG_AvMail) is online on MHSWS001ANP-2 (VCS initiated)

 

 

Jun 7, 2012 4:12:05 AM

V-16-1-10447 Group SG_AvMail is online on system MHSWS001ANP-2

 

 

 

                                                                                       

 

 

 

 

 

 

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions

Wally_Heim
Level 6
Employee

Hi mhab11,

You mentioned that you are running 5.1 SP2 but you did not mention if you have any CPs installed on top of that.  Several of the early CPs had DG import/deport performance fixes that might be helpful in your situation.  I would recommend going to one of the latest CPs like CP10 or higher and see if your issue still happens.

There have also been major increases to the 6.0 product with the FastFailover option in SFW-HA clusters.  If you could upgrade to 6.0 and if your environment meets all the requirements for FastFailover (SCSI-3 is the big one) then you would see major improvements for Diskgroup imports and deports.

 

Thank you,

Wally

View solution in original post

6 REPLIES 6

mikebounds
Level 6
Partner Accredited

There seems to be a few problems:

  1. Delay of just over a minute (4:02:23 to 4:03:34) as AvMailInterchange does not offline correct and clean fails too
  2. Delay of just under a minute  (4:03:34 to 4:04:23) as it takes a long time for Volume to offline
  3. Delay of over 5 mins (4:04:23 to 4:09:50) as it takes a long time for Diskgroup to offline
  4. Delay of over 2 mins ( 4:09:51 to  4:1:50) as it takes a long time for Diskgroup to online 

These delays total 9 mins so these are your issues and diskgroup deport and import seems to be the biggest.

Diskgroup imports and deports can take a long time if you have a lot of volumes, but I guess you just have 1 volume and data running though volumes should make no difference to time - this would only effect Volume resource, not diskgroup resource.  I would look to see if there are any errors in VEA when diskgroup deports and imports and if not then you could try stopping cluster and importing and deporting manually in VEA if you can arange an outage for your application.

Mike

mhab11
Level 5

Is there a chance that DNS can cause issues with VEA ro VCS?

Wally_Heim
Level 6
Employee

Hi mhab11,

You mentioned that you are running 5.1 SP2 but you did not mention if you have any CPs installed on top of that.  Several of the early CPs had DG import/deport performance fixes that might be helpful in your situation.  I would recommend going to one of the latest CPs like CP10 or higher and see if your issue still happens.

There have also been major increases to the 6.0 product with the FastFailover option in SFW-HA clusters.  If you could upgrade to 6.0 and if your environment meets all the requirements for FastFailover (SCSI-3 is the big one) then you would see major improvements for Diskgroup imports and deports.

 

Thank you,

Wally

mikebounds
Level 6
Partner Accredited

I can only see DNS affecting a Lanman resource or resources for Applications like AvMailInterchange - I don't think DNS would even cause issues for IP reosources.  It certainly should not effect diskgroup and volume resources, unless, possibly, you are using CPS.  VCS does not requires VEA, VCS uses an API call (or possibly command line tools) for controlling diskgroup and volume resources.

Mike

mikebounds
Level 6
Partner Accredited

Sorry, just realised CPS is UNIX only, so can't be using CPS, so DNS issues should not effect diskgroups and volumes.

Mike

mhab11
Level 5

Yes CP10 and CP12 look to have some of the problems besides this that we have been having. We are going to start testing this in the TB to make sure we are good. Thanks for the help.