Does somebody know how to set up puredisk network resiliency ?
Can see entries like Reconnecttimout in agent.cfg & spa.cfg, but have not been able to find any documentation on these parameters
I am only aware of network resilience built into NBU 7.5 and later. (Seems PureDisk 6.x went EOSL in Sept 2014.)
For NBU MSDP, the number of work threads can be increased.
About WAN Resiliency: http://www.symantec.com/docs/TECH183552
According to this doc, the feature is 'client-only'.
Resilient connections are not supported for
a. Master Server <-> Media Server communications
b. Media Server <-> Media Server communications
b. is the reason, I started to look at the puredisk config files on our the MSDP media servers, as we have some replications on high latency/unstable connections.
Posted it here as I hoped somebody had used the network resiliency back in the pure puredisk days
Please check RAM utilisation.
Perhaps please try increasing the MBE Engine allowed memory.
But the second operation varies depending on from PureDisk version. The 6.6.5 has this implemented in an automatic way.
Sorry, but this makes no sense in relation to the question I asked in the start of this thread
That is what it looks like. Most often PureDisk drops connections because a particular process is crashing or getting out of memory, not really because of network issues.
I assume you have proper resilient network configuration (on the server side), yet still constant failing over between the LAN links will cause most secure connections to drop.
There is no explicit "resiliency ON" switch in PureDisk (meaning PDRoE) as it is in NetBackup.
For PureDisk the following options can allevaite poor WAN issues:
Log on to GUI as root. Go to Settings --> Configuration
Configuration - Configuration File Templates - PureDisk Client Agent
[Copy the default for affected agent, rename it and assign to Agent after completed with options]
In this template:
backup - MaxRetryCount (set to something higher... I found 50 for one agent)
contentrouter - tcpkeepalive (set to 1, if not already)
webservices - connect_timeout (30-60 seconds is a reasonable limit: if you can't establish connection it will not get any better)
webservices - timeout (I suggest something like 180 seconds, that combined with retry count will cover longer WAN failure)
Save the changes. Assign the template and push the template to the assigned PureDisk agent.
Hope that explains a bit
In this case we know it is network issues as the network connection is high latency with some dropouts.
Have created a support case about these parameters, as expected I am getting all the WAN resiliency documents for Netbackup which does not answer my question. Will update when I hopefully get some more relevant answers.