we have two mailboxes servers hosted on vmware and 30 databases configured in a DAG between the two servers.
currently I have a policy set for a 15 hour window for a full GRT backup during the night (between 17-8), and 7 jobs limit per policy.
lately the backup fail to complete in the time frame we have. and this is after we lengthened it already.
other than increasing the job limit, how can I speed the backups? the servers seem to struggle as it is (I think the backups are causing them to lose connection with the witness server- I never had the time to look into it)
I considered ticking the 'disable client side deduplication' on the policy, since we are using a netbackup appliance with deduplication enabled. but I was unsure of where it moves the workload..
thanks in advance for the help
Seems you know where the issue is:
... the servers seem to struggle as it is ....
Server/Exchange Admins need to request additional resources.
Disabling Client Side dedupe will only help to free up local resources if Client Side dedupe is current enabled under Host Properties -> Master Server -> Client /attributes -> (Exchange Nodes) -> Deduplication location.
The default is Media server only.
If you have not explisitly enabled Client Side Dedupe in this section, then your Exchange nodes are not performing local dedupe and your Appliance is performing the dedupe.
A little bit hard to work out your exact environment, as you list your OS as AIX, and NetBackup 7.1 or earlier, but then in the tags you have Windows Server 2003 and NetBackup 7.7. The last NetBackup version to support Win2K3 was 7.5 if memory serves me correctly.
Having said that, if you are on 7.7, are you using VMware policies to do the Exchange backup? I would highly recommend this, but there is not enough info about your environment - are the Exchange nodes physical or VM?
I can give you some information as to how we improved our performance with VMware Exchange GRT backups. My environment is NetBackup 220.127.116.11 Masters (running on Windows 2008 R2 64-bit) , with NetBackup 5230 Appliances as Media Servers. Exchange is 2013 running on Windows 2012 R2 64-bit.
We have two active DAG nodes at our production site, and one passive node at our DR site. We found similar to you, that when we backed up either of the active nodes, we would have performance issues including loss of connectivity to the witness server, and active users would lose connectivity in Outlook.
Due to this, we changed to backing up the passive node at the DR site instead. This eliminated those issues. Note that we have a witness server at each site. I'm not sure if you have the facility to create a passive node (or nodes), and do the same as above, but for us it was the solution. Note that my environment is much smaller, with only 4 DAG databases.
Hope this gives you some ideas,
I agree with the suggestion to add a server to hold passive copies of your databases, and back up from it. Choose passive-only as the database source and list the new passive node as the only preferred server.
Otherwise, improving the performance and making your time window depends on where the time is being spent. If it's in the main data transfer of the .edb and transaction log files, then look to your network performance. If it's in the GRT phase, then there are two parts where that is most likely to take time, first linking the data files together in nbfsd to stage the backed up database, and then bringing up the database in the JET engine. The bpbkar log will be your guide.
Note that even though you are running multiple jobs (streams), the GRT phase runs serially. (The data transfer can run in parallel, and one job can be doing the GRT phase while others transfer data.)
As to the impact on your Exchange servers, is that happening during the snapshot job before the backup job starts, or during the backup job?