cancel
Showing results for 
Search instead for 
Did you mean: 

Exchange 2010 DAG as a SAN media server

Mahmoud_Moussa
Level 5
Partner

I have a problem regarding the taking the backup of Exchange from Windows server 2008 R2  with Exchange 2010 DAG it's very slow and so many times it hangs and fails with errors 42 and 24 

I know these errors referred to network but I can't distinguish which specifically I can do to solve it ? Is there any tool to see the packets being sent ? any way of looking deeper in the problem ?

I though of an idea to isolate the problem is to make the backup over SAN I know that there two ways to do the backup over SAN , one of them is to have this client as SAN media server and configure the robot on the server and configure it in the master server as a media server then choose in the attributes the robot configured on the client so my question is :
Is that possible knowing that we have three nodes for this Exchange 2010 DAG ? which means that I'll configure the robots on the three nodes ? 

Netbackup Version : 7.5
Client : Windows 2008 R2
SAN license is added
Agent : Netbackup for Exchange DB 

so if you need any more information please don't hesitate asking me and so looking forward to al your replies 

Thanks in advance ,

5 REPLIES 5

V4
Level 6
Partner Accredited

Why not look for FT-SAN client option

DB copies would be spread across nodes, jobs would fail , if it switches to another job for picking up copy of database.

Eitherways you can go for snapshot client too (offhost backup)

CraigL
Level 4

Hi -

The fact that your LAN-based backups are slow may be cause for concern and probably worth investigating into.

A few questions/comments:

- What kind of throughput ARE you seeing?

- Do you have firewall ports 1556, 13720, 13724, 13782 and 13783 open on all of your DAG nodes?

- Are you on NBU 7.5.0.5?  This version is required for Exchange 2010 SP3 support.

- Do you have the 'Perform snapshot backup' checkbox checked in Attributes?

- Are all of your DAG members at the same site?

- Have you tried changing the 'Database backup source' to see if performance improves?

- Have you monitored local performance on your DAG nodes when the job is running?

- For your Client, you should have the DAG name under 'Client Name'.

- Have you checked for updated NIC driver/firmware?  We just had an issue here with some HP DL580 G7s that had horrid network throughput and required an updated driver and firmware to bring the performance back.

Let me know!

-Craig

Mahmoud_Moussa
Level 5
Partner

hey All 

Thanks for your quick and informative replies
@CraigL 
Regarding your inquiries kindly find below the answers 
 

 What kind of throughput ARE you seeing?
you mean the speed of writing , if so It's variable becasue I'm using snapshots for each node of the DAG so the backup on the nodes are various in its speed some are fast and always one node are very slow like 640 KB/s and even sometimes 21KB/s but this node is not always the same 

- Do you have firewall ports 1556, 13720, 13724, 13782 and 13783 open on all of your DAG nodes?
Let me check these ports but from what I believe they are opened but let me check again on Sunday because obviously here in Egypt Friday & Saturdays are weekends :)

- Are you on NBU 7.5.0.5?  This version is required for Exchange 2010 SP3 support.
No I have only the version 7.5 I didn't upgrade it to 7.5.0.5 so do you recommend me to upgrade it ?

- Do you have the 'Perform snapshot backup' checkbox checked in Attributes?
Yes I do 

- Are all of your DAG members at the same site?
Yes they are , actually they are all virtual servers on VMware ESX

- Have you tried changing the 'Database backup source' to see if performance improves?
Yes I have and now I'm using " only the passive copy and if not available the active copy" 
I tried all other options and still the same result :)

 

- Have you monitored local performance on your DAG nodes when the job is running?
Yes I have and most of the time the error is related to the VSS writers and when I run the " Vssadmin writers list " it gives me that the Exchange Replica writer is unstable so I reboot the server and check it to find it stable but as soon as I do the backup and while the backup is running when I check it again I find that it returns back unstable 

- For your Client, you should have the DAG name under 'Client Name'.
Yes I do 

- Have you checked for updated NIC driver/firmware?  We just had an issue here with some HP DL580 G7s that had horrid network throughput and required an updated driver and firmware to bring the performance back.
I'm not sure but these servers are very new so I think they are have the latest firmware but let me check with the Windows and Exchange administrator but also on Sunday :)

 

@Captain Jack Sparow 
Yes I'm doing the off host snapshot backup isn't it by checking the box for snapshots in the attributes tab ? or I'm missing something here 
Also I will think of the FT Media server 
so you mean that using the option of San media server sure will fail 
if so I will think of using FT Media server :)

 

Thanks for both of you 

V4
Level 6
Partner Accredited

Mahmoud

DAG would have copies spread across nodes, if yuo choose to pick up active copies, imagine Robot would be active on node1 to pick up db1 therafter would get redirected to node 2 to pick up db2, (but to be frank this does not happen) instead it streams db2 over LAN, despite using storage group feature)

For DAG  or any clustered application, would strongly recomment to go for FT-SAN client (though licenses requirement would be same, but there would be additional server (FT media server RHEL/SLES with specific Qlogic HBA family)

Refer FT-san client documentation for more info.

With FT-SAN client , not only you can write to Tape over SAN , but you can also use Advanced Disk features to write data over SAN

 

Let me know if above is resourceful

CraigL
Level 4

OK..that throughput you are seeing.. is not good.  Some other comments about your answers above:

- If you're running Exchange 2010 SP3, get to NBU 7.5.0.5 just to cross that off your list.  It is the minimum required for Exchange 2010 SP3.

- When you say you are using snapshots.. what do you mean exactly?  For VM-level backups, that makes sense, but for your mail database backups, VSS should be the only 'snapshot' that is taking place.

- From your DAG nodes, have you tried copying large files to machines OUTSIDE if your ESXi hosts?  If not, can you do that and check the throughput?   If the file copies (try using a large ISO file, for example) are fast, then you know your problem lies somewhere within the backup process and not your VM or ESXi.  If slow, then you can start to look at your ESXi and VM setup.

- It's possible that VSSadmin message you are seeing could be the culprit.  I found this link that may be of assistance in troubleshooting this one:  http://blogs.technet.com/b/exchange/archive/2013/04/29/troubleshoot-your-exchange-2010-database-backup-functionality-with-vsstester-script.aspx .  Ultimately, if this doesn't expose the issue, you may want to open a case with Microsoft.

- My question regarding drivers/firmware is not relevant in this case; I didn't realize your Exchange 2010 servers were VMs.  The issue we ran into was on Windows-based hosts.

- I see some comments about doing an off-host backup.  Are you attempting this backup via a LAN-free method?  (ie: your NBU server is zoned into your VMFS volumes?)

-Craig