cancel
Showing results for 
Search instead for 
Did you mean: 

Netbackup 7.1.0.2-High disk I/O on client backing up flat files

dontribi
Level 3

So I've been beating my head on my keyboard for quite some time and was hoping someone here may be able to offer suggestions. We currently have a 1 master and 1 media server (Both server 2008 r2, sp1-96GB of Ram, 6 quad core processors) setup connecting to a tape library over fiber channel.  We backup our exchange environment using snapshots and then have to get those snapshots off to tape. 

The snapshot is mounted to the media server as a local drive and the media server then backups that content to tape.  The issue that we have is that once the backup job starts there is high disk utilization (High I/O) on the SAN during this backup.  So the bottle neck in our case is the I/O available on the SAN to process the backup.  The data that is being backed up consists of 1 large database file and several hundred log files (multiplied by 26 storage groups).

Through troubleshooting we have found that if we create a single policy with one mount point after another that seems to be the sweet spot.  However our I/O will spike to 100% and sit there for 5-10 minutes not giving the servers on the same disks any I/O.  I've seen throttling bandwidth but I'm not sure if that should be applicable in our scenario since the network isn't the issue.  But it may be necessary to slow the data stream down before it gets to tape.  I can provide you with additional data if needed.

10 REPLIES 10

Gautier_Leblanc
Level 5
Partner Accredited

Hello,

 

What is your backup target ? Tapes (how many) ? Disk ?

I don't think that NetBackup has possibility to slow down I/O, but may be we can find some tips to do that :)

Nicolai
Moderator
Moderator
Partner    VIP   

What type of snapshot are you using - Full Mirror or Space Efficient snapshots ?.

A space efficient snapshot (the one that save space) uses the same physical disk as the source. Sharing same disk will reduce the number of I/O per server. 

Marianne
Level 6
Partner    VIP    Accredited Certified

Is the same happening when you do the 'bpbkar test'?

See  http://www.symantec.com/docs/TECH17541

dontribi
Level 3

The target for the backups is to Tape.  It is a Scaler i500, 10- LTO5 drives, Total amount of data to be backed up roughly 3TB.  The backup has the same retention as other backups so the tape is shared.  But the job for backing it up is run during the day by itself, towards the end of the business day other jobs start to kick off possibly using the same tape.

dontribi
Level 3

The snapshots are netapp native snaps which are space efficient.  So it is more of a delta snap.  Leaving pointers to the original data sharing I/O with the live data on the same disk array.

dontribi
Level 3

It appears the disk I/O is the same.  It spikes to 100% during the backup.  I attached a screenshot of our filer head while the backup was running.  If you look at the column Disk Util that shows the utilization of the disks.

Also I have attached the results from the tests on both the Data lun and the Log lun for the backup.

Marianne
Level 6
Partner    VIP    Accredited Certified

In the 12 years (almost 13) that I've been supoorting NBU, I have NEVER seen a support call where anybody complained that backups were too fast... Most users can only DREAM of 120 Mb/sec!

I do understand why you say why this is a problem:

'... sit there for 5-10 minutes not giving the servers on the same disks any I/O'.

Maybe re-visit lun layout and ensure dedicated disks for these luns, then ENJOY your fast backups!

You need throughput in excess of 100 Mb/sec to stream LTO3 tape drives (not to mention LTO4 and 5...).

dontribi
Level 3

unfortunately we have already separated out the disk.  The storage is split between two filer heads and separate disk for the log and the data luns.  Data is on one filer and log is on another filer.  Where did you get the value of 120mb/sec?  I'm assuming its in the log file, but I'm new to reading those. 

 

Also while working on resolving this issue.  I found this article: http://www.symantec.com/business/support/index?page=content&id=HOWTO41830&key=15143&actp=LIST

 

Would that possibly help us with our issue?

Marianne
Level 6
Partner    VIP    Accredited Certified

"Where did you get the value of 120mb/sec?"

 

Other servers are still sharing the same physical disks?

dontribi
Level 3

Ah...that's the disk operation on the san.  That isn't the backup that is normal operation of reads for our exchange environment.  The only server (a single mailbox server, that is part of a ccr cluster) that is using this san is exchange.  There are two separate sans and the logs and data luns are equally split over the two sans as not to overload one box.