cancel
Showing results for 
Search instead for 
Did you mean: 

Duplication to tape using SLP is runing very slow

Born2rise
Level 4

Hi All,

We have SLP used in backup which has duplication to tape operation once backup completed. Problem is duplication to tape is running very slow causing SLP backlog to increase. I have increase the buffer setting on media servers but no much improvement in duplication to tape. Yes backup to tape is running with good speed  50-70 mbps but not getting good performance in duplication. 

Below is the environmental details.

Media/Master server : Netbackup 7.7.3 all are suse linux 

We are using Datadomain for backup destination then SLP to perform duplication to tape. 

IBM tape library which has 21 LTO 4 tape drives and all is running fine and used in duplication. I have calculated total size written to tape is 15TB per day which is not enough to handle backlog while backup to disk is approx 100 TB per day 

Please suggest if anything can be done.

8 REPLIES 8

Amol_Nair
Level 6
Employee
The major difference out here is that for normal tape backups you may have increased the mpx setting which may have improved the backup performance.. Unfortunately this cannot be done while duplicating images from dis stu since the mpx value on disk stu is always set to 1 and cant be increased.

What is the mpx settings on the tape storage unit (I am referring to the value only with respect to backups)

You mentioned you have tried increasing buffer settings.. Please could you elaborate because directly increasing them woh apply to backups.. If you plan on only increasing the values for duplication then you need to use a combination of restore buffers and backup buffer settings (as duplication is kind of restore + backup).

Do you see waits or delays in the job details? Since the image is going to DD and then moving to tapes do verify if DD is taking more time to rehydrate the images ( backend they also have their own dedup technology and need to create the complete image before sending it to tapes)

1 good test you could try is if there is advanced disk in the environment try sending a small test backup on it and then duplicating it to tapes..

And lastly forgot to ask is the issue happening with only a specific media server by any chance? Try switching the media server


Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

How many media servers do you have? 
And what is the network speed between DD and media servers?
Dedicated network between media servers and DD?
How many simultaneous duplications? And what is the spread between the media servers?
(I guess DD is not SAN attached to media servers, right?)

Have you considered the data path during duplication?
i.e. DD -> rehydrate -> network -> Media server -> tape drive.

Compare this with backup : 
Client -> network -> Media server -> tape drive.

Have you enabled bptm and bpdm logs (level 3) on media servers to evaluate results of buffer sizes before and after making changes?

Nicolai
Moderator
Moderator
Partner    VIP   

80MB/sec per stream is what you can expect from a data domain running DDOS 5.x (first hand experience). I do not have information if this has been improved in DDOS 6.x

The re-hydration speed in a Data Domain is depending on the age of data. Newer data re-hydrate faster than older data, this is due to internal block optimization with-in the Data Domain.

So - if you know data has to be stored on tape for a extended time - consider to duplicate data to tape in the beginning of the SLP instead of when data is about to expire (the active/postponed option in the SLP configuration).

Best Regards

Nicolai

Thanks all for suggestion
@Marianne
As i checked, before adjusting to buffer. Backup to tape was running with 20 to 30 mbps speed and now improved after making chages

We have media servers connected DD using 10 gig eth. DD is not conneted using SAN as of now. And at the same time 18 to 20 duplication jobs running.

@amol: i adjusted buffer on number_data_buffers_tape and size_data_buffer_tape. We do have appliance and getting same or a bit improved performace on that while duplication. Can u please let me know on restore and backup buffer setting which u mentioned.

And i can see there are "waited for full buffer 55902 times, delayed 255521 times messages on duplication jobs

@Nicolai: yes data is old now due to backlog... and i do not get exactly on active and postponed option...can u please help me to understand on this. How should i procced on it.

Thanks again

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

waited for full buffer 55902 times, delayed 255521 times "
indicates that media server is not receiving data fast enough.

Nicolai
Moderator
Moderator
Partner    VIP   

In the SLP configuration there is a section called "State of secondary operation processing" - you can either select to process the duplication at birth of backup (The "active" option) or at the end of a copy (backup) life time (postponed option). 

Active is the default - and is likley what you have configured - in this case you can't do anything. However if SLP is configured with the Postponed option , you should consider to switch to "active" when the backlog issues is resolved.

Hope this explain


If i got it correctly, in case of postpone option selected, retention of backup will be as per retention of backup operation and no slp backlog will increase util it come in active.

If this is the case, then we need to find another solution as i have to make sure we do have copy of backup on tape for longer duration

we have seen good throughput on backup
What are the possibility for this delay?