cancel
Showing results for 
Search instead for 
Did you mean: 

Duplication throughput

sk_martin2001
Level 3

I need some help.

My environment.

WS 2008

NBU 7.6.1

1> I am trying to duplicate my old LTO1/LTO2 tapes to new LTO4 Tapes but I am getting a very low throuput of 20 mb/sec. I have LTO3 and LTO4 tapes reading the LTO1 and LTO2 respectively.

Is this the max we can get or we cam improve the performance by any tweaks.

2> Also how to find if I am utilizing my LTO4 at full. How to check the throuput on my LTO 4 and best utilize them in duplication.

 

10 REPLIES 10

sdo
Moderator
Moderator
Partner    VIP    Certified

1) How many old LTO1/2 tapes do you have to process?

2) What bus/connectivity (SCSI-1, SCSI-2, Wide, Ultra? - or FC 1Gb, 2Gb) are the old LTO1/2 tape drives connected via?

3) Were the original backup images multi-plxed when they were originally written to the old LTO1/2 tapes?

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified
More questions: Do you know what SIZE and NUMBER_DATA_BUFFERS were used at the time of original backups? What are these values set to at the moment? Can we assume that read and write drives are connected to the same media server? How many drives simultaneously in use? Do you have bptm log in place to monitor tead and write throughput? (Level 0 will suffice.)

sk_martin2001
Level 3

Sdo:-

1) How many old LTO1/2 tapes do you have to process?

I have to process 22000 tapes written in 2008 to 2009

2) What bus/connectivity (SCSI-1, SCSI-2, Wide, Ultra? - or FC 1Gb, 2Gb) are the old LTO1/2 tape drives connected via?

These are connected via FC, using 3 media agents and have 10 drives shared across 3 media agents.

3) Were the original backup images multi-plxed when they were originally written to the old LTO1/2 tapes?

Not sure about that, is there a way we can check that ?

 

Marianne :-

 

Do you know what SIZE and NUMBER_DATA_BUFFERS were used at the time of original backups?

sorry, not sure at the time of backup. Is there a way to check now ?

What are these values set to at the moment?

Size_Data_Buffers - 262144

Number_Data_Buffers - 32

Can we assume that read and write drives are connected to the same media server?

Yes we have 10 drives shared across 3 media servers.

How many drives simultaneously in use?

10

Do you have bptm log in place to monitor tead and write throughput? (Level 0 will suffice.)

I just set it up, please let me know how to read this and I can provide the information.

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

I don't know if there is way to check now for original buffer settings.

With 10 drives shared across 3 media server, you need to ensure that each duplication job will for sure use read and write drive on the same media server.

You also need to confirm that you are not overloading any particular media server - 
If one media server is using 4 drives simultaneously (2 read and 2 write), you need a seriously 'beefy' media server with drives preferrable connected (zoned) via different hba's.
In my experience, I have hardly ever seen a media server that can stream (read and/or write) more than 2 or 3 at documented drive speed.
And if all drives are zoned to the same hba, this will probably the bottleneck.

To read bptm log, you may want to go through the Permormance Tuning Guide - link in Featured Post  Updated NetBackup Backup Planning and Performance Tuning Guide for Release 7.5 and Release 7.6 

At the beginning of duplication job, there will be separate bptm processes initiated for the read and write processes (can be seen as different PIDs in square brackets [ ] that will correspont with PIDs in Job details).
The number and size of data buffers will be logged for each of the read and write processes.
Intermediet throughput will be logged at end of each read or write fragment.

At the end of the job there will be 'waited for full/empty buffers' for each of the read and write processes. 

See this section in Performance Tuning Guide for detailed description :

Finding wait and delay counter values

 

sk_martin2001
Level 3

Thanks for the valuable inputs.

Yes each duplication job read and write drive on the same media server.

I would like to understand more on the overloading thing. I have HBA of 8 GB on all the 3 media servers that I have zoned with all the drives. So all the media servers sees all the drives but at any point of time they are using 4 drives per server (2 read and 2 write).

I was assuming that the throuput that you get on LTO1 - 20MB/sec and LTO2 - 40mb/sec. This is uncompressed throuput that you can get on the LTO1 and LTO2 tapes.

So I am reading with lets assume 40MB * 2 Drive(LTO1 and LTO2 Drives) = 80 MB/sec from 2 drives that are LTO1 or LTO2

Writing with lets assume 120 MB * 2(LTO4 Drive) = 240 MB/sec from 2 drives that are LTO 4

Total is = 240 + 80 =  320 MB/Sec

So if I am reading and writing at that speed of 320 MB/Sec. Woudn't the media server would be enough with 8 GB HBA.

Please clarify this part as this might help me. Thanks again for your time.

 

sk_martin2001
Level 3

Just to addd one more thing, memory and CPU is not a problem at all as I see the performance. I have 96 GB RAM on these servers.

sk_martin2001
Level 3

if someone can clarify on it please

sdo
Moderator
Moderator
Partner    VIP    Certified

You won't be able to write faster to LTO4 any faster than you can read from LTO1/LTO2.

I suspect that your LTO1/2 read speeds may be causing your LTO4 drives to stutter/shoeshine because the LTO4 drives might not be achieving their minimum streaming speeds.

If the original images were multi-plexed then this makes the above even more likely.

To see what actual speeds you are definitely actually achieving from/to the tape drives, then logon to the SAN switch and do "portperfshow" (if it's a Brocade).  And you should be able to see and work out actual data volumes flowing through each of the SAN switch ports that face each of tape drives, and the server HBAs.

Choose a time that is relateively quiet on the media server and then try a duplication again, and monitor four SAN switch ports:

1) LTO1/2 tape drive -> SAN switch port (rx)

2) SAN switch port (tx) -> server HBA

3) server HBA -> SAN switch port (rx)

4) SAN switch port (tx) -> LTO4 tape drive

If you see low speeds - and the media server is not stressed, then the root cause is probably the nature of the LTO1/2 media (probably multi-plexed) and or the speed limitation of the old LTO1/2 drives.

.

The media report should show whether the source LTO1/2 media are multi-plexed or not.

sdo
Moderator
Moderator
Partner    VIP    Certified

With 22,000 old tapes to process, are you doing this by hand?

Genericus
Moderator
Moderator
   VIP   

I did this when I moved from LTO2 to LTO5, but I got 20TB of disk and duplicated images to disk and then wrote to tape, this enabled me to queue up via scripting bunches of images and write them to disk, then send that out at destination tape speeds. I was also able to clean up a lot of wasted space - I had all my application groups review their retention requirements, and was able to purge a TON of backups kept "just in case we need to fail back our database to Oracle 8, from years ago - which would never happen.

 

NetBackup 9.1.0.1 on Solaris 11, writing to Data Domain 9800 7.7.4.0
duplicating via SLP to LTO5 & LTO8 in SL8500 via ACSLS