05-15-2011 11:05 PM
Our configuration is:
= NetBackup 6.5.6 on a Solaris 10 serveur based on T2+ processors
= A Data Domain OST device with DD Boost
= Network is 10 Gb/s Ethernet
Maximum performance of a single stream is about 30 MB/s... we can set a lot of streams, but performance of each individual stream is remains low !
On Linux processors, or on Solaris processors but without DD Boost, performance is good (between 100 MB/s and 200 MB/s for each stream).
Does anyone has experienced this?
05-16-2011 12:22 AM
Turn boost off and have a look at the performance..........
05-16-2011 01:36 AM
It is better without boost...
"On Solaris processors without DD Boost, performance is good (between 100 MB/s and 200 MB/s for each stream)."
However, as some of my Solaris media servers have only 1 Gb/s Ethernet attachements, I need DD boost to reach the target throughput !
05-16-2011 01:46 AM
Hi,
I saw the exact same thing (on windows). The problem is that the performance figures you might have been promised relate more to a environment with mulitple SAN media servers, and not a few media servers with lots of clients sending data to them.
What you must remember is that DD boost is NOT clients side dedupe, so what you're doing is turning on a deduplication engine on the media server, which talks to the DD device, and then sends the deduped data from the media to the DD. The problem is when you turn it on, the time taken to process the data on the media server (by boost) cause a delay. Your clients still send there FULL data set to the media server, whether you use boost or not. The only difference is when you use boost, it takes longer to process the same piece of data, so it can send it (deduped) to the DD device. This might alleviate the load on the DD, but it does absolutely nothing for your backup window, in fact, with boost on, it increases the window.
Thats my understanding of it from what I've seen, I look forward to hear your views.
05-16-2011 04:14 AM
Yes, I agree but it works much faster on Linux (X86) and Solaris (Sparc T2 +).
I try to:
1 / Check that I am not alone in having problems on Solaris (I have may have forgotten a prerequisite?)
2 / Check that there is no trick to go faster anyway?
Data Domain documents says that Solaris T+ is almost as fast as X86, but they don't say if it is on a small or on a large number of streams !
05-16-2011 06:41 AM
I'd check couple of things.
First, the version of the OST plug-in. Recent versions (at least 2.2.3 and 2.3.1) are able to leverage built-in cryptographic accelerator in T2+ processors (if libpkcs11.so is available). This lowers load on the media server comparing to traditional BOOST as well as comparing to transferring the entire stream to the DD box, thus speeding up backup if link between DD and Media Server is the bottleneck.
Second, network buffering settings. The OST plug-in admin guide explains in details how system should be tuned for best performance.
05-16-2011 06:48 AM
Client side dedupe doesn't shrink the backup window in NBU with built-in dedupe client either. It will still read the entire data set from disk without caching, therefore the only tangible result you could expect is reduction in traffic between the client and the media server.
05-16-2011 06:57 AM
Not on a single client level, yes, but a reduction in traffic could allow you to send more data at the same time. The logical (call it that if you will) bandwidth increases, which means clients that might have been scheduled later, can run earlier as the media server will be able to accept their data (network not congested).
Same as a incremental really (just on another level).
What I actually wanted to get across is that the DD does not offer this, even though people are led to believe the 5 TB/hour quoted speed is possible in a LAN client environment....
05-16-2011 07:00 AM
DD does offer this, but with SAN media server environments (where you have an option to install the OST plug-in and use local storage unit). There are also other backup apps out there whose clients are able to use boost without having to use full-blown media server, but those are offtopic for this forum :)
05-16-2011 07:07 AM
By the way, there is no technical limitation of putting an OST plug-in into NBU client, given the pure disk plug-in for the client works using this way, but SYMC are too greedy to allow OST-on-client and combinations like boost-on-client along with DD, just because they won't be able to charge for SAN media server licenses anymore
05-16-2011 07:16 AM
Yeah, sorry, should have phrased it different. Its possible, but they should tell the client that he needs to redesign his NetBackup setup to use SAN media's (or as you said, DD boost on client :P). Which these days isn't really an issue with the use of per TB / Platform licensing. It just they don't tell them that when they're selling it (in my experience), and this causes unhappiness later when a redesign is called for.
It would all work well if we all talk to each other and not be so competitive and cryptic, LOL
Thanks for the chat :)
05-16-2011 11:45 AM
In short, if you get low dedupe rates, you will not see much advantage using Boost.
We also tried Boost, but because we didn't have a lot of duplicate data yet, the media server was taking a longer time to process the same amount of data and it still had to send it anyway, negating the use of Boost.
The OST plugin first sends a request to the Data Domain appliance to check the hash of the data packet. If the DD has already seen it, the full data packet is discarded and the block's expiration is updated on the Data Domain. This is where Boost comes in handy. BUT, when the Data Domain does NOT find a duplicate block already stored, the media server must then send the second packet with all of the data. So, twice the number of data packets if your data does not dedupe well, which leads to poor performance and high CPU utilization all around.
05-16-2011 11:40 PM
I use the last version of DD Boost, Release 2.3.1.0.
I test with GEN_DATA - and with NetBackup 6.5.6, generated data have a very high deduplication ratio, so there is very little data sent over the network.
The optimization parameters (system and network) have been positioned, but there is very little data sent,so it is probably not the reason of poor performance ! In addition, Netbackup data buffer size and number have been increased.
However, as the network throughput and the CPU load are low, I wonder about cryptographic accelerator: How can I check that cryptographic accelerator is running, and that libpkcs11.so is available ?
05-17-2011 06:00 AM
Ok, let's do the following:
Check how many SHA1 your box has done so far with help of the crypto accelerator:
kstat -n n2cp OR kstat -n n2cp0
this will give you an idea whether your DD BOOST uses this module or not. As far as the libpkcs11.so is concerned, you can search for this library on your system using find and check whether it present.
Certain options could be disabled in the libpkcs11.so library, run
cryptoadm list -p
to list enabled ones
The following command displays all possible counters available for monitoring in real-time for tracking
cputrack -h
use cputrack with required counter to track sha-1 calculations in real time
05-17-2011 09:42 AM
I did the following test:
Several " kstat -n n2cp0 " during a backup:
SHA1 counter becomes greater, so I think crypto accelerator is used ?
Line 16883: sha1 13329276
Line 38492: sha1 13383773
Line 38644: sha1 13891339
Line 38779: sha1 13999579
According to "Cryptoadm" resuklt, aal the options seems enabled:
root# cryptoadm list -p
User-level providers:
=====================
/usr/lib/security/$ISA/pkcs11_kernel.so: all mechanisms are enabled. random is enabled.
/usr/lib/security/$ISA/pkcs11_softtoken_extra.so: all mechanisms are enabled. random is enabled.
Kernel software providers:
==========================
des: all mechanisms are enabled.
aes256: all mechanisms are enabled.
arcfour2048: all mechanisms are enabled.
blowfish448: all mechanisms are enabled.
sha1: all mechanisms are enabled.
sha2: all mechanisms are enabled.
md5: all mechanisms are enabled.
rsa: all mechanisms are enabled.
swrand: random is enabled.
I have not yet tried "cputrack" but I found the following command:
root# /usr/sfw/bin/openssl speed -engine pkcs11 sha1
engine "pkcs11" set.
Doing sha1 for 3s on 16 size blocks: 68954 sha1's in 2.70s
Doing sha1 for 3s on 64 size blocks: 65603 sha1's in 2.72s
Doing sha1 for 3s on 256 size blocks: 58471 sha1's in 2.76s
Doing sha1 for 3s on 1024 size blocks: 37216 sha1's in 1.53s
Doing sha1 for 3s on 8192 size blocks: 27082 sha1's in 1.32s
OpenSSL 0.9.7d 17 Mar 2004 (+ security fixes for: CVE-2005-2969 CVE-2006-2937 CVE-2006-2940 CVE-2006-3738 CVE-2006-4339 CVE-2006-4343 CVE-2007-5135 CVE-2007-3108 CVE-2008-5077 CVE-2009-0590 CVE-2009-3555)
built on: date not available
options:bn(64,32) md2(int) rc4(ptr,char) des(ptr,risc1,16,long) aes(partial) blowfish(ptr)
compiler: information not available
available timing options: TIMES TIMEB HZ=100 [sysconf value]
timing function used: times
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
sha1 408.62k 1543.60k 5423.40k 24907.96k 168072.53k
So, with larger blocks, it is possible to go faster... but I don't know which block size is used by DD Boost ?
05-17-2011 11:32 AM
I guess this means the crypto accelerator is ok and doing its job.
DD uses variable segment size, it is not fixed block, anyway.
What about the ingestion rate, can you please verify values for NET_BUFFER_SZ and others, so we can be sure you're allocating enough memory buffers to feed the box with data?
06-05-2011 11:21 PM
I only backup local data on the SUN media server, and all the buffer sizes (SIZE_DATA_BUFFER;...) have been increased according to "boost" recommandations... and the EMC support has no idea to solve this problem !
So, I am afraid it never goes faster because NetBackup 6.5 uses a "plugin" 32-bit?
06-06-2011 08:42 AM
06-06-2011 10:37 AM
let assume that your environment is a common one like
Clients -> ETH -> Media/Master Server -> SAN -> DD
possible bottleneck:
Client -> processor load, how big the file is, how many files,
ETH -> Switches (iSCSI, jumbo frames)
Master/Media Server -> Where Dedup starts (processor, simultaneous jobs) your dedup gain will be from media server to the DD. maybe your bootleneck is ETH / Client
DD -> Depending on connections, FC 8gbps, cifs nfs, throughput may vary.
Are U using DD as a VTL?
(Try to use CIFS/NFS map at the master server and make a Backup to Disk or copy something from this server directly to your DD)
I hope it helps.