Duplication from DeDup server very poor performace
Hi,
NBU version 7.0 on windows 2008 R2.
Duplicating using Vault from DeDup server to FC LTO4 drives.
We are seeing speed less than 30Mb/s from disk to tape.
From client to DeDup server we see much greater speeds, up to 70MV/s ( Dont know it this is a limit form the client which is a file server )
NUMBER_DATA_BUFFERS and SIZE_DATA_BUFFERS is set for the LTO 4 drives.
Is there any way to enhance duplication performance from the DeDup server to tape?
NUMBER_DATA_BUFFERS_DISK?
Thank you :)
Obviously disk I/O and network bandwidth are major factors in the overall performance, but I seldom see re-hydration to tape perform better than 30-40MB/s regardless of CPU/memory/disk/net. White papers tell you a different story, but reality is different with real load on the systems.
There isn't much of tuning in PDDE/PDDO as such, apart from optimizing the multipathing I/O policies, using correct block sizes for file system, and so on. We could also look at thread pools and block sizes, and number of buffers in the OST plugin interface.
When you re-hydrate to tape, have a look in the bptm log. If you find the bptm process writing to tape has a huge amount of delayed wait (for full buffers), then you know that the sending bpdm/bptm is not being able to retrieve data fast enough from PDDE/PDDO. Look at the I/O performance on the file system on the PDDE/PDDO together with CPU performance. If the spoold process is not doing a lot, then the disk system can't deliver.
Also, as we talk about delays, it is important to understand the underlying tape technology. Use tape drives that can better handle slow stream speeds. By not having to wait for the tape drive to re-position, you can gain substantial increase in write performance.
Buffer size and number tuning: In contradiction to most beliefs, don't use huge sizes and many buffers for disk. Rather use say 24-32 buffers, with sizes of 64k to 256k. I find if I use NUMBER_DATA_BUFFERS = 32, and SIZE_DATA_BUFFERS = 262144, that I on average have 50-60MB/s re-hydration with ~50% of the dup images perform in the ~70-120MB/s range (using LTO4 or TS1130 drives).
If we look at tuning the buffers for lower values, than we can hit bad performance for B2T jobs. So it is important to find a balance that suits your needs... In some environments (Windows) you may actually see best performance without any type of tuning...
But, in the end, in my opinion, re-hydration performance is slow due a design not geared for anything but backup and optimized duplication to another PDDE/PDDO destination. I guess the solution would be to stop duplicating to tape, as the product can't do it properly...
/A