05-28-2012 07:09 AM
Solved! Go to Solution.
11-02-2012 12:30 AM
Update that may help:
I did a couple of things (or more ) which may worked.
Had a look at the mp 7.1.0.4 technote.It contained the minimum hardware specs recomended.Stay above those specs.
I upgraded media/msdp servers to much higher specs but i saw minimal difference at rehydration speed.
Still i could see around 45Mbyte/sec at one Lto3 (from 27MByte/sec before).Still no good enough.
Yet the garbadge collection was running in a blink of the eye and that alone was a good sign.
I optimized the tape drive settings (buffers/number etc).In common local file backup i could see optimal speed.Did alot of restores also to be sure .
Checked the speed of my storage. Used the camel tool (it has a newer name now) and did copy to an other storage (storage not volume at the same hardware) and kept time .Then did the math to find the storage speed.Before the upgrade i had lower throughput but now I was above 120Mbyte/sec (well symantec says 150 but now i can see 65Myte/sec throughput with 117Mbyte/sec to be exact, and that at my lower spec hardware).Either way you have to be higher than this, It won't work with lower speeds.
Some time after i changed the maximum fragment size of the DDpool from 15360 to 10240.I think it was better but yet not satisfied.
It happened to have in premises an IBM storage expert.And he noticed that the mode the fiber LTO drives (and the tape library) was at L-port (some kind of loop) and not at N-port (point to point with the switch).That setting alone was extra 10-15Mbyte to throughput at traditional backups .
I also expiremented with the settings of the contentrouter.cfg ( msdp pool path\etc\puredisk). The settings you can change are noticed in the 7.1.0.4 release notes.Two significant settings which can fix or destroy your throughput speed are the "ReadBufferSize" and "PrefetchThreadNum".Change one at the time and while no rehydration jobs running do some 30-50GB duplicate from Msdp pool to tape.It is good enough to show you if it works.
I do daily a couple of things also.
I do garbadge collection at the pools and compaction start by hand.
I disable the rehydration to drives while backing-up staff.After the backup ends i start them (in windows scheduler i have a cmd script) but i have a tape storage (you know a copy of the single one i have) with one drive just to be sure that i won't have two or more reading from pool the same time.
05-29-2012 10:51 PM
I dream of 40-50MB/s! I have a 250MB/s capable storage but I'm lucky to get 20-25GB/hour in rehydration speeds making rehydration almost unusable. I have to rehydrate to disk; then roll that image off to tape to avoid shoe shining the tapes. Doesn't seem to matter wether I'm running 1 or 4 rehydrates at the same time; they're all too slow.
Yes, I'm running v7.1.0.4, Windows 2008 R2.
Do you mean changing the "Fragment Size" value of the underlying DiskPool storage unit? Mine are set to default values of 51,200MB. How did yours start so low?
Certainly, I can see your logic in reducing this number. It's not something I've seen before but maybe you've hit on something here.
There's been an expanding Appendix in recent release notes "Rehydration Improvements" which deals with various tweeks but most says don't touch these without approval by Symantec Technical support staff. The 2 that might be of interest are PreFetchThreadNum and ReadBufferSize but I have yet to worth though these given that in my case, everytime I stop and restart NetBackup it takes 1.5 hours to come back on line (my MSDP units are approx 12TB).
05-29-2012 11:55 PM
thank you for the reply.
I had similar problems with some old media/msdp servers.
What we found out is that the original proposal/implementation of the hardware was wrong.
It was writen in the requirements that CPU has to be series 54xx and above and also with CPU clock equal or above 2.5GHz but our servers were way out of specs (older-slower cpus).
As soon as we upgraded to blade IBM bladecenter HS21 (Ε5430CPU x2) we started to see some 40-47Mbyte/sec using a storage which with both camel tool and iometer can give 130Mbyte/sec throughput.
Before the upgrade we also had to wait for some 30-40 minutes from the spoold to come up.
I also made quite some tests with every parameter the release notes of the 7.1.0.4 proposed.
Especially with PreFetchThreadNum and ReadBufferSize.
There was not much change at rehydration speed.
I also played with the disk buffers with no big gain.
I need to understand better the way the external storages operates.I also need to check some settings of the Qlogic card , especially the maximumSGlist which i think it may relates with my bottleneck.
About the "reduce to fragment size" setting now:
Initially the setting was set from a certified service provider.It was the Symantec recommended setting the 15000 size.After that i read about fragment size and the afect on backup and restores.
Due to our enviroment -which has 385 clients over was 2Mbit lines- we have no need for fast backups.So i could give way some backup speed for some increase at the rehyhdration speed.
Still the results are not clear.
I think that i have to keep rebasing the data of my clients.
05-31-2012 07:18 PM
Been doing some testing in my QA NBU environment. Using same v7.1.0.4 Linux client over Gig e-net to a v7.1.0.4 Windows 2008 R2 MSDP server I changed the Max Fragment size on the Disk Pool storage unit down from 51,200MB to 25,600 to 12,800 and finally 6,400. I ran each test twice over a period of 24 hours. I did not stop and restart NetBackup between tests.
Backup time were essentially unchanged as were rehydrate times to a local SAN based staging volume.
Obviously in your environment you saw better performance; in mine I wasn't.
The media server in this case is a little under spec'd CPU clock speed wise but monitoring the server I didn't see a CPU bottleneck; nor did I see an IO one.
My understanding is that your MSDP server should rebase data in the background; yes you can force it but I was told not to bother; let the background processing sort it out. Over time the clients will get re-based although my reading of the release notes seems to indicate it will not start until the next day.
11-02-2012 12:30 AM
Update that may help:
I did a couple of things (or more ) which may worked.
Had a look at the mp 7.1.0.4 technote.It contained the minimum hardware specs recomended.Stay above those specs.
I upgraded media/msdp servers to much higher specs but i saw minimal difference at rehydration speed.
Still i could see around 45Mbyte/sec at one Lto3 (from 27MByte/sec before).Still no good enough.
Yet the garbadge collection was running in a blink of the eye and that alone was a good sign.
I optimized the tape drive settings (buffers/number etc).In common local file backup i could see optimal speed.Did alot of restores also to be sure .
Checked the speed of my storage. Used the camel tool (it has a newer name now) and did copy to an other storage (storage not volume at the same hardware) and kept time .Then did the math to find the storage speed.Before the upgrade i had lower throughput but now I was above 120Mbyte/sec (well symantec says 150 but now i can see 65Myte/sec throughput with 117Mbyte/sec to be exact, and that at my lower spec hardware).Either way you have to be higher than this, It won't work with lower speeds.
Some time after i changed the maximum fragment size of the DDpool from 15360 to 10240.I think it was better but yet not satisfied.
It happened to have in premises an IBM storage expert.And he noticed that the mode the fiber LTO drives (and the tape library) was at L-port (some kind of loop) and not at N-port (point to point with the switch).That setting alone was extra 10-15Mbyte to throughput at traditional backups .
I also expiremented with the settings of the contentrouter.cfg ( msdp pool path\etc\puredisk). The settings you can change are noticed in the 7.1.0.4 release notes.Two significant settings which can fix or destroy your throughput speed are the "ReadBufferSize" and "PrefetchThreadNum".Change one at the time and while no rehydration jobs running do some 30-50GB duplicate from Msdp pool to tape.It is good enough to show you if it works.
I do daily a couple of things also.
I do garbadge collection at the pools and compaction start by hand.
I disable the rehydration to drives while backing-up staff.After the backup ends i start them (in windows scheduler i have a cmd script) but i have a tape storage (you know a copy of the single one i have) with one drive just to be sure that i won't have two or more reading from pool the same time.
11-02-2012 01:08 AM
Thanks for the feedback!
Suggestion:
Publish your findings in a blog or an article.
11-02-2012 01:50 AM
My english are not that good.Will try......
11-02-2012 02:06 AM
Perfectly understandable!
11-02-2012 04:19 AM
ok ,
i did gother the story and bloged it
https://www-secure.symantec.com/connect/blogs/my-deduplication-story-nbu-7
Hope it will help someone.....