We have 5 backups that take place each night, which are staggered to start at 3 different times (need to wait for the RMAN to complete, so if there was a way to do this on one policy i would be even happier!). These jobs are writing to an robot with 2 LTO3 drives in it. After upgrading to 6.0 the jobs that kicked off later in the night queued and waited for a whole drive to be free, causing our backups to run well into the day. I was told by the person who handed the NBU responsibilities to me, that in 5.1 it was able to write multiple policies to the same drive. We didnt change the multiplexing setting, or anything else in the policy, so Is this something that had changed in 6.0? We have currently work around this limitation by changing the multiplexing to 3 (there are 3 clients that kick off at 11pm) so that left the other drive free for the next 2 that kick off at 1:30am. Is there a way around this besides the obvious add more drives?
Sounds like you lost some storage unit groups in the upgrade. It's all pretty easy to configure any way you want it, especially with only five clients, and five backups. I wouldn't get too hung up on the number of policies - usually it's via multiple policies that you achieve the perfect session profile.
a) Clients A, B, C start at 23:00, and use one drive.
b) Clients D and E start at 01:30, and use the other drive.
a) Are the jobs wildly different in size GB/TB?
b) Are the clients very different in power/capability?
c) Do you use SSO, i.e. are one or more of the clients SAN media servers? If so, how many?
d) Are the clients of different O/S type, and/or different agent type? If so, please specify type vs client A,B,C,D,E?
e) Are some clients more important than others? If so, which ones?
f) Are the retentions diferent between clients? i.e. are some backups for some clients retained for longer than other clients? If so, please specify.
g) Are some incremental and/or full?
h) When must backups be completed by?
P.S. You'll only need to add more drives if you haven't got the tape head bandwidth to support your data volume.
I'm sure that if you answers some of the above questions, without giving away server names, then myself and/or others will be able to help you generate your ideal profile of policies and schedules.
Message Edited by David Rawson on 07-19-200709:41 PM
We hadnt typically setup storage groups, as each media server (with the exception of 1) has 1 robot. i had singled out the one that was giving us problems to keep the solution hopefully easier. We backup 3.2tb to this robot each night, so it is a busy drive. This drive is attach via fiber, so we use typically use it for our largest backup jobs. From what i was told, prior to the update the first job (3clients 2 multiplexing) wrote to both drives. At 1:30 the next 3 were started, and began writing immediately. I guess it is possible that the 2 big jobs (1.2tb and 381gb) had previously both gone to the same drive, and the smaller 60gb drive went to the other, thus completing prior to the 1:30am job. It is too bad you cant assign priority to clients at the policy level without creating a new policy.
a) Are the jobs wildly different in size GB/TB? one job is 48gb, 1 is 60gb the others are 400-600 and 1 is 1.2tb
b) Are the clients very different in power/capability? They are similar, and all run jumbo packets on the backup network, giving us around 38000 kb/sec on our 1.2 tb client, and 20000-30000kb on the other 5
c) Do you use SSO, i.e. are one or more of the clients SAN media servers? If so, how many? There are no SSO's configured, and no client SAN media servers at this time
d) Are the clients of different O/S type, and/or different agent type? If so, please specify type vs client A,B,C,D,E? The systems are running Solaris 9 and 10 and are all still running the 4.5 client as we are slowing migrating to the new client to ensure there are no issues with the 6.0 client.
e) Are some clients more important than others? If so, which ones? They are all considered mission critical systems, with 3 being backups of oracle databases. The db's are backed up disk to disk outside of NBU, and then we backup the RMAN
f) Are the retentions diferent between clients? i.e. are some backups for some clients retaine dfor longer than other clients? If so, please specify. We have a 4 week daily pool that we do full backups on each night.
g) Are some incremental and/or full? all full
h) When must backups be completed by? With the exception of the 1 backup that doesnt start until 4:45, we would like to have them all done by 8am. We were able to accomplish this under 5.1
We have 1 Robot, controlling 2 drives(sun SL500). The media server only has 1 storage unit, with the max concurrent write drives set to 2, multiplexing enabled, and max streams per drive of 4. The policy storage unit is not overridden in the schedules and is set to bkpsrv3-hcart3-robot-tld-3.
I guess the biggest problem we are having is, if the smaller 60gb job writes to the one drive, and the 2 larger jobs to the other drive, we are wasting that write capacity as the small job completes in 1 hour (12am). With the next set of jobs not kicking off until 1:30am that wastes the drives capacity. on the other hand, currently if the 2 big jobs get split to different drives, the jobs that start at 1:30 are queued up until a drive frees up (6am). Unfortunatly due to the time of scripts and DB backups that are out of our control, it is not really possible to juggle the jobs and start times.
My suggestion is to keep the storage unit max drives of 2, and to keep the multiplexing at 4, or possibly increase it to five - but the key is to split all your clients into separate policies, and to use two volume pools, thus - one pool efectively for one drive, and the other pool effectively for the other drive. This will allow you to decide which pool to use per policy - and thus by client. All policies will still use the one storage unit - BUT to ensure that backups on one pool do not spill over (i.e. consume all drive multi-plex slots and thus demand another drive) you'll need to ensure that your drive multiplex level (on the storage unit) is as high as the total number of jobs that could run (imagine if you had delays for some business/app reason). Also, you'll need to ensure that all your schedules in all the new policies also have the same multiplex figure. I think this will give you want you want.
One more question... What triggers your backups? From the client side (i.e. user mode backups), or from the NetBackup job scheduler?