Duplication Jobs running wild
After rebooting my NBU servers, I have over a thousand Duplication jobs trying to run.
A little info: We have 2 DD670's. Half backups go to one, half to the other, then each backup duplicates to the other DD670 after the backup completes.
All that shows up for the job is this:
On the Job Overview Tab:
Job Type: Duplication
Master Server: <server name>
Job Policy: SLP_LCP_DD02_Weekly
Job Schedule: Dup
Priority: 0
On the Detailed Status Tab:
Nothing at top - all fields blank
in Status:
2/2/2013 1:46:29 AM - requesting resource LCM_dd01-su
2/2/2013 1:36:35 AM - Info nbrb(pid=3248) Limit has been reached for the logical resource LCM_dd01-su
I have over 1500 sitting in queue like this? How do I keep these from launching? If I cancel them and clean them all up, 5-10 min later they all kick in again... It seems even though they run and complete, they just queuue up again and run again.
Baffled......
Thanks,
John
it looks like, Max I/O streams is the problme.
you did set the Max concurent jobs in Storage unit is 90 for each.
and also Max I/O streams for each disk pools is 90.
so at any point of time only 90 streams can be active for each disk Pool, but as you specified 90 in Storage units, from eaah SLP source is allocating 90 streams, destination is 0 results all are in Queue.
its like:-
for DD01_SU
MaX I/O stream in disk pool =90
Max jobs in disk stu = 90
for DD02_SU
MaX I/O stream in disk pool =90
Max jobs in disk stu = 90
so when duplication starts for DD01 LSP. all these are gettting allocated at source end, DD01 SLP taking 90 at source and noting left for the SLP DD02 jobs results queue jobs.
its same way for DD02 SLP.
its a dead lock situation.
3 ways to come out, and first cancle all Duplicate jobs, and impliment one of the below.
1) Reduce the Max jobs count in each STU.
or
2) Increase I/O at Disk pool (not recommented as its DD670, might not be albe to handle more )
or
3)Deactive one SLP untill other SLP gets compleated.