I have a Quantum DXi disk storage unit, and I have a Quantum Scalar i500 tape library. Starting just a few weeks ago, and it seemed sudden, our jobs started running late into Monday morning. I opened a case with support, but I haven't made any real progress in over a week. What is going on is that backups are running, but then duplications are kicking off during them--so the rehydration of data and writing it to tape, and actual first-write jobs, are competing with each other. Obviously this is going to cause contention; but for some reason just starting a couple weeks ago this has come to a head, and now I'm trying to correct this.
Currently the answer was to take SLP jobs and turn them off until Monday morning, which I don't agree with; because then tape duplications are running way into Wednesday. (I usually use Tuesday daytime for my maintenance window, since there aren't normally a lot of backups running.)
There are several settings I have been looking at:
Storage -> Storage Unit Group -> Storage Unit -> Max Concurrent Jobs (30 each, there are 2)
Devices -> Disk Pools -> Disk Pool -> Limit I/O Streams (60)
I'm not looking for someone to tell me a best practice--I know I need to do tuning, and I have the white paper on my disk unit.
What I am trying to wrap my head around is what exactly each of these does. From what I have read, the max concurrent jobs is only for *jobs writing to the storage unit*, not for other things like duplications, etc.
If someone could point out where that is mentioned to me in one of the user guides, I would be very happy.
I've already done some research...
The other settings I have been looking at have been in the master server's SLP parameters; namely, the force interval for small job and job submission interval. Currently these are set for 9 hours. This to me means that after a backup completes the SLP copy of it won't kick off for at least 9 hours. That apparently was working in the past, up until this point; and I'm wondering if the solution might be as simple as making it 10 or 11 (or, is the problem the other way around, that I have it set too long?)
Hope this is what you are looking for
“Maximum I/O streams per volume” with Disk Pools
Disk storage units allow you to limit the number of concurrent write jobs that use the storage unit, however there are no limits on the number of read jobs (restores and duplications) that may be accessing the same disk pool at the same time and it is also possible to configure multiple storage units to access the same disk pool simultaneously. This can give rise to unexpected I/O contention on the disk pool. By setting the Maximum I/O streams per volume option on the Disk Pool, you can limit the total number of jobs that access the disk pool concurrently, regardless of the job type.
Streams on the pool is the overall figure that it can handle, i.e from all the storage units. Suppose DXi is like a DD so its got multiple servers talking to it, so your 30 + 30 would seem to add up to 60 on the pool. All good there.
What you need to look at now is how busy system is doing backups, and whether you need to defer the duplications till later. Rehydration does put quite a load on a system to its best practice not run the "IN" and the "OUT" at the same time. You can achieve this with an SLP window, no need to play with the SLP parameters.
Share more info on the load if possible.