cancel
Showing results for 
Search instead for 
Did you mean: 

MSDP Storage unit is down during queue processing 2074

D_Thomas
Level 4

I am getting 2074 - MSDP storage unit down - during RMAN backups when scheduled queue processing kicks off.

It's happened twice now and it correpsonds with scheduled queue processing. Kicks off at 6:30 PM on Friday and all of the jobs that started at the same time also died with a 2074 - disk volume is down.
Windows 2008 R2 SP1 7.5.0.4 Media server with MSDP.

I don't have a ticket open on it yet, too busy fighting tickets for RMAN and deduplication.

There is plenty of free disk space and memory in the box

Server is a DL360 G7 is 12 cores (24 with hyperthreading), 64GB of storage and 48GB of memory. Memory usage is not over 33GB.

Has only happened during RMAN backups the two times - not a lot of other disk activity or backups jobs running when this has happened.

DT

5 REPLIES 5

Mark_Solutions
Level 6
Partner Accredited Certified

Thanks for opening the new thread ....

So if you have 64TB of storage you should have a minimum of 64TB RAM

If it is a busy disk system timeout then add the following to the Master and Media Severs:

/usr/openv/netbackup/db/config/DPS_PROXYDEFAULTRECVTMO

with a value of 800 in it.
 
You could change the queue processing to run more often through the day so that it does not have such an impact when it does run (crontab)
 
It is most likely to be memory / paging / virutal memory related to be honest
 
Do you do any VMWare or Accelerator backups? (there are issues with these and queue processing / memory)

D_Thomas
Level 4

Its windows 2008 r2 sp1. There is plenty of free memory left for spool to use, and the storage is only half full at this time, and there are no out of memory errors on the box. The in use memory never goes over 35GB, leaving at least 13GB free.

I don't think its a memory issue - it seems to be something else, and it is oddly related to when RMAN jobs run at the exact same time as queue processing.

Lots of vmware backups (400+), and they are not failing at all. No accelerator jobs at present.

Disks aren't busy either, but I will try changing the timeout. I don't have an option to change queue processing as far as I know on windows, I believe it runs every 21 or 23 hours or something odd like that.

 

 

Mark_Solutions
Level 6
Partner Accredited Certified

There is a semaphore memory leak for VMWare backups - they work but leak memory - not sure if this does affect Windows servers or how to see see it as i have only experienced it on Unix

When queue processing runs it uses the same memory and as a result things start to go wrong with queue processing and the disk goes down.

It may however just be the very busy disk so try the DPS_*** setting in my first thread to see if it resolves it

D_Thomas
Level 4

Thanks, I will give it a go and see how it goes. We have also dropped the number of streams per RAC environement to see if that helps.

The only issue I seem to have with VMware backups is where the media server keeps all of these xml files in \Program Files\Veritas\NetBackup\online_util\fi_cntl

If I don't go in and delete them every few days or so, they build up and make the snapshot/delete snapshot process in vmware take forever, sometimes up to an hour.

There is no A/V running on this box, so I can't really understand what is causing these to not be deleted. Its not as if they are all left there, just a big hunk of them. In the last 5 days there are almost 15000 of them in the directory.

Mark_Solutions
Level 6
Partner Accredited Certified

There is a VMWare EEB for 7.5.0.4 which relates, at least in unix, to file being left in /tmp

Maybe it also relates to Windows - it is available on request only and not listed on the Support site as far as I know - worth logging a call with Support to ask about it in case it relates to your issue