Forum Discussion

Jonathan_D_'s avatar
12 years ago

EMM status: Storage Server is down or unavailable

Hi,

 

Since we upgraded from 7.1 to 7.5.0.4, we have intermittent problem causing us some errors. The problem is that Netbackup is detecting that the Disk Volume are down so the majortiry of our backup are failing with this code : "Disk storage server is down(2106)" or "Disk volume is down(2074)".

 

here is my environment :

Product:  Netbackup enterprise Server
Product Version:  7.5.0.4
Platform / OS Info:  Windows 2008 R2

Disk Pool : Openstorage Disk pool on DataDomain.

 

I checked in the bpstsinfo log file on the Master server and i have this error when it happens

 

21:38:54.707 [836780.826408] <16> emmlib_initializeEx: (-) Exception! CORBA::TRANSIENT
21:38:54.707 [836780.826408] <16> emmlib_initializeEx: (-) system exception, ID 'IDL:omg.org/CORBA/TRANSIENT:1.0'
OMG minor code (2), described as '*unknown description*', completed = NO
21:39:59.729 [836780.826408] <16> emmlib_initializeEx: (-) Exception! CORBA::TRANSIENT
21:39:59.729 [836780.826408] <16> emmlib_initializeEx: (-) system exception, ID 'IDL:omg.org/CORBA/TRANSIENT:1.0'
OMG minor code (2), described as '*unknown description*', completed = NO
21:41:04.594 [836780.826408] <16> emmlib_initializeEx: (-) Exception! CORBA::TRANSIENT
21:41:04.594 [836780.826408] <16> emmlib_initializeEx: (-) system exception, ID 'IDL:omg.org/CORBA/TRANSIENT:1.0'
OMG minor code (2), described as '*unknown description*', completed = NO
21:42:09.460 [836780.826408] <16> emmlib_initializeEx: (-) Exception! CORBA::TRANSIENT
21:42:09.460 [836780.826408] <16> emmlib_initializeEx: (-) system exception, ID 'IDL:omg.org/CORBA/TRANSIENT:1.0'
OMG minor code (2), described as '*unknown description*', completed = NO
21:42:54.794 [817380.831240] <8> vnet_same_host_and_update: [vnet_addrinfo.c:2820] name2 is empty 0 0x0
21:42:54.794 [817380.831240] <4> db_getSTUNITlist: emmserver_name = masterserver

21:42:54.794 [817380.831240] <4> db_getSTUNITlist: emmserver_port = 1556
21:43:14.341 [836780.826408] <16> emmlib_initializeEx: (-) Exception! CORBA::TRANSIENT
21:43:14.341 [836780.826408] <16> emmlib_initializeEx: (-) system exception, ID 'IDL:omg.org/CORBA/TRANSIENT:1.0'
OMG minor code (2), described as '*unknown description*', completed = NO
21:43:19.364 [836780.826408] <16> db_getSTUNITlist: (-) Translating EMM_ERROR_CorbaTransient(3000001) to 25 in the NetBackup context
21:43:19.364 [836780.826408] <16> db_getSTUNITlist: error (25) initializing EMM interface.
21:43:19.411 [836780.826408] <16> bpstsinfo/update ERROR: unable to retrieve stu list from server (25)
21:43:54.699 [817380.831240] <16> emmlib_initializeEx: (-) Exception! CORBA::TRANSIENT
21:43:54.699 [817380.831240] <16> emmlib_initializeEx: (-) system exception, ID 'IDL:omg.org/CORBA/TRANSIENT:1.0'
OMG minor code (2), described as '*unknown description*', completed = NO
21:44:59.533 [817380.831240] <16> emmlib_initializeEx: (-) Exception! CORBA::TRANSIENT
21:44:59.533 [817380.831240] <16> emmlib_initializeEx: (-) system exception, ID 'IDL:omg.org/CORBA/TRANSIENT:1.0'
OMG minor code (2), described as '*unknown description*', completed = NO
21:47:54.817 [817204.838996] <8> vnet_same_host_and_update: [vnet_addrinfo.c:2820] name2 is empty 0 0x0
 

  • Could be a number of causes - the extra ports used by NetBackup often causes the issue - as you are on Windows 2008 try the following (the registry one needs a reboot - the netsh takes effect immediately), these will increase the number of ports available and reduce the time wait delay to free ports up quicker:

    HKLM\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\
    Add a new DWORD – TcpTimedWaitDelay  - Decimal Value of 30
    netsh int ipv4 set dynamicport tcp start=10000 num=50000

    Next - to keep communications with the Storage Server running better add the following flat file to the Master and Storage Servers:

    <install path>\veritas\netbackup\db\config\DPS_PROXYDEFAULTRECVTMO

    open it up in notepad and put a value of 800 in it

    All of the above can really help with this issue

    Hope this helps

4 Replies

  • Could be a number of causes - the extra ports used by NetBackup often causes the issue - as you are on Windows 2008 try the following (the registry one needs a reboot - the netsh takes effect immediately), these will increase the number of ports available and reduce the time wait delay to free ports up quicker:

    HKLM\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\
    Add a new DWORD – TcpTimedWaitDelay  - Decimal Value of 30
    netsh int ipv4 set dynamicport tcp start=10000 num=50000

    Next - to keep communications with the Storage Server running better add the following flat file to the Master and Storage Servers:

    <install path>\veritas\netbackup\db\config\DPS_PROXYDEFAULTRECVTMO

    open it up in notepad and put a value of 800 in it

    All of the above can really help with this issue

    Hope this helps

  • Thanks Mark,

    i have some questions before applying it.

    Netbackup support already asked me to put thos 2 files onm y media servers and put 1800 in it.

    <install path>\veritas\netbackup\db\config\DPS_PROXYDEFAULTRECVTMO

    <install path>\veritas\netbackup\db\config\DPS_PROXYDEFAULTSENDTMO

    It improves the situation but I still got the error after a week. Maybe it's happenning when i have more stuff running on the week-end. SHould i really put it at 800? If yes, can you explain?

     

    Next, for the Netsh command,  here is my actual config if i enter "netsh int ipv4 show dynamic tcp"

    Protocol tcp Dynamic Port Range
    ---------------------------------
    Start Port      : 49152
    Number of Ports : 16384

    Should I still enter what you told me ? "netsh int ipv4 set dynamicport tcp start=10000 num=50000"

    FOr the registry key, i have nothing for the moment so I will try.

     

     

     

  • try this workarrount, until you fix the problem.

    Add the folowing tuch files to the master server:

    /usr/openv/netbackup/db/config/MEDIA_SERVER_ALWAYS_UP.(hostname)
    /usr/openv/netbackup/db/config/DISABLE_DSU_CAPACITY_POLL_SERVER.(hostname)

    These files are to be used at a 6.0 master withr 5.1 media servers but I used them ones when I had a problem with a 6.5 disk media sever.

    Give it a try, you never know.

     
     
  • Jonathan

    I dont like using the DPS_PROXYDEFAULTSENDTMO as i have found it can make things worse

    1800 is too high for the other one so try 800

    This is all from experience, especially around appliances, and based on the amount of interaction with support and engineering that i have had over the last couple of years and the 800 value is now standard on new appliances and the sendtmo is not used - so looks like my figures may be good!

    You can adjust the netsh range to suit - if you want to start at a higher number you can do but the value you have ends at 65536 (which is one of those numbers that rings a bell !!) so i would not have the start setting higher than 15536 if the number is to be 50000 (hope that makes sense)

    Although Stephanos suggestion may stop the disks going down it does also effectively stop the routine checks on the system so i would be reluctant to use those unless desperate