Forum Discussion

TimWillingham's avatar
10 years ago

Getting "(2106) Disk storage server is down" on all MSDP clients after upgrade to 7.6.0.2 from 7.5.0.6

Storage pool shows UP in the GUI, but DOWN at the command line in Linux on the master.

Here is an example of job details:

 

7/23/2014 18:40:00 - Info nbjm (pid=5732) starting backup job (jobid=1429726) for client ncrowap01, policy ar_row_win_nt, schedule frq-inc
07/23/2014 18:40:00 - Info nbjm (pid=5732) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=1429726, request id:{A93CD160-12C2-11E4-BD56-5364D264E16A})
07/23/2014 18:40:00 - requesting resource BHM_MSDP
07/23/2014 18:40:00 - requesting resource alxaptb07xs.NBU_CLIENT.MAXJOBS.ncrowap01
07/23/2014 18:40:00 - requesting resource alxaptb07xs.NBU_POLICY.MAXJOBS.ar_row_win_nt
07/23/2014 18:40:00 - Error nbjm (pid=5732) NBU status: 2106, EMM status: Storage Server is down or unavailable
Disk storage server is down  (2106)

None of the articles I found seem to apply.

  • Turned out to be a polling problem.  Learned that if you have more than one media server credentialed for the storage pool (for failover) and the media server that hosts the storage pool goes offline (like for an upgrade), the client will choose another media server to check to see if the storage pool is up.  If the media server is across the WAN and does not respond quick enough, it marks the storage pool as offline which is what was happening to us.

    So I should have deselected all but the storage server under Credentialed servers for the upgrade.

4 Replies

  • What does nbdevquery -listdp "diskpool name" -stype PureDisk produce?

     

    If its down, check your spoold /spad process, and their logs after restarting. Did you complete the MSDP post upgrade (Metadata conversion) tasks?

  • Yes, I converted the MSDP storage pool afterwards.  Restarting the service brings the pool back up, but restrting the jobs fail with 14, 40 or 87 codes.

  • Are you able to query the disk pool with crcontrol commands?

     

    Try:

    /opt/pdcr/bin/crcontrol --dsstat 1
    
    -or-
    
    C:\Program Files\Veritas\pdde>crcontrol --dsstat 1
    
    -and-
    
    crcontrol --getmode
    

     

     

  • Turned out to be a polling problem.  Learned that if you have more than one media server credentialed for the storage pool (for failover) and the media server that hosts the storage pool goes offline (like for an upgrade), the client will choose another media server to check to see if the storage pool is up.  If the media server is across the WAN and does not respond quick enough, it marks the storage pool as offline which is what was happening to us.

    So I should have deselected all but the storage server under Credentialed servers for the upgrade.