Forum Discussion

iVaibhav's avatar
iVaibhav
Level 4
4 years ago

MSDP Down

Hello Community, 

We had a OS Upgrade on a Master Server on Win2008 to change it to 2019. It also had an MSDP Pool on F Drive. 

After Clean 2019 install  and Netbackup Master Server 8.2 Setup and Catalog Recovery. We followed Process to reconnect MSDP Pool to Netbackup .

https://www.veritas.com/content/support/en_US/doc/25074086-130388296-0/v33188054-130388296

However, we see below in SPAD logs. What Certificate we should look at ?

We updated policies for backups to not to go to disk and changed to go to tape . They are backing up and connecting fine, so master server certificate seems ok,  Not sure what certificate its talking about, 

Appreciate any help. 

========================================

March 02 02:06:50 INFO: set entire process max log size to 0
March 02 02:06:50 INFO: set entire process max log size to 0
March 02 02:06:50 INFO: CR mode in spa db is normal !
March 02 02:06:50 INFO: Startup: loading configuration from F:\MSDP\etc\puredisk\spa.cfg
March 02 02:06:50 INFO: get key from file F:\MSDP\var\keys\auth.key.
March 02 02:06:50 INFO: : extended attribute support disabled
March 02 02:06:50 INFO: get key from file F:\MSDP\var\keys\auth.key.
March 02 02:06:50 INFO: Manager thread stack size: 0 bytes
March 02 02:06:50 INFO: Task thread stack size: 0 bytes
March 02 02:06:50 INFO: Max DO size: 335544320 bytes
March 02 02:06:50 INFO: Fibre Channel configuration is empty.
March 02 02:06:50 INFO [000002DB23E29910]: set entire process max log size to 10000000
March 02 02:06:50 INFO [000002DB23E29910]: set entire process max log size to 10000000
March 02 02:06:50 INFO [000002DB23E29910]: Startup: occurred at Tue Mar 2 02:06:50 2021
March 02 02:06:50 INFO [000002DB23E29910]: Successfully loaded configuration from F:\MSDP\etc\puredisk\spa.cfg
March 02 02:06:50 INFO [000002DB23E29910]: Startup: 1854: MD5 passwd disabled
March 02 02:06:50 INFO [000002DB23E29910]: Encrypted passwords match encryption keys.
March 02 02:06:50 INFO [000002DB23E29910]: FIPS mode not ENABLED
March 02 02:06:50 INFO [000002DB23E29910]: Startup: Instant Data Access is not enabled
March 02 02:06:50 INFO [000002DB23E29910]: Startup: spad type is master
March 02 02:06:50 INFO [000002DB23E29910]: Startup: master spad url is
March 02 02:06:50 INFO [000002DB23E29910]: Startup: Veritas PureDisk Storage Pool Authority Version 12.0000.0019.0624.
March 02 02:06:50 INFO [000002DB23E29910]: Startup: using Veritas: libdct 6.0.0.0, July 7, 2004
March 02 02:06:50 INFO [000002DB23E29910]: Startup: using Veritas PureDisk: libcr 6.1.0.0, December 13, 2006
March 02 02:06:50 INFO [000002DB23E29910]: Startup: upgrade of spa DB skipped
March 02 02:06:50 INFO [000002DB23E29910]: Memory Manager: initializing
March 02 02:06:50 INFO [000002DB23E29910]: Memory Manager: initialization complete
March 02 02:06:50 INFO [000002DB23E29910]: Sched Manager: initializing
March 02 02:06:50 INFO [000002DB23E29910]: schedule string for QueueProcess: [20 */12 * * *]
March 02 02:06:50 INFO [000002DB23E29910]: alert time for QueueProcess is 21600
March 02 02:06:50 INFO [000002DB23E29910]: lifecycle for QueueProcess is 32400
March 02 02:06:50 INFO [000002DB23E29910]: schedule string for RepOrphan: [*/10 * * * *]
March 02 02:06:50 INFO [000002DB23E29910]: alert time for RepOrphan is 300
March 02 02:06:50 INFO [000002DB23E29910]: lifecycle for RepOrphan is 600
March 02 02:06:50 INFO [000002DB23E29910]: schedule string for CatalogBackup: [40 3 * * *]
March 02 02:06:50 INFO [000002DB23E29910]: alert time for CatalogBackup is 21600
March 02 02:06:50 INFO [000002DB23E29910]: lifecycle for CatalogBackup is 32400
March 02 02:06:50 INFO [000002DB23E29910]: schedule string for CRStats: [0 * * * *]
March 02 02:06:50 INFO [000002DB23E29910]: alert time for CRStats is 300
March 02 02:06:50 INFO [000002DB23E29910]: lifecycle for CRStats is 600
March 02 02:06:50 INFO [000002DB23E29910]: Sched Manager: initialization complete
March 02 02:06:50 INFO [000002DB23E29910]: PD_Replication: initializing
March 02 02:06:50 INFO [000002DB23E29910]: PD_Replication: initialization complete
March 02 02:06:50 INFO [000002DB23E29910]: Task Manager: initializing
March 02 02:06:50 INFO [000002DB23E29910]: Task Manager: seeding random number generator
March 02 02:06:50 INFO [000002DB23E29910]: Task Manager: initializing DCT Content Routing (TM)
March 02 02:06:50 INFO [000002DB23E29910]: _crConfigLoadA: Set the CR context use NBU CA.
March 02 02:06:50 INFO [000002DB23E29910]: Task Manager: initialization complete
March 02 02:06:50 INFO [000002DB23E29910]: Connection Manager: initializing
March 02 02:06:50 INFO [000002DB23E29910]: Connection Manager: initialization complete
March 02 02:06:50 INFO [000002DB23E29910]: Catalog auto recovery is enabled.
March 02 02:06:50 INFO [000002DB23E29910]: Catalog backup version count: 5, shadow pool size: 64
March 02 02:06:50 INFO [000002DB23E29910]: Catalog backup shadow path: F:\MSDP\databases\catalogshadow
March 02 02:06:50 INFO [000002DB23E29910]: CatalogContext::get_storage_path: storage_path=F:\MSDP\data, storage_path_spa=F:\MSDP
March 02 02:06:50 INFO [000002DB23E29910]: Startup: completed at Tue Mar 2 02:06:50 2021
March 02 02:06:50 INFO [000002DB23E29910]: set entire process max log size to 10000000
March 02 02:06:50 INFO [000002DB23E29910]: set entire process max log size to 10000000
March 02 02:06:50 INFO [000002DB23E29910]: Increasing maximum number of open files from 512 to 2048
March 02 02:06:50 INFO [000002DB23E29910]: History Manager: start [thread 000002DB24DCB320]
March 02 02:06:50 INFO [000002DB23E29910]: Sched Manager: started [thread 000002DB24E9AFF0]
March 02 02:06:50 INFO [000002DB23E29910]: DataCheck Manager: started [thread 000002DB24E9BA70]
March 02 02:06:50 INFO [000002DB23E29910]: Task Manager: started [thread 000002DB2542D290]
March 02 02:06:51 INFO [000002DB23E29910]: EventSystem::SpaDBEventStorage::checkConsistency Starting consistency check
March 02 02:06:51 INFO [000002DB23E29910]: EventSystem::SpaDBEventStorage::checkConsistency Ending consistency check
March 02 02:06:51 INFO [000002DB23E29910]: EventSystem::SpaDBDSEventStorage::checkConsistency Starting consistency check
March 02 02:06:51 INFO [000002DB23E29910]: EventSystem::SpaDBDSEventStorage::checkConsistency Ending consistency check
March 02 02:06:51 INFO [000002DB23E29910]: EventSystem::EventManager::EventManager m_orig_dsid = 2
March 02 02:06:51 INFO [000002DB23E29910]: EventSystem::EventManager::newEvent Adding new event. type:4, dsid:0
March 02 02:06:51 INFO [000002DB23E29910]: remove old record file F:\MSDP\databases\spa\database\eventproperties\1.
March 02 02:06:51 INFO [000002DB23E29910]: NetBindAndListen: bound myself to <localhost>:10102 using IPv6
March 02 02:06:51 INFO [000002DB23E29910]: NetBindAndListen: bound myself to <localhost>:10102 using IPv4
March 02 02:06:51 INFO [000002DB23E29910]: Connection Manager: started
March 02 02:06:51 INFO [000002DB23E29910]: Setting service status to running if service start
March 02 02:06:54 INFO [000002DB24E9A730]: NBUAPI::GetSecurityProperties: synced security properties with master server 1, 0
March 02 02:07:24 ERR [000002DB24E9A730]: -1: CURLClient::Execute: failed to perform CURL session, error: Operation timed out after 30000 milliseconds with 0 bytes received
March 02 02:07:24 ERR [000002DB24E9A730]: -1: NBUAPI::GetToken: curl error to access https://server.com:1556/netbackup/loginwithcert
March 02 02:07:24 ERR [000002DB24E9A730]: -1: NBUAPI::ValidateHost: failed to get token
March 02 02:07:24 ERR [000002DB24E9A730]: -1: validate_host_cert: validate the host server.com uuid 4393a0cd-e60f-4f37-bb3b-e3fc8bf42ce7 cert failed.
March 02 02:07:24 ERR [000002DB24E9A730]: 25056: Agent at server.com certificate is invalid. 

====================================================
Regards,

  • Ok, This was solved by Veritas. 

    A rerun of Command 

    nbcertcmd.exe -getcertificate -force -token
     
    allowed spad to come online. 
     
    Not sure though why other clients were connecting fine and the spad was not. 
     
    Thanks Marianne and davidmoline for taking a look at my post.
  • Hi iVaibhav 

    The one thing that stands out to me is the failure to connect on port 1556 to obtain the certificate. The certificate in this case would be the host/CA certificate on the master server.

    Does everything else on the master server work and come up? Are you able to communicate with clients or other media servers (have you tried the bptestbpcd command to check connectivity?).

    Have you verified that the local Windows firewall is not blocking this port? If the firewall is not enabled, can you please check that the PBX service is running (Veritas Provate Branch Exchange).

    Finally - run some certificate checking commands to verify all appears to be okay.
    # nbcertcmd -ping
    # nbcertcmd -hostselfcheck

    Let us know how you go.
    David

  • iVaibhav 

    I am curious to know if master server was installed in 'Disaster Recovery' mode and if security certificates were restored at the end of the installation?

    • iVaibhav's avatar
      iVaibhav
      Level 4

      Yes, it was in DR Mode. 

      All clients are connecting and backing up to tape fine. 

      Just to be sure this was not an issue , we did a run of below command again to fix if anything did not go right at that time. 

      nbhostidentity -import

      Its only the pool which is the local drive fails to connect. I am not sure whats up with it. 

      • iVaibhav's avatar
        iVaibhav
        Level 4

        Ok, This was solved by Veritas. 

        A rerun of Command 

        nbcertcmd.exe -getcertificate -force -token
         
        allowed spad to come online. 
         
        Not sure though why other clients were connecting fine and the spad was not. 
         
        Thanks Marianne and davidmoline for taking a look at my post.