cancel
Showing results for 
Search instead for 
Did you mean: 

CR Queue Processing stopped after restrat MSDP

fhaddad81
Level 3

We have upgraded NetBackup Master and media server (Target replication server) to version 10 , then backup and replication operations failed therefore server was restarted .

It seems we have restarted the server during CR process and the spoold services couldn’t started as below errors:

crcontrol.exe –dataconvertstate

E:\Program Files\Veritas\pdde>crcontrol.exe --dataconvertstate

Error: -1: NetConnectByAddr: Failed to connect to host: No connection could be made because the target machine actively

refused it.  (10061)

Error: -1:  NetConnectByAddr: Failed to connect to spoold on port 10082 using the following interface(s): [ ::1 127.0.0.

1  ] (No connection could be made because the target machine actively refused it. ) Ensure storage server services are r

unning and operational.  V-454-92

Error: 53: Could not establish a connection to localhost:10082: connect failed (No connection could be made because the

target machine actively refused it. )

Error : Connection failed connection actively refused. Note that the content router needs to be running to get a connect

ion.

 

SPoold logs:

 

February 09 11:24:44 WARNING [0000005D1219B530]: 25042: Failed to get startup CR modes from SPA after 3851 attempts, retrying in 10 seconds

February 09 11:24:38 INFO [0000005D1219B530]: WSRequestExt: submitting &request=5&login=agent_3_9902&passwd=*&id=1&action=getModeFromSPA

February 09 11:24:38 INFO [0000005D1219B530]: sessionStartAgent: establishing session with ***-***.local:10102 on socket 8456124

February 09 11:24:38 INFO [0000005D1219B530]: sessionStartAgent: using local address ***-***.local:61493

February 09 11:24:41 ERR [0000005D1219B530]: 25042: Request id mismatch in binary message: 0, expected 2.

February 09 11:24:41 ERR [0000005D1219B530]: 25042: CRSessionStart: connect to server failed!

 

 

1 ACCEPTED SOLUTION

Accepted Solutions

fhaddad81
Level 3

 

The issue has been resolved after checking the logs on the same day that issued appeared since all logs showed unknown error , I found SSL initialization and can't find the certificate errors then I run the "nbcertcmd -getCertificate -server server_name -token token_generated -force" for all used names
Also , I found OpsCenter server errors with certificate , then I removed to isolate the issue .
All services started and NetBackup Deduplication Engine started as well .

View solution in original post

8 REPLIES 8

Nick_Morris
Level 6
Partner    VIP    Certified

It would be worth having a look at https://www.veritas.com/support/en_US/article.100006017 to see if this helps.

sanket_pathak1
Level 4

@fhaddad81 

@Nick_Morris recommendation is valid. However, in my case it was resolving after disabling IPv6 as it reports for it.

Error: -1:  NetConnectByAddr: Failed to connect to spoold on port 10082 using the following interface(s): [ ::1 127.0.0.

1  ] (No connection could be made because the target machine actively refused it. ) Ensure storage server services are running and operational.  V-454-92

 

fhaddad81
Level 3

@Nick_Morris and @sanket_pathak1  i tired your response but still the same 

Also The errors on SPAD.log is the below

 

February 16 11:07:01 ERR [000000A9E98AEF80]: 25004: Session start request from ***.-***.local:55356 could not be honored (unknown error)

February 16 11:07:01 INFO [000000A9E98AEF80]: (unknown) Task 208 [thread 000000A9E98AEF80] for AE-ENG-BMD-HP2.cloud.ccg.local:55356 failed with unknown error.

@fhaddad81 Hope you rebooted server after disabling IPv6.

Also, can you ensure if name resolution working right ??

fhaddad81
Level 3

@sanket_pathak1  I rebooted the server again and confirmed that IPV6 is disabled and name resoulation is ok.

I am posting as well the SPAD logs and onlinecheck

SPAD Logs

February 21 10:12:52 ERR [000000D13939D130]: 25004: Session start request from ****:65316 could not be honored (unknown error)
February 21 10:12:52 INFO [000000D13939D130]: (unknown) Task 333 [thread 000000D13939D130] for***** :65316 failed with unknown error.
February 21 10:13:02 ERR [000000D13939D130]: -1: NBUAPI::InitCURL: certpath is empty
February 21 10:13:02 ERR [000000D13939D130]: -1: CURLClient::setFSPKIArtifacts: Missing PKI artifact(s)
February 21 10:13:02 ERR [000000D13939D130]: -1: CURLClient::InitImp: setFSPKIArtifacts failed
February 21 10:13:03 ERR [000000D13939D130]: -1: CURLClient::setFSPKIArtifacts: Missing PKI artifact(s)
February 21 10:13:03 ERR [000000D13939D130]: -1: CURLClient::InitImp: setFSPKIArtifacts failed
February 21 10:13:04 ERR [000000D13939D130]: -1: CURLClient::setFSPKIArtifacts: Missing PKI artifact(s)
February 21 10:13:04 ERR [000000D13939D130]: -1: CURLClient::InitImp: setFSPKIArtifacts failed
February 21 10:13:05 ERR [000000D13939D130]: -1: nbuapi::GetCAUsage: failed to init CURL session with OPSCenter server
February 21 10:13:05 ERR [000000D13939D130]: -1: failed to get external CA usage

***********

February 21 10:10:28 WARNING [000000D13939CF60]: 53: couldn't connect contentrouter, it is not ready now.
February 21 10:10:28 INFO [000000D13939CF60]: couldn't connect contentrouter, it is not ready now, will sleep 60 seconds and try again.
February 21 10:11:29 ERR [000000D13939CF60]: -1: NetConnectByAddr: Failed to connect to host: No connection could be made because the target machine actively refused it. (10061)
February 21 10:11:29 ERR [000000D13939CF60]: 53: Could not establish a connection to ::1:10082: connect failed (No connection could be made because the target machine actively refused it. )

@fhaddad81 

On the NetBackup Java Console, can you check if there is any host mapping pending for approval for this Media Server ?

If so, please approve it and clear host cache on Master and Media

bpclntcmd -clear_host_cache

Hope it helps.

fhaddad81
Level 3

Thanks @sanket_pathak1 , i tried the above but without a good result. Still NetBackup Deduplication Engine services not strated . there was no host mapping pending for approval 

 

fhaddad81
Level 3

 

The issue has been resolved after checking the logs on the same day that issued appeared since all logs showed unknown error , I found SSL initialization and can't find the certificate errors then I run the "nbcertcmd -getCertificate -server server_name -token token_generated -force" for all used names
Also , I found OpsCenter server errors with certificate , then I removed to isolate the issue .
All services started and NetBackup Deduplication Engine started as well .