cancel
Showing results for 
Search instead for 
Did you mean: 

Server Status: Communication with the server has not been initiated or the server status has not been retrieved from the serve

Breaky_08
Level 3
Partner Accredited

Hi, in our enviroment we have one WinMaster 7.7.2 in Win2k8Stnd and three NBU 5220 2.7.2 as a media server.

Yesterday we notice that the oracle policies (10 policies) has problems with status code 6 in the NBU, but looking the logs messages at the client side are similar:

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on ch01 channel at 06/14/2016 22:31:54
ORA-19506: failed to create sequential file, name="bk_PROD_137098_1_914536889", parms=""
ORA-27028: skgfqcre: sbtbackup returned error
ORA-19511: Error received from media manager layer, error text:
   VxBSACreateObject: Failed with error:
   Server Status:  Communication with the server has not been initiated or the server status has not been retrieved from the serve

All the errors occurs in the time windows from 22:00 to 23:00, and all the day all the jobs run with out problems.

Why the problems happens at this time and all the day the backups are sucessfull?

1 ACCEPTED SOLUTION

Accepted Solutions

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified
Important to look at logs when the error is seen. Ensure dbclient log folder exist on clients (with 777 permission) and bprd on the master. These logs will tell us what is wrong with comms between master and clients. Can you please also have a close look at the master server to see what kind of load and amount of jobs are initiated during the time when Oracle backup failures are seen?

View solution in original post

8 REPLIES 8

Will_Restore
Level 6

Check name resolution forward and back.  Often caused by missing reverse lookup (IP to name).

 

Breaky_08
Level 3
Partner Accredited

Hi Will, thank you for your answer.

there are around 10 diferent clients that fail with the same message. And all the other jobs shcedules in the day for the same clients work with out problems. But I will check the name resolution.

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified
Important to look at logs when the error is seen. Ensure dbclient log folder exist on clients (with 777 permission) and bprd on the master. These logs will tell us what is wrong with comms between master and clients. Can you please also have a close look at the master server to see what kind of load and amount of jobs are initiated during the time when Oracle backup failures are seen?

Will_Restore
Level 6

Right, I missed this: All the errors occurs in the time windows from 22:00 to 23:00, and all the day all the jobs run with out problems.

That does sound like loading issue rather than name resolution.

Breaky_08
Level 3
Partner Accredited

Hi Marianne, the amount of jobs in the windows (from 22:00 to 23:00) is 216 jobs.

I filtered in the Activity Monitor.

  • 199 are backup jobs
  • 15 jobs are Image Clean up
  • 1 Duplication
  • 1 Snapshot (vmware)

I will check your advice "Ensure dbclient log folder exist on clients (with 777 permission) and bprd on the master."

Thank you for your response.

 

 

 

Breaky_08
Level 3
Partner Accredited

Hello, at the afternoon I reviewed at the client side the existence of dcblient log folder. For somes clients there was not created,  I ran the mklogdir script to create those folder.

Somes clients already has the dbclient folder, and looking inside I found similar messages in these clients:

DBCLIENT LOG
22:32:51.098 [22086302] <16> readCommFile: ERR - timed out after 900 seconds while reading from /usr/openv/netbackup/logs/user_ops/dbext/
logs/22086302.0.1465963327
22:32:51.099 [22086302] <32> serverResponse: ERR - could not read from comm file </usr/openv/netbackup/logs/user_ops/dbext/logs/22086302.
0.1465963327>
22:32:51.099 [22086302] <16> CreateNewImage: ERR - serverResponse() failed
22:32:51.151 [22086302] <16> VxBSACreateObject: ERR - Could not create new image with file /arc_RMSPRD_39345_1_914536926.
22:32:51.151 [22086302] <16> xbsa_CreateObject: ERR - VxBSACreateObject: Failed with error: Server Status:  Communication with the server
 has not been initiated or the server status has not been retrieved from the serve
22:32:51.352 [22479420] <16> readCommFile: ERR - timed out after 900 seconds while reading from /usr/openv/netbackup/logs/user_ops/dbext/
logs/22479420.0.1465963328
22:32:51.353 [22479420] <32> serverResponse: ERR - could not read from comm file </usr/openv/netbackup/logs/user_ops/dbext/logs/22479420.
0.1465963328>
22:32:51.353 [22479420] <16> CreateNewImage: ERR - serverResponse() failed
22:32:51.405 [22479420] <16> VxBSACreateObject: ERR - Could not create new image with file /arc_RMSPRD_39346_1_914536927.
22:32:51.405 [22479420] <16> xbsa_CreateObject: ERR - VxBSACreateObject: Failed with error: Server Status:  Communication with the server
 has not been initiated or the server status has not been retrieved from the serve

Will_Restore
Level 6

.<16> readCommFile: ERR - timed out after 900 seconds ...

Change Client Read Timeout on the Media Server.  Try 1800, increase as needed. 

Breaky_08
Level 3
Partner Accredited

Hi Will, thank you for your response.

All the media servers (nbu appliances) has already set CLIENT READ TIMEOUT = 3200.

Today the backups runs ok.

Our master is a VMware machine, I think that the workload on the master o something else at this time is affecting the backups.