cancel
Showing results for 
Search instead for 
Did you mean: 

MSDP Pool Down

MilesVScott
Level 6
Certified

So spoold just crashed and I am unable to get it back up. I was just wondering if anyone else had seen anything like this. The Environment is Win 2k3 64 on 7.1.0.3. Here are the seamingly relavant log entries from spoold. Any help is appriciated. Thanks!

 

July 13 15:04:06 TRACE [0000000000D3D980]: Single pass

July 13 15:04:07 TRACE [0000000000D3D980]: _pq_exec_v73: query is: SELECT value FROM counters WHERE name = '%s'

July 13 15:04:07 ERR [0000000000D3D980]: 25004: Database query failed.

Query      : SELECT value FROM counters WHERE name = '%s'

Called by  : pqCounterSelectCompatible

Reason     : could not receive data from server: No buffer space available (0x00002747/10055)

July 13 15:04:07 ERR [0000000000D3D980]: 25004: Could not retrieve the value of counter "tlogid": unknown error

July 13 15:04:07 ERR [0000000000D3D980]: 25004: Queue processing failed five times in a row. Queue processing will be disabled and the CR will no longer accept new backup data. Content router has been totally shut down. Please contact support immediately!

July 13 15:04:07 TRACE [0000000000D3D980]: WSRequestExt: submitting &request=4&login=agent_3_22528&passwd=5db187a208b6443429fd7b0faa8cbc42&action=process&agentId=3&severity=6&errornumber=2000&application=spoold.exe&source=Storage%20Manager&date=1342206247&description=Queue%20processing%20failed%20five%20times%20in%20a%20row.%20Queue%20processing%20will%20be%20disabled%20and%20the%20CR%20will%20no%20longer%20accept%20new%20backup%20data.%20Content%20router%20has%20been%20totally%20shut%20down.%20Please%20contact%20support%20immediately%21%20%0A

July 13 15:04:07 TRACE [0000000000D3D980]: Sending binary message to qflnbremote.us.firm.hklaw.com:10102: SPA 2191097177 102 1:

July 13 15:04:07 TRACE [0000000000D3D980]: _crDataSendSimple2: 482 bytes

July 13 15:04:07 TRACE [0000000000D3D980]: Received binary message from qflnbremote.us.firm.hklaw.com:10102: REPLY 2191097177 102 1: 0

July 13 15:04:07 TRACE [0000000000D3D980]: _crDataGetSimple: 40 bytes

July 13 15:04:07 CRIT [0000000000D3D980]: 25004: Content router has been totally shut down. Please contact support immediately!
1 ACCEPTED SOLUTION

Accepted Solutions

MilesVScott
Level 6
Certified

Sorry to update, This was the solution to the issue:

 

These files were created.

1. This one is just an empty file:
<install_path>\NetBackup\db\config\DPS_PROXYNOEXPIRE

2. Create this file with the value of 3600 inside:
<install_path>\NetBackup\db\config\DPS_PROXYDEFAULTSENDTMO

3. Create this file with the value of 3600 inside:
<install_path>\NetBackup\db\config\DPS_PROXYDEFAULTRECVTMO

4. Please restart nbrmms (NetBackup Remote Manager and Monitor Service) on the mediasvr.

 

Also, the logging was turned down to help with performance of the Media Server.

View solution in original post

5 REPLIES 5

thesanman
Level 6

I had this happen before; I had some underlying disk corruption.  Call support; they have the magic recoverCR too to work through the issues and fix it all up.  Depending on the size of your MSDP unit will depend on how long it takes to fix.

MilesVScott
Level 6
Certified

I have already run recoverCR on this disk. However every time it comes back up about 2 days later it crashes again. The output from the storaged logs from the most recent time it crashes is the same as posted above. I have checked and the drive that the MSDP is sitting on has plenty of space. Support has had me move the po_list_2 file to a different dir and that seamed to bring it back online for a day or two but now im getting the same log entires with a crash.

thesanman
Level 6

Sounds like you need to push Symantec harder.  Obviously some fundemental issue.

Have you run a underlying storage check or raid array verification?

Looking at your original post there's talk of not enough buffer space which may be the root cause.  Support will have to advise I'm afraid.

 

MilesVScott
Level 6
Certified

Spoold actually is backup again. Symantec believed it may have had to do with not running the recovercr -fix -clean in a timely manner. I'm hoping that was it but as it crashed every 2-3 days I should know soon.

MilesVScott
Level 6
Certified

Sorry to update, This was the solution to the issue:

 

These files were created.

1. This one is just an empty file:
<install_path>\NetBackup\db\config\DPS_PROXYNOEXPIRE

2. Create this file with the value of 3600 inside:
<install_path>\NetBackup\db\config\DPS_PROXYDEFAULTSENDTMO

3. Create this file with the value of 3600 inside:
<install_path>\NetBackup\db\config\DPS_PROXYDEFAULTRECVTMO

4. Please restart nbrmms (NetBackup Remote Manager and Monitor Service) on the mediasvr.

 

Also, the logging was turned down to help with performance of the Media Server.