Problem with a Solaris Cluster

Joe_Despres · ‎01-15-2013

Getting Status 13 errors randomly on my SAP backups on a Solaris 2 node cluster

Solaris 10 NBU 6.5.6

Node B is Active

Node A is being reported as being active by Command ::---> "nbemmcmd -listhosts -verbose"

This started failing after the cluster got patched....

Any idea's?

Joe Despres

mph999 · ‎01-15-2013

What patches did you add ? I presume OS ...

Status 13 is 'file read failed' - this can also be a socket, so this is very often network/ comms related.

What do the patches do ? Eg, network related, file system related etc ...

My first thoughts ...

If, and it is only a possibility, the patches did cause this issue, then removing them would probably be the quickest way to get things running.

If possible, remove the patches, lets' see for 100% sure if they cause the issue.

To fix the issue (with the patches installed) I suspect you'll have to log a call.

Martin

Marianne · ‎01-15-2013

Node B is Active

Node A is being reported as being active by Command ::---> "nbemmcmd -listhosts -verbose"

Can you see that NBU is trying to connect to wrong node?

Please upgrade to 7.x ASAP - it has a host cache refresh that you can use. (Plus 6.x reached EOSL last year October)

Have you been able to determine if the status 13 is a file read error or a socket read error?

We often see that status 13 is also a 'Client Read Timeout'.

Please post all text in Details tab of the failed job.

VOX