Error connecting to oprd on tc01a: network protoco...

rancho1 · ‎06-28-2012

Hello

Please advise on this issue we are seeing

On LINUX client we saw drive down status

we used "vmoprcmd -up 0" command and it did bring up the path to drive /dev/nst0

our backup still fail from tc01a LINUX server

when the job fails it says " Error nbjm(pid=5592) NBU status: 800, EMM status: No drive are available

request failed(800)

The drive path /dev/nst0 is appearing as up

thank you

Marianne · ‎06-28-2012

Apollogies, I don't quite understand...

You mention:

On LINUX client we saw drive down status

Is this a SAN media server? Sharing tape drives with master and/or other media servers? Linux version? NBU version? Has this worked fine in the past?

our backup still fail from tc01a LINUX server

Is this the same 'Linux Client' mentioned above or is this the master?

The error in discussion title is also different to the error in job details:

Error connecting to oprd on tc01a: network protocol error(39)

When do you see this error?
This looks like a comms error between Master and Media server.
Are we looking at more than issue here?
Comms error as well as drive down?
Or maybe the drive is reported as down because of comms error?

To test comms, please run this command on the master and post output:

/usr.openv/netbackup/bin/admincmd/nbemmcmd -listhosts -verbose

Handy NetBackup Links

rancho1 · ‎06-29-2012

Thank you for your hlep!

Please advise

Thanks

==

Below is the output from the command:

The drive is not down, we are having communication errors which are causing the system to think it is down.

All mention of the Linux server are referring to the same Master Server.

The communication error is between the Master Server and the LINUX Server

Previously the path to the Linux connection would show as down, and we would get the oprd error when we try to reset the path.

One of our engineers was able to reset the drive to an upstate at the Linux box, the NBU Control Panel on the Windows server shows the drive path as being up, but all jobs that use that path still fail.

If you try to reset the path from the NBU Control Panel, we still get the oprd error.

[root@tc01a admincmd]# nbemmcmd -listhosts -verbose
NBEMMCMD, Version:6.0MP4(20060530)
The following hosts were found:
netv
        MachineName = "netv"
        MachineDescription = ""
        MachineNbuType = server (6)
NetV
        MachineName = "NetV"
        MachineDescription = ""
        MachineNbuType = cluster (5)
        Active Node Name = "DAS01"
DAS01
        ClusterName = "NetV"
        MachineName = "DAS01"
        GlobalDriveSeed = "VEND:#.:PROD:#.:IDX"
        LocalDriveSeed = ""
        MachineDescription = ""
        MachineFlags = 5
        MachineNbuType = master (3)
        MachineState = active for tape and disk jobs (14)
        NetBackupVersion = 6.00 (600000)
        OperatingSystem = windows (11)
        ScanAbility = 5
DAS02
        ClusterName = "NetV"
        MachineName = "DAS02"
        GlobalDriveSeed = "VEND:#.:PROD:#.:IDX"
        LocalDriveSeed = ""
        MachineDescription = ""
        MachineFlags = 5
        MachineNbuType = master (3)
        MachineState = active for disk jobs (12)
        NetBackupVersion = 6.00 (600000)
        OperatingSystem = windows (11)
        ScanAbility = 5
tc01a
        ClusterName = ""
        MachineName = "tc01a"
        LocalDriveSeed = ""
        MachineDescription = ""
        MachineFlags = 1
        MachineNbuType = media (1)
        MachineState = active for tape and disk jobs (14)
        MasterServerName = "NetV"
        NetBackupVersion = 6.00 (600000)
        OperatingSystem = linux (16)
        ScanAbility = 5
Command completed successfully.

Marianne · ‎06-29-2012

The 'down' path is probably because of the oprd error.

From nbemmcmd output you seem to have a clustered Windows master server :

node1: DAS01
node2: DAS02

virtual name NetV / netv

media server on Linux - tc01a

The fact that nbemmcmd managed to get output from media server says that your connection problems are intermittent. Clustered master server seems to be using both NetV and netv server names - NBU is case sensitive and does not work well with mix of uppercase and lowercase.

How is tc01a resolving master's IP address?
What Server and EMM entries have been added to bp.conf on tc01a?
/etc/hosts entries for master server node names and virtual name?

To troubleshoot MM error 39, you will need to create all of the following folders on tc01a:

Under /usr/openv/volmgr, create debug folder.
In debug folder, create daemon and reqlib folders.

Add the following entries to /usr/openv/volmgr/vm.conf (create the file if does not exist):

VERBOSE
DAYS_TO_KEEP_LOGS = 3

Restart NBU on media server.

Do the same on both Windows cluster nodes. Path:
<install_path>\Volmgr\

Use cluster software to restart NBU.

Next time error 39 is seen, check logs on active cluster node as well as logs on media server for clues.

PS: NBU 6.0 was probably Symantec's worse version ever...
ALL NBU 6.x versions will also reach EOSL in October this year - PLEASE upgrade to 7.1 or preferrably to 7.5 ASAP....

Handy NetBackup Links

VOX

Error connecting to oprd on tc01a: network protocol error(39)