cancel
Showing results for 
Search instead for 
Did you mean: 

cannot connect to robotic software daemon (42)

technimdaxviii
Level 5

Hi,

We've got a problem with our tape library due to a power failure, and requires a robot replacement.

After the change, both drives in one media server are down and returning an error:

 

 

root@pp-db01:~# /usr/openv/volmgr/bin/tpconfig -l
Device Robot Drive       Robot                    Drive                 Device         Second
Type     Num Index  Type DrNum Status  Comment    Name                  Path           Device Path
robot      0    -    TLD    -       -  -          -                     pp-master.domain.com
  drive    -    0 hcart3    1    DOWN  -          HP.ULTRIUM6-SCSI.000  /dev/rmt/0cbn
  drive    -    1 hcart3    2    DOWN  -          HP.ULTRIUM6-SCSI.001  /dev/rmt/1cbn
root@pp-db01:~#
root@pp-db01:~#
root@pp-db01:~# /usr/openv/volmgr/bin/vmoprcmd -d
cannot connect to robotic software daemon (42)

 

 

 

Any input with the error? In media server 2, both drives are up.

 

 

root@pp-db02:~# /usr/openv/volmgr/bin/tpconfig -l
Device Robot Drive       Robot                    Drive                 Device         Second
Type     Num Index  Type DrNum Status  Comment    Name                  Path           Device Path
robot      0    -    TLD    -       -  -          -                     pp-master.domain.com
  drive    -    0 hcart3    1      UP  -          HP.ULTRIUM6-SCSI.000  /dev/rmt/0cbn
  drive    -    1 hcart3    2      UP  -          HP.ULTRIUM6-SCSI.001  /dev/rmt/1cbn

 

 

Regards

1 ACCEPTED SOLUTION

Accepted Solutions

technimdaxviii
Level 5

Hi @quebek @Nicolai @X2 

Thank you guys for your help. Luckily, I've managed to bring up the drive in db01.

 

root@master:~# vmoprcmd

                           HOST STATUS
Host Name                                  Version   Host Status
=========================================  =======   ===========
master.domain.com                    760200    ACTIVE
pp-db02.domain.com                   760200    ACTIVE
pp-db01.domain.com                   760200    ACTIVE

                                PENDING REQUESTS


                                    <NONE>

                                  DRIVE STATUS

Drive Name               Label   Ready  RecMID  ExtMID  Wr.Enbl.  Type
    Host                       DrivePath                            Status
=============================================================================
HP.ULTRIUM6-SCSI.000     No      No                     No        hcart3-Clean
    master.domain.com    /dev/rmt/0cbn                        TLD
    pp-db02.domain.com   /dev/rmt/0cbn                        SCAN-TLD
    pp-db01.domain.com   /dev/rmt/0cbn                        TLD

HP.ULTRIUM6-SCSI.001     Yes     Yes    0017L6  0017L6  Yes       hcart3-Clean
    master.domain.com    /dev/rmt/1cbn                        TLD
    pp-db02.domain.com   /dev/rmt/1cbn                        SCAN-TLD
    pp-db01.domain.com   /dev/rmt/1cbn                        ACTIVE

 

Though I'm not sure 100% what make this up. What I did:

1. Reboot the master server

2. Reboot the tape library

 

After the reboot, drive in db01 is still down. What I tried is to manually bring up the path in the GUI:

Media and Device Management > Device Monitor > Select the drive.  From the drive list (pp-db01), right click and click “Up Path”.                 

View solution in original post

14 REPLIES 14

quebek
Moderator
Moderator
   VIP    Certified

Hey

What is the output from vmoprcmd and tpautoconf -report_disc 

did you try to start and stop ltid on db01?

Nicolai
Moderator
Moderator
Partner    VIP   

Hi @technimdaxviii I have deleted the duplicate thread and removed the references to the duplicate thread since its now fixed.

Best Regards
Nicolai

Hi @quebek,

This is the output from the db01

root@pp-db01:~# vmoprcmd

                           HOST STATUS
Host Name                                  Version   Host Status
=========================================  =======   ===========
pp-master.domain.com                    760200    ACTIVE
pp-db02.domain.com                   760200    ACTIVE
pp-db01.domain.com                   760200    ACTIVE

                                PENDING REQUESTS


                                    <NONE>

                                  DRIVE STATUS

Drive Name               Label   Ready  RecMID  ExtMID  Wr.Enbl.  Type
    Host                       DrivePath                            Status
=============================================================================
HP.ULTRIUM6-SCSI.000     No      No                     No        hcart3-Clean
    pp-master.domain.com    /dev/rmt/0cbn                        SCAN-TLD
    pp-db02.domain.com   /dev/rmt/0cbn                        TLD
    pp-db01.domain.com   /dev/rmt/0cbn                        DOWN-TLD

HP.ULTRIUM6-SCSI.001     No      No                     No        hcart3-Clean
    pp-master.domain.com    /dev/rmt/1cbn                        SCAN-TLD
    pp-db02.domain.com   /dev/rmt/1cbn                        TLD
    pp-db01.domain.com   /dev/rmt/1cbn                        DOWN-TLD
root@pp-db01:~#
root@pp-db01:~#
root@pp-db01:~# tpautoconf -report_disc
root@pp-db01:~#

 

What I tried was restart the netbakup service on db01:

Stop:
/usr/openv/netbackup/bin/goodies/netbackup stop

Start:
/usr/openv/netbackup/bin/goodies/netbackup start

 

Regards,

Hi @Nicolai,

 

Thank you, well appreciated.

 

Regards

quebek
Moderator
Moderator
   VIP    Certified

Hey

So few more to run on db01

/usr/openv/volmgr/bin/scan

bpps -x

do you see drives there? I assume you did already attempt to UP these two? I would UP them again and create logs under /usr/openv/volmgr/debug

refer to https://www.veritas.com/support/en_US/doc/86063237-127664549-0/v28993834-127664549 

Below is the output for the scan and bpps -x

root@pp-db01:~# /usr/openv/volmgr/bin/scan
************************************************************
*********************** SDT_TAPE    ************************
*********************** SDT_CHANGER ************************
************************************************************
------------------------------------------------------------
Device Name  : "/dev/rmt/0cbn"
Passthru Name: "/dev/sg/c0tw500104f000d23aa7l0"
Volume Header: ""
Port: -1; Bus: -1; Target: -1; LUN: -1
Inquiry    : "HP      Ultrium 6-SCSI  239S"
Vendor ID  : "HP      "
Product ID : "Ultrium 6-SCSI  "
Product Rev: "239S"
Serial Number: "HU140918T6"
WWN          : ""
WWN Id Type  : 0
Device Identifier: ""
Device Type    : SDT_TAPE
NetBackup Drive Type: 16
Removable      : Yes
Device Supports: SCSI-6
Flags : 0x0
Reason: 0x0
------------------------------------------------------------
Device Name  : "/dev/sg/c0tw500104f000d23aa7l1"
Passthru Name: "/dev/sg/c0tw500104f000d23aa7l1"
Volume Header: ""
Port: -1; Bus: -1; Target: -1; LUN: -1
Inquiry    : "STK     SL150           0355"
Vendor ID  : "STK     "
Product ID : "SL150           "
Product Rev: "0355"
Serial Number: "464970G+1412SY2317"
WWN          : ""
WWN Id Type  : 0
Device Identifier: "STK     SL150           464970G+1412SY2317"
Device Type    : SDT_CHANGER
NetBackup Robot Type: 8
Removable      : Yes
Device Supports: SCSI-5
Number of Drives : 2
Number of Slots  : 30
Number of Media Access Ports: 4
Drive 1 Serial Number      : "HU140918T6"
Drive 2 Serial Number      : "HU140918U4"
Flags : 0x0
Reason: 0x0
------------------------------------------------------------
Device Name  : "/dev/rmt/1cbn"
Passthru Name: "/dev/sg/c0tw500104f000d23aaal0"
Volume Header: ""
Port: -1; Bus: -1; Target: -1; LUN: -1
Inquiry    : "HP      Ultrium 6-SCSI  239S"
Vendor ID  : "HP      "
Product ID : "Ultrium 6-SCSI  "
Product Rev: "239S"
Serial Number: "HU140918U4"
WWN          : ""
WWN Id Type  : 0
Device Identifier: ""
Device Type    : SDT_TAPE
NetBackup Drive Type: 16
Removable      : Yes
Device Supports: SCSI-6
Flags : 0x0
Reason: 0x0
root@pp-db01:~#
root@pp-db01:~#
root@pp-db01:~# /usr/openv/netbackup/bin/bpps -x
NB Processes
------------
    root  8177     1   0 12:43:23 ?           0:01 /usr/openv/netbackup/bin/nbcssc -a NetBackup
    root  8183     1   0 12:43:23 ?           0:00 /usr/openv/netbackup/bin/nbsvcmon
    root  8150     1   0 12:43:20 ?           0:16 /usr/openv/netbackup/bin/nbsl
    root  8090     1   0 12:43:15 ?           0:00 /usr/openv/netbackup/bin/bpcompatd
    root  7933     1   0 12:43:07 ?           0:00 /usr/openv/netbackup/bin/bpcd -standalone
    root  8058     1   0 12:43:12 ?           0:00 /usr/openv/pdde/pdag/bin/mtstrmd
    root  8137     1   0 12:43:19 ?           0:02 /usr/openv/netbackup/bin/nbrmms
    root  7943     1   0 12:43:08 ?           0:01 /usr/openv/netbackup/bin/nbdisco
    root  7929     1   0 12:43:07 ?           0:00 /usr/openv/netbackup/bin/vnetd -standalone


MM Processes
------------
    root 20259     1   0 13:07:40 ?           0:01 vmd
    root  3593     1   0 13:39:06 ?           0:00 vmd -v
    root  3643  3589   0 13:39:08 ?           0:00 tldd -v
    root 20346     1   0 13:07:50 ?           0:00 vmd
    root  3589     1   0 13:39:05 ?           0:00 ltid -v
    root  8077     1   0 12:43:14 ?           0:00 vmd
    root  3667  3589   0 13:39:10 ?           0:00 avrd -v


Shared Symantec Processes
-------------------------
    root  1066     1   0   Mar 31 ?           0:34 /opt/VRTSpbx/bin/pbx_exchange

 

 

Yes, drives are there, and tried to bring up. This is the sample log located in the directory:

root@pp-db01:/usr/openv/volmgr/debug/dqts# ls -lrt
total 10
-rw-rw-rw-   1 root     root         211 Apr 13 19:05 log.Robot01618330002
-rw-rw-rw-   1 root     root         211 Apr 13 19:31 log.Robot01618331539
-rw-rw-rw-   1 root     root         211 Apr 13 19:31 log.Robot01618331550
-rw-rw-rw-   1 root     root         208 Apr 14 11:37 log.Robot01618389510
-rw-rw-rw-   1 root     root         208 Apr 14 11:39 log.Robot01618389641
root@pp-db01:/usr/openv/volmgr/debug/dqts# cat log.Robot01618389641
11:39:52.182 [8344] <4> dtest: Opening up the log
11:39:52.322 [8344] <16> DqGetRobotDatabaseInfo: NumRobots is 1

11:39:52.322 [8344] <16> DqGetRobotDatabaseInfo: For robot number 0, found type 8 and path

root@pp-db01:/usr/openv/volmgr/debug/dqts#

 

quebek
Moderator
Moderator
   VIP    Certified

Hey

I think there should be only one vmd process on that media server.... why there are three? no clue.

bpps -x 

MM Processes
------------
root 24864 1 0 2020 ? 00:10:38 /usr/openv/volmgr/bin/ltid
root 24875 1 0 2020 ? 00:09:19 vmd
root 25517 24864 0 2020 ? 00:00:03 tldd
root 25520 24864 0 2020 ? 00:16:53 avrd
root 25524 1 0 2020 ? 00:00:02 tldcd

stop NBU again there. check with bpps -x if everything is gone from this output...

kill any reminder processes

start it back up. check vmd process...

 

Same error after restarting NBU services and killing vmd process manually

 

root@pp-db01:~# /usr/openv/netbackup/bin/goodies/netbackup stop
stopping the NetBackup Service Monitor
stopping the NetBackup CloudStore Service Container
stopping the NetBackup Service Layer
stopping the NetBackup Remote Monitoring Management System
stopping the NetBackup compatibility daemon
stopping the Media Manager device daemon
stopping the Media Manager volume daemon
stopping the NetBackup Deduplication Multi-Threaded Agent
stopping the NetBackup Discovery Framework
stopping the NetBackup client daemon
stopping the NetBackup network daemon
root@pp-db01:~#
root@pp-db01:~#
root@pp-db01:~# /usr/openv/netbackup/bin/bpps -x
NB Processes
------------


MM Processes
------------
    root 20259     1   0 13:07:40 ?           0:01 vmd
    root  3593     1   0 13:39:06 ?           0:00 vmd -v
    root 20346     1   0 13:07:50 ?           0:01 vmd


Shared Symantec Processes
-------------------------
    root  1066     1   0   Mar 31 ?           0:35 /opt/VRTSpbx/bin/pbx_exchange
root@pp-db01:~#
root@pp-db01:~#
root@pp-db01:~# /usr/openv/netbackup/bin/bpps -x
NB Processes
------------


MM Processes
------------
    root 20259     1   0 13:07:40 ?           0:01 vmd
    root  3593     1   0 13:39:06 ?           0:00 vmd -v
    root 20346     1   0 13:07:50 ?           0:01 vmd


Shared Symantec Processes
-------------------------
    root  1066     1   0   Mar 31 ?           0:35 /opt/VRTSpbx/bin/pbx_exchange
root@pp-db01:~#
root@pp-db01:~# kill -9 20259
root@pp-db01:~# kill -9 3593
root@pp-db01:~# kill -9 20346
root@pp-db01:~#
root@pp-db01:~# /usr/openv/netbackup/bin/bpps -x
NB Processes
------------


MM Processes
------------


Shared Symantec Processes
-------------------------
    root  1066     1   0   Mar 31 ?           0:35 /opt/VRTSpbx/bin/pbx_exchange
root@pp-db01:~# /usr/openv/netbackup/bin/goodies/netbackup start
NetBackup Authentication daemon started.
NetBackup network daemon started.
NetBackup client daemon started.
NetBackup SAN Client Fibre Transport daemon started.
NetBackup Discovery Framework started.
NetBackup Database Server started.
NetBackup Authorization daemon started.
NetBackup Event Manager started.
NetBackup Audit Manager started.
NetBackup Deduplication Manager started.
NetBackup Deduplication Engine started.
NetBackup Deduplication Multi-Threaded Agent started.
NetBackup Enterprise Media Manager started.
NetBackup Resource Broker started.
Media Manager daemons started.
NetBackup request daemon started.
NetBackup compatibility daemon started.
NetBackup Job Manager started.
NetBackup Policy Execution Manager started.
NetBackup Storage Lifecycle Manager started.
NetBackup Remote Monitoring Management System started.
NetBackup Key Management daemon started.
NetBackup Service Layer started.
NetBackup Indexing Manager started.
NetBackup Agent Request Server started.
NetBackup Bare Metal Restore daemon not started.
NetBackup Web Management Console started.
NetBackup Vault daemon started.
NetBackup CloudStore Service Container started.
NetBackup Service Monitor started.
NetBackup Bare Metal Restore Boot Server daemon started.
root@pp-db01:~#
root@pp-db01:~#
root@pp-db01:~#
root@pp-db01:~# /usr/openv/netbackup/bin/bpps -x
NB Processes
------------
    root 20080     1   0 14:10:47 ?           0:01 /usr/openv/netbackup/bin/nbdisco
    root 20247     1   0 14:10:54 ?           0:00 /usr/openv/netbackup/bin/bprd
    root 20396     1   0 14:11:04 ?           0:00 /usr/openv/netbackup/bin/bmrbd
    root 20222     1   0 14:10:51 ?           0:00 /usr/openv/pdde/pdag/bin/mtstrmd
    root 20323     1   0 14:10:59 ?           0:00 /usr/openv/netbackup/bin/nbrmms
    root 20063     1   0 14:10:46 ?           0:00 /usr/openv/netbackup/bin/vnetd -standalone
    root 20338     1   0 14:11:00 ?           0:00 /usr/openv/netbackup/bin/nbsl
    root 20380     1   0 14:11:03 ?           0:00 /usr/openv/netbackup/bin/nbcssc -a NetBackup
    root 20386     1   0 14:11:03 ?           0:00 /usr/openv/netbackup/bin/nbsvcmon
    root 20066     1   0 14:10:46 ?           0:00 /usr/openv/netbackup/bin/bpcd -standalone
    root 20252     1   0 14:10:54 ?           0:00 /usr/openv/netbackup/bin/bpcompatd


MM Processes
------------
    root 20238     1   0 14:10:53 ?           0:00 /usr/openv/volmgr/bin/ltid
    root 20307 20238   0 14:10:57 ?           0:00 avrd
    root 20243     1   0 14:10:53 ?           0:00 vmd
    root 20273 20238   0 14:10:55 ?           0:00 tldd


Shared Symantec Processes
-------------------------
    root  1066     1   0   Mar 31 ?           0:35 /opt/VRTSpbx/bin/pbx_exchange
root@pp-db01:~#
root@pp-db01:~#
root@pp-db01:~#
root@pp-db01:~# vmoprcmd

                           HOST STATUS
Host Name                                  Version   Host Status
=========================================  =======   ===========
master.domain.com                    760200    ACTIVE
pp-db02.domain.com                   760200    ACTIVE
pp-db01.domain.com                   760200    ACTIVE

                                PENDING REQUESTS


                                    <NONE>

                                  DRIVE STATUS

Drive Name               Label   Ready  RecMID  ExtMID  Wr.Enbl.  Type
    Host                       DrivePath                            Status
=============================================================================
HP.ULTRIUM6-SCSI.000     No      No                     No        hcart3-Clean
    master.domain.com    /dev/rmt/0cbn                        SCAN-TLD
    pp-db02.domain.com   /dev/rmt/0cbn                        TLD
    pp-db01.domain.com   /dev/rmt/0cbn                        DOWN-TLD

HP.ULTRIUM6-SCSI.001     No      No                     No        hcart3-Clean
    master.domain.com    /dev/rmt/1cbn                        SCAN-TLD
    pp-db02.domain.com   /dev/rmt/1cbn                        TLD
    pp-db01.domain.com   /dev/rmt/1cbn                        DOWN-TLD
root@pp-db01:~#
root@pp-db01:~#
root@pp-db01:~# /usr/openv/netbackup/bin/bpps -x
NB Processes
------------
    root 20080     1   0 14:10:47 ?           0:01 /usr/openv/netbackup/bin/nbdisco
    root 20247     1   0 14:10:54 ?           0:00 /usr/openv/netbackup/bin/bprd
    root 20222     1   0 14:10:51 ?           0:00 /usr/openv/pdde/pdag/bin/mtstrmd
    root 20323     1   0 14:10:59 ?           0:00 /usr/openv/netbackup/bin/nbrmms
    root 20063     1   0 14:10:46 ?           0:00 /usr/openv/netbackup/bin/vnetd -standalone
    root 20338     1   0 14:11:00 ?           0:00 /usr/openv/netbackup/bin/nbsl
    root 20380     1   0 14:11:03 ?           0:00 /usr/openv/netbackup/bin/nbcssc -a NetBackup
    root 20386     1   0 14:11:03 ?           0:00 /usr/openv/netbackup/bin/nbsvcmon
    root 20066     1   0 14:10:46 ?           0:00 /usr/openv/netbackup/bin/bpcd -standalone
    root 20252     1   0 14:10:54 ?           0:00 /usr/openv/netbackup/bin/bpcompatd


MM Processes
------------
    root 20238     1   0 14:10:53 ?           0:00 /usr/openv/volmgr/bin/ltid
    root 20307 20238   0 14:10:57 ?           0:00 avrd
    root 20243     1   0 14:10:53 ?           0:00 vmd
    root 20273 20238   0 14:10:55 ?           0:00 tldd


Shared Symantec Processes
-------------------------
    root  1066     1   0   Mar 31 ?           0:35 /opt/VRTSpbx/bin/pbx_exchange
root@pp-db01:~#
root@pp-db01:~#
root@pp-db01:~#
root@pp-db01:~# vmoprcmd -up 0
cannot connect to robotic software daemon (42)

quebek
Moderator
Moderator
   VIP    Certified

Hi

Can you please share output from this cat /usr/openv/volmgr/vm.conf from both media servers?

any difference?

from master can you run bpgetconfig -s pp-db01

 

Hi. Please see below:

At master server

root@master:/usr/openv/volmgr/bin# cat /usr/openv/volmgr/vm.conf
MM_SERVER_NAME = master.domain.com

 

At media server (each)

root@pp-db01:/usr/openv/volmgr/bin# cat /usr/openv/volmgr/vm.conf
MM_SERVER_NAME = pp-db01.domain.com

root@pp-db02:/usr/openv/volmgr/bin# cat /usr/openv/volmgr/vm.conf
MM_SERVER_NAME = pp-db02.domain.com

 

 

At master ( bpgetconfig -s pp-db01 and bpgetconfig -s pp-db02)

root@master:/usr/openv/volmgr/bin# bpgetconfig -s pp-db01.domain.com
Media Host
Solaris, Solaris10
7.6.0.0.0.2
NetBackup
7.6
760000
/usr/openv/netbackup/bin
SunOS 5.11
root@master:/usr/openv/volmgr/bin# bpgetconfig -s pp-db02.domain.com
Media Host
Solaris, Solaris10
7.6.0.0.0.2
NetBackup
7.6
760000
/usr/openv/netbackup/bin
SunOS 5.11

 

 

quebek
Moderator
Moderator
   VIP    Certified

Hi

I am running out of ideas... restart NBU and PBX

make sure not firewall etc is there... check the bp.conf if its OK...

Its a pity you cant open a case with Veritas as 7.6 is EOL long time back...

Nicolai
Moderator
Moderator
Partner    VIP   

I think you need to check the drive path from each media server.

Mount a empty tape on first tape drive and then from each media server  do (require the magnetic tape tools). 

mt -f /dev/rmt/0cnb status

You should get a message saying, "BOT online compression immediate-report-mode". If not then the tape drive is not connected to 0cnb but maybe to 1cnb or 2cnb or not a all. Update the Netbackup device configuration accordingly.  

Or maybe better, delete tape drives and robots and re-run the Netbackup device configuration wizard. Do a reboot first.

X2
Moderator
Moderator
   VIP   

@Nicolai wrote:


 Or maybe better, delete tape drives and robots and re-run the Netbackup device configuration wizard. Do a reboot first.


Try @Nicolai 's last suggestion.

 

technimdaxviii
Level 5

Hi @quebek @Nicolai @X2 

Thank you guys for your help. Luckily, I've managed to bring up the drive in db01.

 

root@master:~# vmoprcmd

                           HOST STATUS
Host Name                                  Version   Host Status
=========================================  =======   ===========
master.domain.com                    760200    ACTIVE
pp-db02.domain.com                   760200    ACTIVE
pp-db01.domain.com                   760200    ACTIVE

                                PENDING REQUESTS


                                    <NONE>

                                  DRIVE STATUS

Drive Name               Label   Ready  RecMID  ExtMID  Wr.Enbl.  Type
    Host                       DrivePath                            Status
=============================================================================
HP.ULTRIUM6-SCSI.000     No      No                     No        hcart3-Clean
    master.domain.com    /dev/rmt/0cbn                        TLD
    pp-db02.domain.com   /dev/rmt/0cbn                        SCAN-TLD
    pp-db01.domain.com   /dev/rmt/0cbn                        TLD

HP.ULTRIUM6-SCSI.001     Yes     Yes    0017L6  0017L6  Yes       hcart3-Clean
    master.domain.com    /dev/rmt/1cbn                        TLD
    pp-db02.domain.com   /dev/rmt/1cbn                        SCAN-TLD
    pp-db01.domain.com   /dev/rmt/1cbn                        ACTIVE

 

Though I'm not sure 100% what make this up. What I did:

1. Reboot the master server

2. Reboot the tape library

 

After the reboot, drive in db01 is still down. What I tried is to manually bring up the path in the GUI:

Media and Device Management > Device Monitor > Select the drive.  From the drive list (pp-db01), right click and click “Up Path”.