04-14-2021 02:31 AM - last edited on 04-14-2021 03:27 AM by Nicolai
Hi,
We've got a problem with our tape library due to a power failure, and requires a robot replacement.
After the change, both drives in one media server are down and returning an error:
root@pp-db01:~# /usr/openv/volmgr/bin/tpconfig -l
Device Robot Drive Robot Drive Device Second
Type Num Index Type DrNum Status Comment Name Path Device Path
robot 0 - TLD - - - - pp-master.domain.com
drive - 0 hcart3 1 DOWN - HP.ULTRIUM6-SCSI.000 /dev/rmt/0cbn
drive - 1 hcart3 2 DOWN - HP.ULTRIUM6-SCSI.001 /dev/rmt/1cbn
root@pp-db01:~#
root@pp-db01:~#
root@pp-db01:~# /usr/openv/volmgr/bin/vmoprcmd -d
cannot connect to robotic software daemon (42)
Any input with the error? In media server 2, both drives are up.
root@pp-db02:~# /usr/openv/volmgr/bin/tpconfig -l
Device Robot Drive Robot Drive Device Second
Type Num Index Type DrNum Status Comment Name Path Device Path
robot 0 - TLD - - - - pp-master.domain.com
drive - 0 hcart3 1 UP - HP.ULTRIUM6-SCSI.000 /dev/rmt/0cbn
drive - 1 hcart3 2 UP - HP.ULTRIUM6-SCSI.001 /dev/rmt/1cbn
Regards
Solved! Go to Solution.
04-15-2021 05:30 AM
Thank you guys for your help. Luckily, I've managed to bring up the drive in db01.
root@master:~# vmoprcmd
HOST STATUS
Host Name Version Host Status
========================================= ======= ===========
master.domain.com 760200 ACTIVE
pp-db02.domain.com 760200 ACTIVE
pp-db01.domain.com 760200 ACTIVE
PENDING REQUESTS
<NONE>
DRIVE STATUS
Drive Name Label Ready RecMID ExtMID Wr.Enbl. Type
Host DrivePath Status
=============================================================================
HP.ULTRIUM6-SCSI.000 No No No hcart3-Clean
master.domain.com /dev/rmt/0cbn TLD
pp-db02.domain.com /dev/rmt/0cbn SCAN-TLD
pp-db01.domain.com /dev/rmt/0cbn TLD
HP.ULTRIUM6-SCSI.001 Yes Yes 0017L6 0017L6 Yes hcart3-Clean
master.domain.com /dev/rmt/1cbn TLD
pp-db02.domain.com /dev/rmt/1cbn SCAN-TLD
pp-db01.domain.com /dev/rmt/1cbn ACTIVE
Though I'm not sure 100% what make this up. What I did:
1. Reboot the master server
2. Reboot the tape library
After the reboot, drive in db01 is still down. What I tried is to manually bring up the path in the GUI:
Media and Device Management > Device Monitor > Select the drive. From the drive list (pp-db01), right click and click “Up Path”.
04-14-2021 03:23 AM
Hey
What is the output from vmoprcmd and tpautoconf -report_disc
did you try to start and stop ltid on db01?
04-14-2021 03:25 AM - edited 04-14-2021 03:28 AM
Hi @technimdaxviii I have deleted the duplicate thread and removed the references to the duplicate thread since its now fixed.
Best Regards
Nicolai
04-14-2021 03:33 AM
Hi @quebek,
This is the output from the db01
root@pp-db01:~# vmoprcmd
HOST STATUS
Host Name Version Host Status
========================================= ======= ===========
pp-master.domain.com 760200 ACTIVE
pp-db02.domain.com 760200 ACTIVE
pp-db01.domain.com 760200 ACTIVE
PENDING REQUESTS
<NONE>
DRIVE STATUS
Drive Name Label Ready RecMID ExtMID Wr.Enbl. Type
Host DrivePath Status
=============================================================================
HP.ULTRIUM6-SCSI.000 No No No hcart3-Clean
pp-master.domain.com /dev/rmt/0cbn SCAN-TLD
pp-db02.domain.com /dev/rmt/0cbn TLD
pp-db01.domain.com /dev/rmt/0cbn DOWN-TLD
HP.ULTRIUM6-SCSI.001 No No No hcart3-Clean
pp-master.domain.com /dev/rmt/1cbn SCAN-TLD
pp-db02.domain.com /dev/rmt/1cbn TLD
pp-db01.domain.com /dev/rmt/1cbn DOWN-TLD
root@pp-db01:~#
root@pp-db01:~#
root@pp-db01:~# tpautoconf -report_disc
root@pp-db01:~#
What I tried was restart the netbakup service on db01:
Stop:
/usr/openv/netbackup/bin/goodies/netbackup stop
Start:
/usr/openv/netbackup/bin/goodies/netbackup start
Regards,
04-14-2021 03:33 AM
04-14-2021 03:43 AM
Hey
So few more to run on db01
/usr/openv/volmgr/bin/scan
bpps -x
do you see drives there? I assume you did already attempt to UP these two? I would UP them again and create logs under /usr/openv/volmgr/debug
refer to https://www.veritas.com/support/en_US/doc/86063237-127664549-0/v28993834-127664549
04-14-2021 04:00 AM
Below is the output for the scan and bpps -x
root@pp-db01:~# /usr/openv/volmgr/bin/scan
************************************************************
*********************** SDT_TAPE ************************
*********************** SDT_CHANGER ************************
************************************************************
------------------------------------------------------------
Device Name : "/dev/rmt/0cbn"
Passthru Name: "/dev/sg/c0tw500104f000d23aa7l0"
Volume Header: ""
Port: -1; Bus: -1; Target: -1; LUN: -1
Inquiry : "HP Ultrium 6-SCSI 239S"
Vendor ID : "HP "
Product ID : "Ultrium 6-SCSI "
Product Rev: "239S"
Serial Number: "HU140918T6"
WWN : ""
WWN Id Type : 0
Device Identifier: ""
Device Type : SDT_TAPE
NetBackup Drive Type: 16
Removable : Yes
Device Supports: SCSI-6
Flags : 0x0
Reason: 0x0
------------------------------------------------------------
Device Name : "/dev/sg/c0tw500104f000d23aa7l1"
Passthru Name: "/dev/sg/c0tw500104f000d23aa7l1"
Volume Header: ""
Port: -1; Bus: -1; Target: -1; LUN: -1
Inquiry : "STK SL150 0355"
Vendor ID : "STK "
Product ID : "SL150 "
Product Rev: "0355"
Serial Number: "464970G+1412SY2317"
WWN : ""
WWN Id Type : 0
Device Identifier: "STK SL150 464970G+1412SY2317"
Device Type : SDT_CHANGER
NetBackup Robot Type: 8
Removable : Yes
Device Supports: SCSI-5
Number of Drives : 2
Number of Slots : 30
Number of Media Access Ports: 4
Drive 1 Serial Number : "HU140918T6"
Drive 2 Serial Number : "HU140918U4"
Flags : 0x0
Reason: 0x0
------------------------------------------------------------
Device Name : "/dev/rmt/1cbn"
Passthru Name: "/dev/sg/c0tw500104f000d23aaal0"
Volume Header: ""
Port: -1; Bus: -1; Target: -1; LUN: -1
Inquiry : "HP Ultrium 6-SCSI 239S"
Vendor ID : "HP "
Product ID : "Ultrium 6-SCSI "
Product Rev: "239S"
Serial Number: "HU140918U4"
WWN : ""
WWN Id Type : 0
Device Identifier: ""
Device Type : SDT_TAPE
NetBackup Drive Type: 16
Removable : Yes
Device Supports: SCSI-6
Flags : 0x0
Reason: 0x0
root@pp-db01:~#
root@pp-db01:~#
root@pp-db01:~# /usr/openv/netbackup/bin/bpps -x
NB Processes
------------
root 8177 1 0 12:43:23 ? 0:01 /usr/openv/netbackup/bin/nbcssc -a NetBackup
root 8183 1 0 12:43:23 ? 0:00 /usr/openv/netbackup/bin/nbsvcmon
root 8150 1 0 12:43:20 ? 0:16 /usr/openv/netbackup/bin/nbsl
root 8090 1 0 12:43:15 ? 0:00 /usr/openv/netbackup/bin/bpcompatd
root 7933 1 0 12:43:07 ? 0:00 /usr/openv/netbackup/bin/bpcd -standalone
root 8058 1 0 12:43:12 ? 0:00 /usr/openv/pdde/pdag/bin/mtstrmd
root 8137 1 0 12:43:19 ? 0:02 /usr/openv/netbackup/bin/nbrmms
root 7943 1 0 12:43:08 ? 0:01 /usr/openv/netbackup/bin/nbdisco
root 7929 1 0 12:43:07 ? 0:00 /usr/openv/netbackup/bin/vnetd -standalone
MM Processes
------------
root 20259 1 0 13:07:40 ? 0:01 vmd
root 3593 1 0 13:39:06 ? 0:00 vmd -v
root 3643 3589 0 13:39:08 ? 0:00 tldd -v
root 20346 1 0 13:07:50 ? 0:00 vmd
root 3589 1 0 13:39:05 ? 0:00 ltid -v
root 8077 1 0 12:43:14 ? 0:00 vmd
root 3667 3589 0 13:39:10 ? 0:00 avrd -v
Shared Symantec Processes
-------------------------
root 1066 1 0 Mar 31 ? 0:34 /opt/VRTSpbx/bin/pbx_exchange
Yes, drives are there, and tried to bring up. This is the sample log located in the directory:
root@pp-db01:/usr/openv/volmgr/debug/dqts# ls -lrt
total 10
-rw-rw-rw- 1 root root 211 Apr 13 19:05 log.Robot01618330002
-rw-rw-rw- 1 root root 211 Apr 13 19:31 log.Robot01618331539
-rw-rw-rw- 1 root root 211 Apr 13 19:31 log.Robot01618331550
-rw-rw-rw- 1 root root 208 Apr 14 11:37 log.Robot01618389510
-rw-rw-rw- 1 root root 208 Apr 14 11:39 log.Robot01618389641
root@pp-db01:/usr/openv/volmgr/debug/dqts# cat log.Robot01618389641
11:39:52.182 [8344] <4> dtest: Opening up the log
11:39:52.322 [8344] <16> DqGetRobotDatabaseInfo: NumRobots is 1
11:39:52.322 [8344] <16> DqGetRobotDatabaseInfo: For robot number 0, found type 8 and path
root@pp-db01:/usr/openv/volmgr/debug/dqts#
04-14-2021 04:07 AM
Hey
I think there should be only one vmd process on that media server.... why there are three? no clue.
bpps -x
MM Processes
------------
root 24864 1 0 2020 ? 00:10:38 /usr/openv/volmgr/bin/ltid
root 24875 1 0 2020 ? 00:09:19 vmd
root 25517 24864 0 2020 ? 00:00:03 tldd
root 25520 24864 0 2020 ? 00:16:53 avrd
root 25524 1 0 2020 ? 00:00:02 tldcd
stop NBU again there. check with bpps -x if everything is gone from this output...
kill any reminder processes
start it back up. check vmd process...
04-14-2021 04:16 AM
Same error after restarting NBU services and killing vmd process manually
root@pp-db01:~# /usr/openv/netbackup/bin/goodies/netbackup stop
stopping the NetBackup Service Monitor
stopping the NetBackup CloudStore Service Container
stopping the NetBackup Service Layer
stopping the NetBackup Remote Monitoring Management System
stopping the NetBackup compatibility daemon
stopping the Media Manager device daemon
stopping the Media Manager volume daemon
stopping the NetBackup Deduplication Multi-Threaded Agent
stopping the NetBackup Discovery Framework
stopping the NetBackup client daemon
stopping the NetBackup network daemon
root@pp-db01:~#
root@pp-db01:~#
root@pp-db01:~# /usr/openv/netbackup/bin/bpps -x
NB Processes
------------
MM Processes
------------
root 20259 1 0 13:07:40 ? 0:01 vmd
root 3593 1 0 13:39:06 ? 0:00 vmd -v
root 20346 1 0 13:07:50 ? 0:01 vmd
Shared Symantec Processes
-------------------------
root 1066 1 0 Mar 31 ? 0:35 /opt/VRTSpbx/bin/pbx_exchange
root@pp-db01:~#
root@pp-db01:~#
root@pp-db01:~# /usr/openv/netbackup/bin/bpps -x
NB Processes
------------
MM Processes
------------
root 20259 1 0 13:07:40 ? 0:01 vmd
root 3593 1 0 13:39:06 ? 0:00 vmd -v
root 20346 1 0 13:07:50 ? 0:01 vmd
Shared Symantec Processes
-------------------------
root 1066 1 0 Mar 31 ? 0:35 /opt/VRTSpbx/bin/pbx_exchange
root@pp-db01:~#
root@pp-db01:~# kill -9 20259
root@pp-db01:~# kill -9 3593
root@pp-db01:~# kill -9 20346
root@pp-db01:~#
root@pp-db01:~# /usr/openv/netbackup/bin/bpps -x
NB Processes
------------
MM Processes
------------
Shared Symantec Processes
-------------------------
root 1066 1 0 Mar 31 ? 0:35 /opt/VRTSpbx/bin/pbx_exchange
root@pp-db01:~# /usr/openv/netbackup/bin/goodies/netbackup start
NetBackup Authentication daemon started.
NetBackup network daemon started.
NetBackup client daemon started.
NetBackup SAN Client Fibre Transport daemon started.
NetBackup Discovery Framework started.
NetBackup Database Server started.
NetBackup Authorization daemon started.
NetBackup Event Manager started.
NetBackup Audit Manager started.
NetBackup Deduplication Manager started.
NetBackup Deduplication Engine started.
NetBackup Deduplication Multi-Threaded Agent started.
NetBackup Enterprise Media Manager started.
NetBackup Resource Broker started.
Media Manager daemons started.
NetBackup request daemon started.
NetBackup compatibility daemon started.
NetBackup Job Manager started.
NetBackup Policy Execution Manager started.
NetBackup Storage Lifecycle Manager started.
NetBackup Remote Monitoring Management System started.
NetBackup Key Management daemon started.
NetBackup Service Layer started.
NetBackup Indexing Manager started.
NetBackup Agent Request Server started.
NetBackup Bare Metal Restore daemon not started.
NetBackup Web Management Console started.
NetBackup Vault daemon started.
NetBackup CloudStore Service Container started.
NetBackup Service Monitor started.
NetBackup Bare Metal Restore Boot Server daemon started.
root@pp-db01:~#
root@pp-db01:~#
root@pp-db01:~#
root@pp-db01:~# /usr/openv/netbackup/bin/bpps -x
NB Processes
------------
root 20080 1 0 14:10:47 ? 0:01 /usr/openv/netbackup/bin/nbdisco
root 20247 1 0 14:10:54 ? 0:00 /usr/openv/netbackup/bin/bprd
root 20396 1 0 14:11:04 ? 0:00 /usr/openv/netbackup/bin/bmrbd
root 20222 1 0 14:10:51 ? 0:00 /usr/openv/pdde/pdag/bin/mtstrmd
root 20323 1 0 14:10:59 ? 0:00 /usr/openv/netbackup/bin/nbrmms
root 20063 1 0 14:10:46 ? 0:00 /usr/openv/netbackup/bin/vnetd -standalone
root 20338 1 0 14:11:00 ? 0:00 /usr/openv/netbackup/bin/nbsl
root 20380 1 0 14:11:03 ? 0:00 /usr/openv/netbackup/bin/nbcssc -a NetBackup
root 20386 1 0 14:11:03 ? 0:00 /usr/openv/netbackup/bin/nbsvcmon
root 20066 1 0 14:10:46 ? 0:00 /usr/openv/netbackup/bin/bpcd -standalone
root 20252 1 0 14:10:54 ? 0:00 /usr/openv/netbackup/bin/bpcompatd
MM Processes
------------
root 20238 1 0 14:10:53 ? 0:00 /usr/openv/volmgr/bin/ltid
root 20307 20238 0 14:10:57 ? 0:00 avrd
root 20243 1 0 14:10:53 ? 0:00 vmd
root 20273 20238 0 14:10:55 ? 0:00 tldd
Shared Symantec Processes
-------------------------
root 1066 1 0 Mar 31 ? 0:35 /opt/VRTSpbx/bin/pbx_exchange
root@pp-db01:~#
root@pp-db01:~#
root@pp-db01:~#
root@pp-db01:~# vmoprcmd
HOST STATUS
Host Name Version Host Status
========================================= ======= ===========
master.domain.com 760200 ACTIVE
pp-db02.domain.com 760200 ACTIVE
pp-db01.domain.com 760200 ACTIVE
PENDING REQUESTS
<NONE>
DRIVE STATUS
Drive Name Label Ready RecMID ExtMID Wr.Enbl. Type
Host DrivePath Status
=============================================================================
HP.ULTRIUM6-SCSI.000 No No No hcart3-Clean
master.domain.com /dev/rmt/0cbn SCAN-TLD
pp-db02.domain.com /dev/rmt/0cbn TLD
pp-db01.domain.com /dev/rmt/0cbn DOWN-TLD
HP.ULTRIUM6-SCSI.001 No No No hcart3-Clean
master.domain.com /dev/rmt/1cbn SCAN-TLD
pp-db02.domain.com /dev/rmt/1cbn TLD
pp-db01.domain.com /dev/rmt/1cbn DOWN-TLD
root@pp-db01:~#
root@pp-db01:~#
root@pp-db01:~# /usr/openv/netbackup/bin/bpps -x
NB Processes
------------
root 20080 1 0 14:10:47 ? 0:01 /usr/openv/netbackup/bin/nbdisco
root 20247 1 0 14:10:54 ? 0:00 /usr/openv/netbackup/bin/bprd
root 20222 1 0 14:10:51 ? 0:00 /usr/openv/pdde/pdag/bin/mtstrmd
root 20323 1 0 14:10:59 ? 0:00 /usr/openv/netbackup/bin/nbrmms
root 20063 1 0 14:10:46 ? 0:00 /usr/openv/netbackup/bin/vnetd -standalone
root 20338 1 0 14:11:00 ? 0:00 /usr/openv/netbackup/bin/nbsl
root 20380 1 0 14:11:03 ? 0:00 /usr/openv/netbackup/bin/nbcssc -a NetBackup
root 20386 1 0 14:11:03 ? 0:00 /usr/openv/netbackup/bin/nbsvcmon
root 20066 1 0 14:10:46 ? 0:00 /usr/openv/netbackup/bin/bpcd -standalone
root 20252 1 0 14:10:54 ? 0:00 /usr/openv/netbackup/bin/bpcompatd
MM Processes
------------
root 20238 1 0 14:10:53 ? 0:00 /usr/openv/volmgr/bin/ltid
root 20307 20238 0 14:10:57 ? 0:00 avrd
root 20243 1 0 14:10:53 ? 0:00 vmd
root 20273 20238 0 14:10:55 ? 0:00 tldd
Shared Symantec Processes
-------------------------
root 1066 1 0 Mar 31 ? 0:35 /opt/VRTSpbx/bin/pbx_exchange
root@pp-db01:~#
root@pp-db01:~#
root@pp-db01:~#
root@pp-db01:~# vmoprcmd -up 0
cannot connect to robotic software daemon (42)
04-14-2021 04:50 AM
Hi
Can you please share output from this cat /usr/openv/volmgr/vm.conf from both media servers?
any difference?
from master can you run bpgetconfig -s pp-db01
04-14-2021 05:34 AM
Hi. Please see below:
At master server
root@master:/usr/openv/volmgr/bin# cat /usr/openv/volmgr/vm.conf
MM_SERVER_NAME = master.domain.com
At media server (each)
root@pp-db01:/usr/openv/volmgr/bin# cat /usr/openv/volmgr/vm.conf
MM_SERVER_NAME = pp-db01.domain.com
root@pp-db02:/usr/openv/volmgr/bin# cat /usr/openv/volmgr/vm.conf
MM_SERVER_NAME = pp-db02.domain.com
At master ( bpgetconfig -s pp-db01 and bpgetconfig -s pp-db02)
root@master:/usr/openv/volmgr/bin# bpgetconfig -s pp-db01.domain.com
Media Host
Solaris, Solaris10
7.6.0.0.0.2
NetBackup
7.6
760000
/usr/openv/netbackup/bin
SunOS 5.11
root@master:/usr/openv/volmgr/bin# bpgetconfig -s pp-db02.domain.com
Media Host
Solaris, Solaris10
7.6.0.0.0.2
NetBackup
7.6
760000
/usr/openv/netbackup/bin
SunOS 5.11
04-14-2021 06:29 AM
Hi
I am running out of ideas... restart NBU and PBX
make sure not firewall etc is there... check the bp.conf if its OK...
Its a pity you cant open a case with Veritas as 7.6 is EOL long time back...
04-14-2021 06:36 AM - edited 04-14-2021 06:42 AM
I think you need to check the drive path from each media server.
Mount a empty tape on first tape drive and then from each media server do (require the magnetic tape tools).
mt -f /dev/rmt/0cnb status
You should get a message saying, "BOT online compression immediate-report-mode". If not then the tape drive is not connected to 0cnb but maybe to 1cnb or 2cnb or not a all. Update the Netbackup device configuration accordingly.
Or maybe better, delete tape drives and robots and re-run the Netbackup device configuration wizard. Do a reboot first.
04-14-2021 11:16 AM
04-15-2021 05:30 AM
Thank you guys for your help. Luckily, I've managed to bring up the drive in db01.
root@master:~# vmoprcmd
HOST STATUS
Host Name Version Host Status
========================================= ======= ===========
master.domain.com 760200 ACTIVE
pp-db02.domain.com 760200 ACTIVE
pp-db01.domain.com 760200 ACTIVE
PENDING REQUESTS
<NONE>
DRIVE STATUS
Drive Name Label Ready RecMID ExtMID Wr.Enbl. Type
Host DrivePath Status
=============================================================================
HP.ULTRIUM6-SCSI.000 No No No hcart3-Clean
master.domain.com /dev/rmt/0cbn TLD
pp-db02.domain.com /dev/rmt/0cbn SCAN-TLD
pp-db01.domain.com /dev/rmt/0cbn TLD
HP.ULTRIUM6-SCSI.001 Yes Yes 0017L6 0017L6 Yes hcart3-Clean
master.domain.com /dev/rmt/1cbn TLD
pp-db02.domain.com /dev/rmt/1cbn SCAN-TLD
pp-db01.domain.com /dev/rmt/1cbn ACTIVE
Though I'm not sure 100% what make this up. What I did:
1. Reboot the master server
2. Reboot the tape library
After the reboot, drive in db01 is still down. What I tried is to manually bring up the path in the GUI:
Media and Device Management > Device Monitor > Select the drive. From the drive list (pp-db01), right click and click “Up Path”.
05-15-2024 09:06 AM - edited 05-15-2024 09:07 AM
Thanks for this informative article about software development! For those looking for reliable and modern solutions, I strongly recommend exploring the services of the studio sloboda-studio.com. With a proven track record of providing high-quality software tailored to customer needs, they stand out as a trusted partner in the ever-evolving technological world. Their experience and dedication ensure that your project is in good hands, leading to successful results and client satisfaction.
05-27-2024 05:22 AM
ufffffff