cancel
Showing results for 
Search instead for 
Did you mean: 

The same tpautoconf -report_disc after the tpautoconf -replace_drive successfully

liuyl
Level 6

After one fault TD was replaced successfully via tpautoconf on the Media Server,  why its -report_disc still output the preivous mismatch record?
Notes: On the corresponding Master Server side,  the tpautoconf -report_disc output with empty,  and it can also work fine on the Media Server!

#
#
# tpautoconf -report_disc
======================= Missing Device (Drive) =======================
Drive Name = HP.ULTRIUM5-SCSI.002
Drive Path = /dev/rmt6.1
Inquiry = "HP Ultrium 5-SCSI I6RZ"
Serial Number = HU1248TF0U
TLD(3) definition Drive = 10
Hosts configured for this device:
Host = JCERPDB1
======================= New Device (Drive) =======================
Inquiry = "HP Ultrium 5-SCSI I5GZ"
Serial Number = HU1213MTPL
Drive Path = /dev/rmt6.1
#
#
#
# tpconfig -l|tail -28
drive - 11 hcart3 6 UP - IBM.ULT3580-TD3.001 /dev/rmt19.1
drive - 12 hcart3 3 UP - IBM.ULT3580-TD3.006 /dev/rmt2.1
drive - 13 hcart3 8 UP - IBM.ULT3580-TD3.000 /dev/rmt20.1
drive - 14 hcart3 5 UP - IBM.ULT3580-TD3.005 /dev/rmt3.1
drive - 15 hcart3 7 UP - IBM.ULT3580-TD3.004 /dev/rmt4.1
robot 2 - TLD - - - - jchxbak
drive - 21 hcart3 1 UP - HP.ULTRIUM6-SCSI.001 /dev/rmt21.1
drive - 22 hcart3 2 UP - HP.ULTRIUM6-SCSI.007 /dev/rmt22.1
drive - 23 hcart3 3 UP - HP.ULTRIUM6-SCSI.003 /dev/rmt23.1
drive - 24 hcart3 4 UP - HP.ULTRIUM6-SCSI.005 /dev/rmt24.1
drive - 25 hcart3 5 UP - HP.ULTRIUM6-SCSI.002 /dev/rmt25.1
drive - 26 hcart3 6 UP - HP.ULTRIUM6-SCSI.004 /dev/rmt26.1
drive - 27 hcart3 7 UP - HP.ULTRIUM6-SCSI.000 /dev/rmt27.1
drive - 28 hcart3 8 UP - HP.ULTRIUM6-SCSI.006 /dev/rmt28.1
robot 3 - TLD - - - - jchxbak
drive - 2 hcart2 2 UP - HP.ULTRIUM5-SCSI.009 /dev/rmt10.1
drive - 3 hcart2 3 UP - HP.ULTRIUM5-SCSI.010 /dev/rmt11.1
drive - 4 hcart2 4 UP - HP.ULTRIUM5-SCSI.011 /dev/rmt12.1
drive - 5 hcart2 5 UP - HP.ULTRIUM5-SCSI.006 /dev/rmt13.1
drive - 6 hcart2 6 UP - HP.ULTRIUM5-SCSI.007 /dev/rmt14.1
drive - 7 hcart2 7 UP - HP.ULTRIUM5-SCSI.005 /dev/rmt15.1
drive - 8 hcart2 8 UP - HP.ULTRIUM5-SCSI.004 /dev/rmt16.1
drive - 16 hcart2 9 UP - HP.ULTRIUM5-SCSI.003 /dev/rmt5.1
drive - 17 hcart2 10 UP - HP.ULTRIUM5-SCSI.002 /dev/rmt6.1
drive - 18 hcart2 11 UP - HP.ULTRIUM5-SCSI.001 /dev/rmt7.1
drive - 19 hcart2 12 UP - HP.ULTRIUM5-SCSI.000 /dev/rmt8.1
drive - 20 hcart2 1 UP - HP.ULTRIUM5-SCSI.008 /dev/rmt9.1
drive - 0 pcd - DISABL - IBM.DDSGEN6.000 /dev/rmt0.1
#
#
#
# tpautoconf -replace_drive HP.ULTRIUM5-SCSI.002 -path /dev/rmt6.1
Found a matching device in global DB, HP.ULTRIUM5-SCSI.002 on host JCERPDB1
#
#
#
# tpautoconf -report_disc|grep -Ei "124|1213"
Serial Number = HU1248TF0U
Serial Number = HU1213MTPL
#
#
#
# stopltid
#
#
#
# ltid -v
#
#
#
# tpautoconf -report_disc|grep -Ei "124|1213"
Serial Number = HU1248TF0U
Serial Number = HU1213MTPL
#
#
#
# netbackup stop
stopping the NetBackup Service Monitor
stopping the NetBackup Service Layer
stopping the NetBackup Remote Monitoring Management System
stopping the NetBackup compatibility daemon
stopping the Media Manager device daemon
stopping the Media Manager volume daemon
stopping the NetBackup client daemon
stopping the NetBackup network daemon
#
#
#
# bpps -a
NB Processes
------------


MM Processes
------------
#
#
#
# netbackup start
NetBackup network daemon started.
NetBackup client daemon started.
NetBackup SAN Client Fibre Transport daemon started.
NetBackup Database Server started.
NetBackup Event Manager started.
NetBackup Audit Manager started.
NetBackup Enterprise Media Manager started.
NetBackup Resource Broker started.
Media Manager daemons started.
NetBackup request daemon started.
NetBackup compatibility daemon started.
NetBackup Job Manager started.
NetBackup Policy Execution Manager started.
NetBackup Storage Lifecycle Manager started.
NetBackup Remote Monitoring Management System started.
NetBackup Key Management daemon started.
NetBackup Service Layer started.
NetBackup Agent Request Server started.
NetBackup Bare Metal Restore daemon not started.
NetBackup Vault daemon started.
NetBackup Service Monitor started.
NetBackup Bare Metal Restore Boot Server daemon started.
#
#
#
# bpps -a
NB Processes
------------
root 17301572 1 0 15:22:04 - 0:00 /usr/openv/netbackup/bin/vnetd -standalone
root 4194620 1 0 15:22:08 - 0:00 /usr/openv/netbackup/bin/nbsvcmon
root 4784508 1 0 15:22:06 - 0:00 /usr/openv/netbackup/bin/nbrmms
root 7537248 1 0 15:22:04 - 0:00 /usr/openv/netbackup/bin/bpcd -standalone
root 4522834 1 0 15:22:08 - 0:00 /usr/openv/netbackup/bin/bmrbd
root 5178226 1 0 15:22:06 - 0:00 /usr/openv/netbackup/bin/bpcompatd
root 7930840 1 0 15:22:07 - 0:00 /usr/openv/netbackup/bin/nbsl


MM Processes
------------
root 21692480 3212062 0 15:22:09 - 0:00 tldd -v
root 6226440 1 0 15:22:06 - 0:00 vmd -v
root 9568958 3212062 0 15:22:11 - 0:00 avrd -v
root 3212062 1 0 15:22:06 - 0:00 /usr/openv/volmgr/bin/ltid
#
#
#
# tpautoconf -report_disc|grep -Ei "124|1213"
Serial Number = HU1248TF0U
Serial Number = HU1213MTPL
#
#
#

31 REPLIES 31

1. More precisely, the replace_drive should just refresh/update with the new TD S/N on all the corresponding SSO media servers' mismatch records,  though itself has already existed!
2. Normally, tpautoconf should output all the host entries of the corresponding SSO media servers on any one of them, but my tpautoconf only output one host entry on its own media server......
Notes: I think that it should has nothing to do with the new TD S/N itself, so it would be another confusing problem!

mph999
Level 6
Employee Accredited

The SN is only held in one entry in one table in the DB, this is all it would normally change.  The hosts that see that tape drive are referenced in another table and are just kinda 'linked' .

From memory, tpautoconf shoud only show the devices attached to the one server it is run on.  Are you perhaps thinkng of vmoprcmd ?

Please see my previous post in this question:

That is also my doubt!
Because it is indeed a SSO TD,  but why there would be only its own one host entry in my output of tpautoconf(-replace/report) on any one of the corresponding SSO Media Servers?

https://www.veritas.com/support/en_US/article.000027601
http://symcnbu.blogspot.com/2010/04/updating-replaced-tape-drive-in-nbu.html

mph999
Level 6
Employee Accredited

I do not know the answer to that - I suspect if you dug through NBDB you would perhaps find something amiss with regard to wuich machines are mapped to the drive.

I have never seen this not work before, as I mentioned.  It's a simple concept that has been working for years, and from what I saw, it appears that the new drive may have been added before the replace_drive was run and it has got itself all upset.

The only way to fix this, well two ways ...

Delete the missing drive and possibley the new drive from the config and readd - this should clear the missing device from the output.

Manual SQL commands to remove it from NBDB - but this is very last resort and would only be used if the method above fails.

I don't think there is much else I can add to this, because ultimately the fix will be as I mention, and I am confident that in the future if a drive is swapped - running only tpautoconf -replace_drive and not adding the drive, will be successful.

This problem, I think in all probability, should be the difference of its device DB mechanism between NBU 5.x and 6.x above!
a) distributed:  GlobDB、LocalDB
b) centralized: only EMM

So I fairly doubt that the above corresponding KB documents were actually done under the older NBU 5.x environments!

mph999
Level 6
Employee Accredited

You can't reallly compare 5.0 to >6.0 - as you correctly point ot everything ended up in EMM, which does actually make things easier and more simple, at least, most of the time.

The issue should resolve with a delete/ reconfig the drive - wich although a little inconvenient should be reasonable quick to do.

Genericus
Moderator
Moderator
   VIP   

I am forced to re-run the drive wizard, since the replace_drive command does not stick once the drive is accessed.

This does fix it, but requires recycle of the processes. I am not sure if it is a feature of the shared drives, but it has not worked for several years.

Alternatively, I placed the replace drive command in a crontab running every five minutes, that works too...

 

NetBackup 9.1.0.1 on Solaris 11, writing to Data Domain 9800 7.7.4.0
duplicating via SLP to LTO5 & LTO8 in SL8500 via ACSLS

I think that the replace_drive should be run immediately once the old fault TD was physically replaced with the new good TD!
So anyone should be more attentive to the logical procedure of TD replacement after its physical procedure was completed!

Notes: anyhow in my memory,  I have almost never done successfully with replace_drive for so many years!

mph999
Level 6
Employee Accredited

From what I could see in the nbdb, the new drive had been added already, which I suspect why the repace_drive didn't work - I think.

How about this, if you replace another drive, don't run anything log a call with Veritas, explain that you have agreed to speak with me (Martin Holt) and I will personally go through it with you - that is the best I can do.

Thanks to mph999! I shall do that if possible!

But this mismatch does not affect the normal work of the corresponding TD,  except for its management!

mph999
Level 6
Employee Accredited

No - if you get a missing drive in tpautoconf but have replace the drive and configured it via the wizard, it is simply a cosmetic issue ...  Apart from the fact the old drive is still floating around in NBDB somewhere ...

Yes,   it was just run via tpautoconf -a after the fault TD was replaced with the new TD !