Highlighted

netbackup seems to use only 1 drive, when there is total of 3 drives

Hi All,
not sure if this question has been posted before, but my netbackup server only seems to
stick to one drive i.e drive index 2 ( /dev/rmt/3cbn) , for all its backup job. When that drive is in use,
the other scheduled jobs will be "queued" .  i.e "queued state"

Any way to tell for sure how many tape drive is "available" to the media server  ?
i.e cmd line to execute etc ?

root@SUBCTUXS7Smiley Surprisedpenv[2046]# sgscan all    
/dev/sg/c0t0l0: Disk (/dev/rdsk/c1t0d0): "SEAGATE ST373207LSUN72G"
/dev/sg/c0t1l0: Disk (/dev/rdsk/c1t1d0): "SEAGATE ST373207LSUN72G"
/dev/sg/c0tw2100001086103eecl0: Changer: "ATL     M2500"
/dev/sg/c0tw2100001086103eecl1: Tape (/dev/rmt/0): "HP      Ultrium 2-SCSI"
/dev/sg/c0tw2100001086140864l0: Tape (/dev/rmt/4): "HP      Ultrium 2-SCSI"
/dev/sg/c0tw2100001086140864l1: Tape (/dev/rmt/3): "HP      Ultrium 2-SCSI"


root@SUBCTUXS7:/[2039]# tpconfig -d
Index DriveName              DrivePath                Type    Shared   Status
***** *********              **********               ****    ******   ******
  1   HPUltrium2-SCSI1       /dev/rmt/4cbn            hcart2   Yes      UP
        TLD(0) Definition       DRIVE=2
  2   HPUltrium2-SCSI2       /dev/rmt/3cbn            hcart2   Yes      UP
        TLD(0) Definition       DRIVE=3

Currently defined robotics are:
  TLD(0)     robotic path = /dev/sg/c0tw2100001086103eecl0,
             volume database host = foobar7

Any help or comments will be much appreciated : ) 

BackGround Info

veritas netbackup 5.1 MP7
solaris 10 media server
Storage Tek L100 Tape library, LT0 2 drives

17 Replies
Highlighted

Probably various places you could look:

But, first step check your Storage Unit configuration, especially maximum concurrent write drives.


***EDIT***
Also notice your O/S sees three drives but NetBackup only has two configured - /dev/rmt/0 not in the library? If so, may need to reconfigure or re-add your tape drives.

Any output from tpautoconf -report_disc  ?

Highlighted

Hi Andy, I am quite the

Hi Andy,
I am quite the newbie, how to check the maximum concurrent write drives from the cmdline ?

And you 're right about /dev/rmt/0 being missing from tpconfig -d report.
We replaced that drive earlier, with a similar model that's of a higher firmware.
But we did not down the drive prior to replacement.
Ever since then, I noticed that /dev/rmt/0 was reported as down by tpconfig.

Then I have removed that drive for some reasons.

Now i have added it back : ) 


root@SUBCTUXS7:bptm[2052]# tpconfig -add -drive -type hcart2 -shared yes
-index 0 -path /dev/rmt/0cbn -asciiname HPUltrium2-SCSI0
added drive index 0 of type hcart2 to configuration
root@SUBCTUXS7:bptm[2053]# tpconfig -d
Index DriveName              DrivePath                Type    Shared   Status
***** *********              **********               ****    ******   ******
  0   HPUltrium2-SCSI0       /dev/rmt/0cbn            hcart2   Yes      UP
  1   HPUltrium2-SCSI1       /dev/rmt/4cbn            hcart2   Yes      UP
        TLD(0) Definition       DRIVE=2
  2   HPUltrium2-SCSI2       /dev/rmt/3cbn            hcart2   Yes      UP
        TLD(0) Definition       DRIVE=3

Currently defined robotics are:
  TLD(0)     robotic path = /dev/sg/c0tw2100001086103eecl0,
             volume database host = SUBCTUXS7


But still the media / master server is using 1 tape at any one time.






Highlighted

Have you no access to the Admin Console GUI?

Some things are so much easier to view or change that way! (inc. adding 'new' drives Smiley Wink )

bpstulist -U
should show you the number of drives 'allowed' for each storage unit you have configured.
Highlighted

its sounds like an configuration issue

i recommend you to remove all the drives and robot, then add them through storage configuration wizard

i dont know how to spell in unix
Highlighted

Hi Andy, here's the output

Hi Andy,
here's the output from bpstulist

I have 5 media server.
Note: the master netbackup server is SUBCTUXS7.

For Clarification:

All the 5 media servers is connected
same L100 storageTek Lib via FC through
a san switch.

Does that mean that I have I can only use only 1
drive at any one time on subctuxs7, since
" Max MPX/drive:" is "1"  ?

Forget to mention that the jobs that are queued up
belongs to the same volume pool. Is this pertinent ?

Hi Srikanth,
I will reconfigure all my drives as a last resort. Never done that before ....


root@SUBCTUXS7:bptm[2066]# bpstulist -U

Label:             SUBCTUXS7-hcart2-robot-tld-0
Storage Unit Type: Media Manager
Host Connection:   SUBCTUXS7
Number of Drives:  3
On Demand Only:    yes
Max MPX/drive:     1
Density:           hcart2 - 1/2 Inch Cartridge 2
Robot Type/Number: TLD / 0
Max Fragment Size: 1048576 MB

Label:             SUBCTUXS1-hcart2-robot-tld-0
Storage Unit Type: Media Manager
Host Connection:   SUBCTUXS1
Number of Drives:  3
On Demand Only:    yes
Max MPX/drive:     1
Density:           hcart2 - 1/2 Inch Cartridge 2
Robot Type/Number: TLD / 0
Max Fragment Size: 1048576 MB

Label:             SUBCTUXS2-hcart2-robot-tld-0
Storage Unit Type: Media Manager
Host Connection:   SUBCTUXS2
Number of Drives:  3
On Demand Only:    yes
Max MPX/drive:     1
Density:           hcart2 - 1/2 Inch Cartridge 2
Robot Type/Number: TLD / 0
Max Fragment Size: 1048576 MB



Label:             vapps98a-hcart2-robot-tld-0
Storage Unit Type: Media Manager
Host Connection:   vapps98a
Number of Drives:  3
On Demand Only:    yes
Max MPX/drive:     1
Density:           hcart2 - 1/2 Inch Cartridge 2
Robot Type/Number: TLD / 0
Max Fragment Size: 1048576 MB








Highlighted

The key here is

Number of Drives: 3

This means that it should use all three drives. The Max MPX/drive set to 1 means that it will only send one job to that drive at any one time.

The way you have it set up I would expect 3 jobs to run at any one time (one to each drive) with the remainder Q'ing. Unless you have a max jobs per policy set to one which is another area you could check.

Also, you mention same volume pool - could it be that you have a max number of partially full media set? (altho' having said that I'm not sure at which release of NetBackup that this setting became available)

((Aside: For our 'main' STU we have mpx set to 32 - we then restrict the number of jobs that can go to each drive at the Policy level))

Did you try the tpautoconf -report_disc command? This may show some config issues that can be resolved without resorting to reconfiguring all your drives (although that may be the best way to sort them in the end).

Highlighted

Try this

try exceuting nbrbutil -resetall or nbrbutil -resetmediaserver <media_server> on master server

this will remove all the stale settings... inorder to exceute this you need to ensure that no job is active in activity moniter..

i hope this will resolve your issue.
Highlighted

NBU 5.1 user

<quote>

BackGround Info

veritas netbackup 5.1 MP7

</quote>

That's a NBU 6.x command you suggest, the user is on 5.1 -  time for the "Doohh" expression ;-)
Highlighted

Hi Andy, The way you have it

Hi Andy,

The way you have it set up I would expect 3 jobs to run at any one time (one to each drive) with the remainder Q'ing. Unless you have a max jobs per policy set to one which is another area you could check.

Any idea where can i check that setting ?
cmdline again would be good . Then again I dont remember explicitly setting that.
Is this setting tied to per policy or it  like a 'global' setting ? 

Also, you mention same volume pool - could it be that you have a max number of partially full media set? (altho' having said that I'm not sure at which release of NetBackup that this setting became available)

where can i check that as well ? 
Actually I can do a positive control by
by queuing some jobs from different policy, when i m back at work tml.


will keep you guys posted Smiley Happy


 

Highlighted

Maximum jobs/policy set at Policy level

Try:

bppllist  policy_name  -U


there's an entry towards the start for the policy itself "Max Jobs/Policy"

There's also a Maximum MPX value at the Schedule level.

Seem to think 'maximum partially full media' setting came out after 5.1 (not sure when exactly & can't find the 'proof' at the mo!) so probably not valid

Any luck trying that tpautoconf -report_disc command?


Highlighted

> Is this setting tied to per

> Is this setting tied to per policy or it like a 'global' setting ? With "bpplist POLICY -L": Max Jobs/Policy, you'll see the policy limit.
Highlighted

Hi Andy,just check on the

Hi Andy,

just checked out on the 'Max Jobs/Policy' . its set to
'Unlimited' for ALL my policies.
So we have ruled that out for now.

I have also manually ran 2 jobs at the time, which use different volume pools.
The 2nd job ends up being Que'ed as well.
So its not a matter of SAME volume pool.

Here's the output from tpautoconf

======================= Missing Device (Drive) =======================
 Drive Name = HPUltrium2-SCSI2
 Drive Path = /dev/rmt/3cbn
 Inquiry = "HP      Ultrium 2-SCSI  F63Z"
 Serial Number =
 TLD(0) definition Drive = 3
 Hosts configured for this device:
  Host = SUBCTUXS1
  Host = SUBCTUXS7
  Host = subctux03

How can this device be 'missing' ? 
Is there something wrong with my global device database ?
Seems to me that the current hardware is not in sync
with the global device database.
( We have replaced a drive recently, but
its was /dev/rmt/0cbn. Then again the device path might have
changed because we reconfigure the devices on the OS path )

root@SUBCTUXS7:bptm[2032]# tpconfig -d
Index DriveName              DrivePath                Type    Shared   Status
***** *********              **********               ****    ******   ******
  0   HPUltrium2-SCSI0       /dev/rmt/0cbn            hcart2   Yes      UP
  1   HPUltrium2-SCSI1       /dev/rmt/4cbn            hcart2   Yes      UP
        TLD(0) Definition       DRIVE=2
  2   HPUltrium2-SCSI2       /dev/rmt/3cbn            hcart2   Yes      UP
        TLD(0) Definition       DRIVE=3

Currently defined robotics are:
  TLD(0)     robotic path = /dev/sg/c0tw2100001086103eecl0,
             volume database host = SUBCTUXS7

Standalone drive volume database host = SUBCTUXS7



Highlighted

go ahead and reconfigure

did you re probe the drives after you changed it, it seems to be global database stale issue, removing  and reconfiguring drives will definetly solve the issue

Highlighted

Srikanth is correct

a reconfigure will certainly resolve the issue & maybe something you want to consider?

However, looking at the tpautconf -report_disc output that confirms that at some point you have changed drive HPUltrium2-SCSI2. NetBackup uses drive serialisation for identification & this can lead to issues if you don't 'reconfigure' a drive if it has been changed.

Have a read of this document:

DOCUMENTATION: How to update Netbackup for a replaced tape drive without deleting and re-adding the...


It takes the tpautoconf command one step further.

Using your results above: tpautoconf -replace_drive HPUltrium2-SCSI2 -path /dev/rmt/3cbn should sort out that one.

I'm not happy about drive 0 either - from the tpconfig command it looks different to the others - not attached to the library as far as NetBackup is concerned - so you may need to look at re-configuring (deleting/recreating) that one also.

***EDIT***
From your output you can see that there is a line missing that would indicate drive 0 is NOT attached to library TLD(0) as DRIVE=?? so you missed an entry in your configuration:
  0   HPUltrium2-SCSI0       /dev/rmt/0cbn            hcart2   Yes      UP
  1   HPUltrium2-SCSI1       /dev/rmt/4cbn            hcart2   Yes      UP
        TLD(0) Definition       DRIVE=2
  2   HPUltrium2-SCSI2       /dev/rmt/3cbn            hcart2   Yes      UP
        TLD(0) Definition       DRIVE=3

Hopefully, if you reconfigure drive 0 to attach it to the library ( tpconfig -update -drive 0 -robot 0 -robtype TLD -robdrnum 1 - not 100% sure about this command as it can be done with a couple of clicks in the GUI! - and that from a UNIX man!!)  & use tpautoconf -replace_drive etc for drive 3 you should start using all the drives again in your library!
Highlighted

Hi Guys, I am fairly

Hi Guys,
I am fairly convinced that I should do a device reconfigure now : )

Before I do that, here's some stumbling blocks i forsee

1) I think i am using the SSO option in netbackup, because my L100 tape lib is shared by
5 media servers.

2) One of the media server has already been decommissioned ( i.e became scrap metal ), however the prev SysAdmin did not remove it from the current configuration. i.e i can still see that media server in the Java Admin GUI on the master server.  ( Yep... its almost the prev SA's fault. hehe ... )
As a result, operations like 'Device Monitor' , 'Devices -> Hosts' @ the jnbSA will hang for some time.
Presumpably its trying to pool the decommissioned media server.

Will 1) and 2) when put together , create some kind of complication during device reconfiguration ?

Thanks in advance .
Highlighted

here we go on how to decommission media server

Highlighted

did you reconfigure the drives?

did you reconfigure the drives?