cancel
Showing results for 
Search instead for 
Did you mean: 

Duplication Job Fails

SHEKAIB1
Level 1

Hello All,

NBU Master:7.6.0.2

OS:WIN 2K8 R2

I am facing few issues with duplication jobs.Everytime they are triggered,they go in queued and job state state says "Drives are in Use" although i have drives which are active and ready for use.Secondly.they fail with 96 most of the times and sometimes with ec 191.

I need to know from experts what all steps i need to follow to proceed with troubleshooting this issue.

Thanks.

19 REPLIES 19

RamNagalla
Moderator
Moderator
Partner    VIP    Certified

Hello Shekaib1,

could you provide more details

1)detail status of the duplicate job
2)how many tape libraries configured in this master server
3)if multiple libraries, does the avaliable drives belongs to same library that duplication job is looking?
4)could you show us the output of tpconfig -d and vmoprcmd -d from the media server that is haivng stroge unit allocated for this duplication jobs?
5)does backup jobs working fine on these drives?
6)what is the location of source copy? does it in tape of disk.. if tape.. does the tpae is avaliable in library ?

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified
Seems you have 3 separate issues that you need to troubleshoot separately:
1. Queued jobs with 'drives in use'
2. Status 96
3. Status 191

For the queued jobs, you need to tell us more about your config - how duplications are done (SLP, vault, manual), devices, STU config, vmoprd and other info requested by Ram.

Status 96: add more tapes that are in the robot, correct pool, correct density. Use available_media script to check.

Status 191 - copy the text in Job details and post here.

Please find the answers

1)10/10/2016 10:23:49 - requesting resource LCM_SUG_Tape_Duplicate
10/10/2016 10:23:50 - granted resource LCM_SUG_Tape_Duplicate
10/10/2016 10:23:50 - started process RUNCMD (2576)
10/10/2016 10:23:50 - ended process 0 (2576)
10/10/2016 10:23:50 - begin Duplicate
10/10/2016 10:23:51 - requesting resource SUG_Tape_Duplicate
10/10/2016 10:23:51 - requesting resource @aaaaj
10/10/2016 10:23:51 - reserving resource @aaaaj
10/10/2016 10:23:52 - Error nbjm(pid=7212) NBU status: 96, EMM status: No media is available
10/10/2016 10:23:52 - Error nbjm(pid=7212) NBU status: 96, EMM status: No media is available
10/10/2016 10:23:52 - Error nbjm(pid=7212) NBU status: 96, EMM status: No media is available
10/10/2016 10:23:53 - end Duplicate; elapsed time: 0:00:03
unable to allocate new media for backup, storage unit has none available(96)

2)1
4)tpcnfig -d
C:\Program Files\Veritas\Volmgr\bin>tpconfig -d
Id DriveName Type Residence
SCSI coordinates/Path Status
****************************************************************************
0 HP.ULTRIUM5-SCSI.000 hcart2 TLD(0) DRIVE=1
{1,0,5,0} UP
1 HP.ULTRIUM5-SCSI.001 hcart2 TLD(0) DRIVE=2
{1,0,6,0} UP
2 HP.ULTRIUM5-SCSI.002 hcart2 TLD(0) DRIVE=3
MISSING_PATH:{2,0,7,0}:HU1241RN9H DOWN
3 HP.ULTRIUM5-SCSI.003 hcart2 TLD(0) DRIVE=4
{1,0,4,0} UP
4 HP.ULTRIUM5-SCSI.004 hcart2 TLD(0) DRIVE=3
{1,0,3,0} UP

Currently defined robotics are:
TLD(0) SCSI coordinates = {1,0,5,1}

EMM Server = hg1-nbmas01.hug.hardygroup.co.uk

vmoprcmd -d

C:\Program Files\Veritas\Volmgr\bin>vmoprcmd -d

PENDING REQUESTS

<NONE>

DRIVE STATUS

Drv Type Control User Label RecMID ExtMID Ready Wr.Enbl. ReqId
0 hcart2 TLD - No - 0
1 hcart2 TLD - No - 0
2 hcart2 DOWN-TLD - No - 0
3 hcart2 TLD - No - 0
4 hcart2 TLD - No - 0

ADDITIONAL DRIVE STATUS

Drv DriveName Shared Assigned Comment
0 HP.ULTRIUM5-SCSI.000 Yes -
1 HP.ULTRIUM5-SCSI.001 Yes -
2 HP.ULTRIUM5-SCSI.002 Yes -
3 HP.ULTRIUM5-SCSI.003 Yes -
4 HP.ULTRIUM5-SCSI.004 Yes -

C:\Program Files\Veritas\Volmgr\bin>

5)yes backups are completing

6)Source is tape and duplication goes onto tapes only.Currently 4 tapes are showing active in media database.

we have SLP's defined for daily,weekly and monthly retentions.It seems only daily SLP gets completed but rest fail with ec 96.

SUG_Tape_Duplicate  <== Is this your Storage Unit Group for tape storage?

Can you check if it includes storage units for robot 0 & hcart2 density type? 

If it is, make sure the robot has available media (do an inventory first), make sure no other jobs are using the other 4 UP drives, run a small duplication job to see if it picks up the media. 

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified
You simply do not have enough tapes in the robot.
You need enough tapes for each retention level as NBU does not mix retentions on tape.

Please show us output of available_media as per my previous post.
Command is in C:\Program Files\Veritas\Netbackup\bin\goodies

attached is avaiable media output and SLP screenshot for reference

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

You forgot about available_media output....

Please copy the output to media.txt and upload.

PS: 
You show a screenshot of Monthly SLP with Infinity retention.
Do you realize that tapes in this pool with Infinity retention will never expire? i.e. can never be overwritten?
Only tapes in this pool with same retention level can be appended to.

 

Please find the list of medias 

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified
Oh dear!!! What is the reason for all those DBBACKUP tapes??? I have not seen this status since pre-NBU 6.x days. This status was a sign of serious catalog inconsistencies.
Best to download and run NBCC and submit the output to Veritas Support for analysis.

You simply do not have enough AVAILABLE tapes in the robot. And no Scratch pool?

Marianne,

 I do not know the reason of DBBACKUP tapes.I have been given this environment few days backup and i do not know where to start from.It seems root of problem looks like tapes only.i have also seen in past that we had scratch tapes in library but they were not being mounted on the drives.I did rob test to move medias and they did fine.

I would log a case with Support to get this checked and also get more tapes loaded to see if it makes a difference.Also would apprecite advice from experts on how to move further in this scenario.

Yes,we have scarcth pool but with zero tapes.

Also,we have around 30 tapes showing in media database.

 

C:\Program Files\Veritas\NetBackup\bin\admincmd>bpmedialist -m K08274
requested media id is not assigned to this host in the EMM database

C:\Program Files\Veritas\NetBackup\bin\admincmd>nbemmcmd -listmedia -mediaid K08
274
NBEMMCMD, Version: 7.6.0.2
The function returned the following failure status:
volume does not exist in database (35)
Command did not complete successfully.

 

can i put this media in library?

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

You made a typo error - media id is KO8274 not K08274.

Please try nbemmcmd command again with correct media id.

Run nbemmcd command as well for a tape with DBBACKUP status - e.g. KK7204

I am curious to know what is going on here.
The DBBACKUP status happened in pre-6.x days when each media server had its own mediaDB database and volDB was on the master. When mediaDB on a media server was lost, there was this inconcistency in the db's with tapes showing Assigned in volDB without matching nbemmcmd entry.

This should not happen in versions from 6.0 onwards as all these dbs were migrated into the EMM database on the master server.

Can you run these commands and upload output of each?

bpmedialist

vmquery -a -bx

available_media is a combination of above 2 commands, so I am curious to see what they produce separately.

Best to get that NBCC running and submit to Support as a matter of urgency.

In the meantime, eject all the DBBACKUP status tapes that are currently in the robot and replace with AVAILABLE tapes. 
You need to get duplications going ASAP to prevent huge SLP backlog.

And please start with a daily process to look for unassigned tapes and put them into Scratch pool.
If unassigned tapes are moved to Scratch pool, they will from there onwards go back to scratch when images on tapes expire. 

 

Please find the list of medias and vmquery as requested.

 

C:\Program Files\Veritas\NetBackup\bin\admincmd>nbemmcmd -listmedia -mediaid KK7204
NBEMMCMD, Version: 7.6.0.2
====================================================================
Media GUID: 5c31dc57-ae72-4597-8f36-221592b035f1
Media ID: KK7204
Partner: -
Media Type: HCART2
Volume Group: 000_00000_TLD
Application: Netbackup
Media Flags: 1
Description: Added by Media Manager
Barcode: KK7204L4
Partner Barcode: --------
Last Write Host: hg1-nbmed01.hug.hardygroup.co.uk
Created: 05/09/2016 15:24
Time Assigned: 29/09/2016 05:44
First Mount: 29/09/2016 08:22
Last Mount: 29/09/2016 08:22
Volume Expiration: -
Data Expiration: 13/10/2016 02:04
Last Written: 29/09/2016 13:32
Last Read: -
Robot Type: TLD
Robot Control Host: hg1-nbmed01.hug.hardygroup.co.uk
Robot Number: 0
Slot: 60
Side/Face: -
Cleanings Remaining: -
Number of Mounts: 1
Maximum Mounts Allowed: 0
Media Status: FULL
Kilobytes: 1026132906
Images: 7
Valid Images: 7
Retention Period: 1
Number of Restores: 0
Optical Header Size Bytes: 1024
Optical Sector Size Bytes: 0
Optical Partition Size Bytes: 0
Last Header Offset: 1099087
Adamm Guid: 00000000-0000-0000-0000-000000000000
Rsm Guid: 00000000-0000-0000-0000-000000000000
Origin Host: NONE
Master Host: hg1-nbmas01.hug.hardygroup.co.uk
Server Group: UNRESTRICTED_SHARING_GROUP
Upgrade Conflicts Flag:
Pool Number: 4
Volume Pool: ENCR_NetBackup
Previous Pool Name: -
Vault Flags: -
Vault Container: -
Vault Name: -
Vault Slot: -
Session ID: -
Date Vaulted: -
Return Date: -
Media on Hold: 0
====================================================================
Command completed successfully.

 

 

 

 

 

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified
Seems the ENCRYPT status is confusing available_media command.
We can only rely on bpmedialist and vmquery commands.
vmquery shows that there are hardly any unassigned tapes.

With Infinity retention, your company will need to purchase tapes as 'consumables'.

You really do not have enough Available tapes. And certainly nothing in the robot.

Add barcode rules to add all new tapes to Scratch and move all unassigned tapes to Scratch.

Discuss retention requirements with your management to see if there is a real need for Infinity retention.

Hello Marianne,

I have checked in media DB and found few tapes to be active,no time assigned on them,not in robot.I have put those tape in library and see them as active now and they are getting assigned.

Currently we have 14 duplication jobs running and 3 are active.1drive is being used for backup.Is there any way to fasten these duplication jobs.

Also one of the drives is showing missing path.Do i have delete the drive and then re add it

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified
"Currently we have 14 duplication jobs running and 3 are active."

Does this mean that the rest (11 jobs) are queued?
If so, what is listed in Job Details as the reason?

You may want to look for the SLP Best Practice Guide to read up about tuning of SLP Parameters and see if it helps with speeding up duplications.

About the MISSING path drive - you will need to check at OS-level first to ensure drive is seen.
Delete and re-add will work if drive is visible at OS-level, but Device Manager services need to be restarted. You cannot do this with running jobs.

yes,queued jobs had status "Drives are in Use".

But now due to shortage of tapes,mangement has decided to inactivate few SLP's.Since we have SLP for daily,weekly and monthly backups.

So i am now trying to disable monthly till we get new tapes.But when i try to disable it.

 

C:\Program Files\Veritas\NetBackup\bin\admincmd>nbstlutil inactive -lifecycle SUG_Dedupe_SUG_Tape_Monthly
nbstlutil: unknown operation '-lifecycle'
nbstlutil: unknown operation 'SUG_Dedupe_SUG_Tape_Monthly'

 

Is there something else i need to add/edit in the above command.

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

The command looks fine - just be very sure that you use a  hyphen (-) and not a dash (–) 

When we copy from a doc, it is sometimes a dash (–lifecycle) instead of hyphen (-lifecycle)

Also remember to open cmd with 'Run as Administrator'.