cancel
Showing results for 
Search instead for 
Did you mean: 

Catalog Backup Issue & Vault Job Failing Nightly

sparkyssb
Level 3

So I just took over my company's tape backup system so I'm new to NetBackup.  Last week I had a catalog backup hang for days.  And I think this was not allowing my vault job to run so I wasn't getting tapes ejected nightly like I'm used to.  The catalog job wouldn't cancel so I had to stop all NetBackup services and restart a few times for it to clear.  After that, the Catalog Backup is running better but the vault job is still failing although I do get a tape or two to eject nightly now.  I don't really know if these two issues are seperate or if they work together...remember I'm new.  So basically, this is what I got now:

The catalog backup child job is successful stating "the requested operation was successfully completed(0)" but the parent job says "the requested operation was partially successful(1).  The job was successfully completed, but some files may have been busy or unaccessible. See the problems report or the client's logs for more details."

Looking at the problems log, this is what I see for the latest Catalog backup:

3/18/2015 11:15:08 AM wwr-nbu   Error 0 Retrieve <16> vltrun@DoCatalogBu^4807 Catalog Backup step failed!
3/18/2015 11:15:08 AM wwr-nbu   Error 0 Retrieve <16> vltrun@DoCatalogBu^4807 FAILed NB_EC=1 NB_MSG=the requested operation was partially successful
3/18/2015 11:15:08 AM wwr-nbu   Error 0 Retrieve <16> vltrun@DoCatalogBu^4807: Leaving with DMN=1 SC=1
3/18/2015 11:15:08 AM wwr-nbu   Error 0 Retrieve <16> vltrun@VltSession::lock_and_operate^4807 OP_STEP=catalog_backup FAILED
3/18/2015 11:15:08 AM wwr-nbu   Error 0 Retrieve <16> vltrun@VltSession::lock_and_operate^4807 FAILed NB_EC=294 NB_MSG=vault catalog backup failed
3/18/2015 11:15:08 AM wwr-nbu   Error 0 Retrieve <16> vltrun@VltSession::lock_and_operate^4807: Leaving with DMN=1 SC=294
3/18/2015 11:15:07 AM wwr-nbu wwr-nbu Error 136809 Backup catalog backup exited with status 1 (the requested operation was partially successful)

I'm guessing I need to clear some file locks but I'm unsure where?

The Vault job fails stating "vault catalog backup failed(294)".  However it does seem to be ejecting some tapes with valid images on them from time to time.  Here's the latest Vault failure.

3/18/2015 11:25:56 AM wwr-nbu   Error 0 Retrieve <16> vltrun@main Vault Session FAILED [PRFL=Vault_981 SID=4807 JID=136772 EC=294]
3/18/2015 11:25:56 AM wwr-nbu   Error 0 Retrieve <16> vltrun@main FAILed NB_EC=294 NB_MSG=vault catalog backup failed
3/18/2015 11:25:56 AM wwr-nbu wwr-nbu Error 136772 Backup vault exited with status 294 (vault catalog backup failed)

Attached are some screen schots of the jobs.  Our support contract just ran out and my company doesn't want to renew so any direction or help you can provide would be appreciated.   Thank you.

2 REPLIES 2

SymTerry
Level 6
Employee Accredited

This error was associated with a know bug in 7.1 (TECH170367). What version are you running?

If something else then 7.1, troubleshooting in the tech note would still apply, refer to the logs mentioned in the tech note for the same kind of errors and try the workaround.

sparkyssb
Level 3

Thanks for the reply.  I'm running version 7.5.0.7.  I checked my bpdbm log and could not find any references of a bad image header.  I also ran bpdbm -consistency and searched through the output but could not find any with a bad image header or corruption. And nothing is exiting with a "file read failed, status 13". I don't think this is the problem.

I attached the full bpdbm and vault log from yesterday to this post.  Perhaps this could help?  If any other logs would help, please let me know.