cancel
Showing results for 
Search instead for 
Did you mean: 

DSSU full (129), but relocation jobs find nothing to be done

Thorsten_Jens
Level 4

Hi,

 

we have several Basic Disk DSSU connected to a media server (all systems Windows 2008 R2, NBU 7.5.0.1). Everything worked fine for years, now suddenly we are getting error 129 on 3/4 of the DSSUs. The disks are indeed full of NBU images, the relocation jobs run periodically without error, but find no images to relocate. Manual relocation also relocates nothing. All DSSUs are on separate physical disks.

bpimmedia show that NBU knows that there are images on the disks.

 

Any ideas what to check?

3 ACCEPTED SOLUTIONS

Accepted Solutions

Marianne
Level 6
Partner    VIP    Accredited Certified

NBU 7.5. up to 7.5.0.2 had various bugs and Symantec strongly recommends to be on 7.5.0.3 or later.
NBU 7.5.0.6 is the latest. No guarantee that it will fix your issue, but worth a try.

What is the HWM and LWM on the DSUs?
Do you have admin, bpdm and bptm log folders on your media server?

bpdm log should tell us if disk cleanup is attempted.
admin log should give us info about duplication/relocation job.

The assumption is that all existing images on disk have been duplicated and therefore nothing left to do?

These TNs explains how it should be working:

Disk Staging Relocation Behavior: http://www.symantec.com/docs/TECH44719

Disk Staging Cleanup Behavior: http://www.symantec.com/docs/TECH66149 

 

 

View solution in original post

Mark_Solutions
Level 6
Partner Accredited Certified

How long since anything was duplicated? - run a verify query on the catalog section against copy2 to see what the last ones were.

Have you had a change in Anti Virus or similar that could be locking things?

Having said that .. when i have seen issues it was due to a locked process - so if you get chance reboot the Master and Media Servers and then see if it starts working again

View solution in original post

ontherocks
Level 6
Partner Accredited Certified

TRY this :-

https://www-secure.symantec.com/connect/forums/old-backups-not-being-deletedexpired-policy#comment-1...

 

nbdelete main function is to remove expired fragments from disk units. It can also be used to purge those image fragments from the database when the cleaning fails for one or another reason.

To force a cleanup of all diskpools, and remove references in database, run
<install path>\veritas\netbackup\bin\admincmd\nbdelete -allvolumes -force

 

The command itself sounds brutal, but is considered a safe command due to it works only on expired images.

There are also switches to make it run only on designated storage units.

nbdelete -h to see all other switches.

 

View solution in original post

11 REPLIES 11

Marianne
Level 6
Partner    VIP    Accredited Certified

NBU 7.5. up to 7.5.0.2 had various bugs and Symantec strongly recommends to be on 7.5.0.3 or later.
NBU 7.5.0.6 is the latest. No guarantee that it will fix your issue, but worth a try.

What is the HWM and LWM on the DSUs?
Do you have admin, bpdm and bptm log folders on your media server?

bpdm log should tell us if disk cleanup is attempted.
admin log should give us info about duplication/relocation job.

The assumption is that all existing images on disk have been duplicated and therefore nothing left to do?

These TNs explains how it should be working:

Disk Staging Relocation Behavior: http://www.symantec.com/docs/TECH44719

Disk Staging Cleanup Behavior: http://www.symantec.com/docs/TECH66149 

 

 

Thorsten_Jens
Level 4
Hi Marianne, well, ad-hoc upgrade is not really an option right now. Also, this setup has been running smoothly since 2012. HWM and LWM are default. I shutdown all services and created the required log folders. I will get back to you with the results. > The assumption is that all existing images on disk have been duplicated and therefore nothing left to do? I don't really know what to assume right now: The parent relocation jobs run every couple of hours, exit with status 0 but never start any sub jobs for the images. I guess some part of NBU thinks there are no images on the disks to duplicate, while another part thinks the existing images can not be deleted from disk.

Mark_Solutions
Level 6
Partner Accredited Certified

How long since anything was duplicated? - run a verify query on the catalog section against copy2 to see what the last ones were.

Have you had a change in Anti Virus or similar that could be locking things?

Having said that .. when i have seen issues it was due to a locked process - so if you get chance reboot the Master and Media Servers and then see if it starts working again

Thorsten_Jens
Level 4

How long since anything was duplicated? - run a verify query on the catalog section against copy2 to see what the last ones were.

Hi,

not really sure what to make of this information, since only one media server and only 3/4 DSSUs on that server have the problem. I don't know how to filter that in the Catalog Query.

In the GUI, I ran a Catalog Query looking for primary copies on disks, noted one image which had its primary copy (copy 1) on one affected disk. Result of bpimagelist  for that backup id:

c:\Program Files\Veritas\NetBackup\bin\admincmd>bpimagelist -backupid BWS0119_1397841569
IMAGE BWS0119 0 0 9 BWS0119_1397841569 HH_VMware 40 *NULL* root Cumulative-Inc 4 3 1397841569 156 1400519969 0 0 1553178 1187 1 2 0 HH_VMware_1397841569_INCR.f *NULL* *NULL* 0 2 0
0 0 *NULL* 0 0 1 0 0 1397286560 1397286560 *NULL* 0 0 0 *NULL* 589438 1 0 153140 0 0 *NULL* *NULL* 0 1397840687 0 0 *NULL* *NULL* 0 0 0 0
HISTO 0 0 0 0 0 0 0 0 0 0
FRAG 2 -1 150 0 2 6 51 0624L5 bws0112.naval.dom 65536 1347867 1397669233 2 0 *NULL* 1400519969 0 65539 0 0 0 1 0 1397843389 1 1 *NULL* *NULL* 0 0
FRAG 2 1 1553029 0 2 6 50 0624L5 bws0112.naval.dom 65536 1323598 1397669233 2 0 *NULL* 1400519969 0 65539 0 0 0 1 0 1397843389 1 1 *NULL* *NULL* 0 0

Doesn't that tell me that the only copy is copy2, on medium 0624L5 (a tape)?

I rebooted both master and media server, still the same behaviour.

I am not aware of any changes in our environment - and if so, I think they would affect all DSSUs and not just a couple.

Mark_Solutions
Level 6
Partner Accredited Certified

These are for 12th and18th april - when was the last time  a backup worked to this storage unit?

Run the verify for a recent period against Copy 1 and then change to copy 2 to see if there are the same number of tape copies - if so everything has been duplicated to tape - if not there is an issue

Thorsten_Jens
Level 4

OK, I don't yet know why, but suddenly the affected DSSUs have been cleaned out (no more images on them at all). I once again stopped all processes on the medai server in that timeframe, so maybe it was a hung process after all. The Catalog query for primary copies on the affected disks show no result anymore, too.

I'll update you on monday, after the big backup jobs this weekend.

Thank you both so far.

Marianne
Level 6
Partner    VIP    Accredited Certified

Maybe when you restarted NBU after creating log folders it kicked in outstanding disk cleanups.

But I would not expect disk to be 'cleaned out'! Just to LWM.

What is logged in bpdm log?

Please copy to bpdm.txt and upload as File attachment.

Mark_Solutions
Level 6
Partner Accredited Certified

Seen that many times - once they start to go only a reboot will stop them - it is a known bug so you do need to upgrade as soon as possible - will see if i can dig out the tech note

Mark_Solutions
Level 6
Partner Accredited Certified

Cannot find the note but found my support thread - it was going back a while and at an earlier version of NBU

Still worth upgrading anyway but i have the feeling that it was being caused by a lack of performance on the Master server .. either memory or pagepool memory, just overloaded

A NBU upgrade (earlier to your version!) and a relocation of the Master to a new server prevented the issue occurring again - we also did all of the anti virus exclusions for NBU at the time

To many changes in one go i know but based on your version maybe look at performance and Anti virus first

Hope this helps

ontherocks
Level 6
Partner Accredited Certified

TRY this :-

https://www-secure.symantec.com/connect/forums/old-backups-not-being-deletedexpired-policy#comment-1...

 

nbdelete main function is to remove expired fragments from disk units. It can also be used to purge those image fragments from the database when the cleaning fails for one or another reason.

To force a cleanup of all diskpools, and remove references in database, run
<install path>\veritas\netbackup\bin\admincmd\nbdelete -allvolumes -force

 

The command itself sounds brutal, but is considered a safe command due to it works only on expired images.

There are also switches to make it run only on designated storage units.

nbdelete -h to see all other switches.

 

Thorsten_Jens
Level 4

Everything seems back to normal now. Backups are stored on the DSSUs, relocation jobs run without problems.

I did issue the "nbdelete -allvolumes -force" while poking around, maybe that was the reason the disks were cleaned out completely.

An upgrade to 7.6 is in the planning stages.

Thank you all.