08-23-2012 04:04 AM
Netbackup 6.5.6 with a DSU staging area has developed a problem. Instead of leaving the image on the DSU once it has been copied to tape, Image Cleanup is running and removing the DSU image straight away. The DSU is now empty of what should be the primary copy for restore. The High water mark is at 95% low at 25%. I cannot see any way of determining why the clean up is being triggered any help on returning this system to the way it should be/and was working, would be appreciated.
Solved! Go to Solution.
08-24-2012 02:22 AM
It seems that an nbdelete ran to remove all the images on the DSU, once cleared the images have started to be retained as expected once the image was copied to tape. The multiple image cleanup instances must all have been initiated from the single nbdelete. The problem is resolved therefore.
08-23-2012 04:16 AM
If the image has been copied, and new backups run that cause the high water mark to be reached, then NBU will delete any image that has been copies, to free up disk space.
As you have not mentioned how full the disk is, I cannot say that this isn't the cause.
EMM / nbstserv log would be the place to look from memory
Martin
08-23-2012 04:36 AM
bpdm and admin logs as well...
PLEASE schedule upgrade ASAP - NBU 6.x will be EOSL on 3 Oct 2012...
08-23-2012 04:51 AM
The DSU is 6.5TB the nightly backup is 400Gb normaly a few weeks of data is stored, currently zero.
08-23-2012 05:09 AM
Hi Marianne,
The change request to upgrade has been in for some time, the approval is all that is required!
The bpdm log extract below shows everything working, the disk volumes are correct and the space left after the copy completes is correct, but the cleanup should not be deleteing the primary image if sufficent space exists on the DSU.
08:31:45.819 [9588.4268] <2> volume_cleanup_init: Initiating volume cleanup subtype = 1, pathname = F:\, media_server = yyyyyyyy, storage_server = yyyyyyyy
08:31:45.819 [9588.4268] <2> volume_cleanup_init: using wait values: wait_sleep = 0, wait_initial = 0, wait_limit = 0
08:31:45.819 [9588.4268] <2> bp_sts_get_lsu: entry F:\
08:31:45.835 [9588.4268] <2> volume_cleanup: getting initial capacity and free space of the volume...
08:31:45.850 [9588.4268] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2048: VN_REQUEST_SERVICE_SOCKET: 6 0x00000006
08:31:45.850 [9588.4268] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2062: service: bpdbm
08:31:46.022 [9588.4268] <2> logconnections: BPDBM CONNECT FROM 172.16.80.59.56253 TO 172.16.80.115.13724
08:31:46.552 [9588.4268] <4> volume_cleanup: initial volume F:\: Kbytes total capacity: 8716915648, used space: 126144, free space: 8716789504
08:31:46.693 [9588.4268] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2048: VN_REQUEST_SERVICE_SOCKET: 6 0x00000006
08:31:46.693 [9588.4268] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2062: service: bpdbm
08:31:46.895 [9588.4268] <2> logconnections: BPDBM CONNECT FROM 172.16.80.59.56318 TO 172.16.80.115.13724
08:31:47.441 [9588.4268] <4> volume_cleanup: ending volume F:\: Kbytes total capacity: 8716915648, used space: 126144, free space: 8716789504
08:31:47.441 [9588.4268] <2> dsm_get_storageserver_type: .\dsmlib_diskgroup.cpp: 2218 sev 2 disk_subtype = 1
08:31:47.441 [9588.4268] <2> dsm_get_storageserver_type: .\dsmlib_diskgroup.cpp: 2219 sev 2 sts_type_len (caller allocated mem-size) 65
08:31:47.441 [9588.4268] <2> dsm_get_storageserver_type: .\dsmlib_diskgroup.cpp: 2227 sev 2 DiskServiceHelperidentifyStuServerType() = BasicDisk
08:31:47.441 [9588.4268] <2> dsmlibp_emmserver: .\dsmlib_common.cpp: 128 sev 4 EMM server set to zzzzzzz
08:31:47.441 [9588.4268] <2> dsmlibp_connect_dsm: .\dsmlib_common.cpp: 177 sev 4 Connecting to DiskServiceMgr on zzzzzzz
08:31:47.488 [9588.4268] <2> dsml_extract_diskvolumeinfo: .\dsmlib_popext.cpp: 574 sev 2 property.name=DSM::diskGroupName, property.value=yyyyyyyy_DSU
08:31:47.488 [9588.4268] <2> dsm_get_diskvolumeinfo_by_volumeid: .\dsmlib_diskgroup.cpp: 1391 sev 2 diskvolumeinfo.diskgroup.name = yyyyyyyy_DSU
08:31:47.488 [9588.4268] <2> dsm_get_diskvolumeinfo_by_volumeid: .\dsmlib_diskgroup.cpp: 1403 sev 2 diskvolumeinfo.diskgroup.master_server = zzzzzzz
08:31:47.488 [9588.4268] <2> dsm_get_diskvolumeinfo_by_volumeid: .\dsmlib_diskgroup.cpp: 1414 sev 2 diskvolumeinfo.diskgroup.server_type = BasicDisk
08:31:47.488 [9588.4268] <2> dsml_populate_diskvolume: .\dsmlib_popext.cpp: 781 sev 2 diskvolume.name = Internal_16
08:31:47.488 [9588.4268] <2> dsml_populate_diskgroup: .\dsmlib_popext.cpp: 743 sev 2 diskgroup.name = yyyyyyyy_DSU
08:31:47.488 [9588.4268] <2> dsml_populate_diskgroup: .\dsmlib_popext.cpp: 751 sev 2 diskgroup.master_server = zzzzzzz
08:31:47.488 [9588.4268] <2> dsml_populate_diskgroup: .\dsmlib_popext.cpp: 754 sev 2 diskgroup.server_type = BasicDisk
08:31:47.488 [9588.4268] <2> dsml_populate_diskgroup: .\dsmlib_popext.cpp: 756 sev 2 diskgroup.access_media_server =
08:31:47.488 [9588.4268] <2> dsm_update_diskvolume: .\dsmlib_diskgroup.cpp: 1491 sev 2 Calling dsm->updateDiskVolume()
08:31:47.535 [9588.4268] <2> dsml_extract_diskvolume: .\dsmlib_popext.cpp: 332 sev 2 diskvolume.name = Internal_16
08:31:47.535 [9588.4268] <2> dsml_extract_diskvolume: .\dsmlib_popext.cpp: 338 sev 2 diskvolume.id = F:\
08:31:47.535 [9588.4268] <2> dsml_extract_diskvolume: .\dsmlib_popext.cpp: 342 sev 2 diskvolumeinfo.total_capacity = 8926121623552
08:31:47.535 [9588.4268] <2> dsml_extract_diskvolume: .\dsmlib_popext.cpp: 434 sev 2 property.name=DSM::master, property.value=zzzzzzz
08:31:47.535 [9588.4268] <2> dsml_extract_diskvolume: .\dsmlib_popext.cpp: 434 sev 2 property.name=DSM::flags, property.value=14
08:31:47.535 [9588.4268] <2> dsml_extract_diskvolume: .\dsmlib_popext.cpp: 409 sev 2 diskvolumeinfo.nbu_state = conv_fn(1)
08:31:47.535 [9588.4268] <2> dsml_extract_diskvolume: .\dsmlib_popext.cpp: 434 sev 2 property.name=DSM::fileSystemType, property.value=0
08:31:47.535 [9588.4268] <2> dsml_extract_diskvolume: .\dsmlib_popext.cpp: 434 sev 2 property.name=DSM::fileSystemOrdinal, property.value=16
08:31:47.535 [9588.4268] <2> dsml_extract_diskvolume: .\dsmlib_popext.cpp: 424 sev 2 diskvolumeinfo.max_write_mounts = 1
08:31:47.535 [9588.4268] <2> dsml_extract_diskvolume: .\dsmlib_popext.cpp: 429 sev 2 diskvolumeinfo.num_write_mounts = 1
08:31:47.535 [9588.4268] <2> dsml_extract_diskvolume: .\dsmlib_popext.cpp: 434 sev 2 property.name=DSM::maxReferenceWriters, property.value=2147483647
08:31:47.535 [9588.4268] <2> dsml_extract_diskvolume: .\dsmlib_popext.cpp: 414 sev 2 diskvolumeinfo.max_read_mounts = 0
08:31:47.535 [9588.4268] <2> dsml_extract_diskvolume: .\dsmlib_popext.cpp: 419 sev 2 diskvolumeinfo.num_read_mounts = 0
08:31:47.535 [9588.4268] <2> dsml_extract_diskvolume: .\dsmlib_popext.cpp: 434 sev 2 property.name=DSM::maxReferenceReaders, property.value=0
08:31:47.535 [9588.4268] <2> dsml_extract_diskvolume: .\dsmlib_popext.cpp: 366 sev 2 diskvolumeinfo.free_space = 8925992452096
08:31:47.535 [9588.4268] <2> dsml_extract_diskvolume: .\dsmlib_popext.cpp: 434 sev 2 property.name=DSM::freePhysicalSpace, property.value=404719992832
08:31:47.535 [9588.4268] <2> dsml_extract_diskvolume: .\dsmlib_popext.cpp: 371 sev 2 diskvolumeinfo.potential_free_space = 0
08:31:47.535 [9588.4268] <2> dsml_extract_diskvolume: .\dsmlib_popext.cpp: 376 sev 2 diskvolumeinfo.committed_space = 0
08:31:47.535 [9588.4268] <2> dsml_extract_diskvolume: .\dsmlib_popext.cpp: 381 sev 2 diskvolumeinfo.precommitted_space = 0
08:31:47.535 [9588.4268] <2> dsml_extract_diskvolume: .\dsmlib_popext.cpp: 391 sev 2 diskvolumeinfo.system_tag = Generic disk volume
08:31:47.535 [9588.4268] <2> dsml_extract_diskvolume: .\dsmlib_popext.cpp: 386 sev 2 diskvolumeinfo.disk_media_id = @aaaa0
08-23-2012 05:10 AM
Check the logs that we've mentioned and PLEASE upgrade NBU!
08-23-2012 06:31 AM
The job ID running the image cleanup :
21/08/2012 07:11:12 - begin
21/08/2012 07:11:12 - Info nbdelete(pid=12836) deleting expired images. Media Server: xxx Media: F:\
21/08/2012 07:11:14 - started process bpdm (6232)
From Admincmd log
nbdelete is doing what it is told:
07:11:12.824 [12836.12844] <4> nbdelete: RemoveAllVolumes: There are images to be deleted, asking for a Job ID.
I cannot find the nbstserv.log. Can you provide a path for it please?
I found a bug report that seems to be a similar problem although not applicable in this case.
BUG REPORT: When using Storage Lifecycle Policies (SLP) dup manager in nbstserv may create zombie nbdelete processes when calling 'nbdelete -allvolumes'.
08-23-2012 07:10 AM
I have seen this several times and there is only one way to stop it ---- reboot the master and media servers to remove the rougue / orphaned process
All will then be fine for a while - if you do not reboot every image will be duplicated and then removed
The upgrade should help you but it can also be memory related - how much memory do your master and media servers have and have you done any memory tuning (pagedpool / desktop heap)?
Hope this helps
08-23-2012 08:14 AM
Hi Mark
I will give it a try and comeback once completed.
08-24-2012 02:22 AM
It seems that an nbdelete ran to remove all the images on the DSU, once cleared the images have started to be retained as expected once the image was copied to tape. The multiple image cleanup instances must all have been initiated from the single nbdelete. The problem is resolved therefore.