DFSR backup job that fail with status code 1 show NBU error in BAR
hi,
I'm running a few DFSR share backups using the backup selection = Shadow Copy Components:\User Data\Distributed File System Replication\DfsrReplicatedFolders\HomeShare\Home2 , for example. These are always to Disk target (open storage) and using Accelerator.
backups of DFSR share will usually run clean and finish with status 0 but sometimes completes with a status 1. I can ususally rerun the backup the next morning and it's happy.
The peculiar thing is if it's a status 1 and the logs don't show actual files skipped, BAR will then show an error if I try browsing the image for recovery:
"Unable to obtain list of files using the specified search criteria"
The "skipped" files seem to be the The shadow copy components themselves....resulting in a non-recoverable image. the logs of this scenario are as follows:
11/30/2016 5:00:00 PM - Info nbjm(pid=4396) starting backup job (jobid=2147876) for client cadc3dfsp004, policy DFSR_HomeShare_Home2, schedule Daily_DiffInc_Disk
11/30/2016 5:00:00 PM - Info nbjm(pid=4396) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=2147876, request id:{E5CDA389-208A-444D-820C-23AB30B45037})
11/30/2016 5:00:00 PM - requesting resource CACALDD002_PrimarySTU
11/30/2016 5:00:00 PM - requesting resource bkpnbmaster1.nal.local.NBU_CLIENT.MAXJOBS.cadc3dfsp004
11/30/2016 5:00:00 PM - requesting resource bkpnbmaster1.nal.local.NBU_POLICY.MAXJOBS.DFSR_HomeShare_Home2
11/30/2016 5:00:00 PM - granted resource bkpnbmaster1.nal.local.NBU_CLIENT.MAXJOBS.cadc3dfsp004
11/30/2016 5:00:00 PM - granted resource bkpnbmaster1.nal.local.NBU_POLICY.MAXJOBS.DFSR_HomeShare_Home2
11/30/2016 5:00:00 PM - granted resource MediaID=@aaaaX;DiskVolume=CACALDD002_STU;DiskPool=CACALDD002_Production;Path=CACALDD002_STU;StorageServer=CACALDD002.nal.local;MediaServer=cacalnbu005.nal.local
11/30/2016 5:00:00 PM - granted resource CACALDD002_PrimarySTU
11/30/2016 5:00:02 PM - estimated 79986824 Kbytes needed
11/30/2016 5:00:02 PM - Info nbjm(pid=4396) started backup (backupid=cadc3dfsp004_1480550402) job for client cadc3dfsp004, policy DFSR_HomeShare_Home2, schedule Daily_DiffInc_Disk on storage unit CACALDD002_PrimarySTU
11/30/2016 5:00:04 PM - started process bpbrm (1768)
11/30/2016 5:00:10 PM - Info bpbrm(pid=1768) cadc3dfsp004 is the host to backup data from
11/30/2016 5:00:10 PM - Info bpbrm(pid=1768) reading file list for client
11/30/2016 5:00:11 PM - Info bpbrm(pid=1768) accelerator enabled
11/30/2016 5:00:17 PM - connecting
11/30/2016 5:00:20 PM - Info bpbrm(pid=1768) starting bpbkar32 on client
11/30/2016 5:00:20 PM - connected; connect time: 0:00:03
11/30/2016 5:00:22 PM - Info bpbkar32(pid=3000) Backup started
11/30/2016 5:00:22 PM - Info bpbkar32(pid=3000) change time comparison:<enabled>
11/30/2016 5:00:22 PM - Info bpbkar32(pid=3000) accelerator enabled backup, archive bit processing:<disabled>
11/30/2016 5:00:22 PM - Info bptm(pid=2280) start
11/30/2016 5:00:22 PM - Info bptm(pid=2280) using 262144 data buffer size
11/30/2016 5:00:22 PM - Info bptm(pid=2280) setting receive network buffer to 1049600 bytes
11/30/2016 5:00:22 PM - Info bptm(pid=2280) using 30 data buffers
11/30/2016 5:00:25 PM - Info bptm(pid=2280) start backup
11/30/2016 5:00:25 PM - Info bptm(pid=2280) backup child process is pid 6572.2260
11/30/2016 5:00:25 PM - Info bptm(pid=6572) start
11/30/2016 5:00:25 PM - begin writing
11/30/2016 5:00:27 PM - Info bpbkar32(pid=3000) not using change journal data for <Shadow Copy Components:\User Data\Distributed File System Replication\DfsrReplicatedFolders\HomeShare\Home2>: not supported for non-local volumes / file systems
11/30/2016 5:41:51 PM - Warning bpbrm(pid=1768) from client cadc3dfsp004: WRN - can't open object: Shadow Copy Components: (BEDS 0xE000FECB: A failure occurred accessing the backup component document.)
11/30/2016 5:41:51 PM - Warning bpbrm(pid=1768) from client cadc3dfsp004: WRN - can't open object: Shadow Copy Components:\User Data\Distributed File System Replication\DfsrReplicatedFolders\HomeShare (BEDS 0xE000FEDD: A failure occurred accessing the object list.)
11/30/2016 5:47:33 PM - Info bptm(pid=2280) waited for full buffer 1226 times, delayed 178286 times
11/30/2016 5:47:33 PM - Info bpbkar32(pid=3000) accelerator sent 1808810496 bytes out of 8746298368 bytes to server, optimization 79.3%
11/30/2016 5:47:34 PM - Info bptm(pid=2280) EXITING with status 0 <----------
11/30/2016 5:47:35 PM - Info bpbrm(pid=1768) validating image for client cadc3dfsp004
11/30/2016 5:47:37 PM - Info bpbkar32(pid=3000) done. status: 1: the requested operation was partially successful
11/30/2016 5:47:37 PM - end writing; write time: 0:47:12
the requested operation was partially successful(1)
The job was successfully completed, but some files may have been
busy or inaccessible. See the problems report or the client's logs for more details.
12/1/2016 8:23:46 AM - job 2147876 was restarted as job 2148876
any clue to why this might be happening and why it isn't a full "backup bad, not usable" rather than a status 1 ?
thank you ,
- I have learned to never ignore a status 1. Under certain circumstances, it actually means a failure. Like in this case. Nothing was backed up.
Be sure you have bpbkar log on the client to investigate.