Article URL:http://www.veritas.com/docs/000021355
DFSR backups leave a file in temp at the end of every job
- Article URL:http://www.veritas.com/docs/000023024
02-19-2016 03:16 PM
When Netbackup starts on file server, DFS Replication sores from 0 to 600,000 items in backlog
- NetBackup Server version of Master/Media Server(s): 7.6.0.1 and 7.6.0.1
- NetBackup Client version on File Server: 7.6.0.1
- OS and patch level (SP/CU) of File Server: Windows Server 2012 R2 Datacenter
Backup set is about 12 TB.
Has anyone experienced this issue?
02-19-2016 04:24 PM
How many items in total in the DFSR store?
02-19-2016 04:55 PM
1.1million items and roughly 5 TB of files in one of the policies
There are a total of 4 policies that run on this server for a total of 12 TB.
02-19-2016 06:14 PM
Check the tecnotes below and decide if you want to upgrade to 7.6.0.3 or 7.6.0.4
When using NetBackup to protect Windows 2012 NTFS Data Deduplication volumes, restored DFSR (Distributed File System Replication) data may be corrupt.
Article URL:http://www.veritas.com/docs/000021355
DFSR backups leave a file in temp at the end of every job
02-19-2016 08:39 PM
Hi Bbot,
You might want to patch the NBU client to 7.6.0.4 as it is the most stable version for DFSR backup (for 7.6.0.x family).
It's nice if you can patch the master and media server too, but you don't have too - they can stay at 7.6.0.1. But the NetBackup in DFSR server really needs to be at version 7.6.0.4.
Here is another reason why you need to patch: https://www.veritas.com/support/en_US/article.TECH210794.
02-20-2016 12:38 AM
Hi bbot, you have highlighted one symptom, but so far we only know this as a cosmetic symptom.
1) When a backup runs, the DFSR backlog soars to 600k items.
.
Some questions:
2) When it soars, does it climb in a linear fashion, or does it simply jump from 0 to 600k?
3) When the backup finishes, does it drop immediately from 600k back to zero (or some other small figure)? Or does it remain at 600k?
4) Are you suggesting that the action of a backup having occured causes DFSR to have re-evaluate the entire DFSR file set and re-process/re-check all DFSR items?
.
Whenever I am faced with what appears to be strange behaviour with a backup jobs iteraction with other products, I tend to look in the EEB guide as one of my first steps. Can I suggest that you also peruse the NetBackup v7.6.x.x EEB guide to search for any known issues with NetBackup's interaction with DFSR which have been resolved since your version of v7.6.0.1. The EEB guide is here:
Symantec NetBackup 7.6 Emergency Engineering Binary Guide
02-20-2016 08:15 AM
02-21-2016 01:11 PM
@nbutech Thanks for the links. I am definitely going to look into upgrading the file server to 7.6.0.4.
@sdo
2) When it soars, does it climb in a linear fashion, or does it simply jump from 0 to 600k?
I've only seen this one where I turned off the backup for about 4 days to allow dfs replication to go down to 0, then turned it back on. The next morning when the backup completed, it was at 600,000.
3) When the backup finishes, does it drop immediately from 600k back to zero (or some other small figure)? Or does it remain at 600k?
It does not drop to 0, but it does go to a lower figure. It went down to around 400,000 by the end of the day.
4) Are you suggesting that the action of a backup having occured causes DFSR to have re-evaluate the entire DFSR file set and re-process/re-check all DFSR items?
I think that may be a possibility that it is causing it to re-process/re-check. 600,000 item changes seems high for one day. We have about 1800 employees that use this file server, so it also may be normal to have that many changes. Perhaps I may stop the backup and let the dfs replication finish over the weekend, then rerun it over the weekend to see what the backlog is. (our offices are closed on weekends)
@marianne
Our diferential backups do take about 8 hours to complete (monday-thurs), and the full(ran on friday) takes about 35 hours to complete. If replication is momentarily paused, it could be possible that this server doesn't have enough time to process both the backups and DFS.
Nothing indicative in event viewer for VSS related problems.
02-22-2016 09:53 AM
To have 400,000 items change, or indeed to see 600,000 items change... makes me wonder if NetBackup and/or some other process/tool/script/task/job is 'touching' or updating the 'dates' of files in some way so as to cause DSFR to believe that the files have been modified.
1) Could you show us the output of:
Windows: > bpgetconfig -M mydfsrclient.company.com | findstr /i "file_access ctime" Unix: # bpgetconfig -M mydfsrclient.company.com | egrep -i "file_access|ctime"
...and:
2) Are you aware whether there is a site specific process which forcibly resets permissions/inheritance, on a regular basis?
3) Do you have Enterprise Vault scheduled to archive files?
4) Is there another different large set of files which are being robo-copy synchronized in to the DFSR shares?
02-22-2016 11:02 AM
Upgrade to at least 7.6.0.2 (might as well go to 7.6.0.4)
https://www.veritas.com/support/en_US/article.TECH212807
= = = = = = = =
Also verify this setting for each DFSR Client:
Host Properties > Clients > DFRS_CLIENT > Windows Client > Incrementals > Based on timestamp
02-22-2016 04:05 PM
Now I wish that I had also personally checked the EEB guide, and taken my own advice. ;)
02-23-2016 04:45 AM
Unless i have missed it i haven't seen confirmation that you are backing up via the Shadow Copy Components and not via drive letters / paths.
Could you confirm that please .. i just wondered as you said you had more that one policy to back it all up but maybe just using different SCC paths.
If using drive paths then i assume you would have to use bpstart_notify to shut down DFSR before th ebackup starts .. which would account for the growing back log ... then bpend_notify when the backup finishes to start it up again leaving it to work through that huge queue.
02-23-2016 05:10 AM
Good point Mark. And, if the bpstart_notify and bpend_notify scripts have not been written very carefully then the config/setup could be taking DFSR offline and online by different streams at different times and so causing all manner of up/down havoc whilst backups are running.
.
Personally I think the OP just needs to change to not update file atime, and use file ctime.
.
@bbot - can you show us each of the four policies (as text file attachments please! please don't paste into the thread ;)
bppllist -L <policy-name>
02-23-2016 05:50 AM
Maybe have another look at the best practice TN for DFSR backups : http://www.veritas.com/docs/000095710
and let us know which option is being used in your environment?
02-25-2016 03:41 PM
@bbot - what did you do in the end? resolved now?
02-25-2016 03:54 PM
@sdo - I just updated to 7.6.0.4 last night. I'm waiting for our dfs backlog to go down to 0, then will re-run the job to see what happens. I'll update this in 1-2 days. Thanks!
07-08-2016 11:18 AM
(four months later) So how is it going?