cancel
Showing results for 
Search instead for 
Did you mean: 

Netbackup DFSR backup very slow

rshehab
Level 3

Hello, we recently deployed netback 7.7.3 on windows 2012. we have a Data domain storage. Everything is configured by following the veritas and emc guides.

eveything is superb except when it comes to DFSR folders. After some reading we understood "Use Journal" is not supported with dfsr. 

we understood we should add the backup select like 

Shadow Copy Components:\User Data\Distributed File System Replication\DfsrReplicatedFolders\LosAngeles
Shadow Copy Components:\User Data\Distributed File System Replication\DfsrReplicatedFolders\NewYork
Shadow Copy Components:\User Data\Distributed File System Replication\DfsrReplicatedFolders\Denver

we have a 2 TB of data and the backup durations of full and incrimental backup are not ecceptable. I read a lot of posts from people complaining from the same issue when backing up the shadow copy components bot none was helpful.

i think i missed something. your help will be much appreciated. 

let me know if i should provide more info. 

38 REPLIES 38

no good news.. 

i followed this guide to disable the tracker but didn't find anything in the registry under "run" which means tracker is not running. 

https://www.veritas.com/support/en_US/article.TECH31484

i contacted support and they asked me to update the NIC driver.. i did do that and tested again with the same results. 

:(

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

Have you enabled bpbkar logging?

Have you had a look at the log to see which phase is slow? 

Can you confirm what buffer sizes are in place?
(Client communication and buffer settings on media server)

Hello, 

yes i enabled bpbkar logging and attached a fresh log. unfortunately i don't know what to look for.

can you please help? 

buffer sizes are set as follows

NUMBER_DATA_BUFFERS 16

SIZE_DATA_BUFFERS 262144

hello again, another issue we faced today is when backing up the active directory (System State:\) for the first time using netbackup. it took 30 minutes while it should be around 6 or 7 minutes. size is around 10 gb only. 

i then backed up a test folder (c:\test) (10 gb) from the same active directory server and the time was 5:30 minutes and speed exellent. 

i am saying this becuase i think it is the same issue. backup very slow when using VSS.

i will conduct more tests and let u know the results. 

thanks 

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

I give up. You do not seem to be interested in accepting any advice from me.

Logging level is 0:

09:37:23.591 [8608.9688] <4> ov_log::OVInit: GENERAL Log Level (Effective): 0

 

sorry to hear that. I always insist on mentioning that we are new to netbackup and this is a new environment.

there are a lot of things we are still learning. 

anyway, i changed the logging level and started the backup again. I will attach the new log once the backup is done.

thank you for your patience. 

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

I actually had a look at the level 0 log in the meantime.

Basically everything that http://www.veritas.com/docs/000019459  tells us to look for can be seen in this log.

Phase 1:
09:41:40.769 [8608.9688] <4> dos_backup::tfs_scanstart: INF - in 'pre-backup' mode
10:05:20.239 [8608.9688] <4> dos_backup::tfs_scannext: INF - no more path list entries
24 min

The 1st phase that enumerates the file list takes almost the same time as the actual writing of the files.

Phase 2:
10:05:20.255 [8608.9688] <4> VssSnapshot::V_VssVolumeSnapshot(): INF - Drive 'F:' added to VSS Volume Set
10:05:24.146 [8608.9688] <4> dos_backup::V_InitializeShadowCopy: INF - ShadowCopy PrePost: End
4 sec

Phase 3:
10:06:15.865 [8608.9688] <4> dos_backup::tfs_scanstart: INF - in 'backup' mode
11:31:28.453 [8608.9688] <4> dos_backup::tfs_scannext: INF - no more path list entries
25 min

So - I see the exact same behaviour as described in the TN.

Please send your bpbkar log and the TN to your Veritas engineer.
It seems the issue is back in your NBU version.

PS:
Your engineer will most certainly request level 5 log. 

sdo
Moderator
Moderator
Partner    VIP    Certified

Looking back, I feel that we're not getting the full story here, and as Marianne has rightly pointed out, clearly a lack of willingness to follow the threads and help us help the OP... and I still see at least two unanswered questions, and I want to ask another two:

1) Is the target Data Domain backup storage device on the same local network, in the same room, or building, or site?

2) Have you tried the bpbkar to null disk/volume/file-system speed read test?

3) Is the backup client (which is running DFSR) also functioning as an "Active Directory Domain Controller" ?

4) Is the backup client also functioning in any other 'server' roles ?  DNS Server?  DHCP Server?  MS SQL Server?  WSUS Server?  SCOM Server?  Hyper-V Host?  A/V Server?  IIS/Apache Web Server?  Anything else?  NetBackup Server?

sorry if i missed any questions before. 

sdo,

1) yes same server room. same everything.

2) yes, and i attached it in an earlier

3) and 4) it is only a dfsr file server. no other roles.

Marianne, 

i will recheck the link and revert back.

thank you all

sdo
Moderator
Moderator
Partner    VIP    Certified

You were right, I'm very sorry, you did include part of a bpbkar to null speed read test, but it's incomplete :(
Looking at the spreadsheet that you attached, named "backup log.xlsx":

10:16:44.323 [15736.548] <2> WinMain: DAT - lpCmdLine = '-nocont Shadow Copy Components:\User Data\Distributed File System Replication\DfsrReplicatedFolders\test32\ '
...
10:16:46.355 [15736.548] <2> tar_base::V_vTarMsgW: INF - BACKUP START 15736
...
10:19:14.558 [15736.548] <4> dos_backup::tfs_scanstart: INF - starting scan using name:<Shadow Copy Components:\User Data\Distributed File System 
10:19:14.558 [15736.548] <4> dos_backup::tfs_scanstart: INF - in 'pre-backup' mode
...
10:48:20.669 [15736.548] <4> dos_backup::V_InitializeShadowCopy: INF - ShadowCopy PrePost: End
...
10:48:20.669 [15736.548] <4> tar_backup_tfi::determineEstimate: INF - job tracking estimate: start
...
11:14:28.217 [15736.548] <4> tar_backup_tfi::determineEstimate: INF - job tracking estimate: stop
...
11:14:28.217 [15736.548] <4> tar_backup_tfi::determineEstimate: INF - Job Tracking Estimates 11:14:28.217 [15736.548] <4> tar_backup_tfi::determineEstimate: INF - Files: 1079795 11:14:28.217 [15736.548] <4> tar_backup_tfi::determineEstimate: INF - Folders: 6216 11:14:28.217 [15736.548] <4> tar_backup_tfi::determineEstimate: INF - Bytes Data: 416473664 11:14:28.217 [15736.548] <4> tar_backup_tfi::determineEstimate: INF - Gigabytes Data: 58
...
11:15:26.795 [15736.548] <4> dos_backup::tfs_scanstart: INF - starting scan using name:<Shadow Copy Components:\User Data\Distributed File System Replication\DfsrReplicatedFolders\test32\> 11:15:26.795 [15736.548] <4> dos_backup::tfs_scanstart: INF - in 'backup' mode
...
[incomplete]

So, from the above, we can determine:
1) the test was a speed read test, i.e. "lpCmdLine = '-nocont".
2) the test did not finish (or an incomplete log was attached)
3) elapsed time of the pre-backup walk was 11:14:28 - 10:48:20 = 00:26:08 = 1568 seconds
4) total names enumerated = 1,079,795 files + 6,216 folders = 1,086,011 names
5) therefore the folder tree walk was 692 names/second, which seems ok.
6) but we don't see the end of teh log, so we are unable to confirm the underlying disk volume read speed.
7) therefore test is incomplete/inconclusive.

.
Looking at the text log that you attached, named "ALL_ADMINS.012517_00001.log":

09:37:23.591 [8608.9688] <2> WinMain: DAT - lpCmdLine = '-r 5356800 -ru root...
...
09:40:49.862 [8608.9688] <4> dos_backup::V_InitializeShadowCopy: INF - ShadowCopy PrePost: Begin
09:41:40.769 [8608.9688] <4> dos_backup::tfs_scanstart: INF - starting scan using name:<Shadow Copy Components:\User Data\Distributed File System Replication\DfsrReplicatedFolders\CSCProjects32>
09:41:40.769 [8608.9688] <4> dos_backup::tfs_scanstart: INF - in 'pre-backup' mode
10:05:20.239 [8608.9688] <4> dos_backup::tfs_scannext: INF - no more path list entries
10:05:24.146 [8608.9688] <4> VssSnapshot::V_VssVolumeSnapshot(): INF - About to commence post-snap unbounded association checks
10:05:24.146 [8608.9688] <4> dos_backup::V_InitializeShadowCopy: INF - ShadowCopy PrePost: End
...
10:06:15.865 [8608.9688] <4> dos_backup::tfs_scanstart: INF - in 'backup' mode
11:31:28.469 [8608.9688] <2> tar_base::backup_finish: TAR - backup: 1089490 files
11:31:28.469 [8608.9688] <2> tar_base::backup_finish: TAR - backup: file data: 904154538 bytes 58 gigabytes
11:31:28.469 [8608.9688] <2> tar_base::backup_finish: TAR - backup: image data: 851491840 bytes 62 gigabytes
11:31:28.469 [8608.9688] <2> tar_base::backup_finish: TAR - backup: elapsed time: 6845 secs 9850034 bps
11:31:28.594 [8608.9688] <2> ov_log::V_GlobalLog: _bedsTermFsys(): ERR - FS_DetachDLE() for 'Shadow?Copy?Components' Failed! (0xE000FED3:A failure occurred setting the Component backup status. )

1) was a genuine backup job, from: "lpCmdLine = '-r 5356800 -ru root..."
2) you have check point enabled on the backup policy: -ckpt_time 600
3) the backup time was 11:31:28 - 10:06:15 = 01:25:13 = 5113 seconds.
4) files per second for backup was: 1089490 / 6845 = 159 files per second
5) I don't understand the file data and image data lefthand "bytes" numbers, they make no sense to me.
6) from the right hand numbers we seem to have:
9,850,034 bps 9,619 KB/s 9 MB/s 6,845 seconds 67,423,482,730 bytes 65,843,245 KB 64,300 MB 63 GB
7) If 63 GB was actually backed-up then I make the actual backup speed for the backup element to be 64300 MB / 5113 seconds = 12.6 MB/s, which still isn't very fast.
8) We see error code 0xE000FED3 at the end of the backup which looks suspiciously like a VSS error.

.
I think Michael is right and that your backup client is struggling with VSS.

It would be nice to see a complete bpbkar log for the null speed read test for two completely separate tests, at different times:

-nocont "Shadow Copy Components:\User Data\Distributed File System Replication\DfsrReplicatedFolders\CSCProjects32"

...and:

-nocont F:\Groups\CSCProjects32

...i.e. the actual folder tree that the DFSR namespace points to.

.

Three possible actions for you:

1) Can you show us the output from these commands on the client:

vssadmin list writers | find /i "last"
vssadmin list writers | find /i "state"
vssadmin list writers
vssadmin list shadows
vssadmin list shadowstorage
vssadmin list volumes
vssadmin list providers

.

2) And can you check your system and application event logs for the underlying cause of this event in the backup log:

11:31:28.594 Shadow Copy Components Failed! 0xE000FED3

I've googled for "0xE000FED3", but I didn't find very much.

.

3) Try de-selecting the check point of 10 minutes on the backup policy.  Any difference in backup speed?

sdo
Moderator
Moderator
Partner    VIP    Certified

And we seem to have moving goal posts.  The first log for the incomplete speed read test was for DFSR namespace of:

Shadow Copy Components:\User Data\Distributed File System Replication\DfsrReplicatedFolders\test32

...whereas the second ALL_ADMINS log was for a DFSR namespace of:

Shadow Copy Components:\User Data\Distributed File System Replication\DfsrReplicatedFolders\CSCProjects32

...so we're still not comparing apples with apples.

hello all,

sdo, these are the results 

C:\Program Files\Veritas\NetBackup\bin>vssadmin list writers | find /i "last"
Last error: No error
Last error: No error
Last error: No error
Last error: No error
Last error: No error
Last error: No error
Last error: No error
Last error: No error
Last error: No error
Last error: No error
Last error: No error
Last error: No error
Last error: No error

C:\Program Files\Veritas\NetBackup\bin>vssadmin list writers | find /i "state"
State: [1] Stable
State: [1] Stable
State: [1] Stable
State: [1] Stable
State: [1] Stable
State: [1] Stable
State: [1] Stable
State: [1] Stable
State: [1] Stable
State: [1] Stable
State: [1] Stable
State: [1] Stable
State: [1] Stable

C:\Program Files\Veritas\NetBackup\bin>vssadmin list writers
vssadmin 1.1 - Volume Shadow Copy Service administrative command-line tool
(C) Copyright 2001-2013 Microsoft Corp.

Writer name: 'Task Scheduler Writer'
Writer Id: {d61d61c8-d73a-4eee-8cdd-f6f9786b7124}
Writer Instance Id: {1bddd48e-5052-49db-9b07-b96f96727e6b}
State: [1] Stable
Last error: No error

Writer name: 'VSS Metadata Store Writer'
Writer Id: {75dfb225-e2e4-4d39-9ac9-ffaff65ddf06}
Writer Instance Id: {088e7a7d-09a8-4cc6-a609-ad90e75ddc93}
State: [1] Stable
Last error: No error

Writer name: 'Performance Counters Writer'
Writer Id: {0bada1de-01a9-4625-8278-69e735f39dd2}
Writer Instance Id: {f0086dda-9efc-47c5-8eb6-a944c3d09381}
State: [1] Stable
Last error: No error

Writer name: 'System Writer'
Writer Id: {e8132975-6f93-4464-a53e-1050253ae220}
Writer Instance Id: {2339ae8a-7e86-472a-85e0-dc8b48d87607}
State: [1] Stable
Last error: No error

Writer name: 'ASR Writer'
Writer Id: {be000cbe-11fe-4426-9c58-531aa6355fc4}
Writer Instance Id: {2a28f4f6-48ed-47be-8799-4dd8a67b6862}
State: [1] Stable
Last error: No error

Writer name: 'DFS Replication service writer'
Writer Id: {2707761b-2324-473d-88eb-eb007a359533}
Writer Instance Id: {a0917651-d6bf-4ec1-ac50-c1d2528b44d0}
State: [1] Stable
Last error: No error

Writer name: 'FSRM Writer'
Writer Id: {12ce4370-5bb7-4c58-a76a-e5d5097e3674}
Writer Instance Id: {ce0cdee5-36a1-4600-9516-d22b7d8dc7c9}
State: [1] Stable
Last error: No error

Writer name: 'Dedup Writer'
Writer Id: {41db4dbf-6046-470e-8ad5-d5081dfb1b70}
Writer Instance Id: {b876ae21-a903-4468-b757-262a10f6b057}
State: [1] Stable
Last error: No error

Writer name: 'Shadow Copy Optimization Writer'
Writer Id: {4dc3bdd4-ab48-4d07-adb0-3bee2926fd7f}
Writer Instance Id: {23b5b92d-8aa5-422c-93f1-fecd37e18592}
State: [1] Stable
Last error: No error

Writer name: 'BITS Writer'
Writer Id: {4969d978-be47-48b0-b100-f328f07ac1e0}
Writer Instance Id: {69c1a03b-794a-4824-bd98-aeb5d8950902}
State: [1] Stable
Last error: No error

Writer name: 'WMI Writer'
Writer Id: {a6ad56c2-b509-4e6c-bb19-49d8f43532f0}
Writer Instance Id: {a3624309-512d-4f1d-beeb-261272ff6a09}
State: [1] Stable
Last error: No error

Writer name: 'Registry Writer'
Writer Id: {afbab4a2-367d-4d15-a586-71dbb18f8485}
Writer Instance Id: {5f32f22d-301b-43c4-8146-8e40206f9725}
State: [1] Stable
Last error: No error

Writer name: 'COM+ REGDB Writer'
Writer Id: {542da469-d3e1-473c-9f4f-7847f01fc64f}
Writer Instance Id: {01b2f272-65d8-4415-bb51-8cfa870645ef}
State: [1] Stable
Last error: No error


C:\Program Files\Veritas\NetBackup\bin>vssadmin list shadows
vssadmin 1.1 - Volume Shadow Copy Service administrative command-line tool
(C) Copyright 2001-2013 Microsoft Corp.

No items found that satisfy the query.


C:\Program Files\Veritas\NetBackup\bin>vssadmin list shadowstorage
vssadmin 1.1 - Volume Shadow Copy Service administrative command-line tool
(C) Copyright 2001-2013 Microsoft Corp.

No items found that satisfy the query.


C:\Program Files\Veritas\NetBackup\bin>vssadmin list volumes
vssadmin 1.1 - Volume Shadow Copy Service administrative command-line tool
(C) Copyright 2001-2013 Microsoft Corp.

Volume path: C:\
Volume name: \\?\Volume{bb279cf0-c260-4712-ae80-08515e7d7bde}\
Volume path: G:\
Volume name: \\?\Volume{072b7f09-3bc4-48cc-a84a-9f5337e6dfdd}\
Volume path: E:\
Volume name: \\?\Volume{af8225c4-f92a-42cd-b8cc-a7bbd0a8502b}\
Volume path: F:\
Volume name: \\?\Volume{19915042-304c-438a-b165-515dc9a69d47}\
Volume path: \\?\Volume{078e04e0-24ce-4885-8edf-36ec67d3d041}\
Volume name: \\?\Volume{078e04e0-24ce-4885-8edf-36ec67d3d041}\

C:\Program Files\Veritas\NetBackup\bin>vssadmin list providers
vssadmin 1.1 - Volume Shadow Copy Service administrative command-line tool
(C) Copyright 2001-2013 Microsoft Corp.

Provider name: 'Microsoft File Share Shadow Copy provider'
Provider type: Fileshare
Provider Id: {89300202-3cec-4981-9171-19f59559e0f2}
Version: 1.0.0.1

Provider name: 'Microsoft Software Shadow Copy provider 1.0'
Provider type: System
Provider Id: {b5946137-7b9f-4925-af80-51abd60b20d5}
Version: 1.0.0.7

 

i also found something in the event viewer

Log Name: Application
Source: VSS
Date: 1/26/2017 11:06:20 AM
Event ID: 8193
Task Category: None
Level: Error
Keywords: Classic
User: N/A
Computer: Serv-DFS02.csclebanon.com
Description:
Volume Shadow Copy Service error: Unexpected error calling routine GetVolumeNameForVolumeMountPoint is now failing on the volume, winerror 0x00000001. . hr = 0x80070001, Incorrect function.

i will try to fix this event and revert back to you.

sdo
Moderator
Moderator
Partner    VIP    Certified

In the meantime, what are your disk space levels like on the problem backup client:

C:\> wmic logicaldisk list brief

sdo
Moderator
Moderator
Partner    VIP    Certified

BTW - we are asking all these different questions so that we can (hopefully) offer some solid advice at some point later in this thread (which may well probably be quite a long thread by the time we get there).

.

What does this show:

C:\> fsutil fsinfo ntfsinfo F:

.

Some more detail about the VM (i.e. the backup client) hosting the problem DFSR:

Number of vCPUs?

Amount of vRAM?

Are all of the VMDKs thin provisioned?  If you have a mix of thick and thin provisioning, please specify which of these:  C: E: F: G:   is thick or thin?

How many different DFSR namespace roots do you have on each of:  C: E: F: G: ?

hello again, 

this is a physical server. 

here are the results of the commands you asked for

 

C:\>wmic logicaldisk list brief
DeviceID DriveType FreeSpace ProviderName Size VolumeName
C: 3 503715999744 599486566400
D: 5
E: 3 1438150057984 4397909143552 
F: 3 407094427648 536734593024 
G: 3 3259587207168 4500620308480 


C:\>fsutil fsinfo ntfsinfo F:
NTFS Volume Serial Number : 0x4ea4d2bca4d2a5ad
NTFS Version : 3.1
LFS Version : 2.0
Number Sectors : 0x000000003e7befff
Total Clusters : 0x0000000007cf7dff
Free Clusters : 0x0000000005ec9737
Total Reserved : 0x00000000000007b0
Bytes Per Sector : 512
Bytes Per Physical Sector : 512
Bytes Per Cluster : 4096
Bytes Per FileRecord Segment : 1024
Clusters Per FileRecord Segment : 0
Mft Valid Data Length : 0x0000000046080000
Mft Start Lcn : 0x00000000000c0000
Mft2 Start Lcn : 0x0000000000000002
Mft Zone Start : 0x0000000001c8e960
Mft Zone End : 0x0000000001c93f00
Resource Manager Identifier : 980F5FC8-D4CE-11E6-80CB-3863BB4074A3

 

i am still working on the VSS events i mentioned earlier:

Volume Shadow Copy Service error: Unexpected error calling routine OpenService (shSCManager, 'VSS', SERVICE_QUERY_STATUS). hr = 0x80070005, Access is denied.
.

Operation:
Initialize For Backup

sdo
Moderator
Moderator
Partner    VIP    Certified

We seem to have had two different VSS events:

Log Name: Application
Source: VSS
Date: 1/26/2017 11:06:20 AM
Event ID: 8193
Level: Error
Computer: Serv-DFS02.csclebanon.com
Description:
Volume Shadow Copy Service error: Unexpected error calling routine GetVolumeNameForVolumeMountPoint is now failing on the volume, winerror 0x00000001. . hr = 0x80070001, Incorrect function.

...and another, but you don't show the event ID or when this occured:
Volume Shadow Copy Service error: Unexpected error calling routine OpenService (shSCManager, 'VSS', SERVICE_QUERY_STATUS). hr = 0x80070005, Access is denied.

.

Can we check the client's buffer size:

# bpgetconfig -h serv-dfs02 buffer_size

.

This physical server seems to have a fair amount of storage.

Untitled.png

.

Are any of E: F: G: from SAN/NAS storage arrays, or are all of the volume drive letters actually local formatted from physical disk within the server chassis?

.

If the server has local disk only, and no SAN/NAS storage, then please can you explain how the 9 TB of disk space is carved?  How many spindles?  Any RAID parity groups?  What model of RAID card HBA?    (I ask because we need to check a few things).

Hello,

E & F are EMC storages. G is local disk of 900GBs RAID 5 (Total 4 tb)

client buffer size

C:\Program Files\Veritas\NetBackup\bin>bpgetconfig -h serv-dfs02 buffer_size
Could not print the serv-dfs02 item.
buffer_size = 128

both events have the same ID different description. 

Log Name: Application
Source: VSS
Date: 1/29/2017 9:42:45 AM
Event ID: 8193
Task Category: None
Level: Error
Keywords: Classic
User: N/A
Computer: Serv-DFS02.csclebanon.com
Description:
Volume Shadow Copy Service error: Unexpected error calling routine OpenService (shSCManager, 'VSS', SERVICE_QUERY_STATUS). hr = 0x80070005, Access is denied.

Log Name: Application
Source: VSS
Date: 1/26/2017 11:06:20 AM
Event ID: 8193
Task Category: None
Level: Error
Keywords: Classic
User: N/A
Computer: Serv-DFS02.csclebanon.com
Description:
Volume Shadow Copy Service error: Unexpected error calling routine GetVolumeNameForVolumeMountPoint is now failing on the volume, winerror 0x00000001. . hr = 0x80070001, Incorrect function.

i thought the events are shown during backups but the first one is dated yesterday when no backup was running. 

Hi Marianne,

I am getting the same slowness 1.3tb of data taking 120 hours to backup and its not completeing and then gets aborted manually. We have followed this web link to the letter.

https://www.veritas.com/support/en_US/article.HOWTO65638

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

 @mad_jock

Following that TN is good, but seeing you found this discussion, can I ask if you have read through the entire discussion from 'page 1'?

There have been lots of suggestions but @rshehab eventually stopped responding.