cancel
Showing results for 
Search instead for 
Did you mean: 

DFS Backup too slow

Tanveer_Ahmad
Level 4

I have two systems with DFS enabled data OS of both system is Windows 2012 r2.when I start backup on one system backup transfer rate is about 25 M/b while on second system it is just 400 K/b. both system client version is 7.7.3 and netbackup master server also have same version. Can someone guide me how can I check why one system is too slow.

6 REPLIES 6

sdo
Moderator
Moderator
Partner    VIP    Certified

1) Have you proven that a network based file copy from the same source volumes runs at expected speeds ?

2) Have you proven whether a backup of say a single large flat 5 GB file from each server, from a path/area on the DFSR volume but not in the DFSR path, runs as expected from each node.  And remember... if your backup target storage unit is MSDP or indeed any other dedupe platform... then to be fair then you cannot use the same file for both backups... so... both files need to contain different random data.  In fact every time you run such a test to any dedupe storage then you need brand new random test data.  Do you have a tool/script to create random data files?  If not I can supply one in VBscript, but it's not that fast, and will take a few minutes to create a 5GB file.

Marianne
Moderator
Moderator
Partner    VIP    Accredited Certified

@Tanveer_Ahmad

Do you have bpbkar log on the fileserver? 
You will need logging level between 1 and 3 to see where the delay is.

This TN describes the different phases of DFS backup: 
https://www.veritas.com/support/en_US/article.000019459

So, if we can see timestamps of the different phases, it might give a clue.

Another possibility is fragmentation on the filesystem.
When last has this been checked? 

There is no issue when I copy simple file or folder from source to destination system. source and destination system CPU and network resources utilization is normal during backup jobs. I am sharing here bpbkar log file of one system which have slow backup issue and image file of backup jobs . hopefully I can get some help. My backup server is physical machine while client machies is VMs.

sdo
Moderator
Moderator
Partner    VIP    Certified

When ones DFSR servers are VMs then one has at least one, but usually three, more layer(s) of abstraction to deal with.  Like layers of an onion.

Try my suggestion 2.  This will start to indicate whether the whole virtual disk LUN is a problem or whether it really is more closely related to DFSR.

What is the backup policy's initial target storage unit ?  Is it "accelartor" capable ?  Do you have accelerator enabled on the backup policy ?  FYI the first ever back can sometime be slower than might ordinarily be expected whilst the track log is populated.

How may files are involved ?  Tens of millions ?

Hi Sdo

I am using LTO6 tapes for full backup it is not accelrator capable so accelrator option is not enabled in backup policy. I need backup 2 to 3 TB data. I copy a single file of size 4.23 GB on disk partition which contain dfs data when backup this file on LTO tape it took 2 hour to complete and backup transfer rate is 582Kbs. So it looks like that issue with file server VM. whch log file help me that from where problem is start. As per my File Server administrator they have no complain from user end to access their data on file server.

Anshu_Pathak
Level 5

Hi Tanveer,

For NetBackup, DFSR filelist is a logical entity and to perform its backup it needs to convert this logical filename into physical file location on disk. Microsoft VSS is used to find this logical to physical mappings. This mapping should complete before NetBackup sends even 1 KB of backup data. From the shared logs, it took about 15 minutes to perform this mapping task. So first thing to check is, how much time is taken for this task on good node (25MBps).

Another thing that I noticed in logs is checkpoint enabled in policies. This checkpoint creation is done every 15 minutes in some cases it took 5 minutes to take checkpoint and resume sending data. So this would be second thing to compare with the good node.

My suggestions:

#1 Check/compare windows and VSS patches on both nodes.

#2 Disable checkpoint, if you can for issue isolation purpose.

Log snippets:

15 minutes DFSR pre processing.

15:24:23.479 [3160.4740] <4> ov_log::OVInit: GENERAL Log Level (Effective): 1
15:24:29.838 [3160.4740] <2> tar_backup::V_SetupProcessContinue: TAR - CONTINUE BACKUP received
.....pre processing DFSR data for backup............
15:39:54.083 [3160.4740] <4> tar_backup_tfi::backup_finishfile_state: INF - catalog message: dir - 3 128 51386 1 0 -1 24 16832 root root 0 1549363193 1549363193 1549363193 1 /Shadow Copy Components/

 

Checkpoint taking 5 minutes.

15:52:13.582 [3160.4740] <4> tar_backup_tfi::backup_finishfile_state: INF - catalog message: fil - 3 128 423448 171 438020 -1 248 33088 root;msaeed@DESCON.COM root;Domain:Users@DESCON.COM 422944 1474968578 1446084940 1474968578 1 /Shadow Copy Components/User Data/Distributed File System Replication/DfsrReplicatedFolders/DEST/7374/Document Control/Incoming/Client/Inst & Control/Inst Approved Documents/134004-SEP-INC-BLD-0001-03-B1_INSTRUMENT CABLE BLOCK DIAGRAM (TYPICAL).pdf
15:54:29.873 [3160.3948] <2> tar_backup_cpr::wakeupThread: INF - setting checkpoint wakeup event
15:55:03.312 [3160.4740] <4> tar_backup_tfi::backup_finishfile_state: INF - catalog message: fil - 3 128 59190530 172 438855 -1 212 33088 root;msaeed@DESCON.COM root;Domain:Users@DESCON.COM 59190110 1474968578 1449484822 1474968578 1 /Shadow Copy Components/User Data/Distributed File System Replication/DfsrReplicatedFolders/DEST/7374/Document Control/Incoming/Client/Inst & Control/Inst Approved Documents/134004-SEP-INC-CRL-0001-01-B2(4.0).dwg

16:08:55.192 [3160.4740] <4> tar_backup_tfi::backup_finishfile_state: INF - catalog message: fil - 3 128 419365 193 1064971 -1 243 33088 root;msaeed@DESCON.COM root;Domain:Users@DESCON.COM 418861 1474968650 1446349460 1474968650 1 /Shadow Copy Components/User Data/Distributed File System Replication/DfsrReplicatedFolders/DEST/7374/Document Control/Incoming/Client/Inst & Control/Inst Approved Documents/134004-SEP-INC-LAY-0001-04-B1_INSTRUMENT LOCATION LAYOUT AREA 102.pdf
16:09:29.880 [3160.3948] <2> tar_backup_cpr::wakeupThread: INF - setting checkpoint wakeup event
16:12:17.871 [3160.4740] <4> tar_backup_tfi::backup_finishfile_state: INF - catalog message: fil - 3 128 62578233 194 1065798 -1 212 33088 root;msaeed@DESCON.COM root;Domain:Users@DESCON.COM 62577813 1474968650 1449484932 1474968650 1 /Shadow Copy Components/User Data/Distributed File System Replication/DfsrReplicatedFolders/DEST/7374/Document Control/Incoming/Client/Inst & Control/Inst Approved Documents/134004-SEP-INC-LAY-0002-01-B1(4.0).dwg