Forum Discussion

Verneti_Berny's avatar
5 years ago
Solved

File server windows 2019 backup slowly

Hi all, I have a problem with file server backup, since we upgrade the OS of FS to Windows 2019, sometime it backup stay very slowly.

The first time it occurs I fix it restarting my file server, after one month, the problem return.

Only Inc backup occur this, the full backups are ok.

I have a policie with many clients and these other working fine, only file server has problem.

The normal backup of this client take 3 hours in normally conditions, but now it takes 10 hours and the speed from 52MB/s past to 20MB/s.

I donĀ“t know whats happen, nothing was did in file server or master and media server.

Can anybody help me?

 

My configuration is:

Master Server:

Windows Server 2012 R2 (Virtual Machine)

Two media server MSDP (phisical machines), we have two sites.

My File server client is Windows Server 2019 (Virtual Machine)

Our virtual environment is VmWare

Attached I post two job log one with normally condition and the other with the problem.

 

  • davidmoline's avatar
    davidmoline
    5 years ago

    Yes a red herring - the bpbkar process in 8.x is called bpbkar32.exe - and there is only a 64 bit binary available for Windows now.

    What I'd be looking at is enabling Windows change journalling to avoid the costly directory walk in the first place. For the client in "Host properties -> Clients -> Windows Client -> Client Settings":  enable "Use Change Journal". 
    I'd also be enabling aceletator in the policy (it doesn't appear to be set)

    This should remove the directory walk time which is where your backup is showing slowness. 

    Another option would be to use a Flashbackup-Windows policy for the E: drive (and exclude E: from original policy) as that seems to have a very large number of files for which that policy type is designed. 

  • To me this looks like problem in the client OS.  In this summary we see very similar delay and wait counts for NetBackup buffers which makes me think that the problem is not with NetBackup Server:

    was good
    24/02/2020 02:30:16 - Info bptm (pid=952) using 262144 data buffer size
    24/02/2020 02:30:16 - Info bptm (pid=952) setting receive network buffer to 1049600 bytes
    24/02/2020 02:30:16 - Info bptm (pid=952) using 30 data buffers
    24/02/2020 02:30:16 - Info scsnb3 (pid=952) Using OpenStorage client direct to backup from client uscfs1 to scsnb3
    24/02/2020 05:48:55 - Info bpbkar32 (pid=10524) bpbkar waited 13852 times for empty buffer, delayed 23719 times.
    24/02/2020 05:49:00 - Info scsnb3 (pid=952) StorageServer=PureDisk:scsnb3; Report=PDDO Stats (multi-threaded stream used) for (scsnb3): scanned: 745085458 KB, CR sent: 1954275 KB, CR sent over FC: 0 KB, dedup: 99.7%, cache disabled
    24/02/2020 05:49:05 - end writing; write time: 3:18:47
    
    
    now slow
    02/03/2020 02:30:17 - Info bptm (pid=244) using 262144 data buffer size
    02/03/2020 02:30:17 - Info bptm (pid=244) setting receive network buffer to 1049600 bytes
    02/03/2020 02:30:17 - Info bptm (pid=244) using 30 data buffers
    02/03/2020 02:30:18 - Info scsnb3 (pid=244) Using OpenStorage client direct to backup from client uscfs1 to scsnb3
    02/03/2020 12:19:23 - Info bpbkar32 (pid=8148) bpbkar waited 12443 times for empty buffer, delayed 29060 times.
    02/03/2020 12:19:37 - Info scsnb3 (pid=244) StorageServer=PureDisk:scsnb3; Report=PDDO Stats (multi-threaded stream used) for (scsnb3): scanned: 748291295 KB, CR sent: 1533437 KB, CR sent over FC: 0 KB, dedup: 99.8%, cache disabled
    02/03/2020 12:19:43 - end writing; write time: 9:49:24

    .

    My guess is that it is the file-system walk of the client OS that is slow.  Something is slugging the client.  But determining the root cause for this is going to be quite difficult.

    One idea that comes to mind would be to collect level 3 (I think it is level 3) bpbkar logging which reveals the file name of each file as the file system is walked.  But manually visually spotting any repeated minor slight delay and increase in length of time across millions of files will be near impossible.  So, one answer might be to write a script / tool to process the bpbkar log to analyse the time gap between each file detected (i.e. count, sum, max, min, average, standard deviation, etc...) and see if this clearly points to a delay in walking the file-system.  If so, then open a case with Microsoft.

    Another idea might be to run a bpbkar to null every day and thus definitely prove that the delay has nothing at all to do with LAN (to NetBackup Server) nor with NetBackup storage units.  And again when you see clear slowness from one day to the next, open a case with Microsoft.

    Another idea might be to try to see if the problem is definitely at least in part related to purely folder walking.  Write a script (which could a PowerShell one-liner) which only walks the folder structure, i.e. do not open any files, and see if that demonstrates sudden slowness from one day to the next.

    Just curious as to the nature and config of the storage that underlies the file-share.  Could you describe?  Is it a DAS, or NAS, or SAN (iSCSI or FC?) ?

    • sdo's avatar
      sdo
      Moderator

      Another thing that you can do is to logon,and leave it logged on, an RDP session to the file-server, and re-logon once a day to keep the RDP session alive, and inside this session use the Windows "perfmon" tool to monitor IO to the logical disk(s) hosting the file-server services/shares.  Set the counter to 120 seconds in 86400 seconds (i.e. every two minutes for 24 hours), and monitor counters like:

      - operations : reads/sec, writes/sec

      - latency : sec/read, sec/write

      - payload : avg read size, avg write size

      - throughput : read MB/s, write MB/s

      • Thanks sdo, my restart does not fix at this time, I will do your suggestion and let you know here.

        Thanks again.

    • You ask.

      Just curious as to the nature and config of the storage that underlies the file-share.  Could you describe?  Is it a DAS, or NAS, or SAN (iSCSI or FC?) ?

      We are using a Cisco HyperFlex (Hyperconvergence) as cluster servers with Vmware.

      Thanks

      • davidmoline's avatar
        davidmoline
        Level 6

        Also what is the file system type used for the storage that is exhibiting the slow backup speed?