Fileserver backup: poor backup performance | Initial VirtApi DLL load check failed.
Hi,
I recently migrated our clustered MS file server to Windows Server 2019. 2 nodes (basically the same installation), VMware, iSCSI SAN, multiple policies, multible streams, time-splited schedules, no accelerator or change journal in use (currently I have no knowledge with this), NB is v8 on server and client.
Especially the daily incremental backup of the user share folders takes way too long on node 1, it exceeds the time window. Node 2 perform better - not as good as on Server 2k8, but well...
Trouble shooting turned out a different behaviour: on node 1 the server is waiting for a timeout while failing to load 2 dll's (VirtApi.dll and bedstrace.dll). Node 2 doesn't have this issue. Both nodes should be identically installed and configured.
part of the bpbkar log:
15:56:17.549 [2524.6724] <2> ov_log::V_GlobalLog: INF - FS_InitFileSys
15:56:17.549 [2524.6724] <2> ov_log::V_GlobalLog: INF - VirtApi DLL WAS LOADED FROM VirtApi.dll.!
15:57:48.925 [2524.6724] <2> ov_log::V_GlobalLog: INF - Initial VirtApi DLL load check failed. Will try again later.
15:57:48.941 [2524.6724] <2> ov_log::V_GlobalLog: INF - VirtApi DLL WAS LOADED FROM VirtApi.dll.!
15:59:20.089 [2524.6724] <2> ov_log::V_GlobalLog: INF - Initial VirtApi DLL load check failed. Will try again later.
15:59:20.089 [2524.6724] <2> ov_log::V_GlobalLog: INF - loaded bedsxese.dll
15:59:20.089 [2524.6724] <2> ov_log::V_GlobalLog: INF - [AgentID:13] ... failed to load bedstrace.dll.
15:59:20.089 [2524.6724] <2> ov_log::V_GlobalLog: INF - FS_DleGen: Failure to create debug trace object at line number 76
15:59:20.089 [2524.6724] <2> ov_log::V_GlobalLog: INF - [AgentID:23] ... failed to load bedstrace.dll.
15:59:20.089 [2524.6724] <2> ov_log::V_GlobalLog: INF - FS_DleGen: Failure to create debug trace object at line number 76
15:59:20.105 [2524.6724] <2> ov_log::V_GlobalLog: INF - [AgentID:30] ... failed to load bedstrace.dll.
15:59:20.105 [2524.6724] <2> ov_log::V_GlobalLog: INF - FS_DleGen: Failure to create debug trace object at line number 76
15:59:20.105 [2524.6724] <2> ov_log::V_GlobalLog: INF - loaded bedsshadow.dll
15:59:20.105 [2524.6724] <2> ov_log::V_GlobalLog: INF - loaded bedsnt5.dll
15:59:20.105 [2524.6724] <2> ov_log::V_GlobalLog: INF - [AgentID:5] ... failed to load bedstrace.dll.
15:59:20.105 [2524.6724] <2> ov_log::V_GlobalLog: INF - FS_DleGen: Failure to create debug trace object at line number 76
15:59:20.105 [2524.6724] <2> ov_log::V_GlobalLog: INF - loaded bedsss.dll
15:59:20.105 [2524.6724] <2> ov_log::V_GlobalLog: INF - [AgentID:1e] ... failed to load bedstrace.dll.
15:59:20.105 [2524.6724] <2> ov_log::V_GlobalLog: INF - FS_DleGen: Failure to create debug trace object at line number 76
15:59:20.121 [2524.6724] <2> ov_log::V_GlobalLog: INF - VirtApi DLL WAS LOADED FROM VirtApi.dll.!
16:00:51.334 [2524.6724] <2> ov_log::V_GlobalLog: INF - Initial VirtApi DLL load check failed. Will try again later.
16:00:51.334 [2524.6724] <2> ov_log::V_GlobalLog: INF - loaded bedsadgran.dll
16:00:51.334 [2524.6724] <2> ov_log::V_GlobalLog: INF - loaded bedssql2.dll
16:00:51.334 [2524.6724] <2> ov_log::V_GlobalLog: INF - [AgentID:10] ... failed to load bedstrace.dll.
16:00:51.334 [2524.6724] <2> ov_log::V_GlobalLog: INF - FS_DleGen: Failure to create debug trace object at line number 76
16:00:51.334 [2524.6724] <2> ov_log::V_GlobalLog: INF - Initializing FSs
This is constantly repeated after every stream written (parallel ofcourse due to multistream) and adds up plenty of time at all. The issue is reproducable and follow the VM in case of VM migration to another ESXi host - as well as the _non-issue_ node 2 runs without this issue on the same ESX.
Unfortunately I found not much about the function of the VirtApi.dll.
Access control is prohibited at master/ media server and client. Nevertheless I tried intentionelly to decrease the default value of HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\VirtFile "Sleep Before Unload" (31000) to 10 to force bpbkar to go ahead in this circumstance and not waiting, which ended in not running the backup even if the processes where running.
Services are running with local system permissions.
I also tried to manually reinstall the VirtFile.inf and to implement some points from the Performance Tuning Guide, but ended up every time at this dll load issue.
Any help or hint would be greatly appreciated.
Thanks,
Holger
I figured out that there was a difference between the 2 nodes: on the non-issue node Secure Boot was disabled in VMware Boot options (I don't know why and can't trace back). After disabling secure boot on the issued node and reinstalled NB Client the minifilter is loaded correctly by the system. I'm really interessted why just the secure boot option makes a difference while installing the client.
So the first point should be solved now
(has to be confirmed this evening).confirmed: it's solvedJumping to the next point: as mentioned the (incremental) backup takes about 50% more time on Windows Server 2019 than before on Server 2008 R2. Previously about 2hrs for DINCR backup of the approx 1.5M files, now with srv 2019 almost 3hrs. It's currently not a problem, but I want to understand why. Same SAN, same infrastructure, same ESXi, still NTFS, SMB v3 now. Are there NB tuning guides for server 2019? Does it make sense to enable journaling? What consequenses are to be expected? I'm really careful cause I'd read that once enabled it's not reversable.