Forum Discussion
So this is rather strange. While I was looking at logs and processes I was retrying the duplication jobs few more times and surprisingly on the 4th time they went throuhg to a different storage device. At the same time though the "nblbc.exe" process kept doing the same behavior but this time it had only the following (supposedly error) output:
0,51216,158,351,49,1596147990576,8400,8524,0:,53:Device Name : NFS Network (../SubContextBEDS.cpp:182),14:_dumpDLEInfo(),1
0,51216,158,351,50,1596147990576,8400,8524,0:,54:Device Name : System?State (../SubContextBEDS.cpp:182),14:_dumpDLEInfo(),1
0,51216,158,351,51,1596147990576,8400,8524,0:,75:Device Name : Active Directory Application Mode (../SubContextBEDS.cpp:182),14:_dumpDLEInfo(),1
0,51216,158,351,52,1596147991154,8400,8524,0:,53:SpsRestoreCurrentVer:False (../ResourceChild.cpp:719),28:ResourceChildBEDS::_attach(),1
1,51216,309,351,128,1596149160389,8400,8524,0:,0:,38:ConfigDataIterator::metadataIdCallback,3,(35|A21:3:Error_Messages_Max;|A2:10|)
1,51216,309,351,129,1596149160404,8400,8524,0:,0:,38:ConfigDataIterator::metadataIdCallback,3,(35|A25:3:CLIENT_CONNECT_TIMEOUT;|A3:300|)
0,51216,309,351,130,1596149160404,8400,8524,0:,93:Processing ProcessDebug:nblbc; using the Window's registry (../DefaultConfigProvider.cpp:345),40:DefaultConfigProvider::getValueNonBpConf,1
1,51216,309,351,131,1596149160404,8400,8524,0:,0:,40:DefaultConfigProvider::getValueNonBpConf,3,(36|S20:\HKEY_LOCAL_MACHINE\|S61:SOFTWARE\Veritas\NetBackup\CurrentVersion\Config\ProcessDebug|S5:nblbc|D32:2|)
0,51216,309,351,132,1596149160404,8400,8524,0:,93:Processing Granular:CopyLocal; using the Window's registry (../DefaultConfigProvider.cpp:345),40:DefaultConfigProvider::getValueNonBpConf,1
1,51216,309,351,133,1596149160404,8400,8524,0:,0:,40:DefaultConfigProvider::getValueNonBpConf,3,(36|S20:\HKEY_LOCAL_MACHINE\|S57:SOFTWARE\Veritas\NetBackup\CurrentVersion\Config\Granular|S9:CopyLocal|D32:2|)
1,51216,309,351,134,1596149161857,8400,8524,0:,0:,14:nbfs_unmount(),0,(28|A33:INF - Lets terminate the event...|)
1,51216,309,351,135,1596149161857,8400,8524,0:,0:,14:nbfs_unmount(),0,(28|A25:INF - Event terminated...|)
Also in the meantime I changed the client on one of the DC's to v8.2 however it behave literally the same way as the v8.3 clients. Right now I'm completely puzzled on what's going on.
Does anyone know what is the exact purpose of the "nblbc.exe" process and how does it relate to the GRT operations ? I assume that it is the main responsible binary to do the mapping between AD/Exchange and the backup images. What I really don't understand is the constant CPU load it does for about 20 - 30 minutes then it times out. It would be nice to know what operations are involved in the background to and from each of the involved servers. This issue now looks more like a timeout between the NBU client and the NBU servers.
I would appreciate if someone could provide some enlightenmenet on the GRT related procedures.
Cheers
- asg2ki4 years agoLevel 4
Ok so I looked a bit further at the background situation on the NBU master server and it seems that the issue might be somewhere within the "bpdbm" processes which are engaged shortly after the "nblbc" process is running on the client.
I noticed that the "bpdbm" instances kept using a lot of CPU so I looked at the logs and this is what got generated in huge amounts:
03:20:12.541 [12040.1020] <4> create_user_group_id_marker_WIN: connecting to named pipe:\\.\pipe\7100_{01E0C052-3E0E-4048-B588-1D69F857E066}
03:20:12.541 [12040.1020] <4> create_user_group_id_marker_WIN: successfully connected to server named pipe
03:20:12.541 [12040.1020] <2> logconnections: BPJOBD CONNECT FROM 192.168.12.84.59428 TO 192.168.12.84.13723 fd = 2036
03:20:12.541 [12040.1020] <2> job_authenticate_connection: VxSS authentication check for now...
03:20:12.541 [12040.1020] <2> get_long_base: (1) cannot read (byte 1) from network: An established connection was aborted by the software in your host machine.
03:20:12.541 [12040.1020] <2> job_connect: Can't get server time
03:20:12.541 [12040.1020] <2> check_connect_time: It took 0 seconds to connect.
03:20:12.541 [12040.1020] <2> check_connect_time: Try again with 83 seconds left.
03:20:12.541 [12040.1020] <4> create_user_group_id_marker_WIN: connecting to named pipe:\\.\pipe\7100_{C269A78A-B5D9-4CD6-9F7F-D2F51BB417FA}
03:20:12.541 [12040.1020] <4> create_user_group_id_marker_WIN: successfully connected to server named pipe
03:20:12.541 [12040.1020] <2> logconnections: BPJOBD CONNECT FROM 192.168.12.84.59429 TO 192.168.12.84.13723 fd = 2036
03:20:12.541 [12040.1020] <2> job_authenticate_connection: VxSS authentication check for now...
03:20:12.541 [12040.1020] <2> get_long_base: (1) cannot read (byte 1) from network: An established connection was aborted by the software in your host machine.
03:20:12.541 [12040.1020] <2> job_connect: Can't get server time
03:20:12.541 [12040.1020] <2> check_connect_time: It took 0 seconds to connect.
03:20:12.541 [12040.1020] <2> check_connect_time: Try again with 83 seconds left.
03:20:12.557 [12040.1020] <4> create_user_group_id_marker_WIN: connecting to named pipe:\\.\pipe\7100_{5850DF88-5291-4527-B1E6-ABDF27BD2C61}
03:20:12.557 [12040.1020] <4> create_user_group_id_marker_WIN: successfully connected to server named pipe
03:20:12.557 [12040.1020] <2> logconnections: BPJOBD CONNECT FROM 192.168.12.84.59430 TO 192.168.12.84.13723 fd = 2028
03:20:12.557 [12040.1020] <2> job_authenticate_connection: VxSS authentication check for now...
03:20:12.557 [12040.1020] <2> get_long_base: (1) cannot read (byte 1) from network: An established connection was aborted by the software in your host machine.
03:20:12.557 [12040.1020] <2> job_connect: Can't get server time
03:20:12.557 [12040.1020] <2> check_connect_time: It took 0 seconds to connect.
03:20:12.557 [12040.1020] <2> check_connect_time: Try again with 83 seconds left.
03:20:12.557 [12040.1020] <4> create_user_group_id_marker_WIN: connecting to named pipe:\\.\pipe\7100_{C0421D7F-53D9-4C97-B858-F6634C4D30C9}
03:20:12.557 [12040.1020] <4> create_user_group_id_marker_WIN: successfully connected to server named pipe
03:20:12.557 [12040.1020] <2> logconnections: BPJOBD CONNECT FROM 192.168.12.84.59431 TO 192.168.12.84.13723 fd = 2028
03:20:12.557 [12040.1020] <2> job_authenticate_connection: VxSS authentication check for now...
03:20:12.557 [12040.1020] <2> get_long_base: (1) cannot read (byte 1) from network: An established connection was aborted by the software in your host machine.
03:20:12.557 [12040.1020] <2> job_connect: Can't get server time
03:20:12.557 [12040.1020] <2> check_connect_time: It took 0 seconds to connect.
03:20:12.557 [12040.1020] <2> check_connect_time: Try again with 83 seconds left.
03:20:12.572 [12040.1020] <4> create_user_group_id_marker_WIN: connecting to named pipe:\\.\pipe\7100_{79A485BB-EDB5-4B73-9549-E548D74DD4CE}
03:20:12.572 [12040.1020] <4> create_user_group_id_marker_WIN: successfully connected to server named pipe
03:20:12.572 [12040.1020] <2> logconnections: BPJOBD CONNECT FROM 192.168.12.84.59432 TO 192.168.12.84.13723 fd = 2028
03:20:12.572 [12040.1020] <2> job_authenticate_connection: VxSS authentication check for now...
03:20:12.572 [12040.1020] <2> get_long_base: (1) cannot read (byte 1) from network: An established connection was aborted by the software in your host machine.
03:20:12.572 [12040.1020] <2> job_connect: Can't get server time
03:20:12.572 [12040.1020] <2> check_connect_time: It took 0 seconds to connect.
03:20:12.572 [12040.1020] <2> check_connect_time: Try again with 83 seconds left.
03:20:12.572 [12040.1020] <4> create_user_group_id_marker_WIN: connecting to named pipe:\\.\pipe\7100_{3DEBB42D-DC38-4586-BC0F-E1846B61998C}
03:20:12.572 [12040.1020] <4> create_user_group_id_marker_WIN: successfully connected to server named pipe
03:20:12.572 [12040.1020] <2> logconnections: BPJOBD CONNECT FROM 192.168.12.84.59435 TO 192.168.12.84.13723 fd = 2036
03:20:12.572 [12040.1020] <2> job_authenticate_connection: VxSS authentication check for now...Any ideas from anyone ?
- asg2ki4 years agoLevel 4
Problem is now resolved with a workaround.
Apparently NBU 8.3 seems to have issues when the master server's services are running under domain based accounts instead of LocalSystem.
This behavior was encountered during an entirely different issue with Exchange policies about which I have posted the details here:
NetBackup 8.3 - Exchange Policy Bugs
Once I changed the master server's service accounts to run under LocalSystem (except for Messaging Broker and Web Management services), I was able to duplicate the previously taken and failing images on the first go without seeing any of the "bpdbm" and "nblbc" issues which I previously described. Also I took fresh images of the DC's and those were also replicated without any problems.
Please note that this is just a workaround so the situation is far from ideal. More investigation shall be done by the backend engineers on this rather odd behavior of the product.
Cheers
Related Content
- 11 years ago
- 8 years ago