backup failing with error code 24 and 50
Hi ,
My backup policy is taking a backup of 10 different clients and it fails randomly for 3 or 4 with error code 24 and 50as shown below
08/10/2019 03:01:53 - Error bpbrm (pid=28472) from client prl-ercenmsvc2bkp: ERR - Cannot write to STDOUT. Errno = 110: Connection timed out
08/10/2019 03:01:55 - Error bptm (pid=28513) media manager terminated by parent process
08/10/2019 03:02:08 - Info bpbkar (pid=25895) done. status: 24: socket write failed
08/10/2019 03:02:08 - end writing; write time: 1:44:00
socket write failed (24)
bpbkar logs :
05:27:08.827 [11187] <8> bpbkar process_file: WRN - [/.ENM_BACKUP/root_lvm/var/spool/postfix/public/cleanup] is a socket special file. Skipping
05:27:14.968 [11187] <8> bpbkar process_file: WRN - [/.ENM_BACKUP/root_lvm/var/VRTSvcs/uds/.hadsocket] is a socket special file. Skipping
05:27:22.650 [11187] <8> bpbkar process_file: WRN - [/.ENM_BACKUP/root_lvm/var/run/multipathd.sock] is a socket special file. Skipping
05:27:22.777 [11187] <8> bpbkar process_file: WRN - [/.ENM_BACKUP/root_lvm/var/run/rpcbind.sock] is a socket special file. Skipping
05:27:22.880 [11187] <8> bpbkar process_file: WRN - [/.ENM_BACKUP/root_lvm/var/run/libvirt/libvirt-sock-ro] is a socket special file. Skipping
05:27:22.882 [11187] <8> bpbkar process_file: WRN - [/.ENM_BACKUP/root_lvm/var/run/libvirt/libvirt-sock] is a socket special file. Skipping
05:27:24.587 [11187] <8> bpbkar process_file: WRN - [/.ENM_BACKUP/root_lvm/var/run/dbus/system_bus_socket] is a socket special file. Skipping
05:27:24.626 [11187] <4> bpbkar expand_wildcards: end backup for filelist /.ENM_BACKUP/root_lvm/var
05:27:24.629 [11187] <4> bpbkar main: INF - Client completed sending data for backup
05:27:24.629 [11187] <2> bpbkar main: INF - Total Size:0
05:27:24.629 [11187] <2> accl_post_nbfeds_backup_list: Posting Accurate Licensing data
05:27:24.629 [11187] <4> accl_post_nbfeds_backup_list: Master server name: prl-ercenmombsbkp
05:27:24.629 [11187] <4> post_nbfeds_backup_list: Json String value: [{"virtualName":"","jobid":"","backupId":"prl-ercenmsvc3bkp_1566872326","backupSelection":"/.ENM_BA
CKUP/root_lvm/var","frontEndDataSizeMb":4651}]
05:27:24.629 [11187] <2> post_nbfeds_backup_list: Post data URL: netbackup/catalog/frontenddata
05:27:24.637 [11187] <2> NBClientCURL::NBClientCURL: Curl initialized successfully.
05:27:24.637 [11187] <2> NBClientCURL::NBClientCURL: Performing curl_easy_init()
05:27:24.637 [11187] <2> NBClientCURL::NBClientCURL: Disabling any proxy option set in the environment.
05:27:24.637 [11187] <2> NBClientCURL::NBClientCURL: Using paths as caCertFilePath [/usr/openv/var/webtruststore/cacert.pem] privateKeyPath [/usr/openv/var/vxss/credent
ials/keystore/PrivKeyFile.pem]
05:27:24.637 [11187] <2> NBClientCURL::NBClientCURL: Setting default CURL timeout = [300] seconds
05:27:24.637 [11187] <2> NBClientCURL::NBClientCURL: Disable signal during name resolution.
05:27:24.637 [11187] <2> NBClientCURL::nbcurl_bpconf_keyvalue_state: ENABLE_NBCURL_VERBOSE is not set
05:27:24.637 [11187] <2> NBClientCURL::nbcurl_bpconf_keyvalue_state: DISABLE_NBCURL_CONNECTIONS is not set
05:27:24.637 [11187] <2> NBClientCURL::nbcurl_bpconf_keyvalue_state: DISABLE_NBCURL_PREFERRED_NETWORKS is not set
05:27:24.655 [11187] <2> NBClientCURL::setupCurlOperationTimeout: Setting CURL timeout = [180] seconds
05:27:24.655 [11187] <2> NBClientCURL::Post: POSTing to URL https://prl-ercenmombsbkp:1556/netbackup/catalog/frontenddata
05:27:24.655 [11187] <2> NBClientCURL::Post: Sending payload ...
05:27:24.655 [11187] <2> NBClientCURL::setPostOptions: setting curl post options ...
05:27:24.656 [11187] <2> NBClientCURL::performOperation: Configuration suggests use of tunnel.
05:27:24.656 [11187] <2> NBClientCURL::performOperation: Looking up tunnels for master server [prl-ercenmombsbkp].
05:27:24.656 [11187] <2> NBWebsvcTunnelRouter::resetState: Tunnel Router's current state = [0].
05:27:24.656 [11187] <2> NBWebsvcTunnelRouter::resetState: Tunnel Router's state reset to [0].
05:27:24.656 [11187] <2> NBClientCURL::performOperation: Querying client router for a list of tunnels.
05:27:24.656 [11187] <2> NBWebsvcTunnelRouter::getWebsvcTunnelServers: Read NBU configuration settings.
05:27:24.656 [11187] <2> NBWebsvcTunnelRouter::getWebsvcTunnelServers: Returning [0] tunnel server from in-memory cache.
05:27:24.656 [11187] <2> NBWebsvcTunnelRouter::getWebsvcTunnelServers: Returning [0] tunnel servers from user-specified (override) settings in NBU Configuration.
05:27:24.656 [11187] <2> NBWebsvcTunnelRouter::getTunnelsFromCache: Cache File /usr/openv/netbackup/websvctunnels.cache does not exist.
05:27:24.656 [11187] <2> NBWebsvcTunnelRouter::getWebsvcTunnelServers: Returning [0] tunnel servers from Client router's tunnel servers' cache.
05:27:24.656 [11187] <2> NBWebsvcTunnelRouter::getWebsvcTunnelServers: Returning [0] tunnel servers. Will try to directly connect to the Web service.
05:27:24.656 [11187] <2> NBClientCURL::performOperation: Received a list of potential tunnels.
05:27:24.656 [11187] <2> NBClientCURL::performOperation: Empty tunnel entry encountered. Will try a direct connection to the Master server.
05:27:24.656 [11187] <2> NBClientCURL::performOperation: Disabling any --proxy option previously set.
05:27:24.656 [11187] <2> NBClientCURL::performOperation: Disabling any --proxytunnel option previously set.
05:27:24.657 [11187] <2> LoginWithCertManager::getLocalToken: tokenIssueTimeSec = 1566856007
05:27:24.657 [11187] <2> LoginWithCertManager::getLocalToken: tokenExpTimeSec = 1566942407
05:27:24.657 [11187] <2> LoginWithCertManager::getLocalToken: tokenLastFailTimeSec = 0
05:27:24.657 [11187] <2> LoginWithCertManager::isJWTRefreshRequired: tokenExpTimeSec = 1566942407, timeNow = 1566872844, (tokenExpTimeSec-timeNow) = 69563
05:27:24.657 [11187] <2> LoginWithCertManager::isJWTRefreshRequired: Token-refresh IS NOT required
05:27:24.657 [11187] <2> NBClientCURL::performCurlOperation: Obtained login token
05:27:24.685 [11187] <4> NBClientCURL::performCurlOperation: HTTP response header : HTTP/1.1 201 X-Request-ID: 280E34AC43382593 X-Request-ID: 280E34AC43382593 Content-
Length: 0 Date: Tue, 27 Aug 2019 02:27:24 GMT
05:27:24.685 [11187] <2> NBClientCURL::performCurlOperation: Fetched data = [xxxxx], httpcode = 201
05:27:24.685 [11187] <2> NBClientCURL::performOperation: Disabling any --proxy option previously set.
05:27:24.685 [11187] <2> NBClientCURL::performOperation: Disabling any --proxytunnel option previously set.
05:27:24.685 [11187] <2> NBClientCURL:~NBClientCURL: Performing curl_easy_cleanup()
05:27:24.685 [11187] <2> NBClientCURL:~NBClientCURL: Performing curl_global_cleanup()
05:27:24.686 [11187] <4> bpbkar Exit: INF - bpbkar exit normal
05:27:24.686 [11187] <4> bpbkar Exit: INF - EXIT STATUS 0: the requested operation was successfully completed
05:27:24.686 [11187] <2> bpbkar Exit: INF - Close of stdout complete
05:27:24.686 [11187] <4> bpbkar Exit: INF - setenv FINISHED=1
05:28:25.083 [10751] <8> bpbkar process_file: WRN - [/.ENM_BACKUP/root_lvm/root/usr/openv/var/proxy.d/proxy_helper-outbound_proxy-0-10861.uds] is a socket special file.
Skipping
05:28:25.440 [10751] <8> bpbkar process_file: WRN - [/.ENM_BACKUP/root_lvm/root/usr/openv/var/vnetd/bpcd.uds] is a socket special file. Skipping
05:28:25.447 [10751] <8> bpbkar process_file: WRN - [/.ENM_BACKUP/root_lvm/root/usr/openv/var/vnetd/terminate_vnetd.uds] is a socket special file. Skipping
05:28:25.498 [10751] <8> bpbkar process_file: WRN - [/.ENM_BACKUP/root_lvm/root/usr/openv/var/vnetd/terminate_bpcd.uds] is a socket special file. Skipping
05:37:12.549 [10751] <4> check_file_sparseness: File </.ENM_BACKUP/root_lvm/root/etc/openldap/certs/key3.db> is probably sparse. Size = 16384 st_blocks = 24 block_size
= 512, fstype = 37
05:37:12.549 [10751] <2> bpbkar process_file: INF - /.ENM_BACKUP/root_lvm/root/etc/openldap/certs/key3.db is sparse: stat.st_size = 16384, stat.st_blocks * 512 = 12288
05:37:12.549 [10751] <2> bpbkar process_file: INF - /.ENM_BACKUP/root_lvm/root/etc/openldap/certs/key3.db is now size 8448
05:37:12.552 [10751] <4> check_file_sparseness: File </.ENM_BACKUP/root_lvm/root/etc/openldap/certs/cert8.db> is probably sparse. Size = 65536 st_blocks = 72 block_size
= 512, fstype = 37
05:37:12.552 [10751] <2> bpbkar process_file: INF - /.ENM_BACKUP/root_lvm/root/etc/openldap/certs/cert8.db is sparse: stat.st_size = 65536, stat.st_blocks * 512 = 36864
05:37:12.553 [10751] <2> bpbkar process_file: INF - /.ENM_BACKUP/root_lvm/root/etc/openldap/certs/cert8.db is now size 33024
05:37:12.557 [10751] <4> check_file_sparseness: File </.ENM_BACKUP/root_lvm/root/etc/openldap/certs/secmod.db> is probably sparse. Size = 16384 st_blocks = 24 block_siz
e = 512, fstype = 37
05:37:12.557 [10751] <2> bpbkar process_file: INF - /.ENM_BACKUP/root_lvm/root/etc/openldap/certs/secmod.db is sparse: stat.st_size = 16384, stat.st_blocks * 512 = 1228
8
05:37:12.558 [10751] <2> bpbkar process_file: INF - /.ENM_BACKUP/root_lvm/root/etc/openldap/certs/secmod.db is now size 5120
05:37:17.417 [10751] <4> bpbkar expand_wildcards: end backup for filelist /.ENM_BACKUP/root_lvm/root
05:37:17.422 [10751] <4> bpbkar main: INF - Client completed sending data for backup
05:37:17.422 [10751] <2> bpbkar main: INF - Total Size:0
05:37:17.422 [10751] <2> accl_post_nbfeds_backup_list: Posting Accurate Licensing data
05:37:17.422 [10751] <4> accl_post_nbfeds_backup_list: Master server name: prl-ercenmombsbkp
05:37:17.422 [10751] <4> post_nbfeds_backup_list: Json String value: [{"virtualName":"","jobid":"","backupId":"prl-ercenmsvc3bkp_1566872317","backupSelection":"/.ENM_BA
CKUP/root_lvm/root","frontEndDataSizeMb":3903}]
05:37:17.422 [10751] <2> post_nbfeds_backup_list: Post data URL: netbackup/catalog/frontenddata
05:37:17.430 [10751] <2> NBClientCURL::NBClientCURL: Curl initialized successfully.
05:37:17.430 [10751] <2> NBClientCURL::NBClientCURL: Performing curl_easy_init()
05:37:17.430 [10751] <2> NBClientCURL::NBClientCURL: Disabling any proxy option set in the environment.
05:37:17.430 [10751] <2> NBClientCURL::NBClientCURL: Using paths as caCertFilePath [/usr/openv/var/webtruststore/cacert.pem] privateKeyPath [/usr/openv/var/vxss/credent
ials/keystore/PrivKeyFile.pem]
05:37:17.430 [10751] <2> NBClientCURL::NBClientCURL: Setting default CURL timeout = [300] seconds
05:37:17.430 [10751] <2> NBClientCURL::NBClientCURL: Disable signal during name resolution.
05:37:17.430 [10751] <2> NBClientCURL::nbcurl_bpconf_keyvalue_state: ENABLE_NBCURL_VERBOSE is not set
05:37:17.430 [10751] <2> NBClientCURL::nbcurl_bpconf_keyvalue_state: DISABLE_NBCURL_CONNECTIONS is not set
05:37:17.430 [10751] <2> NBClientCURL::nbcurl_bpconf_keyvalue_state: DISABLE_NBCURL_PREFERRED_NETWORKS is not set
05:37:17.441 [10751] <2> NBClientCURL::setupCurlOperationTimeout: Setting CURL timeout = [180] seconds
05:37:17.441 [10751] <2> NBClientCURL::Post: POSTing to URL https://prl-ercenmombsbkp:1556/netbackup/catalog/frontenddata
05:37:17.441 [10751] <2> NBClientCURL::Post: Sending payload ...
05:37:17.441 [10751] <2> NBClientCURL::setPostOptions: setting curl post options ...
05:37:17.443 [10751] <2> NBClientCURL::performOperation: Configuration suggests use of tunnel.
05:37:17.443 [10751] <2> NBClientCURL::performOperation: Looking up tunnels for master server [prl-ercenmombsbkp].
05:37:17.443 [10751] <2> NBWebsvcTunnelRouter::resetState: Tunnel Router's current state = [0].
05:37:17.443 [10751] <2> NBWebsvcTunnelRouter::resetState: Tunnel Router's state reset to [0].
05:37:17.443 [10751] <2> NBClientCURL::performOperation: Querying client router for a list of tunnels.
05:37:17.443 [10751] <2> NBWebsvcTunnelRouter::getWebsvcTunnelServers: Read NBU configuration settings.
05:37:17.443 [10751] <2> NBWebsvcTunnelRouter::getWebsvcTunnelServers: Returning [0] tunnel server from in-memory cache.
05:37:17.443 [10751] <2> NBWebsvcTunnelRouter::getWebsvcTunnelServers: Returning [0] tunnel servers from user-specified (override) settings in NBU Configuration.
05:37:17.443 [10751] <2> NBWebsvcTunnelRouter::getTunnelsFromCache: Cache File /usr/openv/netbackup/websvctunnels.cache does not exist.
05:37:17.443 [10751] <2> NBWebsvcTunnelRouter::getWebsvcTunnelServers: Returning [0] tunnel servers from Client router's tunnel servers' cache.
05:37:17.443 [10751] <2> NBWebsvcTunnelRouter::getWebsvcTunnelServers: Returning [0] tunnel servers. Will try to directly connect to the Web service.
05:37:17.443 [10751] <2> NBClientCURL::performOperation: Received a list of potential tunnels.
05:37:17.443 [10751] <2> NBClientCURL::performOperation: Empty tunnel entry encountered. Will try a direct connection to the Master server.
05:37:17.443 [10751] <2> NBClientCURL::performOperation: Disabling any --proxy option previously set.
05:37:17.443 [10751] <2> NBClientCURL::performOperation: Disabling any --proxytunnel option previously set.
05:37:17.443 [10751] <2> LoginWithCertManager::getLocalToken: tokenIssueTimeSec = 1566856007
05:37:17.443 [10751] <2> LoginWithCertManager::getLocalToken: tokenExpTimeSec = 1566942407
05:37:17.443 [10751] <2> LoginWithCertManager::getLocalToken: tokenLastFailTimeSec = 0
05:37:17.443 [10751] <2> LoginWithCertManager::isJWTRefreshRequired: tokenExpTimeSec = 1566942407, timeNow = 1566873437, (tokenExpTimeSec-timeNow) = 68970
05:37:17.443 [10751] <2> LoginWithCertManager::isJWTRefreshRequired: Token-refresh IS NOT required
05:37:17.443 [10751] <2> NBClientCURL::performCurlOperation: Obtained login token
05:37:17.473 [10751] <4> NBClientCURL::performCurlOperation: HTTP response header : HTTP/1.1 201 X-Request-ID: A38307B46654BD6E X-Request-ID: A38307B46654BD6E Content-
Length: 0 Date: Tue, 27 Aug 2019 02:37:17 GMT
05:37:17.473 [10751] <2> NBClientCURL::performCurlOperation: Fetched data = [xxxxx], httpcode = 201
05:37:17.473 [10751] <2> NBClientCURL::performOperation: Disabling any --proxy option previously set.
05:37:17.473 [10751] <2> NBClientCURL::performOperation: Disabling any --proxytunnel option previously set.
05:37:17.473 [10751] <2> NBClientCURL:~NBClientCURL: Performing curl_easy_cleanup()
05:37:17.473 [10751] <2> NBClientCURL:~NBClientCURL: Performing curl_global_cleanup()
05:37:17.473 [10751] <4> bpbkar Exit: INF - bpbkar exit normal
05:37:17.473 [10751] <4> bpbkar Exit: INF - EXIT STATUS 0: the requested operation was successfully completed
05:37:17.473 [10751] <2> bpbkar Exit: INF - Close of stdout complete
05:37:17.473 [10751] <4> bpbkar Exit: INF - setenv FINISHED=1
05:54:17.497 [11904] <16> flush_archive(): ERR - Cannot write to STDOUT. Errno = 110: Connection timed out
05:54:17.499 [11904] <16> bpbkar Exit: ERR - bpbkar FATAL exit status = 24: socket write failed
05:54:17.499 [11904] <4> bpbkar Exit: INF - EXIT STATUS 24: socket write failed
05:54:17.499 [11904] <2> bpbkar Exit: INF - Close of stdout complete
05:54:17.499 [11904] <4> bpbkar Exit: INF - setenv FINISHED=0
Above bpbkar logs shows that at 5:37 when bpbkar finished sending data ..it didnot cose the socket and after 16 minutes at 5:54 its timed out
on days when it is succesful : nothing after finish
05:35:56.186 [40463] <2> NBWebsvcTunnelRouter::getTunnelsFromCache: Cache File /usr/openv/netbackup/websvctunnels.cache does not exist.
05:35:56.186 [40463] <2> NBWebsvcTunnelRouter::getWebsvcTunnelServers: Returning [0] tunnel servers from Client router's tunnel servers' cache.
05:35:56.186 [40463] <2> NBWebsvcTunnelRouter::getWebsvcTunnelServers: Returning [0] tunnel servers. Will try to directly connect to the Web service.
05:35:56.186 [40463] <2> NBClientCURL::performOperation: Received a list of potential tunnels.
05:35:56.186 [40463] <2> NBClientCURL::performOperation: Empty tunnel entry encountered. Will try a direct connection to the Master server.
05:35:56.186 [40463] <2> NBClientCURL::performOperation: Disabling any --proxy option previously set.
05:35:56.186 [40463] <2> NBClientCURL::performOperation: Disabling any --proxytunnel option previously set.
05:35:56.186 [40463] <2> LoginWithCertManager::getLocalToken: tokenIssueTimeSec = 1566872278
05:35:56.186 [40463] <2> LoginWithCertManager::getLocalToken: tokenExpTimeSec = 1566958678
05:35:56.186 [40463] <2> LoginWithCertManager::getLocalToken: tokenLastFailTimeSec = 0
05:35:56.186 [40463] <2> LoginWithCertManager::isJWTRefreshRequired: tokenExpTimeSec = 1566958678, timeNow = 1566873356, (tokenExpTimeSec-timeNow) = 85322
05:35:56.186 [40463] <2> LoginWithCertManager::isJWTRefreshRequired: Token-refresh IS NOT required
05:35:56.186 [40463] <2> NBClientCURL::performCurlOperation: Obtained login token
05:35:56.252 [40463] <4> NBClientCURL::performCurlOperation: HTTP response header : HTTP/1.1 201 X-Request-ID: 1086FD2A20531977 X-Request-ID: 1086FD2A20531977 Content-Length: 0 Date: Tue, 27 Aug 2019 02:35:56 GMT
05:35:56.252 [40463] <2> NBClientCURL::performCurlOperation: Fetched data = [xxxxx], httpcode = 201
05:35:56.252 [40463] <2> NBClientCURL::performOperation: Disabling any --proxy option previously set.
05:35:56.252 [40463] <2> NBClientCURL::performOperation: Disabling any --proxytunnel option previously set.
05:35:56.252 [40463] <2> NBClientCURL:~NBClientCURL: Performing curl_easy_cleanup()
05:35:56.252 [40463] <2> NBClientCURL:~NBClientCURL: Performing curl_global_cleanup()
05:35:56.252 [40463] <4> bpbkar Exit: INF - bpbkar exit normal
05:35:56.252 [40463] <4> bpbkar Exit: INF - EXIT STATUS 0: the requested operation was successfully completed
05:35:56.252 [40463] <2> bpbkar Exit: INF - Close of stdout complete
05:35:56.252 [40463] <4> bpbkar Exit: INF - setenv FINISHED=1
During successful backup : 5 Finished=1 entries
[root@prl-ercenmsvc3 bpbkar]# grep -i FINISHED root.082919_00001.log
05:17:47.137 [12954] <4> bpbkar Exit: INF - setenv FINISHED=1
05:18:19.415 [13541] <4> bpbkar Exit: INF - setenv FINISHED=1
05:18:54.266 [12920] <4> bpbkar Exit: INF - setenv FINISHED=1
05:25:22.667 [12983] <4> bpbkar Exit: INF - setenv FINISHED=1
05:29:12.226 [12670] <4> bpbkar Exit: INF - setenv FINISHED=1
and during failure : after 5 finished=1 entries socket dinot close and one more entry after 16 minutes Finished=0
root@prl-ercenmsvc3 bpbkar]# grep -i finished root.082719_00001.log
05:18:51.366 [11239] <4> bpbkar Exit: INF - setenv FINISHED=1
05:19:20.380 [11842] <4> bpbkar Exit: INF - setenv FINISHED=1
05:19:47.308 [11044] <4> bpbkar Exit: INF - setenv FINISHED=1
05:27:24.686 [11187] <4> bpbkar Exit: INF - setenv FINISHED=1
05:37:17.473 [10751] <4> bpbkar Exit: INF - setenv FINISHED=1
05:54:17.499 [11904] <4> bpbkar Exit: INF - setenv FINISHED=0
Is my understanding correct that sometimes bpbkar is not closing socket after sending all stream data to media server.
and how can I correct it ?
any suggestions please ?
Regards
S