Sybase restore take more time than usual
Hello,
Since some days, the refreshs from sybase are taking more time than usually.
Activity monitor logs from this morning is the following :
Jul 28, 2022 5:18:24 AM - begin Restore
Jul 28, 2022 5:18:25 AM - Info bprd (pid=72446) Found (1) files in (1) images for Restore Job ID 5010796.xxx
Jul 28, 2022 5:18:26 AM - restoring from image vmp-sybase-13_1658964717
Jul 28, 2022 5:18:26 AM - Info bprd (pid=72446) Searched (1) files of (1) files for Restore Job ID 5010796.xxx
Jul 28, 2022 5:18:26 AM - Info bprd (pid=72446) Restoring from copy 1 of image created Thu Jul 28 01:31:57 2022 from policy PRD_DC1_SYB
Jul 28, 2022 5:18:26 AM - requesting resource @aaafR
Jul 28, 2022 5:18:26 AM - granted resource MediaID=@aaafR;DiskVolume=dc1backup;DiskPool=DC1-DD9400-01_dc1backup_dp;Path=dc1backup;StorageServer=dc1-dd9400-01-backup.lux.eib.org;MediaServer=vmp-nbums-05.lux.eib.org
Jul 28, 2022 5:18:27 AM - Info bpbrm (pid=2351893) vmt-sybase-40.lux.eib.org is the host to restore to
Jul 28, 2022 5:18:27 AM - Info bpbrm (pid=2351893) reading file list for client
Jul 28, 2022 5:18:27 AM - Info bpbrm (pid=2351893) listening for client connection
Jul 28, 2022 5:18:28 AM - Info bpbrm (pid=2351893) accepted connection from client
Jul 28, 2022 5:18:28 AM - Info dbclient (pid=28172) Restore started
Jul 28, 2022 5:18:28 AM - Info bpbrm (pid=2351893) bptm pid: 2351906
Jul 28, 2022 5:18:29 AM - Info bptm (pid=2351906) start
Jul 28, 2022 5:18:29 AM - started process bptm (pid=2351906)
Jul 28, 2022 5:18:29 AM - Info bptm (pid=2351906) reading backup image
Jul 28, 2022 5:18:29 AM - Info bptm (pid=2351906) using 512 data buffers
Jul 28, 2022 5:18:29 AM - Info bptm (pid=2351906) spawning a child process
Jul 28, 2022 5:18:29 AM - Info bptm (pid=2351906) child pid: 2351908
Jul 28, 2022 5:18:31 AM - begin reading
Jul 28, 2022 5:55:42 AM - Info bptm (pid=2351906) waited for empty buffer 29858 times, delayed 40615 times
Jul 28, 2022 5:55:42 AM - end reading; read time: 0:37:11
Jul 28, 2022 6:21:22 AM - Info bptm (pid=2351906) completed reading backup image
Jul 28, 2022 6:21:22 AM - Info bptm (pid=2351906) EXITING with status 0 <----------
Jul 28, 2022 6:21:22 AM - Info dbclient (pid=28172) done. status: 0
Jul 28, 2022 6:21:22 AM - Info dbclient (pid=28172) done. status: 0: the requested operation was successfully completed
Jul 28, 2022 6:21:22 AM - restored from image vmp-sybase-13_1658964717; restore time: 1:02:56
Jul 28, 2022 6:21:23 AM - end Restore; elapsed time 1:02:59
The requested operation was successfully completed. (0)
We can see that there is a delay between "end reading; read time: 0:37:11" and "Info bptm (pid=2351906) completed reading backup image"
bptm logs are the following :
05:55:42.794 [2351906] <2> read_data: read short block, bytes = 819200, remainder = 0
05:55:42.794 [2351906] <2> read_data: waited for empty buffer 29858 times, delayed 40615 times
05:55:42.812 [2351906] <2> ConnectionCache::connectAndCache: Acquiring new connection for host bck-srv, query type 1
05:55:42.815 [2351906] <2> vnet_pbxConnect_ex: pbxConnectExEx Succeeded
05:55:42.874 [2351906] <2> vnet_proxy_protocol_from_legacy_ex: remote_proxy_version = 4
05:55:42.875 [2351906] <2> logconnections: PROXY CONNECT FROM 172.19.151.22.58627 TO 172.26.88.130.1556 fd = 19
05:55:42.875 [2351906] <2> logconnections: BPDBM CONNECT FROM 127.0.0.1.36799 TO 127.0.0.1.52298 fd = 19
05:55:42.924 [2351906] <2> db_end: Need to collect reply
05:55:42.933 [2351906] <2> closeConnection: Caching connection for query type 1 for reuse
05:55:42.933 [2351906] <4> report_throughput: VBRT 2 2351906 4 1 dc1backup @aaafR 0 1 0 106439456 106439456 1528 180476 612496 1561757 29858 40615 15 103946 512 (../bptm.c.11992)
[...]
06:21:22.399 [2351908] <2> get_next_file: [2351906] received string: (EXIT STATUS 0)
06:21:22.400 [2351908] <2> terminate_restore: [2351906] waited for full buffer 43 times, delayed 17066 times
06:21:22.400 [2351908] <2> terminate_restore: [2351906] sent 1 directories/files to client vmt-sybase-40.lux.eib.org
06:21:22.402 [2351908] <8> JobdSockList::~JobdSockList: 1 socket left.
06:21:22.402 [2351908] <4> JobdSockList::UnregisterSocket: Unregister socket (1).
06:21:22.402 [2351908] <2> put_length_bytes_nonblocking: cannot write data to network: Bad file descriptor (9)
06:21:22.402 [2351908] <2> job_disconnect: Failed to send request
06:21:22.402 [2351908] <2> vnet_proxy_get_logs: fstat(1) failed: 9
06:21:22.444 [2351906] <2> ConnectionCache::connectAndCache: Acquiring new connection for host bck-srv, query type 1
06:21:22.447 [2351906] <2> vnet_pbxConnect_ex: pbxConnectExEx Succeeded
06:21:22.502 [2351906] <2> vnet_proxy_protocol_from_legacy_ex: remote_proxy_version = 4
06:21:22.503 [2351906] <2> logconnections: PROXY CONNECT FROM 172.19.151.22.37405 TO 172.26.88.130.1556 fd = 19
06:21:22.503 [2351906] <2> logconnections: BPDBM CONNECT FROM 127.0.0.1.51745 TO 127.0.0.1.34094 fd = 19
06:21:22.548 [2351906] <2> db_end: Need to collect reply
06:21:22.557 [2351906] <2> closeConnection: Caching connection for query type 1 for reuse
06:21:22.557 [2351906] <4> read_backup: successfully read (restore) backup id vmp-sybase-13_1658964717, copy 1, fragment 1, 106439456 Kbytes at 47705.219 Kbytes/sec
06:21:22.558 [2351906] <2> notify: executing - /usr/openv/netbackup/bin/restore_notify bptm vmp-sybase-13_1658964717 restore
06:21:22.563 [2351906] <2> Media_dispatch_signal: calling child_wait (../common.c:4234) delay 0 seconds
06:21:22.563 [2351906] <2> bptm: Calling tpunmount for media @aaafR
06:21:22.563 [2351906] <2> send_MDS_msg: MEDIA_DONE 0 5010796 0 @aaafR 373 0 {F1CAC1CA-0E23-11ED-9443-B56714432736}
06:21:22.563 [2351906] <2> packageBptmResourceDoneMsg: msg (MEDIA_DONE 0 5010796 0 @aaafR 373 0 {F1CAC1CA-0E23-11ED-9443-B56714432736})
06:21:22.563 [2351906] <2> packageBptmResourceDoneMsg: keyword MEDIA_DONE version 0 jobid 5010796 copyNum 0 mediaId @aaafR mediaKey 373 unloadDelay 0 allocId {F1CAC1CA-0E23-11ED-9443-B56714432736}
06:21:22.563 [2351906] <2> packageBptmResourceDoneMsg: returns 0
06:21:22.569 [2351906] <2> JobInst::sendIrmMsg: returning
06:21:22.569 [2351906] <2> bptm: EXITING with status 0 <----------
An old refresh was like that :
Jul 19, 2022 5:18:18 AM - begin Restore
Jul 19, 2022 5:18:18 AM - Info bprd (pid=98862) Found (1) files in (1) images for Restore Job ID 4870118.xxx
Jul 19, 2022 5:18:19 AM - restoring from image vmp-sybase-13_1658187148
Jul 19, 2022 5:18:19 AM - Info bprd (pid=98862) Searched (1) files of (1) files for Restore Job ID 4870118.xxx
Jul 19, 2022 5:18:19 AM - Info bprd (pid=98862) Restoring from copy 1 of image created Tue Jul 19 01:32:28 2022 from policy PRD_DC1_SYB
Jul 19, 2022 5:18:19 AM - requesting resource @aaafR
Jul 19, 2022 5:18:19 AM - granted resource MediaID=@aaafR;DiskVolume=dc1backup;DiskPool=DC1-DD9400-01_dc1backup_dp;Path=dc1backup;StorageServer=dc1-dd9400-01-backup.lux.eib.org;MediaServer=vmp-nbums-05.lux.eib.org
Jul 19, 2022 5:18:21 AM - Info bpbrm (pid=1983908) vmt-sybase-40.lux.eib.org is the host to restore to
Jul 19, 2022 5:18:21 AM - Info bpbrm (pid=1983908) reading file list for client
Jul 19, 2022 5:18:21 AM - Info bpbrm (pid=1983908) listening for client connection
Jul 19, 2022 5:18:22 AM - Info bpbrm (pid=1983908) accepted connection from client
Jul 19, 2022 5:18:22 AM - Info dbclient (pid=20498) Restore started
Jul 19, 2022 5:18:22 AM - Info bpbrm (pid=1983908) bptm pid: 1983918
Jul 19, 2022 5:18:22 AM - Info bptm (pid=1983918) start
Jul 19, 2022 5:18:22 AM - started process bptm (pid=1983918)
Jul 19, 2022 5:18:23 AM - Info bptm (pid=1983918) reading backup image
Jul 19, 2022 5:18:23 AM - Info bptm (pid=1983918) using 512 data buffers
Jul 19, 2022 5:18:23 AM - Info bptm (pid=1983918) spawning a child process
Jul 19, 2022 5:18:23 AM - Info bptm (pid=1983918) child pid: 1983920
Jul 19, 2022 5:18:25 AM - begin reading
Jul 19, 2022 5:46:20 AM - Info bptm (pid=1983918) waited for empty buffer 7364 times, delayed 17401 times
Jul 19, 2022 5:46:20 AM - end reading; read time: 0:27:55
Jul 19, 2022 5:47:57 AM - Info bptm (pid=1983918) completed reading backup image
Jul 19, 2022 5:47:57 AM - Info bptm (pid=1983918) EXITING with status 0 <----------
Jul 19, 2022 5:47:57 AM - Info dbclient (pid=20498) done. status: 0
Jul 19, 2022 5:47:57 AM - Info dbclient (pid=20498) done. status: 0: the requested operation was successfully completed
Jul 19, 2022 5:47:57 AM - restored from image vmp-sybase-13_1658187148; restore time: 0:29:38
Jul 19, 2022 5:47:58 AM - end Restore; elapsed time 0:29:40
The requested operation was successfully completed. (0)
Here, there is almost no delay between "end reading" and "Info bptm (pid=1983918) completed reading backup image"
05:46:20.084 [1983918] <2> read_data: read short block, bytes = 360448, remainder = 0
05:46:20.085 [1983918] <2> read_data: waited for empty buffer 7364 times, delayed 17401 times
05:46:20.104 [1983918] <2> ConnectionCache::connectAndCache: Acquiring new connection for host bck-srv, query type 1
05:46:20.107 [1983918] <2> vnet_pbxConnect_ex: pbxConnectExEx Succeeded
05:46:20.163 [1983918] <2> vnet_proxy_protocol_from_legacy_ex: remote_proxy_version = 4
05:46:20.164 [1983918] <2> logconnections: PROXY CONNECT FROM 172.19.151.22.34583 TO 172.26.88.130.1556 fd = 19
05:46:20.164 [1983918] <2> logconnections: BPDBM CONNECT FROM 127.0.0.1.36963 TO 127.0.0.1.49226 fd = 19
05:46:20.211 [1983918] <2> db_end: Need to collect reply
05:46:20.217 [1983918] <2> closeConnection: Caching connection for query type 1 for reuseroot.071922_00001.log
05:46:20.217 [1983918] <4> report_throughput: VBRT 2 1983918 4 1 dc1backup @aaafR 0 1 0 105617760 105617760 4389 1326 264181 1358602 7364 17401 15 103144 512 (../bptm.c.11992)
05:47:57.122 [1983920] <2> get_next_file: [1983918] received string: (EXIT STATUS 0)
05:47:57.122 [1983920] <2> terminate_restore: [1983918] waited for full buffer 11309 times, delayed 16489 times
05:47:57.122 [1983920] <2> terminate_restore: [1983918] sent 1 directories/files to client vmt-sybase-40.lux.eib.org
05:47:57.124 [1983920] <8> JobdSockList::~JobdSockList: 1 socket left.
05:47:57.124 [1983920] <4> JobdSockList::UnregisterSocket: Unregister socket (1).
05:47:57.124 [1983920] <2> put_length_bytes_nonblocking: cannot write data to network: Bad file descriptor (9)
05:47:57.124 [1983920] <2> job_disconnect: Failed to send request
05:47:57.124 [1983920] <2> vnet_proxy_get_logs: fstat(1) failed: 9
05:47:57.163 [1983918] <2> ConnectionCache::connectAndCache: Acquiring new connection for host bck-srv, query type 1
05:47:57.166 [1983918] <2> vnet_pbxConnect_ex: pbxConnectExEx Succeeded
05:47:57.181 [1983918] <2> vnet_proxy_protocol_from_legacy_ex: remote_proxy_version = 4
05:47:57.183 [1983918] <2> logconnections: PROXY CONNECT FROM 172.19.151.22.43387 TO 172.26.88.130.1556 fd = 19
05:47:57.183 [1983918] <2> logconnections: BPDBM CONNECT FROM 127.0.0.1.36963 TO 127.0.0.1.49256 fd = 19
05:47:57.231 [1983918] <2> db_end: Need to collect reply
05:47:57.238 [1983918] <2> closeConnection: Caching connection for query type 1 for reuse
05:47:57.238 [1983918] <4> read_backup: successfully read (restore) backup id vmp-sybase-13_1658187148, copy 1, fragment 1, 105617760 Kbytes at 63059.972 Kbytes/sec
05:47:57.238 [1983918] <2> notify: executing - /usr/openv/netbackup/bin/restore_notify bptm vmp-sybase-13_1658187148 restore
05:47:57.244 [1983918] <2> Media_dispatch_signal: calling child_wait (../common.c:4234) delay 0 seconds
05:47:57.244 [1983918] <2> bptm: Calling tpunmount for media @aaafR
05:47:57.244 [1983918] <2> send_MDS_msg: MEDIA_DONE 0 4870118 0 @aaafR 373 0 {703272B6-0711-11ED-B5E1-81C675697DE0}
05:47:57.244 [1983918] <2> packageBptmResourceDoneMsg: msg (MEDIA_DONE 0 4870118 0 @aaafR 373 0 {703272B6-0711-11ED-B5E1-81C675697DE0})
05:47:57.244 [1983918] <2> packageBptmResourceDoneMsg: keyword MEDIA_DONE version 0 jobid 4870118 copyNum 0 mediaId @aaafR mediaKey 373 unloadDelay 0 allocId {703272B6-0711-11ED-B5E1-81C675697DE0}
05:47:57.244 [1983918] <2> packageBptmResourceDoneMsg: returns 0
05:47:57.250 [1983918] <2> JobInst::sendIrmMsg: returning
05:47:57.250 [1983918] <2> bptm: EXITING with status 0 <----------
Any idea why there can be a delay ? How can I investigate ?