Forum Discussion
- NicolaiModerator
This is a issue when the OS doesn't return a list of files after 60 seconds. Try do a "ls -l" on the directory with the most files and see if it takes more than 60 seconds complete. Be aware that re-issuing the command properly will return the file list pretty quick since the list is in the file cache.
To find the directory that causes the timeout follow this T/N:
http://www.symantec.com/docs/TECH31513
You can't change the 60 second timeout, with CLIENT_READ_TIME or CLIENT_CONNENCT_TIMEOUT. To remedy the issue you can reduce the number of files, exclude it for backup or tune the file system.
- bsbLevel 3
Sorry I should have mentioned this is an RMAN backup not filesystem. I looked at the tech doc you linked to though and turned on the higher level tracing as it suggested. I am seeing some errors from vnetd:
13:21:56.140 [18374] <2> vnet_sock_ready: vnet.c.507: max_time: 120 0x00000078
13:21:56.140 [18374] <2> vnet_sock_ready: vnet.c.508: sock: 8 0x00000008
13:21:56.140 [18374] <2> vnet_sock_ready: vnet.c.509: Function failed: 11 0x0000000b
13:21:56.140 [18374] <2> vnet_pop_byte: vnet.c.173: Function failed: 11 0x0000000b
13:21:56.140 [18374] <2> vnet_pop_string: vnet.c.268: Function failed: 11 0x0000000b
13:21:56.140 [18374] <2> vnet_pop_signed: vnet.c.312: Function failed: 11 0x0000000b
13:21:56.140 [18374] <2> vnet_pop_status: vnet.c.390: Function failed: 11 0x0000000b
13:21:56.140 [18374] <2> vnet_vnetd_connect_forward_socket_end: vnet_vnetd.c.614: try again: 540 0x0000021cThanks
- rizwan84txLevel 6
Could you enable bpcd log on the client (Verbose 5) and check if a response was received.
- bsbLevel 3
I turned on logging. I do not see much in the BPCD logs. In the activity monitor I see a status 54 at 12:57:59 AM and in my RMAN logs I see several errors like the following:
RMAN-03009: failure of backup command on s1 channel at 04/19/2011 01:14:10
ORA-19506: failed to create sequential file, name="c-2237647430-20110412-00", parms=""
ORA-27028: skgfqcre: sbtbackup returned error
ORA-19511: Error received from media manager layer, error text:
VxBSACreateObject: Failed with error:
Server Status: timed out connecting to client
ORA-19600: input file is backup piece (+ARCHIVE/autobackup/2011_04_12/s_748224764.305.748224765)
ORA-19601: output file is backup piece (c-2237647430-20110412-00)
channel s1 disabled, job failed on it will be run on another channelIn the BPCD log at that time I see where the failure occured but there is no actual error message. It just submits a close request then another open.
00:57:59.825 [6374] <2> process_requests: BPCD_OPEN_FOR_WRITE_RQST
00:57:59.826 [6374] <2> process_requests: open for write file name = /usr/openv/netbackup/logs/user_ops/dbext/logs/4769.0.1303189075
00:57:59.828 [6374] <2> process_requests: buffer length = 22
00:57:59.829 [6374] <2> process_requests: our open flags = 0x119
00:57:59.829 [6374] <2> process_requests: client open flags = 0x1441
00:57:59.829 [6374] <2> process_requests: completed the write of the file number to the output socket
00:57:59.833 [6374] <2> process_requests: BPCD_LOG_RQST
00:57:59.900 [6374] <2> process_requests: BPCD_CLOSE_FILE_RQST
00:58:01.858 [6374] <2> process_requests: BPCD_OPEN_FOR_WRITE_RQST
00:58:01.860 [6374] <2> process_requests: open for write file name = /usr/openv/netbackup/logs/user_ops/dbext/logs/4751.0.1303189077
00:58:01.861 [6374] <2> process_requests: buffer length = 22
00:58:01.863 [6374] <2> process_requests: our open flags = 0x119
00:58:01.863 [6374] <2> process_requests: client open flags = 0x1441
00:58:01.863 [6374] <2> process_requests: completed the write of the file number to the output socket
00:58:01.867 [6374] <2> process_requests: BPCD_LOG_RQST
00:58:01.930 [6374] <2> process_requests: BPCD_CLOSE_FILE_RQST
00:58:09.286 [7145] <2> process_requests: BPCD_DISCONNECT_RQST
00:58:09.286 [7145] <2> bpcd exit_bpcd: exit status 0 ----------->exitingSince the backups write many of the files it is not a connection issue. It is consistent but not constant. Do I need to set one of the timeouts to be as long as the max file write time? I do not see any realted to how long a file takes to write.
Thanks
- NicolaiModerator
Try ramp-up CLIENT_READ_TIMEOUT. RMAN sometimes need very long time to determine what archives is needed. All this time Netbackup need to wait. Try setting CLIENT_READ_TIMEOUT = 86400 and retry the operation. Today where all communication is passed thru 13724, more meaningful messages may be masked as sockets errors because of vnetd.
- bsbLevel 3
I set the read timeout to the max it would allow me (327667) and it still timed out.
readCommFile: ERR - timed out after 32767 seconds while reading from /usr/openv/netbackup/logs/user_ops/dbext/logs/20626.0.1303445742
Brian
- NicolaiModerator
Can you confirm that a connection is really established ?
How fragmented is the recover catalog ?. I have seen really long response time from a recover catalog because some household routines of the RMAN catalog wasn't running (index update etc etc). Talk to the DBA about this.
Related Content
- 10 months ago
- 6 years ago
- 13 years ago
- 14 years ago