As I await access to a few clients that failed last night, I figure I'd throw this out there in case there is a quick explanation...
The environment is running NB 6.5.6 (on clients and media servers). We have 2 media servers (a windows and linux media server), backing up windows and rh linux clients.
The storage media is LTO4.
In an effort to improve performance, I increased the NET_BUFFER_SZ from 65536 to 262144, and increased the DATA_BUFFER_SZ from 131072 to 262144. I just realized the 2 higher numbers differ...perhaps I fat fingered the numbers when I made the change but this is what I see in the BPTM log:
Before:
23:07:46.556 [23746] <2> io_set_recvbuf: receive network buffer is 131072 bytes
23:07:49.634 [23856] <2> io_set_recvbuf: setting receive network buffer to 65536 bytes
23:07:49.634 [23856] <2> io_set_recvbuf: receive network buffer is 131072 bytes
23:07:50.634 [23882] <2> io_set_recvbuf: setting receive network buffer to 65536 bytes
23:07:50.634 [23882] <2> io_set_recvbuf: receive network buffer is 131072 bytes
23:07:50.646 [23880] <2> io_set_recvbuf: setting receive network buffer to 65536 bytes
After:
17:03:15.992 [9475] <2> io_set_recvbuf: setting receive network buffer to 262144 bytes
17:03:15.992 [9475] <2> io_set_recvbuf: receive network buffer is 262142 bytes
17:03:41.647 [9520] <2> io_set_recvbuf: setting receive network buffer to 262144 bytes
17:03:41.647 [9520] <2> io_set_recvbuf: receive network buffer is 262142 bytes
17:04:15.462 [9602] <2> io_set_recvbuf: setting receive network buffer to 262144 bytes
17:04:15.462 [9602] <2> io_set_recvbuf: receive network buffer is 262142 bytes
For some reason several, but not all, linux clients failed with a RC=13:
Sep 2, 2010 6:13:30 PM - begin writing
Sep 2, 2010 6:23:48 PM - Error bpbrm (pid=21985) socket read failed: errno = 62 - Timer expired
Sep 2, 2010 6:25:02 PM - end writing; write time: 0:11:32
file read failed (13)
I reverted the NET_BUFFER_SZ to 65536, and the jobs reran successfully.
Any thoughts?