Forum Discussion

dcbone's avatar
dcbone
Level 3
5 years ago

Status 41: Network Connection Timed out on full jobs only

Current netbackup version 7.7.3. Currently no support contract or no plans to re-up to make netbackup current so I'm posting here for some help.

This is currently backing up some old unix servers and recently began to have a Status 41: Network Connection Timed out only on full backup schedules. I increased the client read timeouts (up to 1800) on the master server and increased the client read/file browse timeouts on the client to 7200 each. After doing this, the full backup timeout still occurred after about 45 minutes of the job running.

Recently, I checked the "allow multiple data streams" and the total backup ran for 13 hours and the job that has /opt in the file list failed, but still wrote alot of KB and files, but the last 2 lines of the status are:

info bpbkar32 (pid = 12833) done. status: 41 network connection timed out

end writing; write time 5:33:25

 

Looking for any assistance please.

7 Replies

  • You may want to check this : https://www.veritas.com/support/en_US/article.100003560



    You said :
    *I increased the client read timeouts (up to 1800) on the master server and increased the client read/file browse timeouts on the client to 7200 each. After doing this, the full backup timeout still occurred after about 45 minutes of the job running.*

    My questions are : is the master server also the media server?

    You should change ‘client read timeout’ on the media server and set it to 3600 and the same paramater on the client with the same value (no need to change file browse, you can change it back to 300)
    After doing this, do another backup.

    Is there a FW between the master/media and the client? If yes you should check with your network team to check logs for any drop connections.

    Your backup full is failing because the amount of data to send is much more than the incremental. So if there is any delay on the client’s side.. the media is waiting to the Timeout that you set but this change wouldn’t mean anything if the FW had already dropped the connection..

    My suggestions are :
    Activate the accelerator and run a full backup.
    If it fails, then try another test with deduplication on the client side (but the client must have enough ressources to handle the deduplication).
    If this works then there is a problem on your connection and surely on the FW..

    And also (I know this reply is a little bit long, but all network/connection status codes especially 13,14,41 are tricky and a lot of factors can be the root cause) so also you may want to check on your switches/routers/clients NIC IF they are on full duplex and not half duplex..

    Good luck
    • Hamza_H's avatar
      Hamza_H
      Moderator

      Hello dcbone ,

       

      just wondering if you were able to resolve this ? if yes could you please share the solution or mark the post that helped you out.

      Thanks :)

      BR.

       

      • dcbone's avatar
        dcbone
        Level 3

        It resolved itself, however unfortunately there was no solution. It just stopped happening.

  • Hello,
    Please share this :
    Version of nbu and os of the client
    Is deduplication client side is enabled?
    Accelerator? If yes ? Try a forced rescan backup
    Share detailled status
    Is the throughput is good enough? How much is the size ur trying to backup?
    Did it work before? And finally, enable bpbkar logs on the client with verbo =3 and look for any errors in the same timeframe ..
    Good luck
    • dcbone's avatar
      dcbone
      Level 3

      Version of nbu and os of the client: 7.7.3 and RHEL 5


      Is deduplication client side is enabled?  Currently "always use the media server"


      Accelerator? If yes ? Try a forced rescan backup Accellerator is unchecked

      Is the throughput is good enough? I would assume yes, these were working as recently as a couple of weeks ago.

      How much is the size ur trying to backup? ~300gb

      Did it work before? Yes, this is a recent occurence. No changes that I'm aware of.

       

      And finally, enable bpbkar logs on the client with verbo =3 and look for any errors in the same timeframe: I dont have direct acccess to this client, I will work with someone who does and report back. Could you provide where I can find instructions to enable this? New NBU admin.

      • dcbone's avatar
        dcbone
        Level 3

        Detail status. The job ends 100% complete, with files and folder written.

         

        05/01/2020 03:10:03 - Info nbjm (pid=2552) starting backup job (jobid=343720) for client *servername*, policy *policyname*, schedule FULL_D2D
        05/01/2020 03:10:03 - Info nbjm (pid=2552) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=343720, request id:{6id}
        05/01/2020 03:10:03 - requesting resource *diskname*
        05/01/2020 03:10:03 - requesting resource *nbuservername*.NBU_CLIENT.MAXJOBS.*name*
        05/01/2020 03:10:03 - requesting resource *nbuservername*.NBU_POLICY.MAXJOBS.*policyname*
        05/01/2020 03:10:03 - granted resource *nbuservername*.NBU_CLIENT.MAXJOBS.*servername*
        05/01/2020 03:10:03 - granted resource *nbuservername*.NBU_POLICY.MAXJOBS.*policyname*
        05/01/2020 03:10:03 - granted resource MediaID=@aaaaZ;Path=*storagepath*;MediaServer=*nbuservername*
        05/01/2020 03:10:03 - granted resource *diskname*
        05/01/2020 03:10:03 - estimated 0 kbytes needed
        05/01/2020 03:10:03 - Info nbjm (pid=2552) started backup (backupid=*servernameid*) job for client *servername*, policy *policyname*, schedule FULL_D2D on storage unit *storagename*
        05/01/2020 03:10:04 - started process bpbrm (pid=588)
        05/01/2020 03:10:05 - Info bpbrm (pid=588) *servername* is the host to backup data from
        05/01/2020 03:10:05 - Info bpbrm (pid=588) reading file list for client
        05/01/2020 03:10:05 - connecting
        05/01/2020 03:10:07 - Info bpbrm (pid=588) starting bpbkar32 on client
        05/01/2020 03:10:07 - Info bpbkar32 (pid=12833) Backup started
        05/01/2020 03:10:07 - connected; connect time: 0:00:00
        05/01/2020 03:10:07 - Info bptm (pid=2596) start
        05/01/2020 03:10:07 - Info bptm (pid=2596) using 262144 data buffer size
        05/01/2020 03:10:07 - Info bptm (pid=2596) setting receive network buffer to 1049600 bytes
        05/01/2020 03:10:07 - Info bptm (pid=2596) using 128 data buffers
        05/01/2020 03:10:09 - Info bptm (pid=2596) start backup
        05/01/2020 03:10:12 - Info bptm (pid=2596) backup child process is pid 6360.6920
        05/01/2020 03:10:12 - Info bptm (pid=6360) start
        05/01/2020 03:10:12 - begin writing
        05/01/2020 08:43:37 - Info bpbkar32 (pid=12833) done. status: 41: network connection timed out
        05/01/2020 08:43:37 - end writing; write time: 5:33:25
        network connection timed out (41)