Forum Discussion

benjamin-1's avatar
benjamin-1
Level 2
5 years ago

Duplications issue with 191 error code

Hello dear veritas community vox, long time we were to read your topics, greatfull thanks !

We have some duplications failing with OST plugin error, break of the network exchange fail over a precise time (especialy recently discovered)

Duplication is through IP (WAN). Netbackup side we have :

<16>:bptm:ddp_filecopy_status() failed, start_offset[0], Err: 5009-filecopy operation failed (nfs: I/O error)

Datadomain side : 

DDErrNo = 5009 (I/O error)
DDErrNo = 5057 (File handle is stale)

All duplications are failing over : 

time for duplication network side break (131 minutes) (2h11 30 seconds)

Vendor purpose to set up a time out value for NFS of 600000, plugin was updated on the media server which achieve the duplication through the WAN (which is one of 10 media servers for this master).

For future purpose we had seen also some codes different of the master code 191 : 

84 : media write error

190 / 191 : bellow the children code and plugin error

Error bpduplicate backup id optimized duplication failed, client process aborted (50).
Error bpduplicate Duplicate of backupid failed, client process aborted (50).
Error bpduplicate Status = no images were successfully processed.
2060046: plugin error

Vendor suggest it could be a network break caused by IPS or security enhanced feature.

Best regards and thanks in advance for all your answers.

  • Those DD errors need to be resolved by EMC and your network team. NetBackup only tells the DD to replicate/duplicate and it waits for the DD to say when it's done. In your case its not getting done, its complaining about NFS I/O. Once that is resolved, you'll see no issue in NetBackup.

    • benjamin-1's avatar
      benjamin-1
      Level 2

      Hello we found 131 minutes and 15 s in

      net.ipv4.tcp_keepalive_time = 7200
      net.ipv4.tcp_keepalive_intvl = 75
      net.ipv4.tcp_keepalive_probes = 9

      Found that recommended parameters seems to be 900 in Symantec NetBackup Backup Planning and
      Performance Tuning Guide and applied master and media server

      net.ipv4.tcp_keepalive_time = 900

       Issue seems to be solved. We will work further with our network team to see the tcp_keepalive_time on our WAN link.

      • Nicolai's avatar
        Nicolai
        Moderator

        the tcp_keepalive_time will only make a difference if there is a firewall between the two data domains.

        tcp_keepalive ensure to send a idle frame to destination when the idle time has passed to prevent a fireall in closing idle connections. 

        You should ask the network admin what is configured in the firewall to find the right tcp_keepalive_time