Forum Discussion

Arton's avatar
Arton
Level 2
10 years ago

Duplication error

Hello,

 

Im facing an issue when duplicating data from disk to MEDIA (data size optimization process)

I begin the job and after many hours I have this error :

...

17.06.2015 18:38:38 - end reading; read time: 00:00:26
17.06.2015 18:38:38 - begin reading
17.06.2015 18:39:14 - end reading; read time: 00:00:36
17.06.2015 18:39:14 - begin reading

...

17.06.2015 18:39:37 - Info bptm(pid=33528) waited for full buffer 906382 times, delayed 1534064 times    
17.06.2015 18:39:45 - Info bptm(pid=33528) EXITING with status 0 <----------        
17.06.2015 18:39:46 - end reading; read time: 00:00:32
17.06.2015 18:39:46 - Info bpdm(pid=25172) completed reading backup image         
17.06.2015 18:39:46 - Info bpdm(pid=25172) EXITING with status 0         
client process aborted(50)

 

I do not understand why.. Tried several times on several data location: same issue.

Could you please provide me some support ?

THanks a lot

  • Not sure what this means:

    One duplication by one, to one media server by one 

    Is duplication local on a single media server, or from one media server to another?

    I did not expect to see anything 'interesting' - just an indication of which process has 'aborted', up to where images were read, up to where images were written.

    A good friend fixed failed duplications with these performance tuning settings:

    1. The paging file was set to automatic. It has been set to a fixed value 64GB (twice the physical memory) and set to use the F drive which has the most free space.
    2. Increased the number of worker threads available to Windows using the following registry keys (note that the RpcXdr\Parameters\DefaultNumberofWorkerThreads needed to be created):

    HKLM\SYSTEM\CurrentControlSet\Services\RpcXdr\Parameters\DefaultNumberofWorkerThreads   64

    HKLM\SYSTEM\CurrentControlSet\Control\SessionManager\Executive\AdditionalDelayedWorkerThreads   16

    HKLM\SYSTEM\CurrentControlSet\Control\SessionManager\Executive\AdditionalCriticalWorkerThreads   16

  • .... after many hours I have this error .....

    How many hours?

    I remember something about duplication failures after 2 hours.

    Seems you are using NBU 7.5, right? Which patch level?

    Tell us more about source and destination -
    all attached to same media server?  
    different media servers?
    Windows OS versions?

    Do you have log folders on media server(s)?
    bptm and bpbrm will be needed as a start.

  • Hi,

    Approx after 8/10 hours,

    v7.5.0.6

    One duplication by one, to one media server by one

    Win Server 2008 R2

    Logs I have I do not see anything interesting inside...

    If someone have any idea..

  • Not sure what this means:

    One duplication by one, to one media server by one 

    Is duplication local on a single media server, or from one media server to another?

    I did not expect to see anything 'interesting' - just an indication of which process has 'aborted', up to where images were read, up to where images were written.

    A good friend fixed failed duplications with these performance tuning settings:

    1. The paging file was set to automatic. It has been set to a fixed value 64GB (twice the physical memory) and set to use the F drive which has the most free space.
    2. Increased the number of worker threads available to Windows using the following registry keys (note that the RpcXdr\Parameters\DefaultNumberofWorkerThreads needed to be created):

    HKLM\SYSTEM\CurrentControlSet\Services\RpcXdr\Parameters\DefaultNumberofWorkerThreads   64

    HKLM\SYSTEM\CurrentControlSet\Control\SessionManager\Executive\AdditionalDelayedWorkerThreads   16

    HKLM\SYSTEM\CurrentControlSet\Control\SessionManager\Executive\AdditionalCriticalWorkerThreads   16

  • Interesting about this. Look like contradicting to me...


    17.06.2015 18:39:14 - begin reading

    ...

    17.06.2015 18:39:37 - Info bptm(pid=33528) waited for full buffer 906382 times, delayed 1534064 times 

    That "waited for full bufer" for duplication job usually means read I/O from source is very slow, but yet it took just 23 seconds to complete?!  First time I am seeing this.

    Error 50 is like the process in backup client suddenly got terminated. So check the media server that your duplication job run whether it is too busy, or run out of memory etc.

  • Hi,

    Thanks for your comments. I still not understand the issue :(

    I launched  a new duplicate from disk to tape, which aborted the 25 JUNE 2015 (today) at 1:38 am

    25.06.2015 01:37:39 - begin reading
    25.06.2015 01:38:00 - end reading; read time: 00:00:21
    25.06.2015 01:38:00 - begin reading
    25.06.2015 01:38:09 - Info bptm(pid=11088) waited for full buffer 1063797 times, delayed 1721761 times    
    25.06.2015 01:38:16 - Info bptm(pid=11088) EXITING with status 0 <----------        
    25.06.2015 01:38:16 - end reading; read time: 00:00:16
    25.06.2015 01:38:17 - Info bpdm(pid=7852) completed reading backup image         
    25.06.2015 01:38:17 - Info bpdm(pid=7852) EXITING with status 0         
    client process aborted(50)

     

    I attach the logs BPTM and BPBRM.

    Thanks for your help.

  • bpbrm is not needed because it's a duplication job.

    OK, now with full bptm logs, I can see what's happening. The job actually started before "1:30am".. way earlier...

    This error messages keep appearing:

    01:38:09.868 [11088.1676] <16> socksend: failed writing to socket (h_errno=10054)
    01:38:09.868 [11088.1676] <16> socksend: failed writing to socket: An existing connection was forcibly closed by the remote host.
    01:38:09.868 [11088.1676] <16> socksend: failed to send this: <WROTE filer00.hq.root.net_1405216811 80708 1 57823.710 1

    I believe the backup image is a NDMP backup judging by the "filer" host. Questions:

    1) where did you backup the NDMP data to? What storage unit?

    2) how did you duplicate it from (by vault, manual duplication or SLP?)

    Is there a connection between your source media server and the duplication target? Please check