cancel
Showing results for 
Search instead for 
Did you mean: 

NDMP Direct Copy just stopped working

Hanzo581
Level 4

Ok, strange issue, not sure what is going on.

All of the sudden this morning my network team informed me that some traffic for backups have been going over network over the weekend instead of fibre like it normally does,  We have the data buffers set to 262144 on our master server that handles all our NDMP jobs, this is what one of our SLP jobs is displaying,

 

10/27/2014 6:35:27 AM - Info bpduplicate(pid=9052) Suspend window close behavior is not supported for NDMP or NDMP direct copy
10/27/2014 6:35:27 AM - Info bpduplicate(pid=9052) window close behavior: Continue processing the current image    
10/27/2014 6:35:28 AM - Info bptm(pid=7428) start           
10/27/2014 6:35:28 AM - started process bptm (7428)
10/27/2014 6:35:30 AM - Info bptm(pid=7428) using 65536 data buffer size       
10/27/2014 6:35:30 AM - Info bptm(pid=7428) setting receive network buffer to 262144 bytes     
10/27/2014 6:35:32 AM - Info bptm(pid=7428) start backup          
10/27/2014 6:35:32 AM - Info bptm(pid=9436) start           
10/27/2014 6:35:32 AM - started process bptm (9436)
10/27/2014 6:35:33 AM - Info bptm(pid=7428) setting receive network buffer to 262144 bytes     
10/27/2014 6:35:33 AM - Info bptm(pid=9436) reading backup image

Does this mean there is a mismatch on the buffer sizes causing the issue?  Is there a way to set the buffer size on our Netapp Filers somewhere?

 

 

8 REPLIES 8

Michael_G_Ander
Level 6
Certified

You can set the NDMP buffer size in DATA_BUFFER_SIZE_NDMP, but don't think this has anything to do with the buffer size. By the way it is the media server that decided which block size is used.

In this case I would look into the SLP settings and the fiber infrastructure if the duplication is supposed run over and of cause generally for any changes the last week.

The standard questions: Have you checked: 1) What has changed. 2) The manual 3) If there are any tech notes or VOX posts regarding the issue

mph999
Level 6
Employee Accredited

For the read-side, we cannot modify the buffer size. It is determined by record 
size as it currently exists (for the given backup image) on the device we 
are attempting to read.

For write side, the following rules apply:

If there are no touch files that modify it, default is 65536 (64k)

If backup image is NDMP, and duplicating to NDMP device
  Use SIZE_DATA_BUFFERS_NDMP
Else if DISK 
  Use SIZE_DATA_BUFFERS_DISK
Else
  Use SIZE_DATA_BUFFERS

When duplicating NDMP backup images, we can go from 
larger to smaller buffer (break a record up), but we cannot go from smaller to 
larger buffer (lump records together).

In short, source (read) and destination (write) side buffer sizes need to 
  either match, or the read side needs to be even multiple of write side.

 

Hanzo581
Level 4

We were told (after a lot of service calls to Symantec) in order to get NDMP Direct Copy to work we needed the buffer size set to 262144 (which it is on the media server here, which is also our master server), but if we can't change the source buffer size from the filer side, how can we get them to match?

mph999
Level 6
Employee Accredited

I don't don't believe that is correct, the rules are as I explained as far as I am aware, and there is no mention of a specific buffer size - unless they were referring to your specific environment and had based that suggestion on what you already have set.

If the buffer sizes on media servers haven't been changed since it was last working, then I don;t see thjis being the problem.  Really best way is to look the the read and write side log (bptm) to see what it is complaining about.

mph999
Level 6
Employee Accredited

See TN TECH206370 (fixed link - CRZ)

 

Hanzo581
Level 4

That link doesn't work for me.

 

I tried google'ing this but didn't see anything, what does this line mean?

10/27/2014 6:42:50 AM - Info bpduplicate(pid=2832) Suspend window close behavior is not supported for NDMP or NDMP direct copy
10/27/2014 6:42:50 AM - Info bpduplicate(pid=2832) window close behavior: Continue processing the current image

mph999
Level 6
Employee Accredited
Those lines are just information, maybe an attempt was made to suspend the operation, not totally sure, but it says continuing, so no issue. Here are the lines from the TN I was trying to show: 07:47:52.012 [7796.2172] <2> ndmp_receive_direct_copy_enabled: NDMP Direct Copy will not be used because: 07:47:52.012 [7796.2172] <2> ndmp_receive_direct_copy_enabled: read side buffer size = 65536, write side buffer size = 262144 07:47:52.012 [7796.2172] <2> ndmp_receive_direct_copy_enabled: (read side buffer size must be even multiple of write side buffer size) The lines are actually from a 'direct-copy-to-tape' job, but effectivly that is the same as opt dup, just disk to tape, as opposed to disk-to-disk, so I was just trying to demonstrate what sort of lines you might look for in the logs. The TN was referring to a case where there were two media servers, one on the read side (which was on the backup side) and the other the write side (where the duplication was to be made).

Hanzo581
Level 4

I do not see any of the above reference in any of the NDMP SLPs that have ran recently, either those that ran other the network or fibre.  Hmmm.  I am positive it used to say that on all since we made the SOP a year ago that references it, but we are on a newer version of Netbackup so I am not sure what is going on....

 

10/27/2014 11:10:31 AM - Info bpduplicate(pid=7500) Suspend window close behavior is not supported for NDMP or NDMP direct copy
10/27/2014 11:10:31 AM - Info bpduplicate(pid=7500) window close behavior: Continue processing the current image    
10/27/2014 11:10:32 AM - Info bptm(pid=6648) start           
10/27/2014 11:10:32 AM - started process bptm (6648)
10/27/2014 11:10:33 AM - Info bptm(pid=6648) using 65536 data buffer size       
10/27/2014 11:10:33 AM - Info bptm(pid=6648) setting receive network buffer to 262144 bytes     
10/27/2014 11:10:35 AM - Info bptm(pid=6648) start backup          
10/27/2014 11:10:36 AM - Info bptm(pid=7948) start          

 

This for example is an SLP from an NDMP policy that is running right now over flbre the way it should from our DXi8500 VTL to our Quantum i2000 physical library...it looks no different than any NDMP job and SLP that has ran over network incorrectly.