07-18-2014 02:37 AM
hi, I came across a NDMP backup failure recently. it used to work well in the past. we have 10000 opened bi-directionally.
but the backups always ended with 636 error....
would youplease help point the right direction?
it lasted almost 1 week.
thanks in advance!
Master: linux, NBU7.5.0.6
Media: linux, NBU7.5.0.6
on the master
[root@lx0034nbumast bin]# nbemmcmd -listhosts |grep ndmp dsfiler05-101
[root@lx0034nbumast bin]# tpautoconf -verify dsfiler05-101
Connecting to host "dsfiler05-101" as user "ndmp"...
Waiting for connect notification message...
Opening session--attempting with NDMP protocol version 4...
Opening session--successful with NDMP protocol version 4
host supports MD5 authentication
Getting MD5 challenge from host...
Logging in using MD5 method...
Host info is:
host name "dsfiler05"
os type "NetApp"
os version "NetApp Release 7.3.2"
host id "0135080178"
Login was successful
Host supports LOCAL backup/restore
Host supports 3-way backup/restore
Host has SnapVault Secondary license installed
on the Media server
[root@lx0003nbumed01 ~]# nbemmcmd -listhosts |grep ndmp dsfiler05-101
[root@lx0003nbumed01 ~]# tpautoconf -verify dsfiler05-101
Connecting to host "dsfiler05-101" as user "ndmp"...
Waiting for connect notification message...
Opening session--attempting with NDMP protocol version 4...
Opening session--successful with NDMP protocol version 4
host supports MD5 authentication
Getting MD5 challenge from host...
Logging in using MD5 method...
Host info is:
host name "dsfiler05"
os type "NetApp"
os version "NetApp Release 7.3.2"
host id "0135080178"
Login was successful
Host supports LOCAL backup/restore
Host supports 3-way backup/restore
Host has SnapVault Secondary license installed
07-18-2014 02:53 AM
Hello,
Did you create the logs for the ndmp backups to be able to troubelshoot further?
http://www.symantec.com/docs/TECH56492
Check this note as well, http://www.symantec.com/docs/TECH214335
And describe the scenario/environment a bit more.
07-21-2014 01:39 AM
09-03-2014 01:59 AM
09-03-2014 03:38 AM
Have you really tried what Riann suggested with those technotes, with the one lowering TCP keepalive time on master & media server? Error 636 is usually caused by that.
If port 10000 is opened, I suppose your test of telnet <ndmphost> 10000 is working fine. Can you show us one of your job details with that error 636?
09-08-2014 02:25 PM
As the author of http://www.symantec.com/docs/TECH214335 that Riian posted, I can confirm that a 636 should have nothing to do with the connection between the media server processes and the NDMP filer. This usually has everything to do with bpbrm connection and NBJM. The 636 error is generally "floating" in the detailed status (ie no timestamp) and is NBJM stating that it was checking for updates from bpbrm and found the socket closed. Further investigation into bpbrm logs may determine when the sockets actually closed.
The related article http://www.symantec.com/docs/TECH214335 goes further into the 636 issue.