cancel
Showing results for 
Search instead for 
Did you mean: 

NDMP Restore operation getting failed while restoring at original location

Hemant_123
Level 3

Hi ,

 

Currently I am using netbackup 7.5 version . 

 

My problem is that , I took backup successfully but when I use option (restore to its orginal location ) in restore then my restore opration fails.

 

 

08/13/2013 03:49:50 - begin Restore
08/13/2013 03:49:50 - restoring from image 10.1.41.11_1376390734
08/13/2013 03:50:05 - Info bprd (pid=16219) Restoring from copy 1 of image created Tue Aug 13 03:45:34 2013
08/13/2013 03:50:06 - Info bpbrm (pid=16237) 10.1.41.11 is the host to restore to
08/13/2013 03:50:06 - Info bpbrm (pid=16237) reading file list from client
08/13/2013 03:50:06 - connecting
08/13/2013 03:50:06 - Info bpbrm (pid=16237) starting bptm
08/13/2013 03:50:06 - Info ndmpagent (pid=16243) Restore started
08/13/2013 03:50:06 - connected; connect time: 0:00:00
08/13/2013 03:50:06 - Info bpbrm (pid=16237) bptm pid: 16244
08/13/2013 03:50:06 - Info bptm (pid=16244) start
08/13/2013 03:50:06 - started process bptm (pid=16244)
08/13/2013 03:50:06 - Info bpdm (pid=16244) reading backup image
08/13/2013 03:50:07 - begin reading
08/13/2013 03:50:07 - Info ndmpagent (pid=16243) INF - Restoring NDMP files from /rb1
08/13/2013 03:50:07 - Info ndmpagent (pid=16243) Attempting normal restore.
08/13/2013 03:50:07 - Critical bpbrm (pid=16237) unexpected termination of client 10.1.41.11
08/13/2013 03:50:07 - Info bptm (pid=16244) EXITING with status 5 <----------
08/13/2013 03:50:08 - Info ndmpagent (pid=16243) done. status: 5: the restore failed to recover the requested files
08/13/2013 03:50:08 - Error bpbrm (pid=16237) client restore EXIT STATUS 5: the restore failed to recover the requested files
08/13/2013 03:50:08 - restored from image 10.1.41.11_1376390734; restore time: 0:00:18
08/13/2013 03:50:08 - end Restore; elapsed time 0:00:18
NDMP policy restore error  (2813)
 
-bash-3.2$ ls /panfs/REALM41/restore/
-bash-3.2$
 
 
it's perfectly works when I restore it other location .
 
I need help on this.
 
Thanks in Advance ,
H
15 REPLIES 15

Hemant_123
Level 3

I found similar link (symantec forum)

 

https://www-secure.symantec.com/connect/forums/restore-dfsr-data-original-location-fails-redirect-wo...

 

In this it mention that this issue fix in 7.5.0.6. under ET 3136691.

But In release notes of version 7.5.0.6(http://www.symantec.com/business/support/index?page=content&id=DOC6396 ) don't have ET 3136691.

Let me correct if I am wrong.

Will_Restore
Level 6

That ET would not apply.  That was for DFSR. Your issue is NDMP.

 

Check messages on the NAS device.  And ndmpagent logs on the NetBackup server. 

Hemant_123
Level 3

Ndmp Agent logs :

 

05:20:22.533 [10154] <2> ndmpagent_setup_for_read: CINDEX 0, TWIN_INDEX 0
05:20:22.533 [10154] <2> NdmpAgentSession_open[0]: opening ndmpagent session on port 33364
05:20:22.534 [10154] <2> NdmpAgentSession[0]: ndmp_xm_session_create: creating session 0x0x29cbee0
05:20:22.535 [10154] <2> NdmpAgentSession: ndmp_open_and_auth_via_host: hostname=localhost, via_hostname=localhost
05:20:22.535 [10154] <2> NdmpAgentSession: ndmp_open_and_auth_via_host: userid is NULL, pathname=NULL
05:20:22.535 [10154] <2> NdmpAgentSession: ndmp_connect_to_server: hostname = localhost, portname = 33364
05:20:22.535 [10154] <8> vnet_cached_getaddrinfo_and_update: [vnet_addrinfo.c:1583] in failed file cache ERR=-8 NAME=NULL SVC=testdaemon
05:20:22.535 [10154] <8> vnet_cached_get_service_port: [vnet_addrinfo.c:2387] vnet_cached_getaddrinfo failed STAT=6 RV=-8 NAME=testdaemon
05:20:22.535 [10154] <8> is_pbxable_service: [vnet_connect.c:2159] vnet_cached_get_service_port() failed 6 0x6
05:20:22.535 [10154] <8> is_pbxable_service: [vnet_connect.c:2160] service 33364
05:20:22.536 [10154] <8> file_to_cache_item: [vnet_addrinfo.c:6555] fopen() failed ERRNO=2 FILE=/usr/openv/var/host_cache/06b/c413626b+33364,1,20,2,1,0+localhost.txt
05:20:22.537 [10154] <2> NdmpAgentSession: ndmp_authenticate_connection: using NDMP protocol version 4
05:20:22.537 [10154] <2> NdmpAgentSession: ndmp_enable_extensions: enabling class_id = 0x7ff0 class_version = 0x2
05:20:22.537 [10154] <2> create_shared_memory: shm_size = 7865096, buffer address = 0x0x7f0ecf394000, buf control = 0x0x7f0ecfb14000, ready ptr = 0x0x7f0ecfb142d0, res_cntl = 0x0x7f0ecfb142d8
05:20:22.537 [10154] <2> NdmpAgentSession[0]: ndmp_xm_send_shm_info is_mpx 0, is_disk 1
05:20:22.537 [10154] <2> NdmpAgentSession[0]: ndmp_xm_send_shm_info 30 262144 0 3932160 1 0
05:20:22.537 [10154] <2> NdmpAgentSession[0]: [0] Sending 2 (SHM_INFO) "30 262144 0 3932160 1 0"
05:20:22.537 [10154] <2> NdmpAgentSession[0]: [0] Reply 0 "", error = 0
05:20:22.537 [10154] <2> NdmpAgentSession[0]: [1] Sending NDMP_MOVER_SET_RECORD_SIZE 262144
05:20:22.537 [10154] <2> NdmpAgentSession[0]: [1] Reply error = 0
05:20:22.537 [10154] <2> NdmpAgentSession[0]: ndmp_xm_send_image_length 5767168
05:20:22.537 [10154] <2> NdmpAgentSession[0]: [2] Sending 17 (IMAGE_LENGTH) "5767168"
05:20:22.538 [10154] <2> NdmpAgentSession[0]: [2] Reply 0 "", error = 0
05:20:22.538 [10154] <2> NdmpAgentSession[0]: [3] Sending NDMP_MOVER_SET_WINDOW 0 6291456
05:20:22.538 [10154] <2> NdmpAgentSession[0]: [3] Reply error = 0
05:20:22.538 [10154] <2> NdmpAgentSession[0]: [4] Sending 3 (START) ""
05:20:22.538 [10154] <2> NdmpAgentSession[0]: [4] Reply 0 "", error = 0
05:20:22.538 [10154] <2> set_job_details: Tfile (93): BEGIN_READING 1377508822
 
05:20:22.538 [10154] <2> send_job_file: job ID 93, ftype = 3 msg len = 25, msg = BEGIN_READING 1377508822
 
05:20:23.126 [10154] <2> NdmpAgentSession: ndmp_net_read: connection 0x29cfc70: (12) end of file detected
05:20:23.126 [10154] <2> NdmpAgentSession: ndmp_message_receive: xdr_ndmp_header failed, unable to XDR decode             ------------> failue condition
05:20:23.126 [10154] <2> NdmpAgentSession: ndmp_net_read: connection 0x29cfc70: (12) end of file detected
05:20:23.126 [10154] <2> NdmpAgentSession[0]: ERROR ndmp_process_message failed, status = 18
05:20:23.126 [10154] <16> check_and_process_ndmpagent_restore_tasks: internal error fetching ndmpagent[0] task
05:20:23.126 [10154] <2> NdmpAgentSession_close_by_index[0]: closing ndmpagent session 0x29cbee0
05:20:23.126 [10154] <2> read_data: ndmp_task = 1
05:20:23.126 [10154] <2> read_data: status_to_return = -8
05:20:23.126 [10154] <2> bp_sts_close_image: 0x2968a20 0 <10.65.188.11_1377084925_C1_F1 0 1> 1 4 0 1 1 5 complete flag (2)(M_READ_COMPLETE ), close type (1)(STS_CLOSEF_IH_COMPLETE ) 11
05:20:23.126 [10154] <2> notify: executing - /usr/openv/netbackup/bin/restore_notify bptm 10.65.188.11_1377084925 restore
05:20:23.135 [10154] <2> Media_dispatch_signal: calling child_wait (tmcommon.c:3963) delay 0 seconds
05:20:23.135 [10154] <2> Media_siginfo_print: 0: delay 0 signo SIGHUP:1 code 0 pid 10151
05:20:23.135 [10154] <2> Media_siginfo_print: 1: delay 0 signo SIGCHLD:17 code 1 pid 10165
05:20:23.135 [10154] <2> Media_dispatch_signal: calling catch_signal for 1 (tmcommon.c:3963) delay 0 seconds
05:20:25.135 [10154] <2> set_job_details: Tfile (93): LOG 1377508825 16 bptm 10154 media manager terminated by parent process
 
05:20:25.135 [10154] <2> send_job_file: job ID 93, ftype = 3 msg len = 72, msg = LOG 1377508825 16 bptm 10154 media manager terminated by parent process
 
05:20:25.149 [10154] <16> catch_signal: media manager terminated by parent process
05:20:25.149 [10154] <2> send_MDS_msg: MEDIA_DONE 0 0 0 *NULL* 0 0
05:20:25.149 [10154] <2> JobInst::sendIrmMsg: starting
05:20:25.149 [10154] <2> Orb::init: Enabling ORBNativeCharCodeSet UTF-8(Orb.cpp:594)
05:20:25.149 [10154] <2> Orb::init: Created anon service name: NB_10154_1711500264(Orb.cpp:697)
05:20:25.149 [10154] <2> Orb::init: endpointvalue is : pbxiop://1556:NB_10154_1711500264(Orb.cpp:714)
05:20:25.149 [10154] <2> Orb::init: initializing ORB Default_DAEMON_Orb with: Unknown -ORBSvcConfDirective "-ORBDottedDecimalAddresses 0" -ORBSvcConfDirective "static Resource_Factory '-ORBNativeCharCodeSet UTF-8'" -ORBSvcConfDirective "static PBXIOP_Factory '-enable_keepalive'" -ORBSvcConfDirective "static EndpointSelectorFactory ''" -ORBSvcConfDirective "static Resource_Factory '-ORBProtocolFactory PBXIOP_Factory'" -ORBSvcConfDirective "static Resource_Factory '-ORBProtocolFactory IIOP_Factory'" -ORBDefaultInitRef '' -ORBSvcConfDirective "static PBXIOP_Evaluator_Factory '-orb Default_DAEMON_Orb'" -ORBSvcConfDirective "static Resource_Factory '-ORBConnectionCacheMax 1024 '" -ORBEndpoint pbxiop://1556:NB_10154_1711500264 -ORBSvcConf /dev/null -ORBSvcConfDirective "static Server_Strategy_Factory '-ORBMaxRecvGIOPPayloadSize 268435456'"(Orb.cpp:825)
05:20:25.150 [10154] <2> Orb::init: caching EndpointSelectorFactory(Orb.cpp:840)
05:20:25.193 [10154] <8> copy_addrinfo: [vnet_addrinfo.c:3692] no valid addresses to copy 0 0x0
 
 
 
 
from this logs I don't understand why  RESTORE to ORIGINAL location option not able to decode the reply or request .. while using other location option perfectly works..
 
 
 

 

Hemant_123
Level 3

 

While doing restore at other location, and simultaneously looking into the logs of ndmp (151).

I found all commands are successfully sent by NBU.

 

But while restoring at original location, the command to start restore operation is not sent.

 

How to go further with this ?

How can I get the exact command communication between NDMP and NBU ?

 

Thanks in Advance

Mark_Solutions
Level 6
Partner Accredited Certified

When you restore to the original location you will be overwriting files - that may have different permissions to restoring elsewhere - see if the account you use has enought rights - i would have expected the filer itself to log something when the restore ran relating to why it failed

Hemant_123
Level 3

While doing all the operations, I have root permissions.

 

In case, where the restore is failing at original location.

I am getting the foolowing error at ndmpagent (originator ID 134) :

05:20:23.126 [10154] <2> NdmpAgentSession: ndmp_net_read: connection 0x29cfc70: (12) end of file detected

05:20:23.126 [10154] <2> NdmpAgentSession: ndmp_message_receive: xdr_ndmp_header failed, unable to XDR decode
05:20:23.126 [10154] <2> NdmpAgentSession: ndmp_net_read: connection 0x29cfc70: (12) end of file detected
05:20:23.126 [10154] <2> NdmpAgentSession[0]: ERROR ndmp_process_message failed, status = 18
05:20:23.126 [10154] <16> check_and_process_ndmpagent_restore_tasks: internal error fetching ndmpagent[0] task
 
What could be going wrong ?
 
"NdmpAgentSession: ndmp_message_receive: xdr_ndmp_header failed, unable to XDR decode" is what I am doubting at.
Is there any difference the way NBU sends header (commands) requests in restore operation, while doing restore at original and different location ?
 
 

Mark_Solutions
Level 6
Partner Accredited Certified

I have seen similar but nothing that would not have been fixed a long time ago

I think it is time to open a case - who knows, a bit if NetBackup V6 code may have crept back in

Sorry i have not been able to assist further but please keep us up to date with what support find

watsons
Level 6

A few more thoughts..

1) How difference are the "original location" with the "alternate location", i.e. between 2 NAS.

EMC? Netapp? Others? In term of their OS version (DART, OnTap), and the way they're setup. Find out their difference might explain why you can restore to one and not the other.

2) Are you restoring snapshot path, for instance in Netapp you can backup something like /vol/vol1/abc/snapshot.weekly

The snapshot itself is a reserved area setup within the NAS, it's READ-ONLY and cannot be overwrite even though you have root. You have to restore to a different path in "original server".

 

Mark_Solutions
Level 6
Partner Accredited Certified

Good point  on it possibly being a snapshot volume - seen that a few times before - never allows a restore to the original location

Hemant_123
Level 3

Hi,

 

1. The "original location" and the "alternate location" are just two different volumes on a system.

2. No, I am not restoring snapshot path, its just few files and directories that I am restoring.

 

 

Hemant_123
Level 3

I am using open-source NDMP.

 

In the file : "ndmpc/ndmpkit/common/comm.c", in function : "ndmp_recv_msg"

	/* Decode the header. */
	connection->xdrs.x_op = XDR_DECODE;
	xdrrec_skiprecord(&connection->xdrs);
	if (!xdr_ndmp_header(&connection->xdrs, &connection->msginfo.hdr))
	{
		return(-1);
	}
This condition is failing and returning -1, while restoring to the original location. 
Whereas while restoring to different location its successful.
 
What could be going wrong ?
 

Hemant_123
Level 3

Can I get some pointers on it ?

 

Thanks in Advance

Will_Restore
Level 6

Not familiar with that open-source code.  It returns an error (-1) then NetBackup cannot do much about it. 

 

Hemant_123
Level 3

But, it is successfully doing restore in case of alternate location.

 

So, is there any difference the way NBU sends header (commands) requests in restore operation, while doing restore at original and different location ?

 

I would just like to know, how NBU deals with these two different scenarios.

And if there is some difference how can I handle that ?

Will_Restore
Level 6
Looks like ndmp4linux code, which site says

The idea I have for this project to to create a server application for Linux which allows NDMP capable backup servers to connect to a Linux host and backup data. Ideally, the implementation will allow for integration with LVM2 for creating and releasing snapshots before/after the backup begins/ends. It should also integrate with the Linux VFS to sync all information to disk to ensure a consistent environment for backups.

 

I would go with a supported product, or live with the limitations.