cancel
Showing results for 
Search instead for 
Did you mean: 

Storage Server Data Domain timeout

blanco_adalbert
Level 5
Partner Accredited Certified

Hello everybody,

 

I am registering Data Domain Storage Server on netbackup 7.0.1. I have setted up Data Domain ddboost username and password and also enabled ddboos on the Data Domain side.

I have a media server that have tape attached to them via FC and I have a Data Domain attached with a 10GB network. the media server uses the 1gb network to comunicate with server but to be able to send backup to Data Domain i have configured another network segment.

When i try to add storage server and then add credentials using tpconfig -add .... comand it takes long time. Also i have tried using the java interface and i gotten this error:

timeout.

Best regards

 

8 REPLIES 8

blanco_adalbert
Level 5
Partner Accredited Certified

Hello, I found this into the bpcommand logs.

 

13:44:30.115 [13911] <2> cvvnbsev: [3657:100318920] ddcl_nfs_PLU_update_auth: Updated RPC AUTH with  uid=501 gid=50
13:50:00.160 [13911] <4> cvvnbsev: [3657:100318920] we_waited_toolong_for_reply RPC_TIMEOUT  elapsed time = 330043281703 max_time = 330000000000
13:50:00.161 [13911] <4> cvvnbsev: [3657:100318920] clnt_async_call: RPC error= 5
13:50:00.161 [13911] <4> cvvnbsev: [3657:100318920] RPC failed (ddcl_do_nfs_proc_timeout): procedure=GETMARKERINFO retval=5

13:50:00.161 [13911] <4> cvvnbsev: [3657:100318920] (ddcl_nfs_bounce_connection) RPC failed for job = 0 on ip =132.168.70.11
13:50:00.161 [13911] <4> cvvnbsev: [3657:100318920] clnt_async_destroy: RPC_CANTRECV will fail all the pending jobs and close socket
--More--(101%)

blanco_adalbert
Level 5
Partner Accredited Certified

the updated log is:

 

14:03:45.270 [17252] <4> cvvnbsev: [4364:100318920] clnt_async_destroy: RPC_CANTRECV will fail all the pending jobs and close socket
14:03:45.270 [17252] <2> cvvnbsev: /usr/openv/lib/ost-plugins/libstspiDataDomain.so:stspi_close_server err =0 handle =1001f13c0 free'd
14:03:45.270 [17252] <2> cvvnbsev: /usr/openv/lib/ost-plugins/libstspiDataDomain.so:dd_sts_parse_credentials cr_type =0 cr_len 21 cert ddboost-adm
14:03:45.270 [17252] <2> cvvnbsev: /usr/openv/lib/ost-plugins/libstspiDataDomain.so:dd_sts_parse_credentials user ddboost-adm passwd ******
14:03:45.270 [17252] <2> cvvnbsev: /usr/openv/lib/ost-plugins/libstspiDataDomain.so:stspi_open_server STS_ESDEBUG: calling ddcl_PLU2_connect() with User:ddboost-adm Password: ****** MAP:stspi_init @ 0x
ffffffff71d7ee10
14:03:45.270 [17252] <2> cvvnbsev: [4364:100318920] ddcl_vrapid_get_host_ip: host_name dd890sev has host_ip 132.168.70.11
14:03:45.270 [17252] <4> cvvnbsev: [4364:100318920] NFS connect on IP 132.168.70.11
14:03:45.271 [17252] <2> cvvnbsev: [4364:100318920] clnt_async_do_create: attempting to connect() on port =2049 IP=132.168.70.11
14:03:45.277 [17252] <2> cvvnbsev: [4364:100318920] ddcl_nfs_PLU_connect: MOUNT information over NFS port
14:03:45.288 [17252] <2> cvvnbsev: [4364:100318920] Permission passed on retry 0

14:03:45.288 [17252] <2> cvvnbsev: [4364:100318920] ddcl_nfs_PLU_update_auth: Updated RPC AUTH with  uid=501 gid=50
14:09:15.338 [17252] <4> cvvnbsev: [4364:100318920] we_waited_toolong_for_reply RPC_TIMEOUT  elapsed time = 330048443742 max_time = 330000000000
14:09:15.338 [17252] <4> cvvnbsev: [4364:100318920] clnt_async_call: RPC error= 5
14:09:15.339 [17252] <4> cvvnbsev: [4364:100318920] RPC failed (ddcl_do_nfs_proc_timeout): procedure=GETMARKERINFO retval=5

14:09:15.339 [17252] <4> cvvnbsev: [4364:100318920] (ddcl_nfs_bounce_connection) RPC failed for job = 0 on ip =132.168.70.11
14:09:15.339 [17252] <4> cvvnbsev: [4364:100318920] clnt_async_destroy: RPC_CANTRECV will fail all the pending jobs and close socket
14:09:15.339 [17252] <2> cvvnbsev: [4364:100318920] ddcl_vrapid_get_host_ip: host_name dd890sev has host_ip 132.168.70.11
14:09:15.339 [17252] <4> cvvnbsev: [4364:100318920] NFS connect on IP 132.168.70.11
14:09:15.339 [17252] <2> cvvnbsev: [4364:100318920] clnt_async_do_create: attempting to connect() on port =2049 IP

RamNagalla
Moderator
Moderator
Partner    VIP    Certified

it looks like the RPC timeout issue...

does  the Media server to DD is having the proper route...? is it communicating with the Defined network..

as you said it is connected with 10 Gig network.. does it using the 10 Gig to communicate...?

are the media server and DD in same network?

how many hops its taking in traceroute..?

Nicolai
Moderator
Moderator
Partner    VIP   

You need to verify both port 111 & 2049 are accessible. If port 111 is in firewall mount will timeout

Cut down on the network ports until you are sure basic network is functional.

Can you ping the device ?

blanco_adalbert
Level 5
Partner Accredited Certified

Yes I can ping Data Domain from media server and media server from data domain.

 

Also I have checked 111 and 2049 ports on the media server and i cah reach it.

 

 

Marianne
Level 6
Partner    VIP    Accredited Certified

Have you added hosts entry on the media server for DD?

And hosts entry on DD for media server?

Extract from Data_Domain_Boost_Admin_Guide_759-0010-0003.pdf:

 Follow these recommended practices:

• To avoid the need for frequent DNS lookups, add every media
server host/IP address to the Data Domain system using the
net hosts add ipaddr {host | "alias host"}...
command.
Note: DNS reverse lookup has been added for every media server
accessing the Data Domain system as of DD OS 4.6.3.
• To avoid DNS lookup for every job, also add the Data Domain
system IP address into the media server’s /etc/hosts file.

Nicolai
Moderator
Moderator
Partner    VIP   

14:09:15.338 [17252] <4> cvvnbsev: [4364:100318920] we_waited_toolong_for_reply

Can you do a basic drawing of the network and network card you mentioned ?

quebek
Moderator
Moderator
   VIP    Certified

Please run traceroute from DD to NBU domain servers - master and all media

and vice-versa

Maybe you are coming to DD via one NIC and DD is replying to source via other NIC...