cancel
Showing results for 
Search instead for 
Did you mean: 

Eventual error 41 (Redhat)

pylej
Level 4
Hi Guys,

I have a client that is having some kind of network related issue.  The backup start and eventually fails with the following type error: 

Error bpdm (pid=4795) media manager terminated by parent process

It is a Red Hat client. I have read other people on the forum who have had similar issues and they raised the client read value. Which I have done, and the backups are still failing.

Our comms teams ensures me that the switch port is set to 100 meg and the servers admin person ensure me that the NIC card is also set to 100meg.

Any ideas.

Thanks
Jim
9 REPLIES 9

Andy_Welburn
Level 6
I remember a while back we had an issue with a certain sparse file (lastlog or some such) on new RH servers which by default was set to something stupid like 1Tb in size. In our case we just deleted & recreated this file.

This is just something to bear in mind as well as all the other EC41 causes.

In-Depth Troubleshooting Guide for Exit Status Code 41 in NetBackup Server (tm) / NetBackup Enterpri...

Karthikeyan_Sun
Level 6
Can you give us the complete Logs from Detailed Status ?

Were you able to load the Client Properties Window(Is Connectivity OK)  ?

pylej
Level 4
Hi Guys,
@Andy, I would agree this does sound like some problem with a file, but the client has recently moved locations and therefore is using a different switch. So I really think it is some network issue, however the speeds are set the same. I did read some other people say they had some problem with Red hat doing this, but I could not find the solution. What log can you see at what file the backup is failing on. Is it bpcd on the client with logging level at 5? Thanks alot for the help!
@Karthikeyan, as you can see it does connect and starts to write some data:
02/09/2010 14:34:05 - requesting resource 1YearRetention
02/09/2010 14:34:05 - requesting resource dsnsvr42.NBU_CLIENT.MAXJOBS.eumcd01
02/09/2010 14:34:05 - requesting resource dsnsvr42.NBU_POLICY.MAXJOBS.EUMCD01
02/09/2010 14:34:06 - granted resource dsnsvr42.NBU_CLIENT.MAXJOBS.eumcd01
02/09/2010 14:34:06 - granted resource dsnsvr42.NBU_POLICY.MAXJOBS.EUMCD01
02/09/2010 14:34:06 - granted resource 1YearRetention
02/09/2010 14:34:18 - started process bpdm (pid=6402)
02/09/2010 14:34:17 - started process bpbrm (pid=6399)
02/09/2010 14:34:17 - connecting
02/09/2010 14:34:17 - connected; connect time: 0:00:00
02/09/2010 14:34:18 - begin writing
02/09/2010 14:39:55 - begin writing
02/09/2010 14:41:46 - Error bpdm (pid=6402) media manager terminated by parent process
02/09/2010 14:41:47 - end writing; write time: 0:01:52
network connection timed out (41)

Andy_Welburn
Level 6
You should be able to see it in the Activity Monitor in the Admin Console under "Current File" and on the Detailed Status tab with Job Details also.

Nicolai
Moderator
Moderator
Partner    VIP   
When bandwidth is a question I always test with the TTCP test utility from Netcordia. ttcp is small java applet that transfer package between to nodes (eg. media server and client). Since it a java applet  you can run the utility across different operating systems.


The default values are a little weak and won't show the true bandwidth  - This is the command I use:

Reciver side:
/usr/openv/java/jre/bin/java ttcp -r -l 65536 -n 16384

Sender side:
/usr/openv/java/jre/bin/java ttcp  -l 65536 -n 16384 -t {hostname}

Output looks like this:

Transmit: 1073741824 bytes in 9299 milli-seconds = 115468.52 KB/sec (923748.2 Kbps).

In the example we are pretty close to the full gigabit capacity.

Link to Netcordia : http://www.netcordia.com/downloads/free-network-tools.asp

PS: running the applet on Windows brings up a GUI.

pylej
Level 4
Hi Andy,

I don't see any Current File in the Admin Console. In the Detailed status tab with job details. It only shows me the path /home, it is not specific on what file within the /home directory it is stopping on.

Thank you
Jim

Andy_Welburn
Level 6
running - I remember not always being able to see "Current File" in Job Details.

Cannot remember how we isolated the file that was causing an issue (& it may well be that this is not your problem), but I did find the following T/N (altho' it is a bit long-in-the-tooth now)

File system backup job is successful, but bpbkar reports "sparse" files. ...

so maybe the bpbkar log may assist?

***EDIT**
My last note appears to be confirmed by:

"... enable bpbkar logging on the client.  This will capture ... any files that are being backed up."

from GENERAL ERROR: The bpbkar process hangs when attempting to back up a NFS file system. ...

pylej
Level 4
Hi Andy,

I set up the bpcd and bpbkar logs on the client. So I will kick off a backup and see what it tells me. I agree it may not be my problem this time, but it is good from we to understand more indepth of what is going on. I will let you know if I see anything strange. The files are all local that is it backing up, there are no NFS file system.  I have the feeling this is a network or NIC problem.

Thanks alot for your help!
Jim

marekkedzierski
Level 6
Partner
Check the value of Client Read Timeout on Media Server.